UCLA Statistics Seminar
Interactive Discovery in Large Data Sets

Center for Applied Statistics Presents

Applied Statistics Seminar Series

Tue, 10/16/2012, 11:00 AM—12:00 PM
5225 Math Sciences Bldg.

Kiri Wagstaff

Jet Propulsion Laboratory

What is the best way to dive in and explore a new data set? I will discuss a new machine learning problem, iterative discovery, that seeks to enable users to interactively explore a large data set and quickly identify items of interest. Our solution employs an incremental Principal Components Analysis strategy to incorporate user feedback and provide explanations for its selections, rendering it useful to mission scientists, and especially for instruments with excessively high data volumes. I will share results of experiments with hyperspectral data from instruments on Mars orbiters and rovers as well as text data (log files) from a ground-based radio telescope.

