Skip to collection list Skip to video grid
Skip to collection list Skip to video grid

All Videos

Automated Feature Engineering

Xin Hunt, Machine Learning Developer, SAS Institute Inc. Feature engineering plays a significant role in the success of a machine learning model. Most of the effort in training a model goes into data preparation and choosing the right representation. In this talk, I will focus on a robust feature engineering method, Randomized Union of Locally Linear Subspaces (RULLS). We generate sparse, non-negative, and rotation invariant features in an unsupervised fashion. RULLS aggregates features from a random union of subspaces by describing each point using globally chosen landmarks. These landmarks serve as anchor points for choosing subspaces. Our method provides a way to select features that are relevant in the neighborhood around these chosen landmarks. Distances from each data point to k closest landmarks are encoded in the feature matrix. The final feature representation is a union of features from all chosen subspaces. The effectiveness of our algorithm is shown on various real-world datasets for tasks such as clustering and classification of raw data and in the presence of noise. We compare our method with existing feature generation methods. Results show a high performance of our method on both classification and clustering tasks.