Feature Selection

During the stage of feature engineering, many features may be created. How should we select relevant features for modeling?

Feature selection might be categorized in several main types:

Filter based: Features are filtered based on a metric. Examples include correlation (also eta-squared) for numerical variables and chi-square statistics for categorical variables (also Cramer’ V).

Wrapper-based: Features are selected via a search problem. An example is Recursive Feature Elimination

Embedded: Embedded methods use algorithms that have built-in feature selection methods. Examples include Lasso Regression and RF.

  • In tree-based methods such as Random forest, feature importance is calculated as the drop in node impurity for nodes for each feature in each decision tree divided by the total drops at all nodes. The feature importance for each feature is then normalized to be between 0 and 1 by the sum of all feature importance. The final feature importance is the average of all decision tree feature importance.