During the stage of feature engineering, many features may be created. How should we select relevant features for modeling?
Feature selection might be categorized in several main types:
Wrapper-based: Features are selected via a search problem. An example is Recursive Feature Elimination
Embedded: Embedded methods use algorithms that have built-in feature selection methods. Examples include Lasso Regression and RF.
- In Lasso, coefficients of irrelevant features are zeros.
- In tree-based methods such as Random forest, feature importance is calculated as the drop in node impurity for nodes for each feature in each decision tree divided by the total drops at all nodes. The feature importance for each feature is then normalized to be between 0 and 1 by the sum of all feature importance. The final feature importance is the average of all decision tree feature importance.