• Antoreep Jana

Feature Selection vs Feature Removal : Dimensionality Reduction

Dimensionality Reduction is method used to overcome the curse of dimensionality. When the number of independent features increase significantly, there is a general possibility of counting in redundant and unnecessary features. The method to handle such events is known as Dimensionality Reduction.

Two ways to counter this ->

1) Feature Selection

In Feature Selection, out of all the features, attempt is made to find out 'k' most significant features which are a good representative of the entire dataset. Here good representative is usually decided by setting a threshold value.

There are various tests & methods to perform this. ANOVA, Pearson, Spearman, Chi-Squared, Tree Method, Random Forest.

2) Feature Extraction

In Feature Extraction, a new set of processed feature set is generated which captures the majority of the variance of the dataset. PCA is usually considered the best method of performing Feature Extraction. Neural Networks have the mechanism of performing Feature Extraction. Linear Discriminant Analysis is also another method widely used. T-SNE is used to visualize the high dimensionality distribution in a lower 2or3 dimension.

3 views0 comments

Recent Posts

See All