Introduction to Feature Engineering
Chapter 1: The Role of Features in Machine Learning
Revisiting the Machine Learning Workflow
What Constitutes a Feature?
Impact of Feature Quality on Model Performance
Common Data Types and Their Challenges
Overview of Feature Engineering Tasks
Chapter 2: Handling Missing Data
Identifying Missing Values
Mechanisms of Missing Data (MCAR, MAR, MNAR)
Simple Imputation Strategies: Mean, Median, Mode
Creating Missing Value Indicators
Multivariate Imputation: KNN Imputer
Multivariate Imputation: Iterative Imputer
Comparing Imputation Methods
Hands-on Practical: Imputing Missing Data
Chapter 3: Encoding Categorical Features
Challenges with Categorical Data
Nominal vs. Ordinal Categories
One-Hot Encoding for Nominal Features
Ordinal Encoding for Ordered Features
Handling High Cardinality Features
Target Encoding (Mean Encoding)
Comparing Encoding Methods
Hands-on Practical: Applying Encoding Techniques
Chapter 4: Feature Scaling and Transformation
The Need for Feature Scaling
Standardization (Z-score Scaling)
Normalization (Min-Max Scaling)
Log Transformation for Skewed Data
Yeo-Johnson Transformation
Choosing the Right Scaling/Transformation Method
Hands-on Practical: Scaling and Transforming Features
Chapter 5: Feature Creation
Motivation for Creating New Features
Feature Creation from Date/Time Data
Binning Numerical Features
Domain-Specific Feature Engineering
Automated Feature Creation (Introduction)
Hands-on Practical: Engineering New Features
Chapter 6: Feature Selection
Importance of Feature Selection
Filter Methods: Variance Threshold
Filter Methods: Univariate Statistical Tests (ANOVA F-value, Chi-Squared)
Filter Methods: Correlation Analysis
Wrapper Methods: Recursive Feature Elimination (RFE)
Wrapper Methods: Sequential Feature Selection (SFS)
Embedded Methods Overview
Embedded Methods: Regularization (Lasso L1)
Embedded Methods: Tree-Based Feature Importance
Hands-on Practical: Selecting Features