One case study, five models from data preprocessing to implementation with Python, with some examples where no coding is required.
We will cover the following topics in this case study
Problem Statement
Data
Data Preprocessing 1
Understanding Dataset
Data change and Data Statistics
Data Preprocessing 2
Missing values
Replacing missing values
Correlation Matrix
Data Preprocessing 3
Outliers
Outliers Detection Techniques
Percentile-based outlier detection
Mean Absolute Deviation (MAD)-based outlier detection
Standard Deviation (STD)-based outlier detection
Majority-vote based outlier detection
Visualizing outlier
Data Preprocessing 4
Handling outliers
Feature Engineering
Models Selected
·K-Nearest Neighbor (KNN)
·Logistic regression
·AdaBoost
·GradientBoosting
·RandomForest
·Performing the Baseline Training
Understanding the testing matrix
·The Mean accuracy of the trained models
·The ROC-AUC score
ROC
AUC
Performing the Baseline Testing
Problems with this Approach
Optimization Techniques
·Understanding key concepts to optimize the approach
Cross-validation
The approach of using CV
Hyperparameter tuning
Grid search parameter tuning
Random search parameter tuning
Optimized Parameters Implementation
·Implementing a cross-validation based approach
·Implementing hyperparameter tuning
·Implementing and testing the revised approach
·Understanding problems with the revised approach
Implementation of the revised approach
·Implementing the best approach
Log transformation of features
Voting-based ensemble ML model
·Running ML models on real test data
Best approach & Summary
Examples with No Code
Downloads – Full Code