Welcome to the course on Statistics For Data Scientists!
Learn about the key concepts in statistics, and how to apply them to your data analysis.
A highly practical and hands-on approach.
A focus on building an intuitive understanding of each topic.
Learn to use Python code to simulate various scenarios in a plug-and-play manner.
What is included in the course:
Detailed Course Notes (100 page textbook with 50+ illustrative figures)
Deck of 360 slides
Lectures with 10h+ content spread over 40+ videos
All of the code in Jupyter Notebooks (7 notebooks, 2000+ lines of code)
Bonus Chapter: Introduction to Machine Learning
Topics that the course covers:
The Histogram
Generating artificial Data sets
The central tenet of Statistics
The Central Limit Theorem
Distribution functions
Percentiles
Data Ranges
Cumulative Distribution Function
Different Distribution types:
Normal Distribution
Uniform Distribution
Exponential Distribution
Poisson Distribution
Bernoulli Distribution
Rayleigh Distribution
Statistical Testing
Reasoning behind statistical testing
P-value
Statistical Significance
Different Statistical Tests:
Shapiro-Wilk test
Levene's test
Student T-test/ Welsh T-test
ANOVA test
Kolmogorov Smirnov test
Non-parametric tests
Two real-life examples
Detect a biased coin with 95% certainty
Real-life A/B testing
Correlation
Linear correlation - Pearson correlation coefficient + alternatives
Categorical correlation - Chi-Squared test + contingency tables
EXTRA: Regression and intro to Machine Learning
Linear Regression
Logistic Regression + ML pipeline
Who is this course for:
Students on a data science track, or any other technical field.
Professionals that want to pivot into a data science career.
Managers that want to be able to make data driven decisions.
Practicing Data Scientists that want to add this value skill to their tool belt.