Welcome to this course. The pandas library is massive, and it’s common for frequent users to be unaware of many of its more impressive features. Pandas is a popular Python library used by data scientists and analysts worldwide to manipulate and analyze their data. This course presents useful data manipulation techniques in pandas to perform complex data analysis in various domains. This course will teach you how to be more productive with data and generate real business insights to inform your decision-making. You will be guided through real-world data science problems and shown how to apply key techniques in the context of realistic examples and exercises. Engaging activities will then challenge you to apply your new skills in a way that prepares you for real data science projects.
You’ll see how experienced data scientists tackle a wide range of problems using data analysis with pandas. You will learn how to use pandas to perform data analysis in Python. You will start with an overview of data analysis and iteratively progress from modeling data, to accessing data from remote sources, performing numeric and statistical analysis, through indexing and performing aggregate analysis, and finally to visualizing statistical data and applying pandas to finance.
In this course, you'll learn:
Learn How to Access and load data from different sources using pandas
Master the fundamentals of pandas to quickly begin exploring any dataset
Isolate any subset of data by properly selecting and querying the data
Work with a range of data types and structures to understand your data
Split data into independent groups before applying aggregations and transformations to each group
Restructure data into tidy form to make data analysis and visualization easier
Perform data transformation to prepare it for analysis
Prepare real-world messy datasets for machine learning
Combine and merge data from different sources through pandas SQL-like operations
Use Matplotlib for data visualization to create a variety of plots
Create data models to find relationships and test hypotheses
Manipulate time-series data to perform date-time calculations
Utilize pandas unparalleled time series functionality
Create beautiful and insightful visualizations through pandas direct hooks to Matplotlib and Seaborn
Optimize your code to ensure more efficient business data analysis
At the end of this course, you’ll have the knowledge, skills, and confidence you need to solve your own challenging data science problems with pandas.