Having explored numerical computation with NumPy, we now turn our attention to managing and manipulating structured data, a fundamental task in any machine learning project. Real-world data is rarely clean or perfectly formatted for analysis. This chapter introduces the Pandas library, the standard Python tool for data wrangling.
You will learn about the core Pandas data structures, the one-dimensional Series
and the two-dimensional DataFrame
, which provide powerful and flexible ways to handle tabular data. We will cover essential operations including:
.loc
and .iloc
.groupby
.By the end of this chapter, you will be equipped to use Pandas to efficiently prepare diverse datasets for analysis and machine learning model building.
3.1 Introduction to Pandas Data Structures
3.2 Loading Data from Various Sources
3.3 Data Indexing and Selection
3.4 Handling Missing Data
3.5 Data Cleaning and Transformation Techniques
3.6 Grouping and Aggregation Operations
3.7 Merging, Joining, and Concatenating DataFrames
3.8 Time Series Data Handling in Pandas
3.9 Practice: Data Wrangling with Pandas
© 2025 ApX Machine Learning