Lesson 13: Intro to Pandas – DataFrames & Data Exploration

1. What is Pandas?

Pandas is the most popular Python library for data manipulation and analysis — built on NumPy.

Core object: **DataFrame** (like Excel table or SQL table)

Install: pip install pandas

Import convention: import pandas as pd

import pandas as pd
df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35],
    "city": ["NY", "LA", "SF"]
})
print(df)

Exercise 1

What does df.head() usually show?

2. Loading & Exploring Data

Most common: read CSV files (real ML datasets)

df = pd.read_csv("data.csv")
print(df.head())          # first 5 rows
print(df.info())          # types & missing values
print(df.describe())      # stats (mean, std, min/max)
print(df.shape)           # (rows, columns)
print(df.columns)         # list of column names

Used constantly in ML: inspect dataset before modeling.

Exercise 2

After loading df = pd.read_csv("iris.csv"), what does df.shape return if there are 150 rows and 5 columns?
(, )

Exercise 3

Which Pandas methods help find missing values?
← Previous Lesson (12) Next Lesson (14) →