Lesson 14: Pandas – Filtering, Grouping & Basic Cleaning

1. Filtering Data

Select rows based on conditions — like SQL WHERE.

# Single condition
df[df["age"] > 30]

# Multiple conditions (use & | ~)
df[(df["age"] > 25) & (df["city"] == "NY")]

# Select columns
df[["name", "age"]]

Very common in ML: filter training data, remove outliers.

Exercise 1

How to get rows where score >= 90?

2. Grouping & Aggregation

Group by category and compute stats (like Excel pivot table).

df.groupby("city")["age"].mean()
# Average age per city

df.groupby("department").agg({
    "salary": "mean",
    "bonus": "sum"
})

Used in ML: group data before modeling, compute feature stats.

Exercise 2

df has columns "species" and "petal_length"
To get average petal_length per species:
df.groupby("")[""].()

Exercise 3

Which are common Pandas cleaning tasks?
← Previous Lesson (13) Next Lesson (15) →