Select rows based on conditions — like SQL WHERE.
# Single condition
df[df["age"] > 30]
# Multiple conditions (use & | ~)
df[(df["age"] > 25) & (df["city"] == "NY")]
# Select columns
df[["name", "age"]]
Very common in ML: filter training data, remove outliers.
Group by category and compute stats (like Excel pivot table).
df.groupby("city")["age"].mean()
# Average age per city
df.groupby("department").agg({
"salary": "mean",
"bonus": "sum"
})
Used in ML: group data before modeling, compute feature stats.