Pandas is the most popular Python library for data manipulation and analysis — built on NumPy.
Core object: **DataFrame** (like Excel table or SQL table)
Install: pip install pandas
Import convention: import pandas as pd
import pandas as pd
df = pd.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35],
"city": ["NY", "LA", "SF"]
})
print(df)
Most common: read CSV files (real ML datasets)
df = pd.read_csv("data.csv")
print(df.head()) # first 5 rows
print(df.info()) # types & missing values
print(df.describe()) # stats (mean, std, min/max)
print(df.shape) # (rows, columns)
print(df.columns) # list of column names
Used constantly in ML: inspect dataset before modeling.