Real ML projects combine multiple sources (users + purchases + logs).
Pandas merge = SQL JOIN
Types:
# Inner merge on id
pd.merge(users, orders, on="user_id", how="inner")
# Left merge (keep all users)
pd.merge(users, orders, on="user_id", how="left")
# Different column names
pd.merge(df1, df2, left_on="id", right_on="user_id")
# Concat vertically (same columns)
pd.concat([df1, df2], ignore_index=True)
Used in ML: merge features from different tables.