Wrong types cause errors in ML (e.g. numbers stored as strings).
df.dtypes # show types
df["age"] = df["age"].astype(int) # convert to integer
df["date"] = pd.to_datetime(df["date"]) # parse dates
Common conversions:
.astype(int/float/str)pd.to_numeric() — safe numeric conversionpd.to_datetime()pd.Categorical() — for categoriesdf.drop_duplicates()df.rename(columns={"old": "new"})df["name"] = df["name"].str.strip()df["gender"] = df["gender"].replace({"M": "Male", "F": "Female"})df["city"] = df["city"].str.lower()These steps fix most common data quality issues.