Lesson 7: Data Types, Conversion & Basic Cleaning

1. Understanding Data Types

Wrong types cause errors in ML (e.g. numbers stored as strings).

df.dtypes               # show types
df["age"] = df["age"].astype(int)   # convert to integer
df["date"] = pd.to_datetime(df["date"])  # parse dates

Common conversions:

Exercise 1

If df["price"] is object (string), how to convert to float?

2. Basic Cleaning Tasks

These steps fix most common data quality issues.

Exercise 2

To remove duplicate rows:
df = df.()

To rename "old_name" to "new_name":
df = df.rename(columns={{"": ""}})

Exercise 3

Which are good cleaning steps before ML?
← Previous Lesson (6) Next Lesson (8) →