Today I officially kicked off my Titanic Survival Prediction project 🚀
I focused on understanding the data and cleaning it up properly.

🔍 What I accomplished today

Loaded the train and test datasets
Inspected the data structures and distributions (head(), info(), describe(include='all'))
Identified and handled missing values:
- Age: Filled with median values
- Embarked: Dropped rows with missing values (only 2)
- Fare: Dropped row with missing value (only 1 in test set)
- Cabin: Created a new binary feature (Has_Cabin) and dropped the original column
Confirmed there are no missing values remaining in both train and test datasets

🧠 Key Takeaway

Effective EDA isn’t just about pretty charts — it’s about preparing clean, model-ready data.
Today I made sure the dataset is clean and ready for feature engineering.

🧩 Next Steps

Create new features:
- Extract Title from Name
- Calculate FamilySize from SibSp and Parch
Explore additional feature engineering ideas
Begin baseline modeling

Feels great to clear the first major hurdle 🏁
Let’s engineer some powerful features next!