Today I officially kicked off my Titanic Survival Prediction project ๐
I focused on understanding the data and cleaning it up properly.
train and test datasetshead(), info(), describe(include='all'))Age: Filled with median valuesEmbarked: Dropped rows with missing values (only 2)Fare: Dropped row with missing value (only 1 in test set)Cabin: Created a new binary feature (Has_Cabin) and dropped the original columnEffective EDA isnโt just about pretty charts โ itโs about preparing clean, model-ready data.
Today I made sure the dataset is clean and ready for feature engineering.
Title from NameFamilySize from SibSp and ParchFeels great to clear the first major hurdle ๐
Letโs engineer some powerful features next!