Today I officially kicked off my Titanic Survival Prediction project ๐
I focused on understanding the data and cleaning it up properly.
train
and test
datasetshead()
, info()
, describe(include='all')
)Age
: Filled with median valuesEmbarked
: Dropped rows with missing values (only 2)Fare
: Dropped row with missing value (only 1 in test set)Cabin
: Created a new binary feature (Has_Cabin
) and dropped the original columnEffective EDA isnโt just about pretty charts โ itโs about preparing clean, model-ready data.
Today I made sure the dataset is clean and ready for feature engineering.
Title
from Name
FamilySize
from SibSp
and Parch
Feels great to clear the first major hurdle ๐
Letโs engineer some powerful features next!