🧠 Daily Study Log [2025-06-29]
Today was fully dedicated to optimizing ensemble strategies for the SCU_Competition.
I experimented with weighted voting, seed diversity, and OOF-based stacking to push the model’s generalization performance.

📊 SCU_Competition — Submission 26–30

Focus: Enhancing AUC score with tuned ensemble strategies and classic feature refinement
Model: LGBMClassifier, soft voting ensemble, OOF-based stacking

Strategies Applied:

Submission 26: Weighted soft voting (model_11: model_9 = 7:3)
Submission 27: VotingClassifier with seed-tuned model_11/9/21 (weights 5:3:2)
Submission 28: OOF-based StackingClassifier (meta model: LGBM)
Submission 29: Clean model with only essential features (classic structure)
Submission 30: Engineered interaction features based on top variables

Key Observations:

Submission 30 achieved Kaggle AUC 0.8968, the best so far
Overfitting in submission 28 (CV AUC 0.9995 vs. Kaggle AUC 0.8363) proved OOF stacking was risky
Simpler structures with meaningful features continued to perform more reliably

✅ Takeaways

Ensemble synergy depends on model diversity, not quantity
High local CV scores can mislead — real test is on unseen (Kaggle) data
Strategic feature engineering often outperforms complex modeling tricks

🎯 Next Steps

Submission 31: Combine classic model with filtered feature interactions
Revisit meta-feature stacking with regularization and robust validation
Log feature importances and start SHAP-based interpretation

✅ TL;DR

📍 Voting: Weighted ensemble of well-tuned models gave stable results
📍 Stacking: OOF-based meta-model overfitted — needs better regularization
📍 Classic: Clean features + smart interaction boosted performance (best AUC 0.8968)