๐ https://doi.org/10.1038/s41598-024-83475-4
โ
Day 3 โ Alignment, Ensemble, and SHAP: Inside the Core Mechanism
Today I focused on the core algorithmic design of the proposed framework โ how it turns raw joint angles into explainable scores using alignment, ensemble regression, and SHAP-based interpretation.
๐ง What I Learned โ From Feature to Final Score
- MediaPipe provides 33 3D joint coordinates.
- From these, 18 angles are computed across limbs and trunk.
- Angle calculation formula:
\[A_i = \arccos \left( \frac{\vec{v_1} \cdot \vec{v_2}}{|\vec{v_1}||\vec{v_2}|} \right) \cdot \frac{180}{\pi}\]
โ This captures detailed movement articulation, including elbow bends, knee twists, torso rotation, etc.
๐ Feature Alignment
- Procrustes Analysis for spatial normalization (scale, rotation, translation)
- Dynamic Time Warping (DTW) aligns temporal sequences to a 32-frame reference
- Combined, they remove rhythm and execution bias, ensuring fair comparisons
๐ง Regression Models + Adaptive Ensemble
- 7 regressors: Linear, Lasso, SVM, KNN, DT, RF, Bagging
- Adaptive weighting assigns more weight to models with lower RMSE:
\[w_i' = \left( e^{|\text{RMSE}_i - \text{RMSE}_{\max}|} \right)^k, \quad w_i = \frac{w_i'}{\sum_j w_j'}\]
\[\hat{y}_{\text{final}} = \sum_i w_i \cdot \hat{y}_i\]
โ Ensemble behaves like a human judging panel, where better performers influence more.
๐ก Explainability with SHAP
- SHAP values dissect the final score into feature-level contributions
- Global SHAP: which angles matter most across all data
- Local SHAP: why this specific motion got its score
- Feedback becomes joint-aware, not just numerical
๐ Insight Snapshot
โSpatial/temporal alignment + explainable ensemble = evaluation system thatโs both accurate and usable.โ
- The alignment phase is not just preprocessing โ it makes regression meaningful
- Even simple models (like KNN, DT) become competitive with good features
- SHAP allows transparent, joint-level feedback, like a virtual coach
๐ ๏ธ What Iโll Build Next
- Implement DTW-aligned angle extraction using sample dance or fitness videos
- Create ensemble regressors with adaptive RMSE weighting
- Add SHAP visualizations per joint/frame
- Build Streamlit UI for real-time scoring + feedback
๐ญ Day 4 Preview
- Deep dive into experimental results:
- Ablation: aligned vs. unaligned
- Scoring vs. human expert scores
- SHAP global/local visualization case studies
- See whether this pipeline actually generalizes across datasets
๐ Reflection
Today clarified how much alignment and angle definition impact downstream models.
I also liked how explainability wasnโt treated as an afterthought โ itโs built into the pipeline.
Iโm now seriously thinking about how to embed this structure into dance feedback systems or rehab movement scoring.
Even without deep learning, this paper shows a lot can be achieved with clear structure and interpretability.