π https://doi.org/10.1038/s41598-024-83475-4
Today I explored how the proposed system performs across different datasets, how it compares with human experts, and how SHAP helps make the scoring explainable and educational.
π What I Learned β From Metrics to Meaning
π§ͺ Evaluation Metrics
- Used MAE, RMSE, sMAPE, RΒ², Pearson correlation, and ICC
- LOOCV (Leave-One-Out Cross Validation) used due to small sample size
- These metrics cover both accuracy and consistency with expert scores
- Feature alignment greatly improved performance across all models
- The proposed adaptive weighted ensemble achieved the best scores
- On XSQ and PBB datasets, model even outperformed human experts in MAE
- TaiChi results still favored expert judges β possibly due to subtler motion nuance
π₯ Comparison with Experts
Dataset |
MAE (Proposed) |
MAE (Expert Avg) |
XSQ |
0.237 |
0.371β0.420 |
PBB |
0.261 |
0.319β0.457 |
TaiChi |
0.290 |
0.130β0.270 |
β Indicates strong generalization in most styles, but expert intuition still matters in complex domains.
π‘ SHAP in Action
- Global SHAP: Key angles in frames 13β17, 19β23, and 31 had the biggest influence
- Local SHAP: Showed which joints in which frames led to deductions or boosts
- Partial dependency plots revealed optimal angle ranges per joint
- Feedback is interpretable and coachable β not just numerical
π Insight Snapshot
βThe model doesnβt just score β it teaches.β
- Even basic regressors perform well with good features and alignment
- SHAP transforms black-box regression into personalized feedback
- Ensemble + alignment bridges statistical accuracy with human-level usability
π οΈ What Iβll Build Next
- Compare SHAP output with actual dance instructor annotations
- Build a SHAP-driven feedback system to flag common joint issues
- Expand pipeline to support multi-style templates (e.g., waacking vs. hip-hop)
- Explore real-time scoring using simplified keypoint subsets
π Reflection
This part of the paper really convinced me that alignment and explanation together can make ML feedback systems actually useful β not just technically accurate.
It also gave me inspiration for using this approach in rehabilitation or education, where transparency matters more than raw accuracy.