πŸ“š https://arxiv.org/abs/1706.03762
πŸ† Published in NeurIPS 2017

βœ… Day 3 – Multi-Head Attention


πŸ“Œ Motivation


πŸ“Œ Mechanism


πŸ“Œ Dimensions


πŸ“Œ Benefits


πŸ“Œ Key Takeaways (Day 3)


βœ… Day 4 – Feed-Forward Networks & Positional Encoding


πŸ“Œ Feed-Forward Networks


πŸ“Œ Positional Encoding


πŸ“Œ Why Sinusoidal?


πŸ“Œ Key Takeaways (Day 4)


🧠 Final Thoughts (Day 3 & 4)

Next, I’ll study the training strategies and optimization details described in the paper.