📌 Paper Info


🧠 Day 3 – Deep Dive into EfficientNet-B0 Architecture & Scaling Coefficients

Today’s study focused on understanding how the EfficientNet-B0 baseline is constructed via NAS, and how the compound scaling coefficients \( \alpha, \beta, \gamma \) are derived and applied in practice.


⚙️ EfficientNet-B0: NAS-Based Baseline

EfficientNet-B0 is designed using Neural Architecture Search (NAS) with the same search space as MnasNet, but targeting a larger FLOPs budget (400M).
The architecture balances accuracy and efficiency using a multi-objective function:

\[\text{Objective} = \text{ACC}(M) \cdot \left( \frac{\text{FLOPs}(M)}{T} \right)^w\]

🔧 Core Components:


🧮 Compound Scaling Revisited

After defining B0, the paper introduces compound scaling, where model dimensions grow in a coordinated manner:

\[\text{depth} \propto \alpha^{\phi}, \quad \text{width} \propto \beta^{\phi}, \quad \text{resolution} \propto \gamma^{\phi}\] \[\alpha \cdot \beta^2 \cdot \gamma^2 \approx 2\]

This ensures FLOPs double with each unit increase in \( \phi \), making the scaling predictable and efficient.


💡 Why Find Coefficients on a Small Model?


🔍 Key Insights


💬 Personal Reflection

The use of a small, well-designed base model (B0) and then applying uniform scaling using simple coefficients is both elegant and practical.
Instead of engineering each model version, EfficientNet grows predictably in all dimensions, delivering SOTA accuracy with fewer resources.

🔖 This post is part of an ongoing paper review series for deeper learning and long-term retention!