πŸ“Œ Paper Info


🧠 Day 2 – Architecture & Compound Scaling in Practice

βœ… Focus

Today’s reading focused on Section 3: EfficientNet Architecture and the practical implementation of the compound scaling method.
The paper outlines how the authors derive the baseline model EfficientNet-B0 using Neural Architecture Search (NAS), and then scale it up to EfficientNet-B1 through B7 using a simple, principled rule.


βš™οΈ EfficientNet-B0: The Baseline

EfficientNet-B0 is the base model discovered via NAS, with MobileNetV2-like blocks as the search space.

Key architectural elements:

The NAS process balances accuracy, latency, and parameter count, yielding a compact and efficient architecture.

πŸ“Œ B0 Summary: ~5.3M params / 0.39B FLOPs / 77.1% Top-1 (ImageNet)


πŸ“ Compound Scaling Method

Once B0 is established, the authors introduce a compound scaling formula to generate deeper and wider models in a balanced way:

This leads to uniform, predictable scaling, resulting in the EfficientNet-B1 to B7 series.

πŸ“Œ Example: EfficientNet-B7 achieves 84.3% Top-1 with just 66M params, outperforming earlier large-scale models.


πŸ“Š Performance Highlights

Compound scaling consistently improves performance with fewer resources:

Model Params FLOPs Top-1 Acc (ImageNet)
EfficientNet-B0 5.3M 0.39B 77.1%
EfficientNet-B4 19M 4.2B 83.0%
EfficientNet-B7 66M 37B 84.3%

This validates the method’s effectiveness: state-of-the-art results with lower computational cost.


πŸ’¬ Personal Reflection

What stood out most today was the elegance of the compound scaling approach.
Instead of treating depth, width, and resolution separately, the authors proposed a single, constraint-based rule β€” simple yet powerful.

Also, by using NAS once to generate a well-balanced base model (B0), they avoid retraining for each new scale. This makes EfficientNet extremely practical for real-world deployment.


πŸ“ This summary reflects my personal understanding and is written as a reference log for deeper learning.