πŸ“Œ Paper Info


🧠 Day 2 Review – Core Architecture Breakdown

βœ… Step 1: Depthwise Separable Convolution

To reduce computation, MobileNetV2 employs depthwise separable convolutions, splitting standard convolution into:

This reduces computation by a factor of 8–9Γ— (with 3Γ—3 kernels) while maintaining accuracy, making it ideal for mobile devices.


βœ… Step 2: Linear Bottlenecks

MobileNetV2 removes the ReLU activation from the final 1Γ—1 projection layer in the bottleneck block.
Why? Because ReLU can destroy information in low-dimensional spaces by zeroing out values.
By keeping this layer linear, the architecture preserves more information while still introducing non-linearity earlier in the block.


βœ… Step 3: Inverted Residuals

Unlike traditional residual blocks (e.g., ResNet) that skip across wide layers, MobileNetV2 connects compressed bottlenecks.
This inverted design:

πŸ“Œ Residual connections are only applied if:


βœ… Step 4: Block Structure Summary

Each MobileNetV2 block follows this sequence:

Input β†’ 1Γ—1 Conv (Expansion, ReLU6)  
     β†’ 3Γ—3 Depthwise Conv (Stride s, ReLU6)  
     β†’ 1Γ—1 Conv (Projection, Linear)  
     β†’ + Residual connection (if stride = 1 and input/output dims match)

This structure is lightweight, modular, and extremely efficient for mobile environments.


βœ… Key Insights (3-Line Summary)


πŸ“˜ New Terms


πŸ—‚ GitHub Repository

Detailed markdown summary:
πŸ”— github.com/hojjang98/Paper-Review


πŸ’­ Reflections

The architecture design feels even more elegant now that I’ve broken down its components.
The use of linear layers and depthwise convolutions shows how theory and engineering blend together.
Next, I plan to look at experiments, ablation results, and comparisons with MobileNetV1 and ShuffleNet.