Today was a productive day filled with hands-on practice in image segmentation and a solid CNN refresher.
I trained a U-Net model to predict pixel-wise masks on real urban scenes and verified that the predictions matched the ground truth closely.
In parallel, I revisited the classic Dog vs. Cat classifier to reinforce CNN basics.
This project tackles semantic segmentation on urban environments using the Cityscapes dataset.
The goal was to assign each pixel a semantic label (e.g., road, sky, building) โ a key component for autonomous driving systems.
_leftImg8bit.png
, _gtFine_labelIds.png
)Unet(resnet34)
from segmentation_models_pytorch
CrossEntropyLoss()
with classes=34
As a complementary activity, I reviewed a basic CNN model using the Dog vs. Cat dataset.
This helped reinforce my understanding of convolutional layers, activation functions, and data preprocessing.
ImageDataGenerator
CrossEntropyLoss()
works โ lesson learned.This session combined practical low-level vision tasks with high-level architectural understanding.
Next, I plan to experiment with SegFormer, DeepLab v3+, and evaluate models using mIoU, pixel accuracy, and visual comparisons.
โ
Segmentation: Deployed and working
๐ถ CNN: Refreshed and reinforced
๐ Next Step: Validation loop, mIoU scoring, model deployment