Today, my food image classifier reached its highest accuracy yet: 59%.
This wasn’t just a technical achievement β€” it was also a reminder that how you collect data matters as much as how you train a model.


πŸ‡ What Changed?

After hitting 56% with 100 images per class, I decided to push the data frontier further:

But I also stopped to think:

Is it okay to use all these Google Images?


βš–οΈ Data Ethics – Thinking Beyond Accuracy

While this dataset is entirely self-collected using Selenium-based crawlers, I made sure to respect copyright concerns:

None of the image files will be uploaded to GitHub.
The dataset is for non-commercial, academic use only.
I explicitly note in the README that the copyright of all images belongs to their original owners.

You can reproduce the dataset using the script I provide β€” and that’s as far as I’ll go in β€œsharing” the data.


πŸ“Š Results

The model still uses MobileNetV2, no fine-tuning yet.
But it’s learning. Clearly.


πŸš€ Next Steps


πŸ’‘ Final Reflection

Better data still beats better models β€”
but better ethical data is what lasts.

This project helped me grow as both a practitioner and a responsible engineer.
The next 1% will be harder β€” but it’ll be built on the solid foundation I now have.