After digging through Seoul’s commercial sales data for a few days,
I finally feel like today was the day I nailed down the direction of this project’s EDA.

Instead of aimlessly slicing data, I asked:

“If I want to predict how much a business can earn in a specific neighborhood, what do I actually need to understand?”


🔍 What I focused on today

  1. Revisited the core goal:
    “Estimate revenue based on district (행정동) and service type (업종).”
  2. Made sure all data processing aligns with this goal
  3. Grouped and sorted revenue by district + industry + quarter
  4. Identified potential gaps:
  5. Realized that additional data (e.g. number of stores, population, foot traffic) will be crucial for modeling

🧠 Key Takeaway

EDA isn’t just about exploring — it’s about preparing your data to answer real questions.
Today I stopped floating in “what’s interesting?” and locked in on “what’s useful for prediction?”


🧩 Next Steps


Feels good to finally have a compass 🧭
Let’s move from exploration → modeling.