Today marks the first real step in my Sales Forecasting for Small Business project.
Why?
Because after exploring Seoulโs open business datasets and running my own store in the past, I wanted to visualize and analyze actual sales trends across different neighborhoods and industries.
๐ What I did today
- Loaded the raw dataset from Seoulโs commercial analysis platform
- Format:
.xlsx
(2024 data)
- Structure: Quarterly sales by district (
ํ์ ๋
) and industry category (์๋น์ค_์
์ข
_์ฝ๋_๋ช
)
- Cleaned the dataset
- Checked for missing values and basic structure using
df.info()
/ df.describe()
- Grouped sales by district + quarter
- Created a pivot-like structure to analyze trends
- Visualized Top 10 industries in Q4 2024
- Used
matplotlib
and configured Korean font (Malgun Gothic
) to avoid text issues
- Horizontal bar plot to show industry-wise revenue dominance
๐ก Takeaways
- Basic structure is ready โ the data is clean and well-organized
- Certain industries clearly dominate (especially seafood, fruits/vegetables)
- Need to investigate how trends shift over time and by region
๐ง Next steps
- Add time-series plots by industry and neighborhood
- Try a geographical visualization (Seoul choropleth map?)
- Consider log transformation to address large sales variance
- Create README and translate key column names
Thatโs it for Day 1 โ clean start, clear structure, and one solid chart! ๐
Stay tuned for more insights as this project unfolds.