Journal Article
Research on the distribution patterns and interrelationships of sales volume of various vegetable categories and individual products
by
Hao Li
, Siyi Huang
, Rui Hu
, Minghao Liu
, Junhao Li
and
Fuqiang Huang
Abstract
This study conducts a comprehensive analysis of vegetable sales data. Using Python, the collected data was integrated, categorized, and checked for missing values with the "missingno" library. Anomalies were detected using box plots, revealing a few outliers, which were retained as they more accurately reflect real business phenomena. The analysis explored distribution patterns
[...] Read more
This study conducts a comprehensive analysis of vegetable sales data. Using Python, the collected data was integrated, categorized, and checked for missing values with the "missingno" library. Anomalies were detected using box plots, revealing a few outliers, which were retained as they more accurately reflect real business phenomena. The analysis explored distribution patterns and interrelationships of sales quantities across different vegetable categories and individual items. Time series decomposition revealed significant seasonal variations in sales volumes throughout the year. Model accuracy was validated through residual analysis, and missing values were imputed using the Prophet model. Dynamic Time Warping (DTW) calculated distance matrices between categories to uncover similarities. K-means clustering analyzed sales trends and seasonal patterns of individual items, with DTW providing detailed similarity analysis within clusters. This approach identified correlated sales trends, such as the high correlation between Chinese cabbage and bell peppers, indicating that consumers may prefer to purchase these vegetables together. These findings offer valuable insights for optimizing supermarket restocking strategies.