1Department of Data Science, Trendyol, Turkey
2Department of Computer Engineering, Çukurova University, Turkey
Cite this as
Çetin E, Özbek MB, Biner S, Ulus C, Akay MF. Development of Machine Learning Based Product Collection Creation and Recommendation System. Trends Comput Sci Inf Technol. 2024;9(3):094-102. Available from: 10.17352/tcsit.000087Copyright License
© 2024 Çetin E, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.E-commerce has become increasingly popular in recent years and is growing at a dynamic pace. In this sector, maximizing customer satisfaction and increasing sales volumes are of great importance. Developing sales strategies aligned with customer interest is crucial. The aim of this study is to develop a system that enables the creation and recommendation of collections using machine learning-based methods. To achieve this, models were developed using machine learning techniques including DT, LR, XGBoost, LightGBM, RF, and ANN. These models were designed to predict the popularity of collections, rank them on product detail pages and homepages, and provide personalized collection rankings. The performance of the developed models was evaluated using the metrics R^2, MSE, and MAE. The results indicate that the XGBoost Regressor model demonstrates the most successful prediction performance. The developed system led to a 5% increase in the time spent by the customers on the application and a 10% increase in the number of product clicks.
DT: Decision Trees; LR: Logistic Regression; XGBoost: Extreme Gradient Boosting; LightGBM: Light Gradient Boosting Machine; RF: Random Forest; ANN: Artificial Neural Networks; R^2: Coefficient of Determination; MSE: Mean Squared Error; MAE: Mean Absolute Error; HED: Hypergraph-Enhanced Dual Convolutional Neural Network; BundleGT: Bundle Graph Transformer; GNN: Graph Neural Network; IMBR: Interactive MultiRelation Bundle Recommendation with Graph Neural Network; BGN: Bundle Generation Network; CTR: Click Through Rate; SHAP: SHapley Additive exPlanations; CR: Click Rate
Changing consumer behavior and rapid technological developments have triggered a significant transformation in the e-commerce sector [1]. This transformation has heightened the competition businesses face, making it essential for them to develop innovative strategies to sustain their market presence. In this highly competitive environment, companies’ integration into digital systems enables them to expand market share, increase brand awareness, and build a global customer base. Additionally, it offers strategic advantages by allowing small and medium-sized enterprises to compete in the global market, fostering sustainable growth. However, businesses must implement various strategic actions to stand out in this market.
Collection sales emerge as a key strategic element, enabling businesses to introduce new products, set sales targets, and influence fashion trends. In the fashion and clothing sector, collection sales refer to the process of offering wholesale products, designed by a brand for a specific season, to retailers, distributors, or other business partners. This process not only ensures the sustainability of brands’ revenue streams but also plays a crucial role in anticipating consumer demand and optimizing stock management. Successful collection sales are vital for long-term brand growth and competitiveness. Furthermore, developing new collections based on insights from user behavior data represents a strategic advantage.
Offering products that align with customers’ preferences and needs at the right time is considered a key strategy for enhancing business success. Particularly in the fashion sector, where customer preferences are constantly evolving, companies that can swiftly adapt gain a competitive edge [2]. Accurately analyzing customer needs and managing the supply chain accordingly is crucial. However, offering a vast array of products can complicate the decision-making process for customers. Therefore, providing the right products to the right customers is a critical factor for success. In this context, presenting the products expected to receive the highest demand from collections through ranking algorithms enhances customer satisfaction and increases return rates, giving businesses a significant competitive advantage.
Collection suggestions in e-commerce increase customer satisfaction and strengthen customer loyalty by providing personalized experiences. Additionally, these suggestions positively affect sales by creating cross-selling and upselling opportunities. Collection suggestions facilitate stock management and contribute to the effective management of inventory by promoting popular products. As an important tool for gaining a competitive advantage, collection suggestions help shape future marketing strategies by enabling the analysis of user behavior and facilitating the development of more effective campaigns for the target audience. In this context, collection suggestions play a critical role in the success of e-commerce.
This study aims to facilitate users’ discovery of products on the site and access to new content by creating new collections based on user behavior data and highlighting the products expected to receive the most attention from these collections on relevant product detail pages using ranking algorithms.
This study is organized as follows: Section 2 includes relevant literature. The methodology is presented in Section 3. Section 4 presents the development of the collection creation and recommendation system. The results of the study are presented in Section 5. Section 6 presents the discussion of the study. Section 7 concludes the paper.
A new preference-based approach model for bundle recommendation using the Choquet integral was proposed. In this context, preferences for coalitions of environmental attributes were formalized, and product bundles were recommended by taking into account the synergies between product attributes. The results obtained from user feedback showed that the Choquet integral recommended the most suitable packages for various user prototypes [3]. A method that incorporates personalized recommendations to boost e-commerce fashion sales was presented. The model combined Singular Value Decomposition Reranking with customer grouping to offer tailored product recommendations for different customer segments. Multiple recommendation techniques were comprehensively examined and evaluated using Mean Average Precision scores. The results indicated that the proposed model achieved a superior validation score in providing personalized product recommendations [4]. An encoder-decoder framework aimed at bundle generation through a non-autoregressive mechanism was proposed. In this framework, called BundleNAT, pre-training techniques and a GNN were employed to fully embed user-based preferences and item-based compatibility information. Additionally, a self-attention-based encoder was used to extract global dependency information. The results demonstrated that the proposed model outperformed existing methods in terms of Precision, Precision, and Recall error metrics [5]. A recommendation system for fashion retail stores, based on a multi-clustering approach for products and user profiles in both online and physical stores, was introduced. Mining techniques were used to predict customers’ purchasing behaviors, addressing cold start problems typical of state-of-the-art systems. The proposed system was validated both in-store and online [6]. The literature on GNN-based recommendation systems was reviewed. It was noted that there are four dimensions for categorizing existing research in recommendation systems such as stage, scenario, goal, and application. Subsequently, challenges in graph construction, placement propagation/aggregation, model optimization, and computational efficiency were systematically analyzed [7]. The Hyperbolic Mutual Learning model for Bundle Recommendation (HyperMBR), a new approach to bundle recommendation, was proposed. In the HyperMBR, the entities (users, items, bundles) of two view interaction graphs were encoded in hyperbolic space to learn accurate representations. Additionally, a hyperbolic distance-based mutual distillation was proposed to encourage information transfer between the two views and improve recommendation performance. The results indicated that HyperMBR had a good performance [8]. A combined model called the HED for bundle recommendations was proposed. First, a hypergraph was created to capture the interaction dynamics between users, products, and bundles. User-behavior interaction information was then used to enhance the representation of user and bundle placement vectors. The results confirmed that HED achieved strong performance [9]. AI techniques for personalized recommendation systems in online fashion retail stores were explained. It was indicated that customer models are the most crucial component for personalized recommendation systems [10]. An implementation strategy for recommendations based on user clustering was described. Users were clustered according to their scores for commodity categories, and only the nearest neighbors within their categories were considered. Accuracy, Recall, and Specificity metrics were used to measure the performance of this approach [11]. A new bundle recommendation model called BundleGT, which models strategy-aware user and bundle representations, was introduced. BundleGT included three core layers: token embedding, hierarchical graph transformer, and prediction layer. This model considered strategy-aware user representation for predicting user-bundle interactions, taking into account user preferences on item content. Comprehensive experimental results showed that BundleGT outperformed existing models [12]. The IMBR model was proposed to integrate multiple complex interactions with bundles and provide high-quality bundle recommendations. A multi-relation interaction graph was created to capture relationships from the user perspective, while bundle sub-relationship dependencies were obtained from the item perspective. A bundle frequent term constraint algorithm was designed to constrain item composition within a bundle and to emphasize the similarity between bundles. Finally, a multi-task learning framework was employed to capture personalized user preferences and enhance bundle recommendation performance. Experiments with two real-world datasets demonstrated that IMBR outperformed existing methods [13]. The incentives for e-commerce platforms to provide personalized recommendations and their impact on performance were discussed. A theoretical framework was developed to characterize an optimal decision policy for a company, considering the current state of shoppers. It was noted that the company should consistently offer high-quality recommendations that exceed a certain price or value threshold. The recommendations resulted in a significant increase in the company’s revenue [14]. Collaborative Recommender system techniques that offer recommendations based on customers’ interests, facilitating their search and helping them select suitable products, were explained. Collaborative recommender techniques were analyzed comprehensively to determine the value of recommendation systems [15]. GRAM-SMOT, a graph attention-based framework for creating personalized new bundles and recommending existing bundles to users, was proposed. Relationships between users, items, and bundles were examined, and the relative influence of items within a bundle was modeled. A loss function based on a Metric Learning approach was defined to efficiently learn entity embeddings. The use of strategies based on submodular function maximization demonstrated that GRAM-SMOT had superior performance compared to the latest existing models [16]. A BGN for personalized bundle list recommendations was proposed. The BGN employed a typical encoder-decoder framework with a feature-aware softmax to address the limitations of traditional softmax representation. Masked beam search and Determinantal Point Process selection were integrated to produce high-quality and diverse bundle lists with appropriate bundle sizes. Experiments conducted on three general datasets and one industrial dataset showed that BGN achieved a 3.85-fold improvement in response time compared to other methods in bundle list recommendation [17]. The collection recommendation problem was addressed by proposing a product collection that shares a common theme and can potentially be purchased together in a single transaction. Product hierarchies were combined with operational data or domain knowledge to identify candidate product collection clusters. A deep similarity model leveraging textual attributes was then created to generate product collection recommendations from these candidate clusters. The results indicated that this approach could recommend product collections with higher Accuracy [18].
One of the most important procedures in pattern recognition and image analysis is image segmentation technology. Image segmentation directly influences the accuracy of subsequent image processing and analysis. This procedure establishes the foundation for both the image and the evaluation of the analysis results. The primary technique involves dividing the image into distinct feature areas and then extracting data from each area. The goal is to segment the various parts of the image that have different meanings. Shape is a crucial component that helps viewers recognize and understand a scene. Image segmentation and shape feature description are the two components of shape feature extraction. The shape can be defined by its outline (or edge), or by the area it occupies within the scene; in other words, the outline or area can delineate the extent of the relevant scene from the image [19].
The DT approach is a supervised machine learning method that can be used to solve problems involving both classification and regression (for discrete and continuous output values). Its name comes from the tree-like structure in which class labels are the leaves or terminal nodes, and features (or conditions) are the branches. DT’s main advantages are that it is inherently straightforward, comprehensible, and observable. It also allows for the incorporation of decision-making procedures into the tree framework. This method excels in modeling datasets with complex nonlinear relationships between input and output variables. It is important to recognize, though, that it is prone to overfitting and that its effectiveness in handling classification tasks with several output classes is limited [20].
Tree-based learning techniques are used by the gradient-boosting framework LightGBM. These techniques are believed to be very computationally powerful. It is thought to be an algorithm for quick processing. Certain algorithms grow horizontally, forming trees, while LightGBM grows vertically, suggesting it grows leaf-wise [21].
Breiman’s RF is a popular ensemble learning method for interaction detection, regression, clustering, and classification. A single DT is a poor classifier due to its high variance and bias. However, RF usually produces robust models because it can use ensemble trees to mitigate these problems. RF builds hundreds of random binary trees in order to form a forest. Each tree is built from a bootstrap sample using a process known as Classification and Regression Trees, along with a random selection of variables selected at each node. Using data that was not included in the bootstrap sample, the OOB error rate is calculated for each tree that is formed on the sample. Final decisions about model development and class membership are made by majority vote among all trees. Two distinct error rates were computed: the mean decline in accuracy and the mean fall in the Gini coefficient. These criteria have been used by many individuals to rank and choose variables. When executing the RF model, the user must adjust two key parameters: the total number of trees in the forest and the number of variables evaluated at each node in order to minimize the out-of-body error and maximize model performance [22].
XGBoost generates decision trees sequentially through an optimized gradient tree boosting system. It can perform pertinent computations in all computing environments comparatively more quickly. Because of its effectiveness in modeling more recent attributes and label classification, XGBoost is therefore frequently utilized. Since the XGBoost algorithm has been implemented in tabular and structured datasets, its use has exploded in popularity. The decision tree-based method, which computes graphical representations of potential decision solutions based on predetermined criteria, is where the XGBoost algorithm’s evolution began. Then, “bagging,” an ensemble meta-algorithm based on the majoritarian voting technique, was developed to aggregate predictions from different decision trees. By selecting features at random, this bagging technique developed further to create a forest or aggregation of decision trees. The reduction of errors from sequential model building improved the models’ performance. The gradient descent algorithm was used as an additional improvement to lower the errors in the sequential model. In the end, the XGBoost algorithm was found to be a useful method for optimizing the gradient boosting algorithm through the removal of missing values and the use of parallel processing to resolve overfitting problems [23].
The gradient learning approach is the foundation of ANN regression. It is a nonparametric nonlinear model that mimics the information processing and receivers in the human brain by simulating neural networks spreading between layers. An ANN consists of an input layer, a hidden layer, an output layer, and calculations for the hidden layer, output layer, and network initialization (i.e., the number of neurons is determined by the input and expected output to initialize weights between neurons). To determine the final weight, the error values and weights are updated. An ANN is a large-scale learning classification technique that suffers from over-learning and reduced generalization capacity due to sample and network structure complexity. The number of neurons is the most crucial parameter in ANN regression models; the more neurons, the lower the generalization ability and the higher the learning accuracy [24].
In LR, the likelihood of a phenomenon occurring is calculated between 0 and 1, and the predictor variables’ normality is not assumed. Binomial LR analysis predicts the presence or absence of an attribute based on a set of independent variables and is applied when the dependent variable is at the binomial nominal level. In multiple regression, the relationship between multiple independent variables is measured using a single dependent variable, whereas, in linear double regression, one variable is used to predict another variable (for example, temperature– altitude prediction). Multiple regression with a discrete dependent variable is known as LR. The relationship between a set of response variables and a two-dimensional response variable that is, the presence or absence of a variable actually described by the LR model. It is not necessary for the response variable to have a frequent distribution; it can be either continuous or discrete [25].
One of the most commonly used models in technical and scientific applications for determining the relationship between two variables is linear regression. Depending on the characteristics of the data set, two families of statistical methods—Type-I (Ordinary Least Squares, or OLS) and Type-II (Standard Major Axis, or SMA)—have been developed to perform linear regression. In the field of optical oceanography, the rationale for selecting a particular approach to calculate a linear regression fit is often overlooked and rarely supported by statistical data [26].
The study has been conducted in two stages. In the first stage, new collections have been created using data obtained from user behavior. In the second stage, the collections predicted to receive the highest interest among all collections have been displayed on product detail pages using ranking algorithms.
In addition to the previously created rule-based collections, the collection creation process aimed to capture relationships that could not be detected by the human eye using machine learning techniques. In this context, the Prod2Vec approach has been employed by adapting the Word2Vec algorithm, originally used in Natural Language Processing studies, to include products and group them into collections instead of words and sentences. Prod2Vec works with products instead of words. This model is utilized to understand and discover similarities between products on an e-commerce site. Like the Word2Vec model, Prod2Vec analyzes the probability of products coexisting and represents each product in a vector space.
The collection set used for the learning process has been selected from collections created by influencers. The primary reason for this selection is that influencers have the ability to create effective product sets.
Various modeling methods, including DT, LR, XGBoost, LightGBM, RF, and ANN, have been employed for purposes such as estimating the appreciation rate of the collections, determining their rankings on product detail pages and homepages, and creating personalized collection rankings. 10-fold cross-validation was applied during the model development process. Cross-validation is a technique that evaluates a model’s accuracy by performing training and testing on different subsets of the data to measure its overall performance. The optimal values for the hyperparameters were determined using the grid search method. Grid search is an optimization technique that aims to find the parameter combination yielding the highest performance by systematically searching the hyperparameters of a model within a specified range. The hyperparameter values for the developed models are presented in (Table 1).
An A/B test is an experimental method used to determine which of two different versions (A and B) performs better. It is generally used in digital marketing, e-commerce, software development, and user experience improvement processes. The A/B test supports data-driven decision-making processes and is considered a critical tool for strategic improvements.
Among all collections—those created by users, influencers, and the data science team—the ones predicted to receive the highest interest have been identified through ranking algorithms and presented to users as suggestions on product detail pages on both web and mobile interfaces.
These collections, created by users (over 32 million), influencers (over 2 million), and the data science team (over 22 million), have undergone comprehensive data analysis and processes for use in the recommendation system. The relationship matrix has been created between the recommended product and other products in the collection. The relationships between products within the collection have been examined. This relationship matrix helps understand how frequently products are preferred or used together. The quality and sales potential of the products have been evaluated, allowing the system to provide more accurate suggestions by identifying which products are of higher quality and more likely to be sold. Additionally, data on how the collection is used by other users has been analyzed to understand its popularity and interaction level. The collection CTR has been predicted, providing insight into the expected success of the collection and the level of interest it is likely to generate among users.
A large dataset has been created during the factor preparation process. The attributes in the dataset are listed below:
− Stock status, comments, likes, number of stars, visual quality, etc., of the products in the collection
− Whether the person who created the collection is an influencer
− Click, add to cart, and purchase performances of products in the collection in the last 3, 7, and 30 days
− Category, brand, gender, and age similarities of the products in the collection
− Frequency of creation and updates of the collection
− Performance in saving and sharing the collection
− Whether the collection contains video or visual media
− Organic traffic the collection receives from outside Trendyol
Using the prepared variable pool, the click performance that the collections would receive has been predicted. The goal has been to determine which collections have been expected to receive the highest interest. During the modeling phase, linear and tree-based machine learning models, including RF, XGBoost, and Linear Regression, have been utilized. R^2, MSE, and MAE metrics have been employed to evaluate model success. Among these models, the one with the most successful results has been selected as XGBoost.
The trained XGBoost model ranks the collections by forecasting their click performance. In the next stage, the collections suggested by this model are re-ranked by considering any available previous performance data and the performance in the display area.
For the reranking model, the following variables have been used to estimate the click performance of the products:
- Type of concept collections
- CTR of the collection in the relevant area over the last 60 days
- Click, add-to-cart, add-to-favorite, and purchase performance of the products in the collection over the last 1 and 7 days, as well as within the same session
Sample screenshots of the collections are shown in (Figures 1,2). The attribute comparison graph with SHAP values is presented in (Figure 3). In Figure 3, variables such as “avg_similarity_score”, “perc_stock_out” and “avg_review_rating” stood out positively in the context of performance analysis, while variables such as “days_since_last_add” exhibited a negative impact. The two different groups represented by blue and pink colors highlighted the differences in segment behavior.
The A/B test results where collections created with Prod2Vec are recommended first, are presented in (Table 2). The collection CTR (15.24%) and add-to-cart rate (27.6%) were high, indicating that the collection attracts user attention and is effective.
The train and test values of the XGBoost model are given in (Table 3). The model has demonstrated consistent performance on both the training and test data. The low and similar MSE values indicated that the model’s error rate remains minimal. Similarly, the very low MAE values suggested that the model’s generalization ability was effective. However, the fact that the R^2 value can only explain 56% of the variance in the target variable indicates that the model’s explanatory power has been limited and needs improvement.
The obtained results are presented in (Table 4), and the graph of feature importance weights is shown in (Figure 4). As is seen in (Table 4), the error metrics of the model were low, indicating that the predictions were generally accurate. In (Figure 4), a plot of feature importance weights has been presented. “kombin” is a Turkish word and contains the meaning of combination. It has been observed that the curve_mapping_ ctr attribute has the highest importance value.
The click performances of the collections have been predicted using the developed ranking and reranking algorithms. The collections, ranked based on these estimates, have been subjected to A/B testing. The test results are presented in (Table 5). The A/B test results provided a detailed picture of the effects of sequential collection strategies on various performance metrics. In general, the “Collection Click” metric showed a steady increase in each test period, resulting in a 321.57% improvement. In contrast, the “Collection Product CTR” metric decreased by 13.06%. The “Post-click Conversion Rate” metric showed an average increase of 2.01%, with some declines, but the overall trend remained positive. Although the “Post-click Revenue/Session” and “Post-click Profit/Session” metrics yielded negative results in a few periods, both metrics had a positive impact overall, with improvements of 4.21% and 3.21%, respectively. Finally, the “Post-click Quantity/Session” metric increased by 2.81%, though negative results were observed in some tests. These findings suggested that the sequential collection strategy is generally successful.
According to the test results, the test group, where the ranking algorithm was applied, showed higher performance compared to the control group, where the algorithm was not used.
Using the ranking algorithm, the click performance of the collections has been predicted, and the collections have been ranked based on these predictions. The overall performance of the created collections has been monitored daily through a control panel, categorized by channel, date, and category. The CR and CTR performance graph is presented in (Figure 5). In (Figure 5), the “collection” (dark pink) and “collection (previous 8 days)” (light pink) metrics have been compared between June 12 and June 19. The dark pink line showed a downward trend, decreasing from 7.2% to 6.8%, while the light pink line demonstrated an upward trend, rising from 6.8% to 7.2%. The intersection of the two lines on June 14 indicated that the performance of the previous 8 days had surpassed the current performance.
With the developed system,
− Users can discover more products and access new content more easily.
− Users can easily find other products encountered in product images.
− The discovery of products on the site and the finding of new content have also been facilitated.
− Among the created collections, those expected to receive high appreciation are listed on product detail pages.
− Collections that users can be personally interested in are identified and ranked.
The results of the tests conducted with 10,000 users are presented below:
- Time spent on the Trendyol mobile application increased by 5% ± 0.01%.
- Product clicks increased by 7% ± 0.05%.
A 5% increase in time spent on the Trendyol mobile app and a 10% increase in product clicks indicate a positive development in user interaction. These increases suggest that users are spending more time on the platform and are browsing more products. Specifically, the increase in the click-through rate indicates that users are more likely to click on products that interest them and make purchases. However, the rate at which clicks are converted into sales determines the value of this interaction for the e-commerce platform. These increases are a positive indicator of customer experience.
This study differs from previous references by incorporating the following features:
− Data obtained from user behaviors have been analyzed using the Prod2Vec algorithm, and new collections have been created based on this data.
− The processes of creation of collections and recommendations with collection sorting algorithms on product detail pages have been implemented together.
− Among the collections, those with the highest click-through rates have been assumed to generate the most interest and have been displayed on product detail pages using ranking algorithms.
− RF, XGBoost, and Linear Regression-based models have been developed for CTR prediction, with XGBoost achieving the highest predictive performance.
In future studies, it is suggested that exploring topics such as creating collections of combination products using machine learning methods and designing “Shop the Look” themed collections with image processing techniques could potentially expand the study’s scope and enhance its contributions.
E-commerce is a rapidly growing sector today. In this competitive environment, businesses need to increase customer loyalty and increase their sales volumes in order to survive. In this context, creating new collections and recommending these collections on product detail pages according to their interest rates stands out as a strategic action that will increase sales volume. In this study, the main contribution is developing a personalized system that allows the creation of collections and ranking of these collections. As a result of the developed system, it has become possible for users to discover more products and access new content more easily.
Subscribe to our articles alerts and stay tuned.
PTZ: We're glad you're here. Please click "create a new query" if you are a new visitor to our website and need further information from us.
If you are already a member of our network and need to keep track of any developments regarding a question you have already submitted, click "take me to my Query."