Leverage Turing Intelligence capabilities to integrate AI into your operations, enhance automation, and optimize cloud migration for scalable impact.
Advance foundation model research and improve LLM reasoning, coding, and multimodal capabilities with Turing AGI Advancement.
Access a global network of elite AI professionals through Turing Jobs—vetted experts ready to accelerate your AI initiatives.
Machine Learning is rewarding the retail industry in a unique way. It supports the retail sector in all areas, from predicting sales success to locating customers. Market basket analysis (MBA) is one such top retail application of machine learning. It helps retailers know what products people are purchasing together so that the store/website layout can be designed in the same manner. It is mainly done by studying their previous purchasing activity. Companies also leverage it in cross-selling their products on their online platform. However, it is not only used in the retail sector; fraudulent insurance claims and credit card transactions also make use of it.
Amazon is a great example that leverages this analysis to cross-sell products. These are the products that come under the suggested item list which might interest you along with your current purchase. Your browsing history, what other customers have bought with a given product, and other factors determine which products appear in the suggested category.
Market basket analysis is a data mining technique that analyzes patterns of co-occurrence and determines the strength of the link between products purchased together. We also refer to it as frequent itemset mining or association analysis. It leverages these patterns recognized in any retail setting to understand the behavior of the customer by identifying the relationships between the items bought by them. To put it simply, market basket analysis helps the retailers know about the products frequently bought together so as to keep those items always available in their inventory.
The source from which these patterns are found is the vast amount of data that is continually collected and stored. With frequent mining of the item set, it becomes easy to discover the correlation between items in huge relational or transactional datasets. It considerably helps in decision-making processes related to cross-marketing, catalog design, and consumer shopping analytics.
You may better comprehend this idea by using the following example.
When people buy green tea, it is evident that they may also buy honey with it. This relationship is depicted as a conditional algorithm, as given below.
IF {green tea} THEN {honey}
It represents that items stated on the right are more likely to be ordered with the items on the left side. Market basket analysis in data mining helps us understand that relationship and how helpful it would be to alter our decisions based on the analysis.
Machine learning engineers develop data-driven strategies based on detailed market basket analysis that further help retailers improve their revenues in the following ways.
Market basket analysis comprises the following types.
1. Descriptive market basket analysis
This type of market basket analysis offers actionable insights based on historical data. It is a frequently used approach that does not make any predictions but rates the association using statistical techniques between the products. We also refer to it as unsupervised learning based on the way it is modeled.
2. Predictive market basket analysis
Although “predict” and “analysis” make up the word predictive analysis, it actually works in reverse. It first analyzes and then predicts what the future holds. This type utilizes supervised learning models like regression and classification. It is a valuable tool for marketers even if it is less used than descriptive market basket analysis.
So when we talk about the predictive market basket analysis, it considers items purchased in sequence to evaluate cross-sell. For instance, when a consumer purchases a laptop, they are more likely to buy an extended warranty with it. This analysis thus helps in recognizing those considered items in a sequence so they can be sold together.
It finds application in the retail industry mainly to determine the item baskets that are purchased together.
3. Differential market basket analysis
Differential market basket analysis is a great tool for the competitive analysis that can help you determine why consumers prefer to purchase the same product from a particular platform even when they are labeled with the same price on both platforms.
This decision of the consumers is often based on several factors, as listed below.
By considering all these factors that are backing the consumers' decision, organizations can benefit from differential market basket analysis. They can make all the parameters fall in accordance with the consumer excel user experience and increase sales on their platform.
Note: People without any expertise in data mining should consider confirming the results before sharing the results with stakeholders since it consists of various parameters and formulas that are taken into account at every step.
Here are some terminologies you should keep in mind while working with market basket analysis.
We can further understand antecedent and consequent with the below example.
Market basket analysis utilizes association rule {IF} - > {THEN} to predict the probability of certain products being purchased together. They count the item frequency occurring together and seek to find associations that occur more than expected.
Some algorithms that leverage these association rules are AIS, Apriori, and SETM.
Apriori is the commonly cited algorithm by the data scientist that identifies frequent items in the database. It is useful for unsupervised learning and requires no training and thus no predictions. This algorithm is used especially for large data sets where useful relationships among the items are to be determined.
You would be surprised to know that Apriori algorithm leverages a shortcut namely Apriori property. This shortcut states that all items in a frequent itemset must also be frequent. It helps in saving a lot of computational time.
The Apriori algorithm works in two steps that are illustrated below.
It is further classified into three components.
Let’s understand each one of them with an example of how to calculate market basket.
Assume that a popular eCommerce website makes 100 transactions. Now if we want to calculate the lift, support, and confidence of the two products, say books and pencils, here’s how we will proceed.
Let’s say there are 10 transactions for books and 8 transactions for pencils and 6 transactions are made for both products.
1. Support: It is the total number of transactions made for a particular product divided by the total number of transactions made. Zero represents no support while one represents the highest support. Higher the value of support, the greater the importance of the itemset in the data.
support(A⇒ B) =P(A ∪ B)
Support (Books) = Freq (Books)/Total transactions made
Support (Books) = 6/100 = 0.06%
2. Confidence: It is the ratio of combined transactions to individual transactions.
confidence(A⇒ B) =P(B|A)
Confidence (Books) = Combined transactions/Individual transaction
Confidence (Books) = 0.06/0.08 = 0.75
3. Lift: It is the ratio of the confidence percent to the support percent.
Lift = 0.75/0.10 = 7.5
Market basket analysis is used to search for the rules that result in a lift value greater than 1.
Note: Confidence and support can be leveraged to influence the Apriori algorithm. It is done by setting cut-off values to be searched for. For instance, by setting a minimum support value of 0.5 and a confidence value of 0.65, we tell the computer to only report those association rules that are above these cut-off points. It eliminates useless tools that add no value to the decision-making process.
Market basket analysis is based on association rule mining which is
IF {}, THEN {} construct
Everything that is within the brackets is referred to as an itemset, which is some form of data. The process initiates from a data set of transactions. Every transaction depicts a group of products that are bought together by the customers and are often referred to as itemsets. These transactions are analyzed through the market basket analysis to identify the rules of the association.
It means that if a customer made a transaction that consisted of bread and butter, then they are likely to purchase milk too. However, before acting on any rule, the store manager or retailer must have sufficient evidence to back up the decision so that the results are beneficial. The above-discussed components namely support, confidence, and lift helps in measuring the strength of a rule to assist you in making an informed decision.
Let’s understand the market basket analysis with an example of where the Apriori algorithm is implemented in the arules package. It can be installed and run in R.
STEP 1: Catch a glance at the below-given table to understand how the data is loaded into the engine.
Source: Smartbridge
STEP 2: Next, each transaction is aggregated across records into a single record as an array that converts the data set to an R transaction. Check out the below image that depicts the result of the aggregation process.
STEP 3: Lastly, the Apriori logic is implemented in the transactions. Check the below-given image for the result set.
The above image clearly depicts the number of strong consequent combinations in which soda is a keystone product category. Leveraging this information, the store manager can drive sales volume by keeping the price and margins low on soda.
Another eye-opening yet interesting result depicts that all the rules with ice cream illustrate the confidence of one with a significant lift. Thus, the store manager can further promote ice cream with the belief that other items can be purchased by the customers with it at the same time.
The never-ending list of impeccable benefits that market basket analysis has to offer is widely being leveraged by organizations around the world. This is also the reason one can notice a spike in the hiring of ML engineers in companies around the world.
Read on to know some unmatched advantages that will leave you awestruck.
The popularity of market basket analysis machine learning extends beyond the boundaries of the retail industry. We have listed some other areas where it is doing wonders below.
Cross-selling and upselling is the secret mantra of the retail industry that pushed the consumer to buy more. It has become a thriving factor for such industries that harness patterns with market basket analysis in data mining and derive customer insights to upscale their brand's performance.
An urban legend states that a grocery store increased its sales after they placed beer and diapers together because of the market basket analysis that stated that beer and diapers were both purchased frequently by men.
Organizations are using this technique wisely and making billions by playing with the mind of the customer. It is an effective way of improving your sales without having to put extra effort into marketing that won’t give you results as incredible as with this technique. So go ahead and try it on all the data you have in your repository to recognize patterns that may surprise you to the roots.
Srishti is a competent content writer and marketer with expertise in niches like cloud tech, big data, web development, and digital marketing. She looks forward to grow her tech knowledge and skills.