Have you ever noticed that when you buy a certain product, there are usually other products that are often purchased together? For example, when someone buys bread, he might also buy butter or jam. Market basket analysis is a method used to identify product purchasing patterns that often occur together.
In this article, we will discuss the concept of market basket analysis, examples of its application in data mining, how to perform market basket analysis using Python, and how to create data visualizations using Tableau.
What is Market Basket Analysis and How Does the Concept Work?
Market basket analysis is a data analysis technique used to identify the relationship between products purchased together by customers. The main objective of this analysis is to discover hidden relationships among purchased products and to use this information to improve sales and marketing strategies.
The basic concept of market basket analysis is “lift”. Lift is the ratio between the probability that two products will be purchased together and the probability that they will be purchased at random. If lift is 1, then the two products are likely to be purchased at random. However, if the lift is greater than 1, then the two products are likely to be purchased together.
Example of Market Basket Analysis in Data Mining
Let’s say you have data on product purchases at the grocery store for the last month. You want to find a relationship between products that are frequently purchased together. The following is an example of a market basket analysis for that purchasing data.
- Data Preprocessing
The first step in market basket analysis is preparing the data. You must ensure that the data you use is free from errors and duplications, and that it is properly encoded. Next, you need to convert the data into a transaction format, which is a list of all products purchased by a customer in a single transaction. - Frequent Itemset Mining
After the data is converted into a transaction format, the next step is to find frequent itemsets, namely sets of products that are frequently purchased together. Frequent itemsets are obtained using algorithms such as Apriori or FP-Growth. - Association Rule Mining
After getting the frequent itemset, the next step is to find the association rule, namely the relationship between products purchased together. The association rule is obtained by calculating the lift between every two products in the frequent itemset. - Interpretation of Results
After getting the association rule, the final step is to interpret the results. You can use this information to improve your sales and marketing strategy. Suppose you find that people who buy bread also tend to buy butter, then you can place butter near the bread to increase the likelihood of a joint purchase.
How to Use Python to Analyze Market Baskets
Python is one of the popular programming languages for data analysis. In this section, we will cover how to use Python to perform a market basket analysis. why python is it a good language to learn?
1. Data Preprocessing
First, you need to import the required libraries, such as pandas and numpy. Next, you need to read the data from the CSV file or database and prepare the data.
Example:
import pandas as pd
import numpy as np
# read data from CSV file
df = pd.read_csv('data.csv')
# remove unused data
df = df.drop(['id'], axis=1)
# converting data into a transactional format
transactions = []
for i in range(len(df)):
transactions.append([str(df.values[i,j]) for j in range(len(df.columns))])
2. Frequent Itemset Mining
Next, you need to import a library to use the Apriori or FP-Growth algorithms to find frequent itemsets.
Example:
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, fpgrowth
# convert data to boolean format
te = TransactionEncoder()
te_ary = te.fit_transform(transactions)
df = pd.DataFrame(te_ary, columns=te.columns_)
# search for frequent itemsets with the Apriori algorithm
frequent_itemsets = apriori(df, min_support=0.01, use_colnames=True)
# looking for frequent itemsets with the FP-Growth algorithm
frequent_itemsets = fpgrowth(df, min_support=0.01, use_colnames=True)
3. Association Rule Mining
After getting the frequent itemset, you can use the mlxtend library to find the association rule.
Example:
from mlxtend.frequent_patterns import association_rules
# find association rule with lift metric
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)
4. Interpretation of Results
After getting the association rules, you can use pandas to interpret the results.
Example:
# sort the association rule based on the lift value
rules = rules.sort_values(['lift'], ascending=False)
# displays the association rule with the largest lift
print(rules.head())
Read more : How to count a word in string Python
Creating a Market Basket Analysis Visualization with Tableau
Tableau is one of the popular business software for creating data visualizations. In this section, we will cover how to create a market basket analysis visualization with Tableau.
- Import Data
First, you need to import product purchase data into Tableau. - Create Frequent Itemsets
Next, you can use the “Create Sets” feature to create frequent itemsets. Select the product column and select “Create Sets”. Select “Use All” to use all products, then set the support limit to search for frequent itemsets. - Create Association Rules
After getting the frequent itemset, you can use the “Create Combined Sets” to create an association rule. Select two frequent itemsets and set the lift limit to find the association rule. - Create Visualization
After getting the association rules, you can create visualizations using the “Visualizations” feature in Tableau. For example, you can create a scatter plot to display association rules based on support and lift values.
In conclusion, Market basket analysis is a useful data analysis technique for finding related product purchasing patterns. By using this technique, you can improve your sales and marketing strategy.
Python and Tableau are two popular tools for doing market analysis. In this article, we’ve covered how to use Python and Tableau to perform market analysis and create a visualization of the results.
In carrying out basket analysis, you need to carry out several stages such as data preprocessing, frequent itemset mining, association rule mining, and interpretation of results. Once you get results, you can take action to improve your sales and marketing strategy.
We hope this article will be useful for those of you who are interested in market basket analysis. Don’t forget to try it yourself using Python or Tableau to do market basket analysis on your data.