# Module 10: Market Basket Analysis

## Lesson - 11: Market Basket Analysis

In the dynamic landscape of retail and e-commerce, understanding customer behavior and purchase patterns is paramount for driving sales, optimizing product placements, and enhancing customer satisfaction. Market Basket Analysis (MBA) emerges as a powerful data mining technique designed to unravel the associations between products purchased together. In this lesson, we embark on a journey to explore the intricacies of Market Basket Analysis, delving into the Apriori algorithm, frequent itemsets, association rules, real-world applications, and practical implementation in Python using libraries like mlxtend. By the end of this guide, you'll gain insights into customer preferences, enabling you to make informed decisions and strategies to boost your business.

Market Basket Analysis revolves around the concept of association rule learning, seeking to uncover patterns and relationships between items frequently purchased together. Key components of Market Basket Analysis include:

Association Rules: If-then statements that describe the relationships between items in a transactional dataset. For example, "If {milk, bread} then {butter}".

Support: The proportion of transactions that contain a particular itemset. It indicates the frequency of occurrence of an itemset in the dataset.

Confidence: The conditional probability that a transaction containing item A also contains item B. It measures the strength of the association between two items.

Lift: The ratio of the observed support to the expected support if the two items were independent of each other. Lift > 1 indicates a positive association.

### Apriori Algorithm for Association Rule Learning

The Apriori algorithm is a classic algorithm for mining association rules, particularly well-suited for large transactional datasets. The algorithm follows a bottom-up approach to generate frequent itemsets and derive association rules. Key steps of the Apriori algorithm include:

1. Generating Candidate Itemsets: Initially, individual items are considered as candidate itemsets. Subsequently, candidate itemsets of higher order (containing multiple items) are generated based on the frequent itemsets of lower order.
2. Pruning Infrequent Itemsets: Candidate itemsets that do not meet the minimum support threshold are pruned from further consideration.
3. Deriving Association Rules: Association rules are derived from the frequent itemsets, considering various metrics such as support, confidence, and lift.

### Applications of Market Basket Analysis

Market Basket Analysis finds diverse applications across various industries, including:

- Retail: Identifying product bundles, optimizing shelf layouts, and devising targeted marketing strategies based on customer preferences.

- E-commerce: Personalizing product recommendations, cross-selling and upselling strategies, and improving the overall shopping experience.

- Supply Chain Management: Forecasting demand, optimizing inventory management, and streamlining procurement processes.

### Implementation in Python using mlxtend

Let's dive into practical implementation of Market Basket Analysis using Python's mlxtend library:

```python

from mlxtend.frequent_patterns import apriori, association_rules

import pandas as pd

# Perform one-hot encoding

encoded_data = pd.get_dummies(data, drop_first=True)

# Generate frequent itemsets using Apriori algorithm

frequent_itemsets = apriori(encoded_data, min_support=0.05, use_colnames=True)

# Derive association rules

rules = association_rules(frequent_itemsets, metric='lift', min_threshold=1.2)

# Filter rules based on desired metrics (e.g., confidence)

filtered_rules = rules[(rules['confidence'] > 0.7) & (rules['support'] > 0.05)]

# Display the top rules