Consumer Driven Marketing:
Consumer-driven marketing is an approach where marketing strategies and initiatives are shaped by the preferences, behaviors, and needs of the target audience or consumers. Instead of relying solely on company-driven decisions and assumptions, consumer-driven marketing emphasizes understanding and responding to what consumers want.
Applying Data Analytics For Consumer-Driven Marketing:
Recognizing and Customizing Offers: Almost all businesses started recognizing customers through personal information like warranties and credit cards; they’ve aimed to tailor their offers and marketing strategies. These customized offers can be generated by analyzing past customer behaviors or using more sophisticated models. These offers may involve suggesting products likely to interest customers, personalized promotions, or customizing product displays on websites.
Changing Communication Methods: While traditional direct mail was used in the past, today’s communication channels include email, websites, apps, customer care, and in-store applications, especially with the rise of online sales.
Investing in Customer Analytics Programs: Companies have access to abundance of data that can be analyzed for profit, and this is why fashion brands are actively investing in consumer analytics programs. The focus on consumers has become a key strategy for brands to expand their market share, transitioning from a product-centric to a consumer-centric approach.
Transition Challenges: Many companies that strongly believe in focusing on their brand are finding it difficult to adapt to this change, but it has become essential to stay competitive in the market. Brands must now provide ongoing value to customers throughout their entire relationship. Essentially, brands need to build consumer equity to meet the demands of the market.
Role of Marketing and Analysis: Personalized consumer relationships are entirely driven by consumer preferences. Marketing and analytics play a crucial role in enabling large organizations to manage such personalized connections. The key lies in tailoring offers and promotions that align with the individual needs of customers.
Changing Approaches in Offer Creation: In the past, offers were crafted based on psycho-demographic segments or clusters derived from syndicated data. However, contemporary retailers now prioritize actual customer behavior as a reliable indicator of future buying patterns. They leverage internal behavioral data to precisely target offers and initiate lead generation campaigns.
Enhancing Focus Through Analysis: Regardless of the data source, whether internal or external, the focus is on achieving more successful targeting. This involves identifying unique customers at the individual or household level to enhance the effectiveness of personalized marketing strategies.
Consumer Focus and Advanced Analysis: Ideal consumer centricity involves establishing a personalized, individual connection between the brand and consumers throughout every phase of the individual consumer’s journey. Achieving this level of personalization at the individual consumer level requires the application of advanced analytics. In the realm of analytics, fashion has been relatively slow to embrace this approach compared to other industries. It is only in recent times that the fashion industry has begun to utilize advanced analytics to create personalized experiences with its consumers.
Increasing Data Through Digital Channels: Digital channels like email, online navigation, web applications, and social interaction played a significant role in dramatically expanding the amount of accessible data. Dealing with extensive data sets indeed presents certain technological difficulties. To use advanced analytics for consumers, brands need the technology to store, retrieve, analyze, and share specific information about each individual consumer.
Value of Consumer Data: Consider Rana, a customer who signs up on the website, browses product pages, clicks on the latest email, visits the store, uses the app, interacts with ads, and maybe even shares a brand’s Instagram post.
While these actions may seem like isolated pieces of information, they are valuable data for any brand adopting a consumer-centric approach. The ability to establish a comprehensive view of the consumer’s interactions across all channels is often a significant and enduring investment for brands.
Extracting consumer insight from big data: We can use techniques such as consumer clustering, hierarchical clustering, K-means clustering, and consumer scoring to extract valuable information about consumers from big data.
Consumer clustering: It is a widely used technique across various industries, including fashion, to identify potential customer groups based on similar traits which is called customer segmentation. These groups, known as customer archetypes or personas, represent diverse demographics and shopping habits.
In the retail sector, for instance, customers may vary in shopping frequency, from monthly to seasonal patterns. Fast fashion brands like Zara, with frequent collection launches, attract shoppers more often. To customize advertising effectively, the RFM technique (recency, frequency, monetary) ensures uniqueness within each group.
Recognizing that different customer groups respond uniquely to ads, targeted advertising strategies are essential. Once groups are identified, decisions on advertising targeting can be made strategically, considering the relevance to each persona. The practical approach involves employing cluster analysis.
Clustering:
Although I will describe Clustering in machine learning chapter in details, let me give you quick heads up here. Clustering is a data analysis technique that groups similar observations or data points together into homogeneous subgroups. The goal is to identify patterns or structures within the data, making it easier to understand and analyze. By organizing data into clusters, you can gain insights into the underlying structure and relationships, which can be valuable for various applications such as customer segmentation, anomaly detection, and pattern recognition.
Types of Clustering:
01. Hierarchical Clustering
02. DBScan
03. K-Means Clustering
Hierarchical clustering is a method aimed at constructing a cluster hierarchy, and it typically involves two approaches.
The first approach, known as Agglomerative clustering, follows a bottom-up strategy where individual observations start as separate clusters and are gradually merged together.
The second approach, called Divisive clustering, takes a top-down approach by initially placing all observations in a single cluster and then progressively splitting into more clusters at each step.
Despite its simplicity, hierarchical clustering is computationally expensive.
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering algorithm that groups data points based on their density in the feature space. It identifies core points with a minimum number of neighbors within a specified radius, expands clusters by connecting core points and their neighbors, and categorizes border points and outliers. DBSCAN is flexible in handling clusters of different shapes, robust against noise, and efficient for large datasets, but requires careful parameter tuning.
In K-means clustering, the ‘K’ represents the number of clusters to be identified. Unlike hierarchical clustering, where the algorithm determines the optimal solution, in K-means, the user specifies the number of desired clusters at the outset. The algorithm then establishes initial centroids for each cluster, iteratively recalculating them to minimize the sum of distances between data points. This process continues until no further improvement can be achieved. In practice, the ideal ‘K’ is often unknown initially, leading to an exploration of different cluster numbers, and the final choice is based on comparing results among potential solutions to determine the most effective clustering. K-means is widely used for applications like image segmentation, customer segmentation, and pattern recognition when the number of clusters is known or can be estimated.
The techniques we talked about help marketers do more than humans can by organizing descriptive attributes. This lets marketers customize communication for different groups. However, as the number of clusters and touch points increases, managing them manually becomes too complex. What sets consumer-driven marketing at scale apart is the use of consumer scoring.
Consumer scoring involves creating predictive variables to understand and quantify consumer needs, preferences, and purchase motivations. It employs various models, like logistic regression and decision trees, to predict buying likelihood, acceptable price points, and responses to commercial events.
Marketers use consumer scoring to optimize automation, recommend products, and run targeted lead acquisition campaigns. The primary advantage is automating consumer communication, identifying the best actions, and determining optimal reconnection moments.
Benefits include lower acquisition and retention costs, more relevant communication, and happier, longer-lasting customers. Fashion brands like Kenzo and Gucci invest in advanced analytics, but implementation involves a mixed model due to resource-intensive internal capabilities.
Python Implementation for Customer Segmentation:
Problem Statement:
To understand the customers like who can be easily converged [Target Customers] so that the sense can be given to the marketing team and plan the strategy accordingly.
Domain Analysis:
The dataset contains information about customers’ demographic attributes. The features include customer ID, gender, age, annual income (in thousands of dollars), and spending score (ranging from 1 to 100).
From the supermarket mall, we have some basic data about your customers like Customer ID, age, gender, annual income and spending score through membership cards.
Also we have a spending Score which is something assigned to the customer based on your defined parameters like customer behavior and purchasing data.
- CustomerID: Unique identification number for each customer.
- Gender: Gender of the customer.
- Age: Age of the customer.
- Annual Income: Annual income of the customer in thousands of dollars.
- Spending Score (1-100): Spending score indicating customer spending behavior (ranging from 1 to 100).
Importing Necessary Libraries:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
Load Dataset:
data = pd.read_csv("/kaggle/input/mall-customer-segmentation/Mall_Customers.csv")
data
CustomerID Gender Age Annual Income (k$) Spending Score (1-100) 0 1 Male 19 15 39 1 2 Male 21 15 81 2 3 Female 20 16 6 3 4 Female 23 16 77 4 5 Female 31 17 40 ... ... ... ... ... ... 195 196 Female 35 120 79 196 197 Female 45 126 28 197 198 Male 32 126 74 198 199 Male 32 137 18 199 200 Male 30 137 83 200 rows × 5 columns
Basic Checks:
data.head()
CustomerID Gender Age Annual Income (k$) Spending Score (1-100) 0 1 Male 19 15 39 1 2 Male 21 15 81 2 3 Female 20 16 6 3 4 Female 23 16 77 4 5 Female 31 17 40
data.tail()
CustomerID Gender Age Annual Income (k$) Spending Score (1-100) 195 196 Female 35 120 79 196 197 Female 45 126 28 197 198 Male 32 126 74 198 199 Male 32 137 18 199 200 Male 30 137 83
data.describe()
CustomerID Age Annual Income (k$) Spending Score (1-100) count 200.000000 200.000000 200.000000 200.000000 mean 100.500000 38.850000 60.560000 50.200000 std 57.879185 13.969007 26.264721 25.823522 min 1.000000 18.000000 15.000000 1.000000 25% 50.750000 28.750000 41.500000 34.750000 50% 100.500000 36.000000 61.500000 50.000000 75% 150.250000 49.000000 78.000000 73.000000 max 200.000000 70.000000 137.000000 99.000000
- Average customer age is apprx 38 years old & 50% customer age close to average
- Customer average income is $60K & also 50% customers income close to average
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 200 entries, 0 to 199 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 CustomerID 200 non-null int64 1 Gender 200 non-null object 2 Age 200 non-null int64 3 Annual Income (k$) 200 non-null int64 4 Spending Score (1-100) 200 non-null int64 dtypes: int64(4), object(1) memory usage: 7.9+ KB
- No null value present in data set
- only one categorical variable is present which is Gender
data.drop("Gender",axis=1).corr()
CustomerID Age Annual Income (k$) Spending Score (1-100) CustomerID 1.000000 -0.026763 0.977548 0.013835 Age -0.026763 1.000000 -0.012398 -0.327227 Annual Income (k$) 0.977548 -0.012398 1.000000 0.009903 Spending Score (1-100) 0.013835 -0.327227 0.009903 1.000000
- Age & Income are not correlated with each other
Exploratory Data Analysis:
#Distribution of Annnual Income
plt.figure(figsize=(12,8))
sns.distplot(data['Annual Income (k$)'])
plt.title("Distribution of Annual Income (k$)",fontsize=20)
plt.xlabel('Range of Annual Income (k$)')
plt.ylabel('Count')
#Distribution of age
plt.figure(figsize=(12,8))
sns.distplot(data['Age'])
plt.title('Distribution of Age', fontsize = 20)
plt.xlabel('Range of Age')
plt.ylabel('Count')
#Distribution of spending score
plt.figure(figsize=(12,8))
sns.distplot(data['Spending Score (1-100)'])
plt.title('Distribution of Spending Score (1-100)', fontsize = 20)
plt.xlabel('Range of Spending Score (1-100)')
plt.ylabel('Count')
# Check Gender dataset
data_gender = data.Gender.value_counts()
plt.figure(figsize=(8,6))
sns.barplot(x=data_gender.index,y=data_gender.values)
plt.xlabel("Gender")
plt.ylabel("No of People")
plt.title("Customer Comparison based on Gender")
- Customers are mostly female apprx by 20% more than male
Feature Engineering & selection:
data_AnnualIncome = data[['Annual Income (k$)','Spending Score (1-100)']]
data_AnnualIncome
Annual Income (k$) Spending Score (1-100) 0 15 39 1 15 81 2 16 6 3 16 77 4 17 40 ... ... ... 195 120 79 196 126 28 197 126 74 198 137 18 199 137 83 200 rows × 2 columns
plt.figure(figsize=(8,6))
sns.scatterplot(x='Annual Income (k$)',y='Spending Score (1-100)',data=data_AnnualIncome,s=60)
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.title('Spending Score (1-100) vs Annual Income (k$)')
- Based on income we can segregate customer in 5 cluster
data_Age = data[['Age','Spending Score (1-100)']]
data_Age
Age Spending Score (1-100) 0 19 39 1 21 81 2 20 6 3 23 77 4 31 40 ... ... ... 195 35 79 196 45 28 197 32 74 198 32 18 199 30 83 200 rows × 2 columns
plt.figure(figsize=(8,6))
sns.scatterplot(x='Age',y='Spending Score (1-100)',data=data_Age,s=60)
plt.xlabel('Age')
plt.ylabel('Spending Score (1-100)')
plt.title('Spending Score (1-100) vs Age')
- Age group 20 to 40 , spend maximum but 40 to above age people spend less
- Based on above analysis we will use Annual income & scoring to do customer segmentation
# To make our coding easier , taking X variable for data_AnnualIncome
X= data_AnnualIncome
Model Creation:
# Importing KMeans from sklearn
from sklearn.cluster import KMeans
wcss=[]
for i in range(1,11):
model = KMeans(n_clusters=i)
model.fit(X)
wcss.append(model.inertia_)
# Making elbow curve
plt.figure(figsize=(8,6))
plt.plot(range(1,11),wcss,linewidth=2,color='green',marker='o')
plt.xticks(np.arange(1,11))
plt.xlabel("K Value")
plt.ylabel("WCSS")
plt.show()
- From above elbow method we can take 5 for K value which means 5 clusters
# Taking 5 for K value , let's create our model
model_final = KMeans(n_clusters=5)
model_final.fit(X)
label=model_final.predict(X)
label
array([4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 0, 4, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 2, 0, 2, 1, 2, 1, 2, 0, 2, 1, 2, 1, 2, 1, 2, 1, 2, 0, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2], dtype=int32)
data['label']=label
data.head()
CustomerID Gender Age Annual Income (k$) Spending Score (1-100) label 0 1 Male 19 15 39 4 1 2 Male 21 15 81 3 2 3 Female 20 16 6 4 3 4 Female 23 16 77 3 4 5 Female 31 17 40 4
#Scatterplot of the clusters
plt.figure(figsize=(10,6))
sns.scatterplot(x = 'Annual Income (k$)',y = 'Spending Score (1-100)',hue="label",
palette=['green','orange','brown','dodgerblue','red'], legend='full',data = data ,s = 60 )
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.title('Spending Score (1-100) vs Annual Income (k$)')
plt.show()
Model Evaluation:
from sklearn.metrics import silhouette_score
silhouette_score(X,label)
0.553931997444648
- Score is mid value from 0 to 1 , so we can say it is a good model
Customer Segmentation:
Customer_type01 = data[data['label']==0]
print("Number of people in type01 is ",len(Customer_type01))
print("Their customer ids are: ",Customer_type01['CustomerID'].values)
print("===============================================================")
Customer_type02 = data[data['label']==1]
print("Number of people in type02 is ",len(Customer_type02))
print("Their customer ids are: ",Customer_type02['CustomerID'].values)
print("===============================================================")
Customer_type03 = data[data['label']==2]
print("Number of people in type03 is ",len(Customer_type03))
print("Their customer ids are: ",Customer_type03['CustomerID'].values)
print("===============================================================")
Customer_type04 = data[data['label']==3]
print("Number of people in type04 is ",len(Customer_type04))
print("Their customer ids are: ",Customer_type04['CustomerID'].values)
print("===============================================================")
Customer_type05 = data[data['label']==4]
print("Number of people in type05 is ",len(Customer_type05))
print("Their customer ids are: ",Customer_type05['CustomerID'].values)
print("===============================================================")
print("""WOW!!! , We have done our Customer Segmentation,
we will give these list to marketing team to send customised promotion...""")
Number of people in type01 is 81 Their customer ids are: [ 44 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 127 133 143] =============================================================== Number of people in type02 is 35 Their customer ids are: [125 129 131 135 137 139 141 145 147 149 151 153 155 157 159 161 163 165 167 169 171 173 175 177 179 181 183 185 187 189 191 193 195 197 199] =============================================================== Number of people in type03 is 39 Their customer ids are: [124 126 128 130 132 134 136 138 140 142 144 146 148 150 152 154 156 158 160 162 164 166 168 170 172 174 176 178 180 182 184 186 188 190 192 194 196 198 200] =============================================================== Number of people in type04 is 22 Their customer ids are: [ 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 46] =============================================================== Number of people in type05 is 23 Their customer ids are: [ 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45] =============================================================== WOW!!! , We have done our Customer Segmentation, we will give these list to marketing team to send customised promotion...
You can see this model in my kaggle platform also, please click on below link
https://www.kaggle.com/code/masudranaiba/consumer-driven-marketing-customer-segmentation/