How to build a customer segmentation technically?


I want to segment my customers

Build your customer segmentation easily just by uploading a CSV with our self-service analytics platform

61 days full access for only $75

See how it works

You will mind in this post some information on how to build a customer segmentation from a technical standpoint. Which table? Where to start to build a customer segmentation? Should I use Internal or external data? To which frequency should you update your segmentation and so on. Before jumping into the reading of this technical post about segmentation, you might want to read our central guide regarding everything you need to know about customer segmentation. We have covered the topic extensively and you might find some valuable information. 

Start by building a customer table at a customer ID level

No segmentation without programming. SQL rules.

You don't have many options when it comes to build your segmented customer database. The usage of SQL code and the storage of the data into a table in a database is mandatory. You don't want to build it on excel or use a CSV. Databases and tables are made for that and the usage of a table dedicated to it wille be the best options. 

Using programmation to build your customer segmentation will also allow you to (1) automate the updates and batches, (2) work according to ACID principles (Atomicity, Consistency, Isolation and Durability), (3) and fetch the data directly from APIs and other databases. 

Building a customer segmentation will require some support from your IT department and your data science department.

Deduplicate and consolidate your customer information

You need to collect all your customers, at a customer ID level. This implies that you have cleaned and deduplicated your customer information first. You need to deduplicate the addresses, first names and last names, make sure you don't have some fake/test accounts. 

You might also have prospect information into your salesforce or CRM databases. Use them to consolidate prospects + actual customers. Your goal is to promote your business and drive sales so don't miss this opportunity to get a bigger customer database.


Upload a CSV now and score your customer using the RFM (Recency, Frequency, Monetary) methodology

See how it works

Design multiple intermediary tables

Once you have done this customer ID consolidation work, you need to design and create multiple separated sub-tables. Each sub-table will serve a different goal and gather different set of information.

Behavioral tables - e.g. RFM 

In this table, you might want to get for each customer their recency, frequency and monetary data. This table should reflect checkout behaviours and get checkout related information. You could add to the RFM segmentation, the following pieces of information to make this customer table truely behavioural and capture good customer insight:

  • date of first transaction
  • date of last transaction
  • device used during last transaction
  • number of pages viewed during last transaction
  • most purchased product category
  • number of claims and returns
  • percentage of product returned since activation
  • cumulated number of vouchers redeemed during checkout
  • number of items purchased with a price reduction
  • number of connections to the website over the past 365 days
  • number of items purchased per transaction on average
  • Lifetime value and gross margin information over the past 12 months
  • number of friends refered
  • is member to your customer fidelization program : yes / no

Demographic information

External databases can work but might be sensitive in terms of data privacy (especially at the time of the European GDPR) and expensive to acquire. You should first rely on your internal databases. You might want to create a table with demographic information about your customers:

  • country of residence
  • city of residence
  • language spoken
  • postcode
  • proximity with a beach / mountain
  • date of birth
  • man / woman
  • households size - if you know the customer purchased items for kids for instance.

This information can be collected during the checkout when the customer need to enter their delivery details. Some information can be enriched: you know that a customer lives in Miami? Then assuming the proximity with the beach is fair to do. This can be convenient if you sell solar cream.

When customers don't want to be contacted...

You might have during the checkout or sign-up of your customer a checkbox where they tell you whether or not they would like to be contacted. Sometimes, your customers indicate their preferred channel to be contacted... or not. This information needs to be stored into a separate and dedicated tables. Last thing you want to do is to upset your customers by a wrong communication.

These prefered communication channel can change throughout time. This table should reflect it and get data directly from your production database.

You should also be compliant and apply the GDPR legislation.

Aggregate all the customer data into one central table

One central customer table to rule them all

This customer table will have :

  • All the customer IDs identified in step 1
  • All the behavioral information
  • All the demographical information
  • A timestamp update to store the historical evolution of your customers.
  • A clever combination of the fields

You should combine all this data and create 'clever flags'

An example is better than a thousand words.

Let's say you sell shoes online. Your customer purchased 3 pair of boots over the past 360 days? You might want to create a 'boot lover' flag to later promote your new range of boots in exclusivity or track this segment of customer behaviour. Or you have some customer who purchased product in 3 or more categories over the past 12 months? You should create a "is an explorator yes/no " flag. And so on as long as this allows you to increase your knowledge about your customers. 

You should combine both demographic and behavioral information into 'aggregated flags'. You know your customer is living in a city of more than 1 million inhabitants, purchased expensive items multiple times a year, purchased always after 8PM and used the latest iPhone model at checkout? You could build a "high income" category. This would need some tests, but you see the point. 

If you manage to do so, your customer database will be far more than a collection of ID, but a true and unique asset you can exploit to drive your activity and become a true customer champion.

Build a table to store the touch points

You will use this information for:

  1. Promoting your product and services
  2. Increase your knowledge about your customers and track performance

Regarding the first point, you don't want to push 3 newsletters to a customer every week. Hence, you will need to build a table to store some very important information for each customer ID:

  • Date of last contact
  • Channel used last time to reach out to the customer
  • Date when the customer can be reached next time

Refresh your customer segmentation every month

You don't need or want to refresh your customer segmentation every day or every week. Every month is a good rythm to capture the latest trends at a customer ID level. 

If you keep track and store each customer segmentation table update, you will be able to understand how your customer database is evolving throughout time.You should then build reportings and dashboards to track the evolution and deepen your knowledge about your customers. You might also want to assign product manager around the segment of customers you built. 

Our self-service platform can help your build your RFM segmentation... you should have a look at it!

This post was rated: 0 / 5 by 0 readers
You rated: Not rated yet... but please be kind!

I want to segment my customers

We develop self-service analytics solution allowing to build a customer scoring based on their behaviours. Discover our products and our affordable pricing

61 days full access for only $75

See how it works