Main Page | See live article | Alphabetical index

Cluster analysis (in marketing)

Cluster analysis is a class of statistical techniques that can be applied to data that exhibits “natural” groupings. Cluster analysis sorts through the raw data and groups them into clusters. A cluster is a group of relatively homogeneous cases or observations. Objects in a cluster are similar to each other. They are also dissimilar to objects outside the cluster, particularly objects in other clusters.

The diagram below illustrates the results of a survey that studied drinkers’ perceptions of spirits (alcohol). Each point represents the results from one respondent. The research indicates there are four clusters in this market.

Illustration of clusters

Another example is the vacation travel market. Recent research has identified three clusters or market segments. They are the: 1) The demanders - they want exceptional service and expect to be pampered; 2) The escapists - they want to get away and just relax; 3) The educationalist - they want to see new things, go to museums, go on a safari, or experience new cultures.

Cluster analysis, like factor analysis and multi dimensional scaling, is an interdependence technique : it makes no distinction between dependent and independent variables. The entire set of interdependent relationships is examined. It is similar to multi dimensional scaling in that both examine inter-object similarity by examining the complete set of interdependent relationships. The difference is that multi dimensional scaling identifies underlying dimensions, while cluster analysis identifies clusters. Cluster analysis is the obverse of factor analysis. Whereas factor analysis reduces the number of variables by grouping them into a smaller set of factors, cluster analysis reduces the number of observations or cases by grouping them into a smaller set of clusters.

Table of contents
1 In marketing, cluster analysis is used for:
2 The basic procedure is:
3 Clustering Procedures

In marketing, cluster analysis is used for:

The basic procedure is:

  1. Formulate the problem - select the variables that you wish to apply the clustering technique to
  2. Select a distance measure - various ways of computing distance:
    • Squared Euclidean distance - the square root of the sum of the squared differences in value for each variable
    • Manhattan distance - the sum of the absolute differences in value for any variable
    • Chebychev distance - the maximum absolute difference in values for any variable
  3. Select a clustering procedure (see below)
  4. Decide on the number of clusters
  5. Map and interpret clusters - draw conclusions - illustrative techniques like perceptual maps, icicle plots, and dendrograms are useful
  6. Assess reliability and validity - various methods:
    • repeat analysis but use different distance measure
    • repeat analysis but use different clustering technique
    • split the data randomly into two halves and analyze each part separately
    • repeat analysis several times, deleting one variable each time
    • repeat analysis several times, using a different order each time

Clustering Procedures

There are several types of clustering methods: See also :
marketing, marketing research, factor analysis, multi dimensional scaling, quantitative marketing research, positioning, perceptual mapping