Time Series of Counts Data in Marketing
Time series of counts (TSC) surface whenever countable, time-dependent observations can be made about a phenomenon. For example, daily catches of a fisherman, weekly swine flu cases admitted into a hospital, and monthly used cars sold by a salesman are all instances of TSC. Count data are abundant in the study of consumer behavior. Observations such as a consumer's product purchase patterns, online click streams, store visits, and rate of product consumption are examples of count data and can be analyzed for marketing purposes. In fact, in the creation of a company's marketing mix or marketing strategy, managers rely heavily on relevant, accurate and timely (RAT) information about the consumer behavior[1]. In an increasingly technology and information dependent society, companies scramble to acquire RAT information in order to understand the new, emerging demands of consumers, and to gain and keep that competitive edge.
Perhaps the most important use for consumer behavior data is market segmentation where consumers are identified and grouped based on their distinctive traits and characteristics. Market segmentation begins with the realization that there exists a heterogeneity in consumer demand of goods and services and that the heterogeneous market is made up of “a number of smaller homogeneous markets, in response to differing preferences, attributable to the desires of consumers for more precise satisfaction of their varying wants[9]”. Market segmentation, if conducted correctly and effectively, is an extremely powerful and informative tool for managers since segmentation analysis can reveal both the size and the traits of the consumer groups, for whom appropriate group-specific marketing strategies can be devised. Correct and relevant market segmentation is critical for businesses since incorrect evaluations of consumer groups and their characteristics leads to unproductive marketing mix and waste of resources. Numerous publications and books have been written on effective methods of consumer segmentation. For a survey and review of market segmentation methods, see [11].
Current Market Segmentation Methods
Through a review of relevant literature on the subject, it is evident that the statistical process of cluster analysis lies at the core of market segmentation. Though classification of consumers into preset number of groups with already identified traits is also common in segmentation, clustering offers the advantages of flexibility and accuracy because the number of groups, their size and their characteristics are data-driven.
Thus, there has been substantial development in marketing research related clustering models and methodology in the past 60 years. During the initial stages of market segmentation research in 1950s, the statistical models proposed depended heavily on existing Operations Research and Management Science methods. These models were either too complicated to be implemented practically or too unrealistic to accurately represent real-world situations. As researchers gained more computing power, models became more realistic and implementable[7]. However, current methods also have some shortcomings. For example, many segmentation models have difficulties capturing the relationship between the exogenous (consumer traits) and the response variables (ex. purchase frequencies and profits) in a segment[3]. Also, even if the model successfully relates segment traits with response, the time-dependent nature of consumer behavior is often ignored. Finally, many of the models involve only one dependent variable. Consider the purchase of printers among different groups of consumers. Such narrowly defined analysis maybe sufficient for, say a printer manufacturer, however, for an electronic store manager deciding how to stock, shelf and promote printers, it is helpful to be able to generalize the analysis to cover electronic products closely related to printers—printing paper, scanners, toners, etc.—since consumers who buy printers are likely to be interested in these products and vice versa. Thus, a multivariate segmentation approach can be more informative and useful.
Model-Based Clustering
A novel model-based method of clustering TSC was proposed by Thomas, Ray and Ensor[10], who applied the method to Houston air pollution monitoring data. In the study, air quality monitor stations, represented by time series of pollution readings were clustered to identify regions of the city with similar patterns of pollution. The results from the pollution study were promising and motivate the used of model-based clustering (MBC) in other applications. In this study, our goal is to address the shortcomings of current market segmentation methods by applying MBC to consumer purchase data.
Outline for remainder of the discussion
We will explain MBC in more detail in Section 2. The description of the data is given in Section 3. Ongoing work and project implications are presented in Sections 4 and 5 respectively. Finally directions for future research are laid out in Section 6.











