You are on page 1of 16

Personalization Technologies: A Process-Oriented Perspective

Gediminas Adomavicius Alexander Tuzhilin


Information and Decision Sciences Information, Operation and Management Sciences
Carlson School of Management Stern School of Business
University of Minnesota New York University
gedas@umn.edu atuzhili@stern.nyu.edu

1. Introduction

Over the past several years, there has been much work done in personalization focusing on the

development of new technologies, understanding personalization from the business point of

view, and developing novel personalization applications [CACM00]. Since personalization

constitutes a young and rapidly developing field, there still exist different points of view on what

personalization is expressed by academics and practitioners. In this article we synthesize these

various points of view and describe personalization technologies from a process-oriented

perspective.

Several attempts have been made to define personalization by the industry practitioners and the

academic researchers. Some of the representative definitions include:

• “Personalization is the ability to provide content and services that are tailored to individuals based

on knowledge about their preferences and behavior” [“Smart Personalization,” Forrester Report by

Paul Hagen, 1999].

• “Personalization is the combined use of technology and customer information to tailor electronic

commerce interactions between a business and each individual customer. Using information either

previously obtained or provided in real-time about the customer and other customers, the exchange

between the parties is altered to fit that customer's stated needs so that the transaction requires less

time and delivers a product best suited to that customer” [www.personalization.com].

1
• “Personalization is the capability to customize communication based on knowledge preferences

and behaviors at the time of interaction” [Jill Dyche, CRM Handbook, Addison-Wesley, 2002].

• “Personalization is about building customer loyalty by building a meaningful one-to-one

relationship; by understanding the needs of each individual and helping satisfy a goal that

efficiently and knowledgeably addresses each individual’s need in a given context” [Doug

Riecken, “Personalized Views of Personalization,” Communications of the ACM, 43(8), 2000].

These definitions cover various aspects of personalization, and several important features of

personalization emerge from them. In particular, these definitions state collectively that

personalization tailors certain offerings (e.g., content, services, product recommendations,

communications, and e-commerce interactions) by providers (e.g., e-commerce Web sites) to the

consumers of these offerings (e.g., customers, visitors, users, etc.) based on knowledge about

them with certain goal(s) in mind. In the rest of this section, we will elaborate on these concepts.

Personalization Participants. As stated above, personalization takes place between one or

several providers of personalized offerings and one or several consumers of these offerings, such

as customers, users, and Web site visitors. Personalized offerings can be delivered from

providers to consumers by personalization engines in three ways presented in Fig. 1. In these

diagrams, providers and consumers of personalized offerings are denoted by white boxes,

personalization engines by grey boxes, and the interactions between consumers and providers by

lines. Fig. 1(a) presents a provider-centric personalization approach that assumes that each

provider has its own personalization engine that tailors the provider’s content to its consumers.

This is the most common approach to personalization, as popularized by Amazon.com. The

second approach, presented in Fig. 1(b), is a consumer-centric approach, which assumes that

each consumer has its own personalization engine (or agent) that “understands” this particular

consumer and provides personalization services based on this knowledge. This type of

2
consumer-centric personalization delivered across a broad range of providers and offerings, is

called an e-Butler service [AT02]. Rudimentary e-Butler services are provided by such Web

sites as Gator.com, MySimon.com and MyGeek.com. The third approach, presented in Fig. 1(c),

is a market-centric approach that provides personalization services for a marketplace in a certain

industry or sector. In this case, the personalization engine performs the role of an infomediary

[GT01] by knowing the needs of the consumers and the providers’ offerings and trying to match

the two parties in the best ways according to their internal goals. While this approach has not

been extensively used before, it has a high potential, especially because of the proliferation of

electronic marketplaces, such as Covisint [www.covisint.com] for the automobile industry.

Consumers Providers Consumers Providers Consumers Providers

(a) Provider-centric (b) Consumer-centric (c) Market-centric

Figure 1. Classification of Personalization Approaches.

What Is Being Personalized? The result of personalization is the delivery of various offerings

to consumers by the personalization engine(s) on behalf of the providers using any of the

approaches shown in Fig. 1. Examples of the personalized offerings include

• personalized content, such as personalized Web pages and links;

• product and service recommendations, e.g., for books, CDs and vacations;

3
• personalized email;

• personalized information searches;

• personalized (dynamic) prices;

• personalized products for individual customers, such as custom-made CDs.

Personalization Goals. The personalization objectives usually are multifaceted. They may

range from simply improving the consumer’s browsing and shopping experience (e.g., by

presenting only the content that is relevant to the consumer) to much more complex objectives,

such as building long-term relationships with consumers, improving consumer loyalty, and

generating a measurable value for the company. Currently, the most commonly used metrics are

accuracy metrics that measure how the consumer liked a specific personalized offering, e.g.,

how accurate the recommendation was [BS97, Paz99]. Although important, accuracy metrics

are fairly simplistic and do not reflect the “bigger picture”, i.e., to what extent the more complex

personalization objectives have been met. For this purpose, there is a need for more

sophisticated metrics, such as the consumer lifetime value, consumer loyalty value, purchasing

and consumption experiences, and other ROI-based metrics [CS00, RNE+02].

Consumer Knowledge. Successful personalization depends to a very large extent on the

knowledge about personal preferences and behavior of the consumers that is usually distilled

from the large volumes of granular information about the consumers and stored in consumer

profiles [Paz99, AT01]. We will cover this topic in Section 3.

The definitions listed above collectively cover several major points about personalization.

However, the point that personalization constitutes an iterative process that takes place over time

has not been sufficiently addressed before, and we describe it in the next section.

4
2. Personalization Process

Personalization constitutes an iterative process that can be defined by the Understand-Deliver-

Measure cycle taking place in time and consisting of the following stages shown in Fig. 2:

• Understand consumers by collecting comprehensive information about them and

converting it into actionable knowledge stored in consumer profiles.

• Deliver personalized offering based on the knowledge about each consumer, as stored in

the consumer profiles. The personalization engine must be able to find the most relevant

offerings and deliver them to the consumer.

• Measure personalization impact by determining how much the consumer is satisfied with

the delivered personalized offerings. It provides information that can enhance our

understanding about consumers or point out the deficiencies of the methods for

personalized delivery. Therefore, this additional information serves as a feedback for

possible improvements to each of the other components of personalization process. This

feedback information completes one cycle of the personalization process, and sets the

stage for the next cycle where improved personalization techniques can make better

personalization decisions.

The technical implementation of the Understand-Deliver-Measure cycle consists of the six

stages presented in Fig. 2 and described below.

Data collection. The personalization process begins with collecting data across different

channels of interaction between consumers and providers (e.g., Web, phone, direct mail, and

other channels) and from various other heterogeneous data sources. Such data can be solicited

explicitly (e.g., via surveys) or tracked implicitly and may include histories of consumers’

purchasing and searching activities, as well as demographic and psychographic information. The

5
objective is to obtain the most comprehensive “picture” of a consumer. After the data is

collected, it is usually processed, cleaned, and stored in a consumer-oriented data warehouse.

Building consumer profiles. A key issue in developing personalization applications is

constructing accurate and comprehensive consumer profiles based on the collected data. We

discuss the techniques for building consumer profiles in more detail in Section 3.

Matchmaking. Personalization systems must match appropriate content and services to

individual consumers. There are many matchmaking technologies including user-specified rule-

based content delivery systems, e.g., as deployed by BroadVision [http://www.broadvision.com]

and ATG [http://www.atg.com], statistics-based predictive approaches, and recommender

systems. However, in this paper we will focus on recommender systems technologies because of

the space limitation and because they represent the most developed matchmaking technologies

applicable to various types of personalized offerings. We will describe them further in Section 3.

Adjusting Personalization Strategy


Measure
Impact of
Personalization
Measuring Personalization Impact

Delivery and Presentation


Feedback loop

Deliver
Personalized
Offering
Matchmaking

Building Consumer Profiles


Understand
the Consumer

Data Collection

Figure 2. Personalization process.

6
Delivery and presentation. E-companies deliver personalized information to consumers in

several ways. One classification of delivery methods is pull, push, and passive [SKR01]. Push

methods reach a consumer who is not currently interacting with the system, e.g., by sending an

email message. Pull methods notify consumers that personalized information is available but

display this information only when the consumer explicitly requests it. Passive delivery displays

personalized information as a by-product of other activities of the consumer. For example, while

looking at a product on a Web site, a consumer also sees recommendations for related products.

The system can present personalized information in various forms: narrative, a list ordered by

relevance, an unordered list of alternatives, or various types of visualization.

Measuring personalization impact. As was explained before, various accuracy metrics as well

as consumer lifetime value, loyalty value, and purchasing and consumption experience can be

used to evaluate the effectiveness of personalization. The quality of recommendations, as

measured by these metrics, depends on the sophistication of technologies deployed in the

previous four stages in Fig. 2.

Adjusting personalization strategy. As Fig. 2 shows, measuring personalization impact serves

as a feedback for possible improvements to each of the other five steps of the personalization

process. This feedback should be used to decide whether to collect additional data, build better

user profiles, develop better matchmaking algorithms, improve information delivery and

presentation, or to use additional measures of personalization impact. If this feedback is properly

integrated in the personalization process, the quality of interactions with individual consumers,

as measured by the metrics discussed above, should grow over time resulting in the virtuous

cycle of personalization. If the feedback is not properly integrated in the personalization process,

then the metrics can decrease over time producing the effect of de-personalization, when the

7
consumer is getting frustrated with the personalization system and stops using it. The de-

personalization effect is largely responsible for failures of some of the personalization projects

reported in the literature [PR01]. Therefore, one of the main challenges of personalization is the

ability to achieve the virtuous cycle of personalization and not fall into the de-personalization

trap. From the algorithmic sophistication perspective, the technologies that contribute the most

to this goal are the profiling and the matchmaking technologies. We describe them and their

interaction in the next section.

3. Profiling and Matchmaking Technologies

One of the key issues in developing personalization applications is the problem of constructing

accurate and comprehensive profiles of individual consumers. Such profiles should provide the

most relevant information describing who the consumers are and how they behave.

Traditionally, consumer profiles consist of simple factual information. For example, this

information may include consumer’s demographics, such as name, gender, date of birth and

address, or be derived from the past transactions of a consumer, such as the largest purchase

value made at a Web site. This factual profile information is usually defined as a record of

values and stored in a relational database, one record per consumer.

In addition to the factual information, [AT01] considers profiles that capture more complex

behavioral information of consumers. These profiles are modeled using such techniques as

• Conjunctive rules. For example, the rule “John Doe prefers to see action movies on

weekends” (i.e., Name = “John Doe” & MovieType = “action” → TimeOfWeek =

“weekend”) can be a part of John Doe’s profile that describes his movie viewing habits

[AT01]. Such rules can be learned from the transactional history of the consumer using

8
various data mining techniques, including association and classification rule discovery

methods [HMS01].

• Sequences, such as sequences of Web browsing activities. For example, we may want to

store in John Doe’s profile his typical browsing sequence “when John Doe visits the book

Web site XYZ, he usually first accesses the home page, then goes to the

Home&Gardening section of the site, then browses the Gardening section and then

leaves the Web site” (i.e., XYZ: StartPage → Home&Gardening → Gardening → Exit).

Such sequences can be learned from the transactional histories of consumers using

frequent episodes and other sequence learning methods [HMS01].

• Signatures, i.e., the data structures that are used to capture the evolving behavior learned

from large data streams of simple transactions [CFP+00]. For example, “top 5 most

frequently browsed product categories over the last 30 days” would be an example of a

signature that could be stored in individual consumer profiles in a Web store application.

In summary, all profiling approaches can be classified into simple, that support unstructured

factual information about the customers (e.g., demographic information), and advanced, that

support the behavioral information about consumers expressed in the form of rules, sequences,

signatures, and other knowledge representation methods.

Besides profiling, delivering targeted content and services for the consumers is another crucial

aspect of personalization that depends significantly on the quality of the underlying

matchmaking technologies. There has been much research done on this subject, including rule-

based matchmaking, statistics-based predictive approaches, and recommender systems.

However, as was explained in Section 2, in this paper we will focus on recommender systems-

based matchmaking techniques.

9
In the context of recommender systems, matchmaking technologies are often classified into

broad categories according to their recommendation approach as well as their algorithmic

technique as described below.

Classification based on the recommendation approach [BS97]:

• Content-based recommendations: the consumer is recommended items (e.g., content,

services, products) similar to the ones the consumer preferred in the past. In other words,

content-based methods analyze the commonalities among the items the consumer has rated

highly in the past. Then, only the items that have high similarity with the consumer’s past

preferences would get recommended.

• Collaborative recommendations (or collaborative filtering): the customer is recommended

items that people with similar tastes and preferences liked in the past. Collaborative methods

first find the closest peers for each consumer, i.e., the ones with the most similar tastes and

preferences. Then, only the items that are most liked by the peers would get recommended.

• Hybrid approaches: these methods combine collaborative and content-based methods. This

combination can be done in many different ways, e.g., separate content-based and

collaborative systems are implemented and their results are combined to produce the final

recommendations. Another approach would be to use content-based and collaborative

techniques in a single recommendation model, rather than implementing them separately.

Classification based on the algorithmic technique [BHK98]:

• Heuristic-based techniques constitute heuristics that calculate recommendations based on the

previous transactions made by the consumers. An example of such a heuristic for a movie

recommender system could be to find the person whose taste in movies is the closest to mine,

and recommend me everything this person liked that I have not seen yet.

10
• Model-based techniques use the previous transactions to learn a model (usually using some

machine learning or statistical learning technique), which is then used to make

recommendations. For example, based on the movies that I have seen, a probabilistic model

is built to estimate the probability of how I would like each of the unseen movies.

As we have done with profiling techniques, we classify various matchmaking methods into

simple and advanced. Existing empirical research suggests that hybrid approaches outperform

the pure content-based and pure collaborative approaches [BS97, Paz99] and that model-based

techniques outperform heuristic-based ones in terms of recommendation accuracy [BHK98].

Based on these results, we classify content-based and collaborative approaches that use heuristic

algorithms as “simple” and hybrid heuristics and model-based matchmaking techniques are

classified as “advanced.”

Combining the profiling and the matchmaking classifications, personalization technologies can

be characterized by the 2x2 matrix presented in Table 1 emphasizing that various

recommendation techniques often use different types of consumer profiles for recommendation

purposes. In particular, many collaborative techniques only use the ratings of items that were

provided by individuals as a feedback to the recommender system, although some techniques

take into account simple demographic attributes (e.g., age, gender) in the recommendation

process. Content-based techniques commonly use keywords to represent tastes and preferences

of individuals. In other words, traditional recommendation techniques use very limited,

“factual” profiling information. Even more advanced approaches to recommender systems, such

as the current generation of hybrid recommendation heuristics and model-based techniques, still

use the same limited profiling information as simple heuristics (i.e., ratings, keywords,

demographic attributes). Therefore, all these approaches to recommender systems where

11
classified as having “simple” profiling components. Thus, heuristic techniques for content-based

and collaborative recommendations were placed in the upper left quadrant of Table 1. Similarly,

the more advanced recommendation approaches, such as hybrid heuristics and model-based

techniques, also utilize simple profiling methods, as discussed above. Therefore, they fall into

the lower left quadrant of Table 1.

PROFILING
Simple Advanced
Simple Content-based heuristics Rule-based matching
MATCH- Collaborative filtering heuristics
MAKING Advanced Hybrid heuristics Future work?
Model-based approaches

Table 1. Existing personalization techniques according to their profiling and matchmaking


components.

Furthermore, the upper right quadrant of Table 1 corresponds to the advanced profiling methods,

such as rule-based consumer profiling techniques, described above. However, the advanced

profiling methods are underutilized in the modern recommender systems, and there has been

relatively little work done on developing matchmaking technologies that fit into this quadrant.

Some exceptions to this constitute fairly straightforward rule matching techniques, such as the

ones utilized by BroadVision [www.broadvision.com] and ATG [www.atg.com] companies. For

example, if there are rules in the consumer’s profile indicating that this consumer reads politics-

related news in the morning (i.e., rule morning → politics) and sports-related news in the

evening (i.e., evening → sports), then in the morning the news Web site will present more

politics-related stories and in the evening more sports-related stories to this consumer. Since

such systems use advanced profiling and simple recommendation methods, they fall into the

upper right quadrant of Table 1.

12
Finally, there has been very little prior work done for the lower right quadrant of Table 1 because

most of the personalization research has focused on the development of matchmaking techniques

that do not require a comprehensive understanding of consumers’ tastes, preferences, and

behavior. Similarly, the advanced profiling methods have only been applied for simple

matchmaking problems. Since it is crucial to have a comprehensive understanding of the

consumers and deploy sophisticated recommendation methods in order to provide accurate

recommendations, the lower right quadrant of Table 1 represents an important research

opportunity in personalization technologies.

4. Future Work on Personalization Process

Some of the outstanding issues in personalization have been previously pointed out by several

researchers and discussed extensively in the literature. Such issues include the degree of

personalization, privacy, scalability, trustworthiness, intrusiveness, and usage of various metrics

to measure effectiveness of personalization [CACM00, SKR01]. The integration of advanced

profiling and matchmaking techniques also constitutes an important research problem that we

discussed in Section 3.

In the context of the process-oriented view of personalization, one of the most important

problems lies in understanding the dynamics between various stages of the personalization

process and in their proper integration using the feedback loop resulting in the virtuous cycle of

personalization discussed in Section 2. As depicted in Fig. 2, personalization is a process

consisting of several stages, and it is crucial to understand how much each stage contributes

towards the success (or failure) of the overall personalization strategy, as measured by various e-

business and other metrics. In other words, if these metrics suggest that our personalization

strategy is not performing well, is it because of poor data collection, inaccurate consumer

13
profiles, poorly chosen techniques for matchmaking or content delivery? Alternatively, the

selection metrics may not be well suited for the application at hand and are not giving us

accurate measurements. We call this a feedback integration problem, since the main issue is

how to improve the personalization process by integrating the feedback into the overall process.

Note, that the feedback integration problem is a recursive one, i.e., if we are able to identify the

underperforming stages of the personalization process, we may still face similar challenges when

deciding on the specific adjustments within each stage. For example, if we need to improve the

data collection phase of the personalization process, should we collect more data, collect

different data, or use better data pre-processing techniques [SMB+03]?

The problem of feedback integration in the personalization process has not been extensively

studied before. Therefore, more research is needed in order to achieve a more comprehensive

understanding of how to transform the e-business measurements into specific adjustments to

various stages of the personalization process. Much of personalization research has been

focusing on only few stages of personalization process. The process-oriented view of

personalization described in this paper suggests the importance and the need for vertical

personalization research, i.e., research that integrates all the stages of the personalization process.

References

[AT01] G. Adomavicius and A. Tuzhilin. Expert-Driven Validation of Rule-Based User

Models in Personalization Applications. Data Mining and Knowledge Discovery,

5(1-2): 33- 58, 2001.

14
[AT02] G. Adomavicius and A. Tuzhilin. e-Butler: An Architecture of a Customer-

Centric Personalization System. International Journal of Computational

Intelligence and Applications, 2(3): 313-327, 2002.

[BHK98] J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive

algorithms for collaborative filtering. Proceedings of the Fourteenth Conference

on Uncertainty in Artificial Intelligence, 1998.

[BS97] M. Balabanovic. and Y. Shoham. Fab: Content-based, collaborative

recommendation. Communications of the ACM, 40(3): 66-72, 1997.

[CACM00] Communications of the ACM, 43(8), 2000. Special issue on personalization.

[CFP+00] C. Cortes, K. Fisher. D. Pregibon, A. Rogers, and F. Smith. Hancock: a language

for extracting signatures from data streams. Proceedings of the 6th ACM SIGKDD

International Conference on Knowledge Discovery and Data Mining, 2000.

[CS00] M. Cutler and J. Sterne. E-Metrics: Business Metrics for the New Economy,

NetGenesis Corporation, 2000. [http://www.emetrics.org/articles/emetrics.pdf]

[GT01] V. Grover and J. Teng. E-commerce and the information market.

Communications of the ACM, 44(4), 2001.

[HMS01] D. Hand, H. Mannila, and P. Smyth. Principles of Data Mining, MIT Press, 2001.

[Paz99] M. Pazzani. A Framework for Collaborative, Content-Based and Demographic

Filtering. Artificial Intelligence Review, 13(5-6): 393-408, 1999.

[PR01] Peppers, D. and Rogers, M. Why CRM Initiatives Fail and What You Can Do

about It. Inside 1to1. Oct. 8, 2001.

[RNE+02] S. Rosset, E. Neumann, U. Eick, N. Vatnik, Y. Idan. Customer Lifetime Value

Modeling and Its Use for Customer Retention Planning. Proceedings of the 8th

15
ACM SIGKDD International Conference on Knowledge Discovery and Data

Mining, 2002.

[SKR01] Schafer, J. B., J. A. Konstan, and J. Riedl. E-commerce recommendation

applications. Data Mining and Knowledge Discovery, 5(1-2): 115-153, 2001.

[SMB+03] M. Spiliopoulou, B. Mobasher, B. Berendt, and M. Nakagawa. A Framework for

the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis.

INFORMS Journal on Computing, 15(2), 2003.

16

You might also like