Professional Documents
Culture Documents
1. Introduction
Over the past several years, there has been much work done in personalization focusing on the
constitutes a young and rapidly developing field, there still exist different points of view on what
perspective.
Several attempts have been made to define personalization by the industry practitioners and the
• “Personalization is the ability to provide content and services that are tailored to individuals based
on knowledge about their preferences and behavior” [“Smart Personalization,” Forrester Report by
• “Personalization is the combined use of technology and customer information to tailor electronic
commerce interactions between a business and each individual customer. Using information either
previously obtained or provided in real-time about the customer and other customers, the exchange
between the parties is altered to fit that customer's stated needs so that the transaction requires less
1
• “Personalization is the capability to customize communication based on knowledge preferences
and behaviors at the time of interaction” [Jill Dyche, CRM Handbook, Addison-Wesley, 2002].
relationship; by understanding the needs of each individual and helping satisfy a goal that
efficiently and knowledgeably addresses each individual’s need in a given context” [Doug
These definitions cover various aspects of personalization, and several important features of
personalization emerge from them. In particular, these definitions state collectively that
communications, and e-commerce interactions) by providers (e.g., e-commerce Web sites) to the
consumers of these offerings (e.g., customers, visitors, users, etc.) based on knowledge about
them with certain goal(s) in mind. In the rest of this section, we will elaborate on these concepts.
several providers of personalized offerings and one or several consumers of these offerings, such
as customers, users, and Web site visitors. Personalized offerings can be delivered from
diagrams, providers and consumers of personalized offerings are denoted by white boxes,
personalization engines by grey boxes, and the interactions between consumers and providers by
lines. Fig. 1(a) presents a provider-centric personalization approach that assumes that each
provider has its own personalization engine that tailors the provider’s content to its consumers.
second approach, presented in Fig. 1(b), is a consumer-centric approach, which assumes that
each consumer has its own personalization engine (or agent) that “understands” this particular
consumer and provides personalization services based on this knowledge. This type of
2
consumer-centric personalization delivered across a broad range of providers and offerings, is
called an e-Butler service [AT02]. Rudimentary e-Butler services are provided by such Web
sites as Gator.com, MySimon.com and MyGeek.com. The third approach, presented in Fig. 1(c),
industry or sector. In this case, the personalization engine performs the role of an infomediary
[GT01] by knowing the needs of the consumers and the providers’ offerings and trying to match
the two parties in the best ways according to their internal goals. While this approach has not
been extensively used before, it has a high potential, especially because of the proliferation of
What Is Being Personalized? The result of personalization is the delivery of various offerings
to consumers by the personalization engine(s) on behalf of the providers using any of the
• product and service recommendations, e.g., for books, CDs and vacations;
3
• personalized email;
Personalization Goals. The personalization objectives usually are multifaceted. They may
range from simply improving the consumer’s browsing and shopping experience (e.g., by
presenting only the content that is relevant to the consumer) to much more complex objectives,
such as building long-term relationships with consumers, improving consumer loyalty, and
generating a measurable value for the company. Currently, the most commonly used metrics are
accuracy metrics that measure how the consumer liked a specific personalized offering, e.g.,
how accurate the recommendation was [BS97, Paz99]. Although important, accuracy metrics
are fairly simplistic and do not reflect the “bigger picture”, i.e., to what extent the more complex
personalization objectives have been met. For this purpose, there is a need for more
sophisticated metrics, such as the consumer lifetime value, consumer loyalty value, purchasing
knowledge about personal preferences and behavior of the consumers that is usually distilled
from the large volumes of granular information about the consumers and stored in consumer
The definitions listed above collectively cover several major points about personalization.
However, the point that personalization constitutes an iterative process that takes place over time
has not been sufficiently addressed before, and we describe it in the next section.
4
2. Personalization Process
Measure cycle taking place in time and consisting of the following stages shown in Fig. 2:
• Deliver personalized offering based on the knowledge about each consumer, as stored in
the consumer profiles. The personalization engine must be able to find the most relevant
• Measure personalization impact by determining how much the consumer is satisfied with
the delivered personalized offerings. It provides information that can enhance our
understanding about consumers or point out the deficiencies of the methods for
feedback information completes one cycle of the personalization process, and sets the
stage for the next cycle where improved personalization techniques can make better
personalization decisions.
Data collection. The personalization process begins with collecting data across different
channels of interaction between consumers and providers (e.g., Web, phone, direct mail, and
other channels) and from various other heterogeneous data sources. Such data can be solicited
explicitly (e.g., via surveys) or tracked implicitly and may include histories of consumers’
purchasing and searching activities, as well as demographic and psychographic information. The
5
objective is to obtain the most comprehensive “picture” of a consumer. After the data is
constructing accurate and comprehensive consumer profiles based on the collected data. We
discuss the techniques for building consumer profiles in more detail in Section 3.
individual consumers. There are many matchmaking technologies including user-specified rule-
systems. However, in this paper we will focus on recommender systems technologies because of
the space limitation and because they represent the most developed matchmaking technologies
applicable to various types of personalized offerings. We will describe them further in Section 3.
Deliver
Personalized
Offering
Matchmaking
Data Collection
6
Delivery and presentation. E-companies deliver personalized information to consumers in
several ways. One classification of delivery methods is pull, push, and passive [SKR01]. Push
methods reach a consumer who is not currently interacting with the system, e.g., by sending an
email message. Pull methods notify consumers that personalized information is available but
display this information only when the consumer explicitly requests it. Passive delivery displays
personalized information as a by-product of other activities of the consumer. For example, while
looking at a product on a Web site, a consumer also sees recommendations for related products.
The system can present personalized information in various forms: narrative, a list ordered by
Measuring personalization impact. As was explained before, various accuracy metrics as well
as consumer lifetime value, loyalty value, and purchasing and consumption experience can be
as a feedback for possible improvements to each of the other five steps of the personalization
process. This feedback should be used to decide whether to collect additional data, build better
user profiles, develop better matchmaking algorithms, improve information delivery and
integrated in the personalization process, the quality of interactions with individual consumers,
as measured by the metrics discussed above, should grow over time resulting in the virtuous
cycle of personalization. If the feedback is not properly integrated in the personalization process,
then the metrics can decrease over time producing the effect of de-personalization, when the
7
consumer is getting frustrated with the personalization system and stops using it. The de-
personalization effect is largely responsible for failures of some of the personalization projects
reported in the literature [PR01]. Therefore, one of the main challenges of personalization is the
ability to achieve the virtuous cycle of personalization and not fall into the de-personalization
trap. From the algorithmic sophistication perspective, the technologies that contribute the most
to this goal are the profiling and the matchmaking technologies. We describe them and their
One of the key issues in developing personalization applications is the problem of constructing
accurate and comprehensive profiles of individual consumers. Such profiles should provide the
most relevant information describing who the consumers are and how they behave.
Traditionally, consumer profiles consist of simple factual information. For example, this
information may include consumer’s demographics, such as name, gender, date of birth and
address, or be derived from the past transactions of a consumer, such as the largest purchase
value made at a Web site. This factual profile information is usually defined as a record of
In addition to the factual information, [AT01] considers profiles that capture more complex
behavioral information of consumers. These profiles are modeled using such techniques as
• Conjunctive rules. For example, the rule “John Doe prefers to see action movies on
“weekend”) can be a part of John Doe’s profile that describes his movie viewing habits
[AT01]. Such rules can be learned from the transactional history of the consumer using
8
various data mining techniques, including association and classification rule discovery
methods [HMS01].
• Sequences, such as sequences of Web browsing activities. For example, we may want to
store in John Doe’s profile his typical browsing sequence “when John Doe visits the book
Web site XYZ, he usually first accesses the home page, then goes to the
Home&Gardening section of the site, then browses the Gardening section and then
leaves the Web site” (i.e., XYZ: StartPage → Home&Gardening → Gardening → Exit).
Such sequences can be learned from the transactional histories of consumers using
• Signatures, i.e., the data structures that are used to capture the evolving behavior learned
from large data streams of simple transactions [CFP+00]. For example, “top 5 most
frequently browsed product categories over the last 30 days” would be an example of a
signature that could be stored in individual consumer profiles in a Web store application.
In summary, all profiling approaches can be classified into simple, that support unstructured
factual information about the customers (e.g., demographic information), and advanced, that
support the behavioral information about consumers expressed in the form of rules, sequences,
Besides profiling, delivering targeted content and services for the consumers is another crucial
matchmaking technologies. There has been much research done on this subject, including rule-
However, as was explained in Section 2, in this paper we will focus on recommender systems-
9
In the context of recommender systems, matchmaking technologies are often classified into
services, products) similar to the ones the consumer preferred in the past. In other words,
content-based methods analyze the commonalities among the items the consumer has rated
highly in the past. Then, only the items that have high similarity with the consumer’s past
items that people with similar tastes and preferences liked in the past. Collaborative methods
first find the closest peers for each consumer, i.e., the ones with the most similar tastes and
preferences. Then, only the items that are most liked by the peers would get recommended.
• Hybrid approaches: these methods combine collaborative and content-based methods. This
combination can be done in many different ways, e.g., separate content-based and
collaborative systems are implemented and their results are combined to produce the final
previous transactions made by the consumers. An example of such a heuristic for a movie
recommender system could be to find the person whose taste in movies is the closest to mine,
and recommend me everything this person liked that I have not seen yet.
10
• Model-based techniques use the previous transactions to learn a model (usually using some
recommendations. For example, based on the movies that I have seen, a probabilistic model
is built to estimate the probability of how I would like each of the unseen movies.
As we have done with profiling techniques, we classify various matchmaking methods into
simple and advanced. Existing empirical research suggests that hybrid approaches outperform
the pure content-based and pure collaborative approaches [BS97, Paz99] and that model-based
Based on these results, we classify content-based and collaborative approaches that use heuristic
algorithms as “simple” and hybrid heuristics and model-based matchmaking techniques are
classified as “advanced.”
Combining the profiling and the matchmaking classifications, personalization technologies can
recommendation techniques often use different types of consumer profiles for recommendation
purposes. In particular, many collaborative techniques only use the ratings of items that were
take into account simple demographic attributes (e.g., age, gender) in the recommendation
process. Content-based techniques commonly use keywords to represent tastes and preferences
“factual” profiling information. Even more advanced approaches to recommender systems, such
as the current generation of hybrid recommendation heuristics and model-based techniques, still
use the same limited profiling information as simple heuristics (i.e., ratings, keywords,
11
classified as having “simple” profiling components. Thus, heuristic techniques for content-based
and collaborative recommendations were placed in the upper left quadrant of Table 1. Similarly,
the more advanced recommendation approaches, such as hybrid heuristics and model-based
techniques, also utilize simple profiling methods, as discussed above. Therefore, they fall into
PROFILING
Simple Advanced
Simple Content-based heuristics Rule-based matching
MATCH- Collaborative filtering heuristics
MAKING Advanced Hybrid heuristics Future work?
Model-based approaches
Furthermore, the upper right quadrant of Table 1 corresponds to the advanced profiling methods,
such as rule-based consumer profiling techniques, described above. However, the advanced
profiling methods are underutilized in the modern recommender systems, and there has been
relatively little work done on developing matchmaking technologies that fit into this quadrant.
Some exceptions to this constitute fairly straightforward rule matching techniques, such as the
example, if there are rules in the consumer’s profile indicating that this consumer reads politics-
related news in the morning (i.e., rule morning → politics) and sports-related news in the
evening (i.e., evening → sports), then in the morning the news Web site will present more
politics-related stories and in the evening more sports-related stories to this consumer. Since
such systems use advanced profiling and simple recommendation methods, they fall into the
12
Finally, there has been very little prior work done for the lower right quadrant of Table 1 because
most of the personalization research has focused on the development of matchmaking techniques
behavior. Similarly, the advanced profiling methods have only been applied for simple
Some of the outstanding issues in personalization have been previously pointed out by several
researchers and discussed extensively in the literature. Such issues include the degree of
profiling and matchmaking techniques also constitutes an important research problem that we
discussed in Section 3.
In the context of the process-oriented view of personalization, one of the most important
problems lies in understanding the dynamics between various stages of the personalization
process and in their proper integration using the feedback loop resulting in the virtuous cycle of
consisting of several stages, and it is crucial to understand how much each stage contributes
towards the success (or failure) of the overall personalization strategy, as measured by various e-
business and other metrics. In other words, if these metrics suggest that our personalization
strategy is not performing well, is it because of poor data collection, inaccurate consumer
13
profiles, poorly chosen techniques for matchmaking or content delivery? Alternatively, the
selection metrics may not be well suited for the application at hand and are not giving us
accurate measurements. We call this a feedback integration problem, since the main issue is
how to improve the personalization process by integrating the feedback into the overall process.
Note, that the feedback integration problem is a recursive one, i.e., if we are able to identify the
underperforming stages of the personalization process, we may still face similar challenges when
deciding on the specific adjustments within each stage. For example, if we need to improve the
data collection phase of the personalization process, should we collect more data, collect
The problem of feedback integration in the personalization process has not been extensively
studied before. Therefore, more research is needed in order to achieve a more comprehensive
various stages of the personalization process. Much of personalization research has been
personalization described in this paper suggests the importance and the need for vertical
personalization research, i.e., research that integrates all the stages of the personalization process.
References
14
[AT02] G. Adomavicius and A. Tuzhilin. e-Butler: An Architecture of a Customer-
for extracting signatures from data streams. Proceedings of the 6th ACM SIGKDD
[CS00] M. Cutler and J. Sterne. E-Metrics: Business Metrics for the New Economy,
[HMS01] D. Hand, H. Mannila, and P. Smyth. Principles of Data Mining, MIT Press, 2001.
[PR01] Peppers, D. and Rogers, M. Why CRM Initiatives Fail and What You Can Do
Modeling and Its Use for Customer Retention Planning. Proceedings of the 8th
15
ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, 2002.
16