Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1
How to Develop Online Recommendation Systems that Deliver Superior Business Performance

How to Develop Online Recommendation Systems that Deliver Superior Business Performance



|Views: 151 |Likes:
Supported by artificial intelligence, intelligent recommendation systems are a boon for e-commerce and other business activities. This paper examines how clustering, data classification, and collaborative filtering as well as a number of algorithms are harnessed to make recommendation systems powerful tools.
Supported by artificial intelligence, intelligent recommendation systems are a boon for e-commerce and other business activities. This paper examines how clustering, data classification, and collaborative filtering as well as a number of algorithms are harnessed to make recommendation systems powerful tools.

More info:

Published by: Cognizant Technology Solutions on Mar 22, 2012
Copyright:Traditional Copyright: All rights reserved


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF or read online from Scribd
See more
See less


How to Develop Online RecommendationSystems that Deliver Superior BusinessPerformance
Cognizant 20-20 Insights
Executive Summary
Over the past two decades, the Internet hasemerged as the mainstream medium for onlineshopping, social networking, e-mail and more.Corporations also view the Web as a potentialbusiness accelerator. They see the huge volume oftransactional and interaction data generated bythe Internet as R&D that informs the creation ofnew and more competitive services and products.Several “e-movement” crusaders have discoveredthat customers spend signicant amounts oftime researching products they seek beforepurchasing. In a bid to assist customers in theseefforts, and conserve precious time, these orga-nizations offer users suggestions of productsthey may be interested in. This serves the dualpurpose of not just attracting browsers butconverting them into buyers. For instance, anonline bookstore may know that a customer hasinterest in mobile technology based on previoussite visits and suggest relevant titles to purchase.An uninitiated user may be impressed by suchsuggestions. Suggestions (or “recommenda-tions” as they are popularly known) predict likesand dislikes of users. To offer meaningful rec-ommendations to site visitors, these companiesneed to store huge amounts of data pertainingto different user proles and their correspond-ing interests. This eventually culminates in infor-mation overload, or difculty in understandingand making informed decisions. One solution tocombating this issue is what is known as a recom-mendation system.Many major e-commerce Websites are alreadyusing recommendation systems to providerelevant suggestions to their customers. Therecommendations could be based on variousparameters, such as items popular on thecompany’s Website; user characteristics such asgeographical location or other demographic infor-mation; or past buying behavior of top customers.This white paper presents an overview of howwe are helping a Fortune 500 organization toimplement a recommendation system. Moreover,this paper also sheds light on key challenges thatmay be encountered during implementation of arecommendation system built on the open sourceApache Mahout,
a large-data library of statisticaland analytical algorithms.
Recommendation Systems
Recommendation systems can be consideredas a valuable extension of traditional informa-tion systems used in industries such as traveland hospitality. However, recommendationsystems have mathematical roots and are moreakin to articial intelligence (AI) than any otherIT discipline. A recommendation system learnsfrom a customer’s behavior and recommends aproduct in which users may be interested. At theheart of recommendation systems are machine-cognizant 20-20 insights|january 2012
learning constructs. Leading e-commerce playersuse recommendation engines that sift users’ pastpurchase histories to recommend products suchas magazine articles, books, goods, etc. Here ishow major e-commerce companies use recom-mendation engines to improve their sales andtheir customers’ shopping experience.
Depending on past purchasesand user activity, the site recommends prod-ucts of user interest.
Recommends DVDs in which a usermay be interested by category like drama,comedy, action, etc. Netix went so far as tooffer a $1 million
prize to researchers whocould improve its recommendation engine.
Collects user feedback about its prod-ucts which is then used to recommend prod-ucts to users who have exhibited similar be-haviors.
Online companies that leverage recommendationsystems can increase sales by 8% to12%.
 Companies that succeed with recommendationengines are those that can quickly and efcientlyturn vast amounts of data into actionable infor-mation.
Anatomy of a Recommendation Engine
The key component of a recommendation systemis data. This data may be garnered by a varietyof means such as customer ratings of products,feedback/reviews from purchasers, etc. This datawill serve as the basis for recommendations tousers. After data collection, recommendationsystems use machine-learning algorithms tond similarities and afnities between productsand users. Recommender logic programs arethen used to build suggestions for specic userproles. This technique of ltering the input dataand giving recommendations to users is alsoknown as “collaborative ltering.”
Along with collaborative ltering, recommenda-tion systems also use other machine-learningtechniques such as clustering and classicationof data. Clustering is a technique which is used tobundle large amounts of data together into similarcategories. It is also used to see data patterns andrender huge amounts of data simpler to manage.For instance, Google News
creates clusters ofsimilar news information when grouping diversearrays of news articles. Many other searchengines use clustering to group results for similarsearch terms.Classication is a technique used to decidewhether new input or a search term matchesa previously observed pattern. It is also usedto detect suspicious network activity. Yahoo!Mail
uses classication to decide if an incomingmessage is spam. Image sharing sites like Picasa
 use classication techniques to determinewhether photos contain human faces. Theythen offer recommendations of people that areidentied in the user contacts list.
A Robust System to CounterInformation Overload
We are working with a leading multinational man-ufacturing company that has numerous productresearch labs with many scientists and research-ers in numerous countries working on differenttechnologies. To help facilitate scientic research,and to buy the latest technology information, thisclient partnered with information providers suchas Scopus, Knovel, etc. But despite these datasources, scientists and researchers were oftenunable to nd the right information to improvetheir research. Also, scientists across the globewere unable to collaborate and share technicalinformation with each other. This situation istypical of companies dealing with informationoverload.To increase the informational awareness ofscientists and other employees, the client wantedto create a system to recommend resources likepatents, articles and journals from paid contentproviders. A successful system needed to learnfrom user searches and be intelligent enough torecommend popular resources similar to the onesthat a user is currently working from. The systemwas also expected to provide scientists withuseful insights on information other scientistsacross the globe are using. Finally, the systemwas to serve as a platform to connect scientistsworking on similar technologies.We helped to design and develop the system,which was dubbed “intelligent recommendationsystem” (IRS). Many of the problems the orga-nization faced originated from the multiple pref-erences and needs of users pertaining to theirindividual research topics. To make the systemadaptive to specic user requirements, thesolution proposed was to use a recommendationsystem. As a rst step towards the solution, thelarge information base possessed by the clientwas categorized/grouped by specic criteria.After much contemplation of data and size of theuser base, we decided to implement the systemusing the Apache Mahout framework.cognizant 20-20 insights
cognizant 20-20 insights
The Mahout framework is highly exible and letsdevelopers customize outcomes according totheir ad hoc requirements. We then developeda customized algorithm to recommend relevantresources to scientists and researchers.
Building an IntelligentRecommendation System
The Solution
The most important purpose of an intelligentrecommendation system (IRS) is to increaseawareness among scientists about the areas inwhich they are exploring, technologies on whichtheir colleagues are working, and informationabout experts and their views on that particulardiscipline. Despite the possession of hugeamounts of data, there was very little insight onthe information scientists were seeking. Thereare fundamental differences designing a recom-mendation system compared with traditionalsoftware design. The overall system architec-ture depends heavily on the choice of algorithmsand the system architecture employed. By usingApache Mahout, the team selected a conven-tional open source framework that implementsmachine-learning algorithms.Apache Mahout is a new Apache SoftwareFoundation (ASF) project whose primary goal isto create scalable machine-learning algorithmsthat are free to use under the Apache license.The term “Mahout” is derived from the Hindiword that means elephant driver. The Mahoutproject started in 2008 as a subproject ofApache’s Lucene project, which is a popularsearch engine. Given the amount of data that theclient possessed, it was imperative that the IRS behighly scalable. Mahout is considered a superiorway of building recommendation systemsbecause it implements all the three machinelearning techniques — collabora-tive ltering, clustering and clas-sication. Collaborative lteringis the primary technique usedby Apache Mahout to providerecommendations. Given ratingdata along with a set of usersand items, collaborative lteringgenerates recommendations inone of the following four ways.
Recommenda-tions are made based on us-ers with similar characteris-tics.
Recommenda-tions are based on similar items.
A fast technique that offers rec-ommendations based on previous user-rateditems.
This approach compares theprole of an active user to aggregate userclusters, rather than the concrete proles.There are many algorithms that are used tocalculate similarities between two entities. Thechoice of algorithm plays a vital role in decidingthe quality of the recommendation that is mostsuited for a given scenario. Since the IRS forthis client needed to offer recommendations toscientists based on their multiple preferences,we adopted the user-based collaborative ltering(see Figure 1).
Mahout is considereda superior wayof buildingrecommendationsystems becauseit implements allthe three machinelearning techniques— collaborativefiltering, clusteringand classification.
Recommendation System Execution Flow
Data ModelSimilarityAlogrithmsRecommenderLogic ProgramEvaluatorProgram
Figure 1

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->