You are on page 1of 7

Using Item Descriptors in Recommender Systems

Eliseo Reategui , John A. Campbell , Roberto Torres
1 Departamento de Informática Centro de Ciências Exatas e Tecnologia Universidade de Caxias do Sul Rua Francisco Getúlio Vargas, 1130 95070-560 Caxias do Sul, RS, Brazil ebreateg@ucs.br 2 Department of Computer Science University College London London WC1E 6BT, UK jac@cs.ucl.ac.uk 3 Instituto de Informática Universidade Federal do Rio Grande do Sul 91501-970 Porto Alegre, RS, Brazil rtorres@inf.ufrgs.br 1 2 3

Abstract
One of the earliest and most successful technologies used in recommender systems is known as collaborative filtering, a technique that predicts the preferences of one user based on the preferences of other similar users. We present here a different approach that uses a simple learning algorithm to identify and store patterns about items, and a noisy-OR function in order to find recommendations. The technique represents knowledge in item descriptors, which are recordlike structures that store knowledge on when to recommend each item. A recommender system keeps several item descriptors that compete when a recommendation is requested. Besides showing a good performance, the item descriptors have the advantage of making it easy to understand and monitor the system’s knowledge. This paper details the item descriptors as well as the way they are used to identify users’ preferences. Preliminary results are presented, and directions for future work are indicated.

Introduction
Collaborative Filtering is one of the most popular technologies in recommender systems (Herlocker et al., 2000). The technique has been used successfully in several research projects, such as Tapestry (Goldberg et al., 1992) and GroupLens (Sarwar et al., 1998), as well as in commercial websites: e.g. Amazon.com Book Matcher, CDNow.com (Schafer et al., 1999). The algorithm behind collaborative filtering is based on the idea that the active user is more likely to prefer items that like-minded people prefer. To support this, a similarity score between the active user and every other user is calculated. Predictions are generated by selecting items rated by the users with the highest degrees of similarity.
Copyright © 2002, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.

One of the interesting features of this technique is that by exploiting information about other users’ ranked items, one may receive recommendations of items that have dissimilar content to those seen in the past. Although collaborative filtering has been used effectively in a wide range of applications, it has a scalability problem: as for other techniques that rely on actual cases or records to arrive at a solution to a problem (e.g. case-based reasoning), the more users there are in the database, the longer it may take to find similar users and items to recommend. The technique also tends to fail when little is known about a user (Mooney, 2000). Another drawback of collaborative filtering is that it is difficult to add business rules to it, or to modify manually the way the system recommends items. Content-based methods are another approach to recommender systems, where one tries to suggest items that are similar to those a given user has ranked positively in the past (Balabanovic and Shoham, 1997). This method has its roots in the field of information retrieval, and is often based on a search for certain terms and keywords (Popescul et al., 2001). One of the weaknesses of this approach is that it is susceptible to over-specialization, i.e. the more the system recommends items scoring highly against a user’s profile, the more the user is restricted to seeing items similar to those already rated. Another disadvantage of this approach is that generally only a very shallow analysis of contents can be supplied for the recommendations. Another scheme common in recommender systems is association rules (Lin et al., 2000, Mombasher, 2001), which can predict users’ preferences based on general rules extracted from a database. In the domain of e-commerce, association rules are relationships between items that indicate a connection between the purchase of one item and the purchase of another item. Although being successfully applied to predict customers’ preferences, association rules are hard to modify while keeping the rule

1997) and inferring user’s preferences ( Buchner et al.92 0. For example.. Demographic material is represented here in attribute-value pairs. These characteristics can be classified as: • demographic: data describing an individual. Furthermore.g. called item descriptors. Keeping track of and trying to understand the large number of generated rules for each item is another difficulty of this approach. The last section of the paper offers conclusions and directions for future work. Business Shopping cart . The first section below gives the general structure of the descriptors. Each term belongs to a certain class. Descriptor: dn Correlated terms ta te tc td tb Classes Business Marketing Demographic Demographic Business Class: Business Confidence 0. such as age. For example. each class has its own set of attributes. For instance. the purchase of a business book. for example.e. 2001). letting the user navigate through it in different levels of granularity.. preliminary results are discussed.. referring to a characteristic that is frequent in users who have rated the target item represented by dn. Institutional Figure 2: Hierarchy of demographic and behavioral classes The hierarchy is useful to give a clearer idea of the existing classes and features. where behavioral data and demographic features are considered to be relevant for the recommendation of a certain item (represented by descriptor dn). The method proposed is similar to content-based methods in that the system keeps descriptors of items. detailing their knowledge representation mechanism and showing how it is used to identify users’ preferences. rather than listing terms and keywords in the descriptors. to represent knowledge on how to make recommendations. gender. For instance. a university lecturer could have the demographic features "occupation = lecturer" and "gender = male". While attributes used to define demographic features are typically single-valued.. Figure 2 gives an example of such a hierarchy for the virtual bookstore. Page-views Marketing . i. instead of storing information about users’ preferences.84 0. such as: . user’s features and item relationships are exploited. Nevertheless. behavioral data is usually multivalued.. We present in this paper a different approach which uses record-like structures. Each term’s class and confidence (the strength with which the term is correlated with the target item) is displayed next to its identification.87 0. A separate structure is used to keep the complete hierarchy of classes for both demographic and behavioral features. Then. adding new rules without contradicting existing ones). representing. Behavioral material is represented by actions that show appreciation or dislike for facts or items.base consistent (e.85 0.77 Figure 1: Item descriptor in the domain of a virtual bookstore The item descriptors An item descriptor represents knowledge about when to recommend a particular item by listing the characteristics that users likely to be interested in the item should have. 1998. Features Demographic Gender .. Behavioral Occupation Books . as illustrated in figure 1. This paper presents the item descriptors. term ta belongs to the behavioral class Business. a person can only belong to one age group (demographic). address • behavioral: data describing purchases or preferences of an individual It has been shown that both types of data are important when building a user profile (Krulwich et al. demographic data about users is read from a database and kept unchanged (unless more information about users becomes available).. the learning algorithm and recommendation mechanism. The method proposed shows a good performance with respect to processing time and accuracy. occupation. both types of information are represented in our model in a similar way. Term te belongs to the behavioral class Marketing.. Term tc belongs to the class Demographic. Claypool et al. Let us examine an example of an item descriptor in the domain of a virtual bookstore. and has an advantage over other techniques when it comes to understanding the knowledge used to recommend items and letting users modify it.... While behavioral features are learned over time. which we call features. an item that may be recommended in the presence of some of its correlated terms. the purchase of an item (or its simple selection for visualization) demonstrates the interest of the user in the article chosen. The descriptor has a target. as well as conceptual advantages and drawbacks of the approach. representing the purchase of an item of the corresponding category. but he/she may like both jazz and rock 'n roll (behavioral). However.

Descriptor: de Correlate Class Behavioral Behavioral Behavioral Class: Behavioral P(de | ti) 0. They are a flexible mechanism to let the user manipulate the way the system recommends items.0 1. 1973). for instance. Therefore.(0. This is analogous to having an ontology and using it to retrieve the relative importance of relationships. ts and tu.60 Importance 1...0 1.. The system computes a score for the descriptor that ranges from not similar (0) to very similar (1). um} and a list of descriptors D={d1. For each item for which we want to define a recommendation strategy. in terms of processing time and . Figure 3: Descriptor of item Id Validation and Discussion The first tests made in our validation experiments compared the performance of our approach with collaborative filtering. making this a flexible mechanism to exercise some control over the way the system recommends items. d2.97 This method is based on the assumption that any term matching the user's terms should increase the confidence that the descriptor holds the most appropriate recommendation..25 * 0. The situation here is the same as in numerical taxonomy (Sneath and Sokal.4) = 0. One extra column has been added to the descriptor to represent the importance of each term. i.t2. In the work of Middleton et al. The figure below gives an example of a descriptor de.tk} which match the rated items and demographic features of the user ui. our learning algorithm treats them in the same way when determining the correlation among features and items. this is circumstantial evidence for the soundness of the underlying design choices. We use confidence as a correlation factor in order to determine how relevant a piece of information is to the recommendation of a given item. u2.. In a real-life example. the descriptors can be learned through the analysis of actual users’ records. which may be useful if we want to differentiate the relevance of different types of attribute. a descriptor is created with the item defined as its target. All demographic features and behavioral data gathered are used to fill-in a user descriptor.produce taxonomic clusters C (analogous to sets of items offered by a recommender system once a user has selected one member of C) that satisfy the users. Then. the relations of document authorship and project membership can be selected in order to identify communities based on publications and project work. This is the same as computing the conditional probability P(dj|e). and consequently by the actual descriptors...3 * 0. These attributes are inherited by the classes’ subclasses. a competitive mechanism starts in which the system computes a similarity score for each descriptor dj by comparing it with the user descriptor and finding the list of terms T={t1.e. where distances between items id in a multidimensional space of attributes are given by metric functions where the choice of distinct dimensions should obviously aim to avoid terms that have mutual dependences. That expression contains an assumption of independence of the various tp . Given a list of users U={u1. • default value: a feature considered to be present in the absence of all features belonging to the class. the recommendation process starts with the gathering of information about a given user ui to whom we want to make recommendations. the probability that the item represented by descriptor dj is rated positively by a user given evidence e.. Next.75 0. 1994) and computed as 1 – P(dj | tp). The score computed for this user for descriptor de would be: Score(de) = 1 . with three terms matching the user’s rated items and demographic data: tr. the metric cannot . subject to not exceeding the maximum value of 1.which the designer of a practical system should be trying to achieve in the choice of terms. (2002)..dn}. Therefore. This process continues until all descriptors have been created... let us suppose that we have a certain degree of confidence that a customer who buys a nail file will also want to buy nail polish.0 tr ts tu The Recommendation Process The goal of the recommendation process is to find one or more items that match the user’s preferences.• importance: how pertinent the features belonging to the class are in the search for appropriate recommendations. The Learning Process Behavioral data and demographic features are represented in our model in the same fashion. a concept used in noisy-OR probability models (Pradhan et al. according to the formula: Score (dj) = 1− Π (Noise (tp)) k j where Score(dj) is the final score of the descriptor dj.. Knowing that this customer is a woman should increase the total confidence..except occasionally by accident . the confidence between the target and other existing demographic features and behavioral data is computed.70 0. If the aim fails. Noise(tp) is the value of thei noise parameter of term tp. This attribute’s value may be inherited by the term’s class. Ultimately the test of the assumption is in users’ perception of the quality of a system’s recommendations: if the perception is that the outputs are fully satisfactory.

for larger values of k (or simply larger numbers of users) the performance of the nearest-neighbor algorithm degrades. A few hours was needed for the system to make the whole set of recommendations. the nearest-neighbor approach may become impractical. The MSWeb database. to perform the tests. Table 4 contrasts the item descriptors with other approaches that are frequent in recommender systems. We selected 5 items randomly from each test user and tried to identify the remaining items rated. Table 1: Scoring results for the MovieLens data set 1 Table 2: Performance results for the MovieLens data set Method Time spent in secs. At present the system is also being applied to a large business-to-business website. though showing a lower rate of accuracy. We are carrying out an additional set of experiments using public databases from Blake et al. was also used in our validation experiments. the nearest-neighbor approach needed less time than the item descriptors to perform the tests.3 37. Sarwar et al. the item descriptors showed an accuracy rate matching that of the Bayesian networks. Table 2 summarizes the results of the experiment.7 We carried out the experiments considering neighborhoods with sizes 1. storing anonymous ratings of 3900 movies assigned by 6040 users in the year 2000.3 54.2 43. available from Blake et al.8% (Bayesian network) for this same problem. The results obtained are presented below. However. 32 14 43 86 Item Descriptors k-nearest-neighbor (k=1) k-nearest-neighbor (k=20) k-nearest-neighbor (k=40) Method Item Descriptors k-nearest-neighbor (k=1) k-nearest-neighbor (k=20) k-nearest-neighbor (k=40) Scoring 65. For the MovieLens database.8 53. In order to evaluate the system’s performance.7 39. (1998) in order to compare our method further with other recommendation techniques. we monitored how much time was spent by the system in 2 order to recommend the 2114 items in the test data set . Breese et al. and we expect to be able to use some of the data from that site in order to evaluate the use of demographical and behavioral data in other research experiments. In this experiment. 1 MovieLens is a project developed in the Department of Computer Science and Engineering at the University of Minnesota (http://movielens. employing the Mean Absolute Error (MAE) method to measure the accuracy of item-based recommendation algorithms. we tested the nearest-neighbor through access to an actual database.edu). Table 3: Scoring results for the MSWeb data set Method Item Descriptors k-nearest-neighbor (k=1) k-nearest-neighbor (k=20) k-nearest-neighbor (k=40) Scoring 59.accuracy. . while that of the item descriptors remains stable.umn. 2 The tests were performed on a PIII 500MHZ PC with 128Mb of RAM. This database contains web data from 38000 anonymous users who visited Microsoft’s web site over a period of one week. 20 and 40 (we did not observe any significant improvement in accuracy for the nearest-neighbor algorithm with neighborhoods larger than 40). The item descriptors performed better than the k-nearest-neighbor algorithm. In our experiment. For k=1. (2001) have carried out a series of experiments with the MovieLens data set. (1998). The results obtained are presented below. In more realistic situations where the nearest-neighbor algorithm may have to access a database containing actual users’ transactions. we only took into account whether a user rated (1) or did not rate (0) an item.9 59. according to different criteria.8% (clustering method) to 59.9 The item descriptors were more accurate than the k-nearest neighbor algorithm regardless of the value chosen for k. using k=10. we selected 10 films randomly from each test user and tried to identify the remaining films the user rated. The results reported cannot be compared directly with our own as the authors computed their system’s accuracy using the MAE and considering integer ratings ranging from 1 to 5 (reaching values around 75%). (1998) reported results showing that the accuracy of other predictive algorithms varied from 54. For the same experiment described above. We used the MovieLens database . no matter what size of the neighborhoods was chosen.

Recommending Regarding the process for making recommendations. Content-based strategies normally have their knowledge bases built manually or through some text-mining technique. collaborative filtering may go through actual purchase records in the database in the search for users who are similar to the current user. The learning mechanism used on the item descriptors also exploits well-known methods to compute correlation factors and define the strength of the relationships among features and items. This is particularly important when the user wants to make the system respond in a certain way in given circumstances. Association rules use well-known inductive learning algorithms. in that both methods search for a problem solution in the history of actual user cases.. As stated earlier.Table 4: Comparing recommender systems’ approaches Collaborative Filtering Representing knowledge Learning Actual records of rated items Finding neighborhoods Contentbased methods Related terms and keywords Manual / Text mining AssociItem ation Rules Descriptors Association Correlated rules items and terms Inductive learning Computing confidence and other correlation factors Looking for items with the highest correlation factors Learning Although collaborative filtering does not have any standard learning algorithm. any such technique may present performance problems (a poor level of scalability).g.. For large databases with hundreds of thousands of users. descriptors are interesting because they make it easy for users to understand as well as modify the knowledge represented. The information used to describe items in content-based methods may be defined manually or through some text mining technique (Mooney. The item descriptor approach is different in that it represents knowledge in the form of descriptors and correlation factors. 2001. 1997). At present. Such an approach elicits information about the items. or to include business rules in its knowledge base. Although the system learns and updates its descriptors in an offline process (therefore not critical for the application to recommend items in real time). the technique usually employs some clustering method to find neighborhoods and use them in order to accelerate the search for similar users in the database. Balabanovic and Shoham. 1997). The recommendation process used by item descriptors selects the descriptor showing the highest correlation score when . Above all. Recommending Finding similar users and recommending items they have rated Searching for items with matching terms and keywords Triggering rules and keeping the outcome with the highest value Representing knowledge The collaborative filtering approach represents knowledge through actual records of users’ purchases. Content-based methods make recommendations by retrieving items with similar descriptions. The main advantage of using such learning methods relies on the robustness and stability of the algorithms available. The option to use term confidence instead of conditional probability to describe the model comes from the fact that other correlation factors that are not supported by probability theory are computed by the system. However. scanning texts and associating terms and keywords with items. with the help of information-retrieval techniques. 2000). Alternative methods which combine contentbased methods and collaborative filtering have been proposed in order to minimize this weakness (Popescul et al. 1994). it finds recommendations by triggering rules that match information known about the user. This is somewhat analogous to case-based reasoning (Watson. When compared with the other approaches in this respect. to extract knowledge from databases. such as interest and conviction (Brin et al. The challenge for knowledge-based personalization is always to find ways of adding detailed improvements under the strong constraint that they do not change the practical computational feasibility of the application. Concerning the recommendation process used by association rules. at present these are provided only to let the user analyze and validate the knowledge extracted from the database. our learning algorithm is fairly simple and fast. it is faster than algorithms that group evidence and try to compute the relevance of each item and then of each group of evidence. the main disadvantage of this approach when used on its own is that it relies solely on the item’s characteristics without taking account of other information about users’ preferences. such as a priori (Agrawal and Srikant. but not the users’ preferences. logic-based) extended automated reasoning might improve the quality of the personalization but would also be computationally too expensive to be worth considering in anything larger than a toy demonstration. the type of information represented in the descriptors does not contain any forms of knowledge on which more complex (e. Association rules store knowledge rules which are extracted from a database using some inductive learning algorithm. We are currently testing different variations on the combination of these factors in the reasoning process. 1997).

The recommendation method we use has the peculiarity of computing the correlation of individual terms initially. 1998.. while keeping their representation unchanged. Initial results have shown that the approach is very effective in large-scale practice for purposes of personalization. 1988). Issue 2.. M. San Antonio. and then combining them in real time.compared to the user’s information. Both models are based on the assumption that an output is statistically independent of previous outputs. Current research on the identification of implicit user's interests also shows that recommender systems will have to manipulate different sorts of data in order to infer users’ preferences ( Claypool et al.J. as we have remarked above in our comments on indepencence). Fast Algorithms for Mining Association Rules. Ullman. Morgan Kaufmann Publisher. Le. Texas. Both of these processes are less susceptible to scalability problems. Vol. Brazil.e. and Tsur. R. The model used to represent different types of information (demographic or behavioral) in a similar way is another relevant contribution of this project. Motwani. and Srikant. USA.. An educational system to assist students in learning algorithms is being implemented at the Department of Computer Science of the University of Caxias do Sul. followed at run time by finding associations between the rules. and Wallace. J. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence. Collaborative Recommendation. and Merz. 1984. This assumption may be limiting in given circumstances. Allison. Gallant. 1984). as well as according to his/her more general learning profile. important advantage of the model. Y. following the idea that courses that work well for one student will also work well for other similar students (Schank. . Our model may also be compared with Hidden Markov Models (HMM). B before A.. Georgeff. M. 26(2):255. C. J. 1997). 1998. I. Proceedings of Intelligent Tutoring Systems ‘98. we consider the product of noise parameters. P. Vol 40. UCI Repository of machine learning databases [http://www. Heckerman. 1452. We are also starting to investigate the use of our model in making recommendations in educational systems.g. S. Brown. 2000. This is a good technique to avoid computing the relevance of all possible associations among terms in the learning phase. 2000). SIGMOD Record (ACM Special Interest Group on Management of Data). and Waseda. and Shoham.edu/~mlearn/MLRepository. Communications of the ACM. Edgoose. A similar use of the function can be found in research on expert systems (Gallant.. the actual methods used to compute probabilities of events are different: while HMM considers the product of the probabilities of individual events.. Communications of the ACM. No 3. To take one practical example. P. M. July. Buchner A. We intend the system to be able to recommend exercises. 1988.html]. CA: University of California. C. IEEE Internet Computing. Irvine. 5(6):3239. From a practical point of view.uci. M. 1999). European Conclusions One important contribution of this work has been the use of the method for calculating the relevance of terms individually. Computers and Chemistry 24(1):43-55. Chile. item recommendation. However. S. Balabanovic. R. Mulvenna. 1994. However. Blake. L. Madison. we do not believe this to be a serious problem (e. and Kadie. S. Lecture Notes in Computer Science. and then combining them at recommendation time through the use of the noisy-OR function. References Agrawal. The two models are similar in that both use probability theory to determine the likelihood that a given event takes place.. Content-based. Dynamic itemset counting and implication rules for market basket data. Stern. Discovering Behavioural Patterns in internet files.. Breese. R. In Proceedings of the 20th International Conference on Very Large Databases. 31. C. 2001). L.. Sequence Complexity for Biological Sequence Analysis.. but not in applications for recommender systems.. i. Santiago. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. or A before B). WI. A General Selection Criterion for Inductive Inference. D. This is analogous to finding first a set of rules with only one left-side term. the probability that a user purchases item C is very rarely dependent on the order in which users have bought other items (e. or the discovery of patterns in DNA sequences (Allison et al. Department of Information and Computer Science.. M. 1997. as association rules and item descriptors keep generalized knowledge about when to recommend each item (instead of having to deal with actual records in the reasoning process).ics. D. 1997. D. Claypool. employed in tasks such as the inference of grammars of simple language (Georgeff and Wallace. S. C. 16-19. further research is needed to refine the reasoning process so as to let it differentiate the way single and multi-valued attributes are used. but for the type of application we have chosen. Previous work in the field has shown the importance of dealing with and combining such types of knowledge in recommender systems (Pazzani.g. dealing with single and multi-valued attributes in a uniform manner is another . Inferring User Interest. T. and Dix. 2001. T. Connectionist Expert Systems. Brin. texts and other contents according to the student’s own characteristics.L.

. Popescul. 1999. In Proceedings of the 10th International World Wide Web Conference .. M. Seattle..WWW2002. Proceedings of the Conference on Computer Supported Cooperative Work.Conference on Artificial Intelligence – ECAI 84. 13(5-6): 393-408. Sarwar.. and Lawrence. Hawaii. J. Schafer. Morgan Kaufmann. N. R. Pennock. Ungar. Amazon. Alani. J. M. ACM Conference on Electronic Commerce. A. D. J... H. and Henrion. D. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. Philadelphia.. Karypis.. and Reidl. B. Collaborative Recommendation via Adaptive Association Rule Mining. 1992.com Recommendations – Item-to-item Collaborative Filtering. 1997. USA. Alvarez. 37-45. M. G.. G. 1973. Borchers. Virtual Learning: A Revolutionary Approach to Building a Highly Skilled Workforce. H. D. J. A. and York. Provan. and Nakagawa. Content-based book recommending using learning for text categorization. M.C. S. KDD-2000 Workshop on Web Mining for E-Commerce. D. S. Mooney. J. Explaining Collaborative Filtering Recommendations. B. M. USA.. J.. Sementic Web Workshop 2002. A Framework for Collaborative. Hong Kong. San Antonio. Konstan. P. and Ruiz. 1999. S. B.. Recommender Systems in E-Commerce. Communications of the ACM. L. M. M. A. In Proceedings of Uncertainty in Artificial Intelligence.. In Proceedings of ACM Conference on Computer Supported Cooperative Work. and Riedl. Pisa.E. and Riedl. San Francisco. Artificial Intelligence Magazine 18(2). Proceedings of the Fifth ACM Conference on Digital Libraries. B. CA: Freeman. and Riedl. Konstan. C. 2001. Georgia.. MA. T. Washington. Seattle. Boston.. Lin.. D.. Oki. Vol 35. NY: ACM Press. USA. J. Herlocker. G. ACM. 2000. J. Exploiting synergy between ontologies and recommender systems. R. Schank. Content-Based and Demographic Filtering.R. New York. Miller. 2001. Colorado. New York. NY: McGraw-Hill. D. Morgan Kaufmann. New York. A. USA. 2002. J. Effective Personalization Based on Association Rule Discovery from Web Usage Data. 1994. 7(1):76-80. Sarwar.WWW10. Num 12.. Atlanta. Konstan. Using Filtering Agents to Improve Prediction Quality in the GroupLens Research Collaborative Filtering System. Konstan. Smith. US. B. Goldberg. Pazzani. Washington. Pennsylvania.. J.. Krulwich. and Terry.. Using collaborative filtering to weave an information tapestry. Nichols. W. . In: Proceedings of The Eleventh International World Wide Web Conference .. 473482. B. B. USA. Linden. J. Pradhan. Washington. Shadbolt. Denver. Artificial Intelligence Review. Middleton. and Sokal.... Mobasher. 1997. IEEE Internet Computing. and Roure. Item-based collaborative filtering recommendation algorithms. Herlocker. 2000. Numerical Taxonomy: The Theory and Practice of Numerical Classification. 2001. LIFESTYLE FINDER: Intelligent User Profiling Using Large-Scale Demographic Data. Middleton.. A. USA. Proceedings of the ACM Workshop on Web Information and Data Management. Sneath. J. Dai.. R. 1998. M. H. 2003. J. Seattle. 2000. B. R. B. Knowledge engineering for large belief networks. Luo.. Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments.