You are on page 1of 5

Short dissertation

TV programme recommendation
Marco Lisi 188601 Introduction
The digital television (DTV) technology allows improvements in the quality of audio and video, offers services and applications and increases the offer of TV content. This increase allows the TV viewers to find a great variety of content and an overload of information. The users do not necessarily benefit from this overabundance of available content; instead, they find it difficult to retrieve the right program they would like to watch. In order to solve this problem, Electronic Program Guides (EPGs) are available for the TV viewers look for favourite programs. These, however, usually only list all the available programs and therefore represent an approach that does not solve the problem. What the users would really need is some sort of personalized program guide that would recommend them only those programs that they are likely to be interested in. So, the researches focus the development of content recommendation systems, which is able to propose the TV content corresponding to the viewers interests. In this dissertation, two papers [1] [2] will be discussed that deal about this topic. Follows a short review of both papers and after a comparison between the two.

Personalization for Digital Television using Recommendation System strategy: brief review
This paper describes a recommendation system for multi-user environments to offer the personalization. The focus of recommendations is not on each viewer but on the group as a whole. For this purpose the RePTVD (Personalized Recommendation for Digital Television) is presented. This system offers personalized recommendation of content for groups of viewers from a same place and considers the preferences discovered by the viewing history. The user preferences are totally implicitly collected and, from this information, the system identifies different behaviour patterns in the group and defines the best recommendations.

In order to discover the behaviour pattern in the group viewing history, data mining technique was used. After analysing three possible data mining algorithms, the association
rules implemented by the Apriori algorithm was chosen, since it is better adjustable to the DTV domain due to the quantity and variety of rules what was relevant for this context. The

data remain in the set-top Box (STB), no connection to any server is needed to apply mining technique.
The analysis was carried out through measuring the system accuracy of presenting relevant content to the viewers expectations. In the formula = /, corresponds to the efficiency of the system and varies from 0 a 1, is the number of viewed recommendation and is the number of performed recommendation. The data were extracted from a sample of thirteen viewers subdivided in four groups. The viewing behaviour was collected according to the open TV real program guide. During four weeks, the group viewing historical was submitted to the recommendation system. For each day the system collected the viewing behaviour and performed the mining in the end of that day. After that, it was performed the data filtering according to the obtained rules, which were compared the program guide of the following day, with the aim of creating a recommendation list. This way, the system efficiency accounting was measured using the implicit feedback, i.e. if the viewer would have chosen to watch the recommendation offered.

In Figure 1, the evaluation was related to the accuracy of all the groups during the four weeks. Considerations will be reported when compared with the results of the other paper.

Figure 1

Towards TV Recommender System: Experiments with User Modeling: Brief review

In this paper it is discussed how to provide the right recommendations in the digital TV environment. It is treated in detail the question of proper user modelling that will enable to learn the patterns of users preferences. It shows that by applying some results from the field of information retrieval, it is possible to build a simple yet robust system that could provide helpful recommendations to its users; follow an explanation of the method: First of all, it is useful that the features are expressed in numeric form. Quantified features could be ordered in a vector structure that would then serve as a unique descriptor for the observed item; the idea is to obtain a set of TV program features by use of the metadata information from MPEG-7 standard. While some metadata are already numerical, others, like title, genre, and cast are textual and need to be converted to numerical equivalents. The solution is found in the information retrieval: it is possible to define a set of relevant features that would apply to all available items. This leads to the high-dimensional vector spaces, where each coordinate (i.e. basis vector) represents one program feature. In order to find the program(s) of expected interest, the user modelling and program retrieval algorithm shown in Figure 2 is applied. TV program pi is represented by n-dimensional vector (pi1, pi2,..., pin) whose coordinates are,

Figure 2

as said before, program features. The user model is given by = (1, 2,..., n) and it could be regarded as target program in the observed vector space . The correspondence of some program pi to the user model is given with similarity function:

Recommendation agent, stored within users TV receiver (or in the set-top box), keeps the record on m programs p1, ..., pm that the user has previously watched, p1 being the most recent one. It uses these data to estimate the optimal user model ; after estimating the user model, the agent then searches over the available programs to find few with the greatest similarities to . These programs are finally recommended to the user. The coordinates of the optimal user model will then be given with


Where wi are optional weighting coefficients that assign unequal importance to the watched programs. Should the user accept any of the recommended programs, it would be considered as positive feedback and the current user model would be updated to include that program. Otherwise, if the user is not satisfied with the program he is currently watching, he would press a Dislike button on the remote control and that program would not be included in his profile; negative feedback is therefore stated explicitly. For evaluating the performance, it is used a dataset whit movies recommendations, consisting on rating given by users. Between these users, 80 were chosen randomly, who generated 22402 interactions with the system. The distribution of these ratings is given in Table I. The average rating was 3.6.

For each movie, a genre description according to the data from Internet Movie Database is provided. This description is used to generate feature vectors as follows. Each movie is represented as an 18-dimensional vector space, where each of the coordinates stands for the presence of a particular genre of the movie. Should any of these features exist in the description of the observed movie, a 1 is assigned to the matching coordinate; otherwise, its value is 0. Finally, the feature vectors are normalized to unit magnitudes. The user interactions were sorted in chronological order and the series of Monte Carlo simulations was then conducted. These included user modelling, making a list of three recommendations and observing the users reaction. This reaction is considered to be positive

should the users rating for any recommended item be at least 3. Otherwise the recommendation was considered to be unsuccessful. Four weighting schemes were considered: 1. wi = 1, which is equivalent to no weighting; 2. Users ratings (1-5) were used as weights; 3. wi = 0.8mi (with m size of history window), older interactions were favoured; 4. wi = 0.8i1, newer interactions were favoured. To evaluate the results, the recommendation success rate is taken into consideration, defined as the percentage of the interactions that were finished successfully (users rating was at least 3). Figure 3 shows how the success rate depends on the window size m. Considerations of the results are reported in the section below.

The first thing that must be noticed is that the two papers propose two different methods for reaching the same purpose: the first works whit data mining algorithms while the second with vector space model from the information retrieval. Apart from this, there are several arguments that are dealt from both the authors. In [2], we find a definition of content-based systems and collaborative systems: Content-based systems recommend those items that resemble the ones the user liked in the past, while collaborative systems recommend the items that the other users with similar tastes liked in the past. It is clear that in [1] collaborative system can be used since the research is developed for multi-user environment; but also in [2] it is considered the possibility to work on viewing groups using collaborative approach. Both [1] and [2] agree on the fact that work with implicit methods it would be preferable when possible (i.e. the user should not give input to the system which works in an automatic way). In [1] a total implicit method is adopted, while in [2] it takes into account the possibility of obtaining a relevant feedback by the user through a Like/Dislike buttons on the remote control of the system. According to this, its possible to see in the results obtained in [2] that the success rate for no weighting and weighted by users ratings are quite the same: this justifies our opinion

that there is no need for the users to express their feedback by rating the

recommendations, so no explicit feedback is needed to develop a good recommendation system.

Analysing the results of both papers, its possible to notice the cold start or the new user problem, which reflects the fact that the system can hardly recommend an item to the user who has not generated enough interactions for his preferences to be learned. In fact, the result improve with time [1] or with the growth of the history window [2] (in which all previous watched programs are stored), clearly because the system has more data to work on. Both methods are developed to work with existing hardware (TV receiver or set-top-box) and not sending data to any server to apply the recommendation techniques: the personalized Electronic Program Guide should rely on techniques that could be applied locally in the users TV receiver or STB.

Considerations and conclusion

Both paper shows valid methods to implement a recommendation system for DTV, both data mining association algorithm and information retrieval had success, but seeing the result its quite evident that in [1] the implemented method does not achieve a large efficiency (0.56 above all groups overall). This is probably due to the fact that the accurate identification of a member of a group is a challenge and it could be difficult to insert in a defined group users whose preferences fall outside the existing niches. So, definitively, working on collaborative systems it could be tough and not always lead to desired results while content-based techniques should be dominant for digital TV recommender systems. It is possible to imagine that in the near future technologies will allows this kind of systems to work on-line, communicating with server, so that a possible recommendation system can have access to a huge database of information (and also computation algorithms could work on servers) and not only limit to the preferences evaluated in a single device (STB). The development and future work should take in consideration these possibilities.