You are on page 1of 7

Complete Genre Listing

Exploring Implicit Hierarchical Structures for Recommender Systems
Action & Adventure
Action Classics
Anime & Animation
Animation for Grown-ups
Comedy
African-American Comedies
Action Comedies Anime Action Best of British Humor
Action Thrillers Anime Comedy Cult Comedies
Adventures Anime Drama Dark Humor & Black Comedies

Blaxploitation
Suhang Wang, Jiliang Tang, Yilin Wang and HuanTearjerkers
African-American Action Anime Fantasy
Anime Feature Films
Liu Foreign Comedies
Late
Historical Night Comedies
Documentaries Slashers and Serial Killers
Blockbusters Anime Horror Latino Comedies
Indie Documentaries Supernatural Horror
School of Computing, Informatics, and Decision Systems Engineering
Comic Books and Superheroes Anime Sci-Fi Gay & Lesbian
Political Comedies
Military Documentaries Teen Screams
Crime Action Anime Series Gay & Lesbian Comedies
Romantic
Miscellaneous Comedies
Documentaries Vampires
Deadly Disasters
Espionage Action
Arizona State University, USA
Kids' Anime Gay & Lesbian Dramas
Saturday Night Live
PBS Documentaries
Gay & Lesbian Romance
Werewolves
PoliticalScrewball
Documentaries Zombies

Heist Films
{suhang.wang,
Foreign Action & Adventure Classics
jiliang.tang, yilin.wang.1, huan.liu}@asu.edu
Classic Comedies
Foreign Gay & Lesbian
Slapstick
Rockumentaries
Indie Gay & Lesbian
ScienceSpoofs and Satire
and Nature Documentaries Romance
Martial Arts Classic Dramas Gay African-AmericanHome Romance
Social &Sports Comedies
Cultural Documentaries Books
Military & War Action Classic Sci-Fi & Fantasy Stand-Up
Sports Documentaries Lesbian Foreign Romance
Super Swashbucklers Classic Thrillers Travel &Mockumentaries
Adventure Documentaries Bisexual Search: Foreign Steamy Romance
Westerns Classic War Stories Showbiz Comedies
Mockumentaries LOGO Indie Romance
Home > Books > Antiques & Collectibles
Classic Westerns Romance Classics
Children & Family Epics Music & Musicals Categories Books
ForeignFaith & Spirituality Romantic Comedies
Ages 0-2 Abstract Film Noir ForeignFaith
Action &&Spirituality
Adventure Feature Films Classical Music Antiques & Collectibles
(30302) Romantic Dramas
30302 products found in
Sorted by: Bestselling |
Ages 2-4 Foreign Classics Foreign ArtInspirational
House Stories Classical Choral Music Americana (415) Steamy Romance
Blue Book Dolls
Ages 5-7 Foreign Classic Comedies Religious &
Foreign Children & FamilyMythic Epics Classical Instrumental Music Art (1583)
Items in real-world recommender systems exhibit
Ages 8-10 Foreign Classic Dramas Religious & Spiritual Dramas
Foreign Classics Opera & Operetta Autographs (145)
Books (1367)
Sports & Fitness
Paperback)Jan Fo
Paperback, 1980
Buy: $0.75 Save
Ages 11-12 Foreign Silent Films Religious Comedies
Foreign Classic Comedies & Satires Sacred Classical Music Baseball
certain hierarchical structures. Similarly, user pref-
Animal Tales Silent Films Faith
Foreign & Spirituality
Classic DramasDocumentaries Country & Western/Folk
Bottles (288)
Basketball
Buttons & Pins (66)
Postsecret : Extra
Book Characters Inspirational Biographies American Folk & Bluegrass Extreme Sports
Care & Restoration (122)
erences also present hierarchical structures. Re-
Cartoons Drama
Foreign Silent Films
Religion & Mythology Documentaries
Foreign Comedies Classic Country & Western
Clocks & Watches (746) Lives by Frank W
BMX & Extreme Biking
Coins, Currency & Medals
Hardcover, 2005
Buy: $1.79 Save
Coming of Age African-American Dramas
cent studies show that incorporating the explicit hi- Spiritual
Foreign Regions Mysteries New Country (3808) Extreme Combat & Mixed M
Dinosaurs Biographies AfricaInspirational Music Inspirational Music Comics (236)
Extreme Motorsports
Dolls (986)
erarchical structures of items or user preferences
Disney Classic Dramas
Courtroom Dramas
ArgentinaGospel Music Gospel Music Figurines (139) Extreme Snow &The IceJedi
Sports
Path : A
Daniel Wallace (2
Education & Guidance Inspirational
Australia/ New Zealand Rock & Pop Inspirational Rock & Pop Extreme Sports Compilation
Firearms & Weapons Wallace
can improve the performance of recommender sys-
Family Adventures Crime Dramas Belgium New Age New Age
(1406)
Furniture (338)
Hardcover, 2010
Mountain Biking Buy: $45.00
Family Animation Dramas Based on Real Life Brazil Sacred Classical Music Sacred Classical Music General (6970) Mountaineering & Climbing
tems. However, explicit hierarchical structures are
Family Classics Dramas Based on the Book China Sacred Folk & Traditional Music Sacred Folk & Traditional Music
Jewelry (416)
Skateboarding Standard Catalog
Glass & Glassware (1485)

Family Comedies Dramas Based on Bestsellers Judaica Jazz & Easy Listening Stunts & Generaland Jim Supica (
Mayhem
usually unavailable, especially those of user prefer-
Family Dramas Dramas Based on Classic Literature
Czech Republic
Kids' Inspirational
Eastern Europe Afro-Cuban & Latin Jazz
Kitchenware (361)
Football
Magazines & Newspapers
Richard Nahas
Hardcover, 2007
Buy: $24.92
Dramas Based on Contemporary Classic Jazz (46)
Golf
ences. Thus, there’s a gap between the importance
Family Sci-Fi & Fantasy
Kids' Music Literature
France Inspirational Sing-Alongs
Inspirational Stories for Kids Contemporary Jazz
Military (34)
Martial Arts, BoxingMy&Secret
Wrestling
Germany Non-Sports Cards (39) : A Po
of hierarchical structures and their availability. In
Kids' TV Family Dramas
Foreign Dramas
GreeceMindfulness & Prayer Jazz Greats
Swing & Big Band
Boxing
Paper Ephemera (104)
Performing Arts (149)
Extreme CombatBuy:
Hardcover)Frank
Hardcover, 2006
& Mixed M
Nickelodeon Hong KongHealing & Reiki $0.75 Save

this paper, we investigate the problem of explor-
Teen Comedies Gambling Dramas India Meditation & Relaxation Vocal Jazz
Political (35)
General Martial Arts
Popular Culture (553)
Teen Dramas Gay & Lesbian Dramas Iran Prayer & Spiritual Growth Vocal Pop Karate
Porcelain & China (790)
ing the implicit hierarchical structures for recom-
Teen Romance Indie Dramas Israel Karaoke Postcards (1250)
Kung Fu
A Lifetime of Sec
(2007, Hardcover

mender Latino Dramas
systems when they are not explicitly avail- Italy(a) Faith & Spirituality
Horror (b) Music & Musicals (c) half.com
Latin Music
Posters (928) Hardcover, 2007
Martial Arts & Boxing
Pottery & Ceramics (908) Worko
Buy: $3.46 Save
Documentary Medical Dramas JapanB-Movie Horror Afro-Cuban & Latin Jazz Self-Defense
Radios & Televisions (see

able. African-American Documentaries Military & War Dramas Creature Features also Performing Arts) (33)
Brazilian Music Tai Chi & QigongEmpty Mansions
We propose
Biographical a novel recommendation frame-
Documentaries Period Pieces
Korea
Figure 1: Netflix Movie Hierarchical Structure and half.com
Latin Cult Horror
America Latin Pop
Records (159)
Reference (333) Wrestling and the Spending
Crime
work Faith
HSR Documentaries
to bridge Pre-20th Century Period Pieces
the gap, which enables us to Foreign Horror Reggaeton Motorsports Dedman and Pau

& Spirituality Documentaries 20th Century Period Pieces
Book Hierarchical Structure
Mexico
Asian Horror Rock en Español
Rugs (31)
Silver, Gold & Other
Auto Racing
Dedman, Paul Cla
Hardcover, 2013
Middle East Metals (899) Buy: $9.00
captureInspirational
the implicit hierarchical
Biographies
Religion & Mythology Documentaries
structures of users
Political Dramas
Romantic Dramas
Italian Horror
Netherlands
Frankenstein
Traditional Latin Music Car Culture
Sports (see also headings
under Sports Cards) (167)
Music Lessons Extreme Motorsports
and items simultaneously.
Spiritual Mysteries Experimental results on
Showbiz Dramas Items in real-world recommender systems could exhibit
Philippines
PolandHorror Classics Bass Lessons
Sports Cards (517)
Stamps (1436)
Gunsmithing - th
Motorcycles & Motocross
Paperback)Patric
Paperback, 2010
Foreign Documentaries Social Issue Dramas RussiaMonsters
certain hierarchical structures. For example, Figure 1(a) and
Drum Lessons Other Sports
Teddy Bears (336) Buy: $14.89
two real world datasets demonstrate the effective-
HBO Documentaries Sports Dramas Satanic
Scandinavia Stories Guitar & Banjo Lessons Textiles & Costume (143)
Bodybuilding
1 Cycling
Tobacco-Related (160)

ness ofHistorical
the
Indie
Documentaries
proposed
Documentaries framework. Tearjerkers 1(b) are two snapshots from Netflix DVD rental page . In
Slashers
Southeast Asiaand Serial Killers
Supernatural Horror
Miscellaneous Music Lessons Toy Animals (110) The Overstreet C
Spain Piano & Keyboard Lessons Toys (1316) Horse Racing M. Overstreet (20
Military Documentaries Gay & Lesbian
Gay & Lesbian Comedies
the figure, movies are classified into a hierarchical struc-
Teen Screams
Thailand Voice Lessons Transportation (412)
Wine (28)
Tennis
Paperback, 2014
Buy: $19.89
Miscellaneous Documentaries Vampires
United Kingdom Musicals Miscellaneous Sports
PBS Documentaries Gay & Lesbian Dramas ture as genre→subgenre→detailed-category. For example,
Foreign Werewolves
Documentaries Classic Movie Musicals
Other (17)
Outdoor & Mountain Sports
1 Introduction
Political Documentaries
Rockumentaries
Gay & Lesbian Romance
Foreign Gay & Lesbian ForeignZombies
Dramas Classic Stage Musicals
the movie Schindler’s List first falls into the genre Faith
Contemporary Movie Musicals
Fishing
Hunting
2014 Standard Ca

Indie Gay & Lesbian Foreign Gay
Romance & Lesbian
Science and Nature Documentaries Contemporary Stage Musicals Mountain Biking
Recommender systems [Resnick and Varian, 1997] intend
Social & Cultural Documentaries Gay Spirituality, under which it belongs to sub-genre Faith &
ForeignAfrican-American
Horror Romance Foreign Musicals Mountaineering & Climbing
Lesbian Asian Horror
Foreign Romance
Sports Documentaries
to provide users with information of potential interest based
Travel & Adventure Documentaries Bisexual Spirituality Feature Films and is further categorized as In-
Italian Horror
Foreign Steamy Romance
Must-See Musicals
Show Tunes
Snow & Ice Sports
Extreme Snow & Ice Sports
LOGO Foreign Languages
Indie Romance
on their demographic profiles and historical data. Collab-
Mockumentaries spirational Stories (see the hierarchical structure shown in
Arabic Language
Romance Classics
Rock & Pop Ice Hockey
Music & Musicals
Foreign
orative Filtering (CF), which only requires past user rat-
Foreign Action & Adventure Classical Music
Fig. 1(a)). Similarly, Fig. 1(c) shows an Antiques & Col-
Romantic Comedies
Romantic Dramas 2
ings to predict
Foreign Artunknown
House Classical Choral Music
ratings, has attracted more and
Classical Instrumental Music
lectibles category from half.com . We can also observe hi-
Steamy Romance
Foreign Children & Family
Foreign [Classics
more attention Hofmann, 2004; Zhang et al., 2006; Koren,
Opera & Operetta erarchical structures, i.e., category→sub-category. For ex-
Sports & Fitness

2010]. Collaborative Filtering can be roughly categorized ample, the book Make Your Own Working Paper Clock be-
into memory-based [Herlocker et al., 1999; Yu et al., 2004; longs to Clocks & Watches, which is a sub-category of An-
Wang et al., 2006] and model-based methods [Hofmann, tiques & Collections. In addition to hierarchical structures
2004; Mnih and Salakhutdinov, 2007; Koren et al., 2009]. of items, users’ preferences also present hierarchical struc-
Memory-based methods mainly use the neighborhood in- tures, which have been widely used in the research of deci-
formation of users or items in the user-item rating matrix sion making [Moreno-Jimenez and Vargas, 1993]. For exam-
while model-based methods usually assume that an underly- ple, a user may generally prefer movies in Faith Spiritual-
ing model governs the way users rate and in general, it has ity, and more specifically, he/she watches movies under the
better performance than memory-based methods. Despite the sub-category of Inspirational Stories. Similarly, an antique
success of various model-based methods [Si and Jin, 2003; clock collector may be interested in Clocks & Watches sub-
Hofmann, 2004], matrix factorization (MF) based model has category under the Antiques & Collections category. Items
become one of the most popular methods due to its good per-
formance and efficiency in handling large datasets[Srebro et 1
Snapshots are from http://dvd.netflix.com/AllGenresList
al., 2004; Mnih and Salakhutdinov, 2007; Koren et al., 2009; 2
Snapshot is from http://books.products.half.ebay.com/antiques-
Gu et al., 2010; Tang et al., 2013; Gao et al., 2013]. collectibles W0QQcZ4QQcatZ218176

j) The gap between the importance of hierarchical structures being the characteristic vector of vj . thus they are likely to work. 2006].(1) problem: how to capture implicit hierarchical structures of users and items simultaneously when these structures are explicitly un. there are recommender systems in handling large and sparse datasets [Zhang et al... user preference matrix with U(i. However. 2012. min kW . we choose weighted nonnegative matrix factor- scores. U chical structures of users and items for recommendation. In and V can be learned by solving the following optimization particular.in the same hierarchical layer are likely to share similar 2.1 The Basic Model properties.. j) by WNMF. we investigate the following two challenges . 2013]. Maleszka et al. exploiting explicit hierarchical structures of items or users WNMF decomposes the rating matrix into two nonnegative to improve recommendation performance [Lu et al. and V is the item characteristic matrix with V(:. recently. Similarly. :) being the preference vec- tures are usually unavailable. users in the same hierarchical layer are ization (WNMF) as the basic model of the proposed frame- likely to share similar preferences. j) = U(i. :)V(:. ui to vj is modeled as X(i. 2012. where U is the Maleszka et al. hence they are likely to receive similar rating In this work.. Then a rating score from and their unavailability motivates us to study implicit hierar. Therefore. explicit hierarchical struc. tor of ui .. low rank matrices U ∈ Rn×d and V ∈ Rd×m . recommender systems and has been proven to be effective 2013]. especially those of users. which is one of the most popular models to build rate certain items similarly [Lu et al.

where . (X − UV)k2F + β(kUk2F + kVk2F ) (1) U≥0.V≥0 available? and (2) how to model them mathematically for recommendation? In our attempt to address these two chal.

Since V is non- items. j)-th entry of M. j) is the rating score from ui to vj if ui rates vj . j) = 1 if ui rates vj . The aforemen- chical structures of users and items. A popular which captures implicit hierarchical structures of users and choice of W is . of users and items for recommendation. ties of users [Wang et al. We do not assume the availability of of items as shown in Figure 2(b): hierarchical structures of users and items. Let Ṽq−1 be the latent category affiliation matrix for the Before going into details about how to model implicit hierar. matrices are written as boldface cap. Let U = {u1 . indicate implicit flat structures of users and items respec- tively. hierarchical structure from (q − 1)-layer implicit hierarchical . and then introduce the proposed framework HSR. V ≈ Ṽ2 V1 (2) where m1 is the number of latent sub-categories in the 2-nd 2 The Proposed Framework layer and V1 indicates the affiliation of m items to m1 latent Throughout this paper. a square matrix. which explicitly available. (q − 1)-layer implicit hierarchical structure. j) controls the lenges. . we present a method to solve the op. oth. dation datasets to demonstrate the effectiveness of the we first give details about how to model implicit hierarchical proposed framework. we introduce the proposed framework HSR with the details The item characteristic matrix V ∈ Rd×m indicates the of how to capture implicit hierarchical structures of users and affiliation of m items to d latent categories. contribution of X(i. Since both U and V are nonnegative. the user pref- erarchical structures of users and items simultaneously erence matrix U and the item characteristic matrix V can based on the user-item matrix. we propose a novel recommendation framework HSR.. . sub-categories. In Section 4. . hence the input of V ≈ Ṽ3 V2 V1 (3) the studied problem is only the user-item rating matrix X. matrix for the 2-layer implicit hierarchical structure because M(i. . which is the same as that of traditional recommender systems. vm } be the set of m items. negative. We name Ṽ2 as the latent category affiliation ital letters such as A and Bi . structures based on weighted nonnegative matrix factoriza- The rest of the paper is organized as follows. it indicates the affiliation relation between d latent categories nius norm of M and T r(M) is the trace norm of M if M is in the 1-st layer and m1 latent sub-categories in the 2-nd layer. u2 .2 Modeling Implicit Hierarchical Structures • We provide a principled approach to model implicit hi. and W(i. . In weighted nonnegative matrix factorization. 2011] and clusters of items [Xu et which enables us to capture implicit hierarchical struc- al.. j) = 0. we can fur- tures of users and items when these structures are not ther perform nonnegative matrix factorization on them. For an arbitrary matrix M. j) to the learning process. a coherent model. un } be the set of n Since Ṽ2 is non-negative. We tent category affiliation matrix Ṽ2 to V2 ∈ Rm2 ×m1 and use X ∈ Rn×m to denote the user-item rating matrix where X(i. The major contributions of this paper are summarized next: 2. In Section 3. which have been widely used to identify communi- • We propose a novel recommendation framework HSR. we would like to first tioned process can be generalized to get the q-layer implicit introduce the basic model of the proposed framework. we can further decompose the la- users and V = {v1 . tion. j) = 0 items based on the user-item matrix and integrate them into otherwise. . j) denotes the (i. .W(i. In this subsection. In Section 2. we present the con- clusion and future work. and may pave the way to model implicit hierarchical structures • We conduct experiments on two real-world recommen. we can further decompose V into two nonnegative timization problem of HSR along with the convergence and matrices V1 ∈ Rm1 ×m and Ṽ2 ∈ Rd×m1 to get a 2-layer im- time complexity analysis. ||M||F is the Frobe. 2003]. we show empirical plicit hierarchical structure of items as shown in Figure 2(a): evaluation with discussion. v2 . denotes Hadamard product and W(i. In Section 5. Ṽ3 ∈ Rd×m2 to get a 3-layer implicit hierarchical structure erwise X(i. .

. to model a p-layer user implicit hierarchical structure. V2 V1 (4) Similarly. . Ui (1 < i < p) is a ni−1 × ni matrix and Up is a np−1 × d matrix. Up−1 Up (5) where U1 is a n × n1 matrix. With model components to model implicit hierarchical structures of items and users. (a) 2 Layers (b) 3 Layers (c) q Layers Figure 2: Implicit Hierarchical Structures of Items via Deeply Factorizing the Item Characteristic Matrix. . we can perform a deep factorization on U as U ≈ U1 U2 . . structure by further factorizing Ṽq−1 into two non-negative matrices as shown in Figure 2(c): V ≈ Vq Vq−1 . the framework HSR is proposed to solve the following optimization problem min ||W .

.. . 1 ≤ i ≤ p. V1 )||2F U1 . .. . i ∈ {1. L(Ui ) = ||W . And strated in Figure 3.t. .. . Ui ≥ 0.V1 .. . . q}  U1 U2 . . p}. The proposed framework HSR performs  Ui+1 . V1 if i 6= p a deep factorizations on the user preference matrix U and the Hi = (9) Vq . . . 2. . . Up Vq . i=1 j=1 s. .Up . 2.. . V1 if i = p item characteristic matrix V to model implicit hierarchical structures of items and users. j ∈ {1. respectively. where Ai and Hi . . Ui−1 if i 6= 1 (6) Ai = (8) I if i = 1 An illustration of the proposed framework HSR is demon.Vq Xp q X 2 + λ( ||Ui ||F + ||Vj ||2F ) Figure 3: An Illustration of The Proposed Framework HSR. Up Vq . .. . are defined as: Vj ≥ 0. (X − U1 . . while the original The Lagrangian function of Eq.(7) is WNMF based recommender system only models flat struc. .. .

(X−Ai Ui Hi )||2F +λ||Ui ||2F −T r(PT Ui ) tures as shown in the inner dashed box in Figure 3. The derivative of 3 An Optimization Method for HSR L(Ui ) with respect to Ui is The objective function in Eq. (10) where P is the Lagrangian multiplier.(6) is not convex if we update ∂L(Ui ) = 2ATi [W .

We will first introduce our optimization (11) method for HSR based on an alternating scheme in [Trigeor..e.. and complexity analysis of the optimization method. (Ai Ui Hi − X)] HTi + 2λUi − P all the variable jointly but it is convex if we update the vari. we get:  T Ai [W . 2004]. ∂Ui ables alternatively. By setting the derivative to zero and using Karush-Kuhn- gis et al. 2014] and then we will give convergence analysis Tucker complementary condition [Boyd and Vandenberghe. i. t) = 0. P(s. t)Ui (s.

(12) leads to the following update rule of Ui as: To update Ui . t)Ui (s.(6) can be rewritten Ai (W . s  T  moving terms that are irrelevant to Ui . By re.1 Inferring Parameters of HSR (12) Update Rule of Ui Eq. t) = 0  3. we fix the other variables except Ui . Eq. (Ai Ui V − X)] HTi + λUi (s.

t) as: Ui (s. t) ← Ui (s. t)  T  Ai (W . X)HTi (s.

t) min ||W . (Ai Ui Hi ))HTi + λUi (s.

(X − Ai Ui Hi )||2F + λ||Ui ||2F (7) (13) Ui ≥0 .

(13) and Eq. to update Vi . is summarized in Algorithm 3.(17) separately. This initializing process Similarly.Update Rule of Vi have p user layers and q item layers. the optimization initialization. After By removing terms that are irrelevant to Vi .1 from line 1 to line 9. The procedure is to first update Vi in sequence and then Ui min ||W . we will do fine-tuning by updating the Ui and problem for Vi is: Vi using updating rules in Eq. we fix the other variables except Vi .

. . A miss- ing rating from ui to vj will be predicted as Xpred (i. Vi as: 0 s Definition [Lee and Seung. . . . 2001].  U1 . .1 from line 10 to line 20. . 1 ≤ i ≤ q. . . are defined as user-item matrix as Xpred = U1 . we will use Mi = (16) I if i = 1 the auxiliary function approach to prove the convergence of We can follow a similar way as Ui to derive update rule for the algorithm.1. V1 . 2001] G(h. . h ) is an auxiliary  T  Bi (W . (X − Bi Vi Mi )||2F + λ||Vi ||2F (14) in sequence alternatively. Following [Lee and Seung. which is summarized in Algorithm Vi ≥0 3. . V1 if i 6= 1 gorithm 3. Vi+1 if i 6= q Bi = (15) U1 . In line 21. Up Vq . . Up if i = q 3. Up Vq .2 Convergence Analysis and  In this subsection. we will investigate the convergence of Al- Vi−1 . j)3 . we reconstruct the where Bi and Mi .

t) function for F (h) if the conditions Vi (s. t) ← Vi (s. X)MTi (s. t)  T T  0 Bi (W .

d) Proof F (ht+1 ) ≤ G(h(t+1) . Vi ← NMF(Ṽi . h(t) ) ≤ G(h(t) . h( t)) (19) p q 1: Initialize {Ui }i=1 and {Vi }i=1 2: Ũ1 . ni ) Lemma 3. G(h.2 [Ding et al. B ∈ R + . (Bi Vi Mi ))Mi + λVi (s. 2006] For any matrices A ∈ 0 5: end for Rn×n k×k + . t)S2 (s.(16) ten in the following form by expanding the quadratic terms 13: update Vi by Eq. Ũi+1 ← NMF(Ũi . 2001] If G is an auxiliary func- Framework HSR. t) G(h. mi ) n X k 0 8: end for X (AS B)(s.1 [Lee and Seung. h ) ≥ F (h). B are 6: for i = 1 to q-1 do symmetric. t) 10: repeat 11: for i = 1 to p do Now consider the objective function in Eq.(15) and Eq. S ∈ R+ . Ṽ1 ← WNMF(X. Vq = Ṽq 0 ≥ T r(ST ASB) (20) s=1 t=1 S (s. q. then F is non-increasing under the update Input: X ∈ Rn×m . the following inequality holds 7: Ṽi+1 . S ∈ R+ k×k k×k and A. p. it can be writ- 12: update Bi and Mi using Eq. d and dimensions of each layer Output: Xpred h(t+1) = arg min G(h. h(t) ) ≤ 3: for i = 1 to p-1 do G(h(t) ) 4: Ui . λ.. h) = F (h) (18) (17) are satisfied Algorithm 1 The Optimization Algorithm for the Proposed Lemma 3.(17) and removing terms that are irrelevant to Ui 14: end for J (Ui ) = T r −2ATi (W . tion for F. t) 9: Up = Ũp .(7).

X)HTi UTi  15: 16: for i = p to 1 do + T r ATi W .

. . U )   X Ui (s. (ATi Ui Hi ) HTi UTi   (21) 17: update Ai and Hi using Eq. the optimization =−2 (ATi (W . V1 0 G(U. . . Up Vq .(8) and Eq.(13) + T r(λUi Ui ) 19: end for 20: until Stopping criterion is reached Theorem 3.3 The following function 21: predict rating matrix Xpred = U1 . t) With the update rules for Ui and Vj .(9) T 18: update Ui by Eq.

1.  X (ATi W .t Ui (s. t) briefly review Algorithm 3. In order to expedite the ap.1. X)HTi )(s. t)Ui (s. t) 1 + log 0 algorithm for HSR is shown in Algorithm 3. Next we s.

public. we pre-train each layer + 0 to have an initial approximation of the matrices Ui and Vi . we further decompose Ũ1 into Ũ1 ≈ U1 Ũ2 and Ṽ1 ≈ Ṽ2 V1 using nonnegative matrix 3 The code can be downloaded from factorization. +T r(λUi UTi ) 2006] to decompose the user-item rating matrix into Ũ1 Ṽ1 (22) by solving Eq. t) To perform pretraining. We keep the decomposition process until we http://www. After that. we first use WNMF [Zhang et al.(1).asu.edu/∼swang187/ . (ATi Ui Hi ) HTi )(s. t)U2i (s. t) proximation of the factors in HSR.t Ui (s.. s.

the Douban dataset and get a dataset consisting of 149. For both s  T  datasets.623 vex function in Ui and its global minimum is movie ratings of 1371 users and 1967 movies. The Ai (W . Furthermore.is an auxiliary function for J (Ui ). it is a con. users can rate movies with scores from 1 to 5.

t) statistics of the two datasets are summarized in Table 1. t) ← Ui (s. Ui (s. X)HTi (s. t)  T  Ai (W .

we choose 4 http://grouplens. and each iteration is O(nni−1 ni + nni m + ni−1 ni m). the update rule for Vi will also monotonically MAE is defined as decrease the value of the objective in Eq. the time complex.(6). j)| M AE = (24) can conclude that the optimization method in Algorithm 3. the cost of random select x% as training set and the remaining 1 − x% initializing Vi ’s is O(td(mm1 + m1 m2 + · · · + mq−2 mq−1 ) as testing set where x is varied as {40. Specifically. we conduct experiments to evaluate the ef- fectiveness of the proposed framework HSR and factors that • UCF: UCF is the user-oriented collaborative filtering could affect the performance of HSR.623 Theorem 3. i. Thus the cost of initializing Ui ’s is and X̃(i... Similarly. to evaluate the rating prediction performance. and RMSE is defined as 3. mean absolute er- J (Ui ) = G(Ui . j) − X̃(i. then the timePcomlexity of fine. MovieLens100K consists of 100. (Ai Ui Hi ))HTi + λUi (s. T denotes the set of ratings we want O(tnn1 d) for i = 1. or MAE value means better performance. Ui ) ≥ J (Ui ) ≥ ror (MAE) and root mean square error (RMSE).3 Performance Comparison of Recommender sum of the costs of initialization and fine-tuning.1 and Theorem 3. sions of layers on HSR.1 |T | converges. benchmark datasets. 4. • MF: matrix factorization based collaborative filtering 4.1. we (i. RM SE = (25) |T | ity of factorization of Ũi ∈ Rni−1 ×d to Ui ∈ Rni−1 ×ni and Ũi+1 ∈ Rni ×d is O(tni−1 ni d) for 1 < i < p. Douban 1371 1967 149. are adopted . 2010] and MovieLens100K 943 1682 100. The overall time conplexity is the 4. Further use the cosine similarity measure to calculate user-user experiments are conducted to investigate the effects of dimen. For line 3 to 5.com/u/17517913/Douban. . J (Ui ) decreases monotonically. j) tions for Algorithm 3.2 Evaluation Settings Proof With Lemma 3.e. similarity. the computational cost of fine-tuning Vi in each it. A smaller RMSE ilarly. m0 = m.1 Datasets tries to decompose the user-item rating matrix into two matrices such that the reconstruction error is min- The experiments are conducted on two publicly available imized [Koren et al. t) Table 1: Statistics of the Datasets (23) Dataset # of users # of items # of ratings Proof The proof is similar to that in [Gu et al.(6). and where in both metrics. Sim.e. where the rating from ui to vj is predicted as an aggre- ing datasets and experimental settings.zip HSR. Similarly. Systems The comparison results are summarized in Tables 2 and 3 for 4 Experimental Analysis MAE and RMSE. We O(td(nn1 + n1 n2 + · · · + np−2 np−1 ). we have (0) (0) (0) (1) (0) (1) Two widely used evaluation metrics. the average MAE and RMSE are reported.3. MovieLens100K 4 and Douban 5 .j)∈T |X(i. We with the state-of-the-art recommendation systems. terms can have a significant impact on the quality of the top- p q tuning is O(tf [(n + m)( i=1 ni−1 ni + j=1 mi−1 mi ) + few recommendation[Koren.000 movie ratings of 943 • WNMF: weighted nonnegative matrix factorization tries users for 1682 movies.j)∈T X(i. X(i. j) denotes the rating user i gave to item j for the decomposition. then we compare HSR gation of ratings of K most similar users of ui to vj .(13) will monotonically decrease the value of the objective in Eq. j) denotes the predicted rating from ui to vj . Note that previous eration is O(mmi−1 mi + mmi n + mi−1 mi n). j) − X̃(i. i. The computational cost of fine-tuning Ui in random selection is carried out 10 times independently. Ui ) ≥ G(Ui .. That is. np = mq =P d.. We begin by introduc.000 thus we omit the details.dropbox. Let n0 = work demonstrated that small improvement in RMSE or MAE n.4 Updating Ui with Eq. 2008]. where tf is the number of iter- ations takes to fine-tune. . The baseline methods in the table are defined as: In this section..(6) is at least bounded by zero . respectively. 2009]. where t is number of iterations takes to predict. In this work. Pp Pq nm( i=1 ni + j=1 mj )]). 60} is this work. We filter users who rated less than 20 to decompose the weighted rating matrix into two non- movies and movies that are rated by less than 10 users from negative matrices to minimize the reconstruction er- ror [Zhang et al.3 Complexity Analysis v uP  2 Initialization and fine-tuning are two most expensive opera- u t (i. .org/datasets/movielens/ WNMF as the basic model of the proposed framework 5 http://dl. 2006]. Since the value P of the objective in Eq. The (line 6 to 8). .

9412 0.7988 0.9953 0.5792 0. We only show results with p = 2 and q = 2.0446 0.7745 0.5685 Table 3: RMSE comparison on MovieLens100K and Douban Methods UCF MF WNMF HSR-User HSR-Item HSR 40% 1.5973 0.9664 0. • HSR-User: HSR-User is a variant of the proposed 4.7363 0.7313 0.7551 0.9578 MovieLens100K 60% 1.7559 0.7284 Douban 60% 0.8103 0.9792 1.7359 0.8392 0.0205 0.6192 0.e.7807 0. HSR-Item only considers the implicit tary information and capturing them simultaneously can hierarchical structure of items by setting p = 1 in HSR.7179 • HSR-Item: HSR-Item is a variant of the proposed chical structures of users and items contain complemen- framework HSR.9325 40% 0. Table 2: MAE comparison on MovieLens100K and Douban Methods UCF MF WNMF HSR-User HSR-Item HSR 40% 0. In this subsection.7403 0. further improve the recommendation performance.7637 0.6407 0.9433 0.8268 0.0615 0.9672 0.7637 0.5726 0.7538 0.7469 MovieLens100K 60% 0.5786 0. 0.6059 0. we investigate the impact of dimensions of implicit layers on the performance of the proposed frame- work HSR.6347 0.7820 0.9681 0.5867 0.5721 0..5767 Douban 60% 0.7304 0. HSR-Users only considers the implicit hierarchical structure of users by setting q = 1 in HSR.722 W . i.7286 40% 0.7225 0.8077 0.4 Parameter Analysis framework HSR.7219 0.

X ≈ W .

73 500 In this paper. Army Research . U2 ∈ 0. der grant number IIS-1217466 and the U.568 200 400 d to be 20 and vary the value of n1 as {100.S.57 similar observations with other settings of p and q. which how to incorporate social network information into the pro- suggest that the improvement is significant.718 0. We perform t-test on these results. the performance is relatively sensitive to m1 . when we increase the numbers of dimensions. We 800 800 m1 1000 100 n1 m1 1000 100 n1 only show results with 60% of the datasets as training sets due (a) RMSE for Douban 60% (b) MAE for Douban 60% to the page limitation and the results are shown in Figure 4. in this work. 0. We fix 500 200 400 500 0. First.94 0.72 MAE 0. Experimental results on two real-world datasets Figure 4: Parameter Analysis for HSR. sults indicate that the implicit hierarchical structures of users and items can improve the recommendation per. 400. Jiliang Tang and Huan Liu are supported by. posed framework. In general. (U1 U2 V2 V1 ) with U1 ∈ Rn×n1 . since we have RMSE 0. 6 Acknowledgements formance. • HSR consistently outperforms both HSR-Item and or in part by.. would like to investigate other basic models.945 0. Based on the results.935 0.735 0.74 5 Conclusion RMSE MAE 0. and V2 ∈ Rm1 ×m . 1000}. demonstrate the importance of the implicit hierarchical struc- tures of items and those of users in the recommendation per- Note that parameters of all methods are determined via formance improvement. we will investigate than WNMF. 300. Suhang Wang.572 Rn1 ×d . 400. 400 600 200 300 400 600 200 300 500} and the value of m1 as {200. negative matrix factorization as our basic model to capture tems outperform the user-oriented CF method and this the implicit hierarchical structures of items and users and we observation is consistent with that in [Koren et al. 600. These re. matrix factorization based recommender sys. 200. the performance tends to first increase and then decrease. 800. V1 ∈ Rd×m1 . Among 0. works are pervasively available in social media and provide • Both HSR-Item and HSR-Users obtain better results independent sources for recommendation. Since social net- 2009]. we choose the weighted non- • In general.95 n1 and m1 . we study the problem of exploiting the implicit 400 400 200 400 600 300 200 400 300 hierarchical structures of items and users for recommendation 200 600 200 800 m1 1000 100 n1 m1 800 1000 100 n1 when they are not explicitly available and propose a novel recommendation framework HSR. These results suggest that implicit hierar. which captures the im- (c) RMSE for MovieLens100K (d) MAE for MovieLens100K 60% 60% plicit hierarchical structures of items and users into a coher- ent model.93 500 0. we make the following There are several interesting directions needing further in- observations: vestigation. cross validation. the National Science Foundation (NSF) un- HSR-Users.

[Koren. Stefanos Zafeiriou. 2007] Andriy Mnih and Ruslan of the authors and do not necessarily reflect the views of the Salakhutdinov. A retrieval. Exploiting and exploring hierar. neering.. Jiliang Tang. Mathematical and Computer Modelling.. 2012. 1135656. In SDM. Jie Zhou. ACM. Communications of the ACM. Probabilistic matrix factorization. An algorithmic pages 2712–2718. 17(4):73–81. 2006] Jun Wang. Ford. 2004. Xin Liu. 2004] Kai Yu.. 2013. In Proceedings of the 12th ACM SIGKDD Recommender systems. In [Tang et al. borhood: a multifaceted collaborative filtering model. anowska. pages 211–225. 2003. ACM. Doc- temporal dynamics. pages ative matrix factorization. 2006. pages 1692–1700.. A probabilistic study of pref- Vandenberghe. James Retrieval Technology. vances in neural information processing systems. 2013. Advances in neural information processing systems. Bernadetta Mi. and H-P Kriegel. and Ngoc Thanh Nguyen. for clustering. and Mar- cel JT Reinders. els for collaborative filtering. In [Wang et al. Liu. tion... In Information [Zhang et al. Konstantinos conference on Research and development in information Bousmalis. Latent semantic mod. 2004] Nathan Srebro. 2001] Daniel D Lee and H Sebastian Se. In Proceedings of the 7th ACM conference on Recommender systems. 2006] Sheng Zhang. and SDM. terval judgments. 2003] Luo Si and Rong Jin.. 22(3):493–521. Machine Learning (ICML-14). Springer. Matrix factorization techniques for recom. 2011] Fei Wang. 2012] Kai Lu. 1993] Jose Maria Moreno- [Boyd and Vandenberghe. ings using non-negative matrix factorization. Ding. pages 549–553.. References [Moreno-Jimenez and Vargas. 2013] Jiliang Tang. 2009. ACM. Jason Rennie. Xia Hu. pages 267–273. pages [Yu et al. 2011. Maximum-margin matrix factoriza- tion. Tao Li. 16(1):56–69. rative recommendation using knowledge integration tools . SIAM. Weihong Wang. 2003. pages 230–237. 2008. Orthogonal nonnegative matrix t-factorizations [Resnick and Varian. framework for performing collaborative filtering. and Chris HQ tems. and Hae. 1999] Jonathan L Herlocker. 2010] Quanquan Gu. 2004] Thomas Hofmann. Collaborative filtering with [Xu et al. and Chris Ding. ACM Transactions on Infor- mation Systems (TOIS). 2013] Marcin Maleszka. Shuai based collaborative filtering.. sun Park. and Fillia Makedon.. [Wang et al. 2004. pages 501–508. Xin Wang. pages 126–135. ACM. AAAI Press. Arjen P De Vries. 2010] Yehuda Koren.. international conference on Knowledge discovery and 40(3):56–58. A method for collabo.. 2007. 1997] Paul Resnick and Hal R Varian. Algorithms for non-negative matrix factorization. Convex optimization. ference on Research and development in information re- [Koren. 47:1–13. and Huan model for collaborative filtering. Knowledge and Data Engi- Zhang. Robert Bell. Guanyuan Zhang. chical structure in music recommendation. 1993. 2010. 556–562. and Chris laborative filtering approaches by similarity fusion. Yilin and hierarchical structure of user profiles. In Proceedings of the 26th annual international ACM [Lee and Seung. Cambridge univer. 2006. Data Mining and Knowledge 426–434. Unifying user-based and item-based col- [Koren et al.. Tao Li. In maion retrieval. and Bjoern Schuller. SIAM. pages 1329–1336. IEEE Transactions on. Joseph A international joint conference on Artificial Intelligence. Exploiting local and global social context for recommendation. ACM. trieval. pages 704–711. Any opinions expressed in this material are those [Mnih and Salakhutdinov. [Si and Jin. data mining.. 2003] Wei Xu. 2004. In ICML. deep semi-nmf model for learning hidden representations. Learning from incomplete rat- [Maleszka et al. and Yihong Gong. 2013] Huiji Gao. volume 3. 2004. In Advances in neural information processing sys- [Gu et al. erence structures in the analytic hierarchy process with in- sity press. 2004] Stephen Boyd and Lieven Jimenez and Luis G Vargas. 1999. Huiji Gao. Konstan. 22(1):89–115. Factorization meets the neigh. Exploring temporal effects for location recommenda. Collaborative filtering: Weighted nonnegative ma- trix factorization incorporating user and item graphs. Computer. Flexible mixture [Gao et al. Al Borchers. Communications of the ACM. [Ding et al. Volker Tresp. Wei Peng. 2014] George Trigeorgis. Discovery.Office (ARO) under contract/grant number 025071. 2014. In Proceedings of the 31st International Conference on [Hofmann. SIGIR conference on Research and development in infor- ung. pages [Srebro et al. Tommi S Jaakkola. and Bin Wang. and 93–100. Anton Schwaighofer. tion on location-based social networks.. 42(8):30–37. 2006] Chris Ding. Knowledge- Wang is supported by the NSF under contract/grant number Based Systems. pages 1257–1264. ACM. In Pro- Volinsky. Xiaowei Xu. Community discovery using nonneg- ference on Knowledge discovery and data mining. In Pro- ceedings of the 22nd annual international ACM SIGIR [Trigeorgis et al. Xia Hu. and John Riedl. Probabilistic memory- [Lu et al. 2010. pages 199–210. 2013. Huan Liu. ceedings of the 29th annual international ACM SIGIR con- mender systems. In Ad- NSF and ARO. ument clustering based on non-negative matrix factoriza- 53(4):89–97. 2001. Rui Li. 1997. Shenghuo Proceedings of the 14th ACM SIGKDD international con. 2008] Yehuda Koren. 2009] Yehuda Koren. Zhu. 2006. In Proceedings of the Twenty-Third [Herlocker et al.