## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

**and the Choice of the Outlinks
**

Laure Ninove

Joint work with Cristobald de Kerchove and Paul Van Dooren

CESAME

Université catholique de Louvain, Belgium

CESAME Seminar

February 27, 2007

Laure Ninove (CESAME) Outlinks and PR 1 / 27

Google’s power

Google’s search engine

guides websurfers in their

visits.

A good ranking is vital for a

webpage to be read.

How to improve your

Google rank?

Laure Ninove (CESAME) Outlinks and PR 2 / 27

Google’s power

Google’s search engine

guides websurfers in their

visits.

A good ranking is vital for a

webpage to be read.

How to improve your

Google rank?

Laure Ninove (CESAME) Outlinks and PR 2 / 27

Outline

1

Preliminaries: What is under Google’s PageRank?

A brief history

A story of links

PageRank equations

2

How to improve your PageRank?

Add inlinks

Choose outlinks

3

Optimal outlink structure

For a single node

For a set of nodes

Laure Ninove (CESAME) Outlinks and PR 3 / 27

Outline

1

Preliminaries: What is under Google’s PageRank?

A brief history

A story of links

PageRank equations

2

How to improve your PageRank?

Add inlinks

Choose outlinks

3

Optimal outlink structure

For a single node

For a set of nodes

Laure Ninove (CESAME) Outlinks and PR 4 / 27

A brief history of the Web search engine Google

1996: a research project, by L. Page and S. Brin

1998: Google Inc. company, 25 million webpages indexed

2005: 8 billion webpages indexed

2006: "to google" added to the Oxford English Dictionary

“The primary goal is to provide high quality search results

over a rapidly growing World Wide Web. Google employs a

number of techniques to improve search quality including

page rank, anchor text, and proximity information.”

Brin & Page, 1998

The anatomy of a large-scale hypertextual web search engine

Laure Ninove (CESAME) Outlinks and PR 5 / 27

Google’s PageRank: a story of links

An hyperlink from i to j

≡

i ’s vote of conﬁdence in j .

A page j has a high PageRank π

j

if it is pointed to by many pages with

a high PageRank,

few outlinks.

Laure Ninove (CESAME) Outlinks and PR 6 / 27

Votes of conﬁdence

Example

1

2

4

3

2/11

2/11

?

1/11

1/11

2/11

π

1

=

1

2

π

2

+ 1 π

4

=

3

11

Laure Ninove (CESAME) Outlinks and PR 7 / 27

PageRank equations

Vote of conﬁdence

π

j

= c

i →j

π

i

d

i

+ (1 −c)z

j

j

π

j

= 1

sum of parents’ weighted scores

normalization of the PageRanks

damping with personalization score

π

T

= c π

T

D

−1

A + (1 −c)z

T

π

T

e = 1

A ∈ ¦0, 1¦

n

: webgraph’s adjacency matrix

(zero diagonal, no zero row)

D = diag(Ae): outdegrees matrix

c ∈ ]0, 1[: damping factor

z > 0, z

T

e = 1: personalization vector

Laure Ninove (CESAME) Outlinks and PR 8 / 27

PageRank equations

Vote of conﬁdence

π

j

= c

i →j

π

i

d

i

+ (1 −c)z

j

j

π

j

= 1

sum of parents’ weighted scores

normalization of the PageRanks

damping with personalization score

π

T

= c π

T

D

−1

A + (1 −c)z

T

π

T

e = 1

A ∈ ¦0, 1¦

n

: webgraph’s adjacency matrix

(zero diagonal, no zero row)

D = diag(Ae): outdegrees matrix

c ∈ ]0, 1[: damping factor

z > 0, z

T

e = 1: personalization vector

Laure Ninove (CESAME) Outlinks and PR 8 / 27

PageRank equations

Vote of conﬁdence

π

j

= c

i →j

π

i

d

i

+ (1 −c)z

j

j

π

j

= 1

sum of parents’ weighted scores

normalization of the PageRanks

damping with personalization score

π

T

= c π

T

D

−1

A + (1 −c)z

T

π

T

e = 1

A ∈ ¦0, 1¦

n

: webgraph’s adjacency matrix

(zero diagonal, no zero row)

D = diag(Ae): outdegrees matrix

c ∈ ]0, 1[: damping factor

z > 0, z

T

e = 1: personalization vector

Laure Ninove (CESAME) Outlinks and PR 8 / 27

PageRank equations

Random walk

Google matrix:

G = c D

−1

A + (1 −c) ez

T

Irreducible, stochastic matrix −→transition probability matrix

Random walk on the webgraph:

P(i →j ) = G

ij

, with P(follow hyperlinks) = c

P(zap according z) = 1 −c

PageRank vector π: stationary distribution of this Markov chain

π

T

G = π

T

π

T

e = 1

Laure Ninove (CESAME) Outlinks and PR 9 / 27

Damping with a personalization score

Example

1

2

4

3

0.19 ?

c*0.095

c*0.19

z

(1−c)*0.25

0.19

π

1

= c

1

2

π

2

+ π

4

+ (1 −c) z

1

Laure Ninove (CESAME) Outlinks and PR 10 / 27

Outline

1

Preliminaries: What is under Google’s PageRank?

A brief history

A story of links

PageRank equations

2

How to improve your PageRank?

Add inlinks

Choose outlinks

3

Optimal outlink structure

For a single node

For a set of nodes

Laure Ninove (CESAME) Outlinks and PR 11 / 27

How to improve your PageRank?

Laure Ninove (CESAME) Outlinks and PR 12 / 27

How to improve your PageRank?

Add inlinks

Add inlinks?

π

j

= c

i →j

π

i

d

i

+ (1 −c)z

j

Always your PR

Ipsen & Wills, 2006

Mathematical properties and analysis of Google’s PageRank

Laure Ninove (CESAME) Outlinks and PR 13 / 27

How to improve your PageRank?

Add inlinks

Add inlinks?

π

j

= c

i →j

π

i

d

i

+ (1 −c)z

j

Always your PR

Ipsen & Wills, 2006

Mathematical properties and analysis of Google’s PageRank

Laure Ninove (CESAME) Outlinks and PR 13 / 27

How to improve your PageRank?

Add inlinks

Example

1 1

π

1

= 0.196 < π

(inlink)

1

= 0.245

Laure Ninove (CESAME) Outlinks and PR 14 / 27

How to improve your PageRank?

Add inlinks

Add inlinks?

π

j

= c

i →j

π

i

d

i

+ (1 −c)z

j

Always your PR

But no control

on your inlinks

Ipsen & Wills, 2006

Mathematical properties and analysis of Google’s PageRank

Laure Ninove (CESAME) Outlinks and PR 15 / 27

How to improve your PageRank?

Choose outlinks

Choose outlinks?

You control them

Constraints:

at least one outlink

no loop

Impact not obvious:

adding outlinks can

or `your PR

Sydow, 2005

Can one out-link change your PageRank?

Laure Ninove (CESAME) Outlinks and PR 16 / 27

How to improve your PageRank?

Choose outlinks

Choose outlinks?

You control them

Constraints:

at least one outlink

no loop

Impact not obvious:

adding outlinks can

or `your PR

Sydow, 2005

Can one out-link change your PageRank?

Laure Ninove (CESAME) Outlinks and PR 16 / 27

How to improve your PageRank?

Choose outlinks

Example

1 1 1

π

(outlink a)

1

= 0.182 < π

1

= 0.196 < π

(outlink b)

1

= 0.211

Laure Ninove (CESAME) Outlinks and PR 17 / 27

Outline

1

Preliminaries: What is under Google’s PageRank?

A brief history

A story of links

PageRank equations

2

How to improve your PageRank?

Add inlinks

Choose outlinks

3

Optimal outlink structure

For a single node

For a set of nodes

Laure Ninove (CESAME) Outlinks and PR 18 / 27

Notation

Let 1 be the considered set of nodes.

Up to a permutation of the indices,

A =

A

I

A

out(I)

A

in(I)

A

¯

I

.

Laure Ninove (CESAME) Outlinks and PR 19 / 27

Optimal outlink structure for a single node

Suppose 1 = ¦1¦.

We want to maximize π

1

(A

out({1})

).

With A

out({1})

= e

T

L

, where L = ¦children of 1¦ = ∅.

Proposition

π

1

(e

T

L

) is maximal ⇐⇒ ∅ = L ⊆ L

∗

= arg min

i

e

T

i

(I −G

¯

I

)

−1

e.

Proof.

π

1

(e

T

L

) =

1

c

i ∈L

e

T

i

(I −G

¯

I

)

−1

e

[L[

+ constant

.

Laure Ninove (CESAME) Outlinks and PR 20 / 27

Optimal outlink structure for a single node

Suppose 1 = ¦1¦.

We want to maximize π

1

(A

out({1})

).

With A

out({1})

= e

T

L

, where L = ¦children of 1¦ = ∅.

Proposition

π

1

(e

T

L

) is maximal ⇐⇒ ∅ = L ⊆ L

∗

= arg min

i

e

T

i

(I −G

¯

I

)

−1

e.

Proposition

Suppose that 1 has some parents. Then

π

1

(e

T

L

) is maximal =⇒ L ⊆ ¦parents of 1¦.

Laure Ninove (CESAME) Outlinks and PR 20 / 27

Optimal outlink structure for a single node

Suppose 1 = ¦1¦.

We want to maximize π

1

(A

out({1})

).

With A

out({1})

= e

T

L

, where L = ¦children of 1¦ = ∅.

Proposition

π

1

(e

T

L

) is maximal ⇐⇒ ∅ = L ⊆ L

∗

= arg min

i

e

T

i

(I −G

¯

I

)

−1

e.

Proposition

Suppose that 1 has some parents. Then

π

1

(e

T

L

) is maximal =⇒ L ⊆ ¦parents of 1¦.

But

Laure Ninove (CESAME) Outlinks and PR 20 / 27

Optimal outlink structure for a single node

Example

Example

1

2

3

*

*

*

In order to maximize its PageRank,

Node 1 should link

to some node(s) (parents).

But it is better for 1 to link

to node 3 (grand-parent)

rather than to node 2 (parent).

Laure Ninove (CESAME) Outlinks and PR 21 / 27

Optimal outlink structure for a set of nodes

Consider now a set 1 of nodes.

Internal link structure A

I

given, with A

I

has no zero row.

External outlink structure A

out(I)

to be determined.

Goal: to maximize the sum of PageRanks:

max

A

out(I)

i ∈I

π

i

(A

out(I)

).

Laure Ninove (CESAME) Outlinks and PR 22 / 27

Optimal outlink structure for a set of nodes

Proposition

Under the assumption that 1 has at least m external outlinks,

i ∈I

π

i

(A

out(I)

) is maximal =⇒ 1 has exactly m external outlinks.

Laure Ninove (CESAME) Outlinks and PR 22 / 27

Optimal outlink structure for a set of node

Proof.

Removing a link i →j from the graph ⇐⇒perturbation:

˜

G

(i ,j )

= c (D

−1

A + e

i

δ

(i ,j )T

) + (1 −c) ez

T

.

Difference between new and old PageRank sums:

s∈I

π

(i ,j )

s

−

s∈I

π

s

= c π

i

δ

(i ,j )T

(I −cD

−1

A)

−1

e

I

1 −c δ

(i ,j )T

(I −cD

−1

A)

−1

e

i

.

For every link i →j , c δ

(i ,j )T

(I −c D

−1

A)

−1

e

i

< 1.

There exists an external outlink k → with k ∈ 1, / ∈ 1, such that

δ

(k,)T

(I −cD

−1

A)

−1

e

I

> 0.

Laure Ninove (CESAME) Outlinks and PR 23 / 27

Optimal outlink structure for a set of node

Example

Sometimes, removing an outlink for 1

may decrease the PageRank sum for 1.

2

3

4

5

1 2

3

4

5

1 2

3

4

5

1

i ∈I

π

i

(1 →3) <

i ∈I

π

i

(1 →3&5) <

i ∈I

π

i

(1 →5)

Laure Ninove (CESAME) Outlinks and PR 24 / 27

Optimal outlink structure for a set of nodes

Special case

Proposition

Let 1 be a set of nodes organized in a clique.

Let T ⊂ 1 be the set of nodes f

without any external parent (A

in(I)

e

f

= 0),

with a minimal zapping for 1 (z

f

= min

i ∈I

z

i

).

Suppose that 1 must have at least one external outlink. If T = ∅, then

i ∈I

π

i

(A

out(I)

) is maximal

⇐⇒

A

out(I)

= e

f

e

T

with f ∈ T and ∈ L

∗

= arg min

i

e

T

i

(I −G

¯

I

)

−1

e.

Laure Ninove (CESAME) Outlinks and PR 25 / 27

Summary

A single webpage 1:

π

1

(A

out(I)

) maximal ⇐⇒ A

out(I)

= e

T

L

with ∅ = L ⊆ L

∗

,

where L

∗

= arg min

i

e

T

i

(I −G

¯

I

)

−1

e.

Moreover L

∗

⊆ ¦parents of 1¦.

A set 1 of at least 2 webpages:

i ∈I

π

i

(A

out(I)

) maximal =⇒ 1 has a unique external outlink,

if we suppose that 1 must have at least one external outlink.

Under some assumptions: external outlink k → with ∈ L

∗

.

Laure Ninove (CESAME) Outlinks and PR 26 / 27

Related questions

Modify internal link structure A

I

?

Impact not obvious

**Adding a link between two pages of 1
**

can decrease the PageRank of one of these pages,

or even, can decrease the sum of their PageRanks!

**The clique is not always the optimal internal link structure.
**

Add servant pages: link spam farms

Laure Ninove (CESAME) Outlinks and PR 27 / 27

Related questions

Modify internal link structure A

I

?

Impact not obvious

**Adding a link between two pages of 1
**

can decrease the PageRank of one of these pages,

or even, can decrease the sum of their PageRanks!

**The clique is not always the optimal internal link structure.
**

Add servant pages: link spam farms

Link farm Target

Laure Ninove (CESAME) Outlinks and PR 27 / 27

- 24-GooglePageRankAlgorithmuploaded byGian Gulla Megantara
- Google WebMaster Toolsuploaded byhappyweb
- Encom Discover 2011uploaded byErland Prasetya
- example-rep-traininguploaded byapi-32062167
- cobol 1uploaded byrobp2005
- Page Rankuploaded bycompira
- Testinguploaded bySrimathi Rajamani
- 14-VerissimoWorkshop in Measuring Assurance in Cyberspaceuploaded bycaire
- Markov Hand Outuploaded byKARAN
- ISCEbutnotInsarAppuploaded bySyach Roel
- engineeringuploaded byKurumeti Naga Surya Lakshmana Kumar
- EMOOCs 2014 Research Track 3_Wolluploaded byEMOOCS2014
- document-management-scannersuploaded bygkphinduja
- bs.pdfuploaded byBenjamin Leung
- Resume 2014uploaded byRichard Marrujo
- 1066uploaded byBaljinder Singh
- Breast Cancer.namesuploaded bylinkranjit
- Perluploaded byJan Abraham S. Quijano
- P PI PID basicsuploaded byDinesh
- 3.5 Define Variable 3uploaded byKrishna Bhowal
- Practice questions for Computer Sciences page 5.pdfuploaded byImtiaz Zaman
- Inglês para o TRE-PE com o Professor Renato Baggio Livro 05uploaded byIvan Saboia
- First Reviewuploaded bysathyaji
- SVA_TRIVIA.2211847uploaded byfireboy99
- Course_Outline [CSC103 Introduction to Computer and Progamming]uploaded byTooba Aamir
- auploaded bysikkandar44
- TP2 Initializationuploaded byMathieu Cammaert
- 05_3_Exercise_Table_2.docuploaded byRichard Miller
- ServiceAreaCode.PDFuploaded byRaja Kalyan
- Tehnical Module D -Project Planning & Controluploaded byLopeke

- Algorithmic Tools for Data-Oriented Law Enforcement (phd Thesis by Tim Cocx)uploaded byPascal Van Hecke
- [Dutch] DEFINITIEVE BEVINDINGEN: Onderzoek CBP naar de verzameling van Wifi-gegevens met Street View auto’s door Google (CBP)uploaded byPascal Van Hecke
- Hearing on The Collection and Use of Location Information for Commercial Purposes February 24, 2010uploaded byPascal Van Hecke
- Enisa Cloud Computing Risk Assessmentuploaded byPascal Van Hecke
- An Analysis of Private Browsing Modes in Modern Browsersuploaded byPascal Van Hecke
- [Dutch] Scrapen : het verzamelen van on-line publiekelijk beschikbare gegevensuploaded byPascal Van Hecke
- Opt-in Dystopiasuploaded byPascal Van Hecke
- Internet usage in 2009 in the European Union (statistics)uploaded byPascal Van Hecke
- Facebook’s Response to Questions from the Data Inspectorate of Norway, september 2011uploaded byPascal Van Hecke
- State of the eUnion: Government 2.0 and Onwardsuploaded byPascal Van Hecke
- Letter from Apple to US representatives Markey and Barton on its Location-based services, July 12, 2010uploaded byPascal Van Hecke
- [Dutch] Sociale Netwerken en Privacy PI 2009 5uploaded byPascal Van Hecke
- [Dutch] Raamwerk beveiliging webapplicaties, Govcertuploaded byPascal Van Hecke
- Enisa Briefing: Behavioural BioMetricsuploaded byPascal Van Hecke
- [Dutch] 'Databases – Over ICT-beloftes, informatiehonger en digitale autonomieuploaded byPascal Van Hecke
- Open Trust Frameworks for Open Gov 2009-08-10uploaded byPascal Van Hecke
- [Dutch] Alles onder controle?uploaded byPascal Van Hecke
- Opinion on Online Social Networking by Working Party 29uploaded byPascal Van Hecke
- [Dutch] Wieowie Enquete Rapportuploaded byPascal Van Hecke
- Study of MITM Attacks Against Smartphone Devicesuploaded byPascal Van Hecke
- [Dutch] Roadmap voor Vertrouwen in de Informatiesamenlevinguploaded byPascal Van Hecke
- [Dutch] Krabbels en Respect Plzuploaded byPascal Van Hecke
- Flash Cookies and privacyuploaded byPascal Van Hecke
- Tailored Advertising (survey published sep 2009)uploaded byPascal Van Hecke
- Ambtenaar 2.0 artikel in personeelsblad CBP, september 2009uploaded byPascal Van Hecke
- [Dutch] Zoekmachines en de Wbp (Search engines and privacy legislation)uploaded byPascal Van Hecke

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading