You are on page 1of 12

Advanced Algorithms Design

Professor
Student Name
Student Number

Dr. Yangjun Chen


Ajay Ramganesh
3064395

Assignment 1

Advanced Algorithm Design


Assignment 1

Question-1:
1.(25) Please prove the following equalities and inequalities:
10n (n2).
100n = (n).
22n (2n).

Answer:
10n (n2)
Lets assume that 10n (n2). The 3 constant c1, c2, and n0 are defined as:
c1 n2 10n c2 n2

Formula of -notation:

(g(n)) = {f(n) : positive constants c1, c2, and n0, such that n n0, we have 0 c1g(n) f(n)
c2g(n)}.

Left hand side (LHS) of equation is:


c1 n2 10n
c1 n 10
n 10/ c1.
Equation only hold when n 10/c1. Equation will not hold when n > 10/c1.
10n (n2). Hence Proved.

100n = (n)
Let us assume equation is correct then there are 3 constants c1, c2, and n0.
As per -notation formula:
c1n 100n c2n

Advanced Algorithm Design


Assignment 1

Left hand side of equation:


c1n 100n
c1 100

Right hand side of equation:


100n c2n
100 c2

For every value of c1 100 and 100 c2 we can find out the value of n.
Eg: 99n 100n 101n. This equation holds for every value of n when c1 100 and
100 c2.
Hence proved.

22n (2n).
Let assume that 22n (2n). Then there will be 3 constants c1, c2, and n0.
As per the formula: (g(n)) = {f(n) : positive constants c1, c2, and n0, such that n
n0, we have 0 c1g(n) f(n) c2g(n)}.
c12n 22n c22n
We select right hand side of the function to proceed with it.
22n c22n.
2n c 2
n log c2
So, above formula only holds when n log c2. Equation will not hold when n >
log c2.
22n (2n) hence proved..

Question-2:
(15) In Fig. 2, we show a network, in which each node stands for a page and each
arc for a link from a page to another. Please give the transition matrix for the
network. Also, explain why the solution to the equation:
A = MA can be used as the estimation of page importance, where A is a vector of n
variables and M is an n n transition matrix.

Advanced Algorithm Design


Assignment 1

Answer:
Transition Matrix:

P1
0

P2
0
0
0
0
0

P3
0
1
0
0
0

P4
0
0

P5
0
0
0
1
0

The web navigation for the above transition matrix can be expressed as random
walker move.
Let M has sxy entries in row x and column y, where:
1.

sxy = 1/r if page y has a link to page x, and there are a total

2.

r1 pages that y links to.


sxy = 0 otherwise.

After a large number of moves, the walkers distribution of possible locations is the
same at each step. To overcome this, the solution A = MA can be used as the
estimation of page importance, where A is a vector of n variables and M is an n n
transition matrix.
So, the time that the random walker spends at a page is used as the measurement
of importance.
After 50 to 100 iterations of this process, the amount of time spent by the user on
the particular page on Web will be exactly close to the above results.
Advanced Algorithm Design
Assignment 1

So, the equation A = MA helps in finding the amount of time the user spends on the
page and this can be used as the estimation of page importance

Question-3:
(10) Explain why the following equation (for estimate the importance of pages)
works in the presence of spider traps and dead ends.

Pnew = MPold + (1 - )T
Answer:

When a user enters a set of pages where there is no link outside the set, its
called Spider Trap.
When a user enters a page where there is no link to the outside world, its
called Dead End.

In both the above scenarios, the user gets stuck and the walk ends.
If we apply the relaxation to the matrix of Web with Spider Traps, it can result in a
limiting distribution where all probabilities outside a spider trap are 0.
Limiting random walker is allowed to wander at random. By doing this, the walker
follows a random out-link, with probability (normally, 0.8 0.9) and with
probability 1 - (called the taxation rate), we remove that walker and deposit a new
walker at a randomly chosen Web page.
Using the above strategy,
i.
If walker gets stuck in Spider Trap, after few time steps, walker will disappear
and replaced by a new walker
ii.
If the walker reaches a dead end and disappears, a new walker takes over
shortly
Let Pnew and Pold be the new and old distributions of the location of the walker, after
1 iteration, we can express the relationship between them as following:

Pnew

= 0.8

1-
Transition Matrix M

Pold + 0.2

Fraction Of Time

Based on the above equation, if we multiply the transition matrix with the
probability of 0.8, we can get the new location of the walker and with 0.2 probability

Advanced Algorithm Design


Assignment 1

we can start the walker from the random place that helps the walker to come out of
the dead end or spider trap situation.
This is the reason why Pnew= MPold+ (1 - )T is used to overcome dead end or
spider traps because it helps in to move the walker out of the situation.

Question-4
(20) Fig. 3 shows a tree encoding. The quadruples can be stored as a sequence
sorted by LeftPos values by using the depth-first search. Design an algorithm to
transform it into another sequence sorted by RightPos values.

T:

Answer:
Algorithm:
Let X(i) be all data streams sorted by LeftPos.
Let R(i)be new data streams sorted by RightPos.
Begin
repeat until each X(i) becomes empty
{
identify i such that the first element v of X(i) is of the minimal LeftPos value;
remove v from X(i);
while Stack is not empty and Stack.top() is not v s ancestor
do
{
d Stack.pop();
Let d = (j, u);
put u at the end of R(i);
}
Advanced Algorithm Design
Assignment 1

Stack.push(i, v);
}
Stack = Pop out all the remaining elements
Insert into corresponding R(i);
End

Question-5:
(15) In the following table, we show the key words of five documents, as well as
the key word sequences sorted by frequencies. Please construct a trie for the sorted
sequences and a header table for all the key words to speed up the evaluation of
conjunctive queries of form word1 word2 wordi. Also, show how a
conjunctive query is evaluated by using the trie.

DocID
Items
12 f,a,a,
c,c,i, h,
c,
j,c,m,
f,ff,Sorted
i,pa,
a,b,
m,hp, j
b,
item
34 b,
b, i, fi c,
f, b,
i
b,
5 a, f, c,
c, m,c,pf,sequence
a, im, p

Answer:
Frequency of each word is found by the following:
af(w) =

No. of doc containing w


No. of doc

Frequency of each word:


af(f) = 4/5
af(c) = 4/5
af(a) =3/5
af(b) =3/5
af(i) = 3/5
af(m) =2/5
af(p) = 2/5
af(h) = 1/5
Advanced Algorithm Design
Assignment 1

af(j) =1/5

Root

Header Table
Item
s
c

Links
{1,2,4,5}

c
o

{1,2,3,5}

{1,2,5}

{2,3,4}

{2,3,4}

i
m

{1,3,4}
{1,5}

{1,5}
h

{1,5}
{1,5}

b
o

f
o
{1,3,4} i

a
o
m
o

f
o
{2,3,4}
b
o

{1,2,3,5}

{1,2,5}
a
o

b
{2,3,4}
o
h

{1,3,4}
i

p
{2}

{2}
{1,5}

{1}

Evaluation of query in trie


The following steps are used to evaluate the query in the trie:
Let Q = word1 word2 wordkbe a query
Sort increasingly the words in Q according to the appearancefrequency:

Advanced Algorithm Design


Assignment 1

o Wordi1 Wordik
Find a node in the trie, which is labeled with word i1
If the path from the root to wordi1 contains all wordj(j = 1, , k), Return the
document identifiers associated with wordi
The check can be done by searching the path bottom-up, starting from word i1.
In this process, we will first try to find word i2 , and then wordi3, and so on.

Example:
We have a query say: c b f
The frequency of each query word:
af(c) = 4/5
af(b) = 3/5
af(f) = 4/5
After sorting the frequencies in increasing order we have the result as:

bfc
Root

Header Table
Item
s
c

Links

f
a

b
o

b
o

a
o

b
o

f
o

i
m

f
o

c
o

a
o

p
h
j

m
o
p

Advanced Algorithm Design


Assignment 1

Question-6:
(20) The following is a directed graph G. Please find a spanning tree of it and then
label the nodes in the spanning tree by intervals. Also, construct an interval
sequence for each node, which can be used to check the reachability queries with
respect G.

Answer:
Spanning Tree:
a[0,13)

b[1,6)

c[2,5)
j[12,13)

Advanced Algorithm Design


Assignment 1

r[6,10)

d[5,6)

e[7,10)

h[10,13)

i[11,12)

p[3,5)

f[8,9)

g[9,10)

k[4,5)

Topological order of nodes:

a[0,13)

b[1,6)

c[2,5)

d[5,6)

r[6,10)

h[10,13)

e[7,10)

i[11,12)

j[12,13)

p[3,5)

f[8,9)

g[9,10)

k[4,5)

Topological order: a, b, h, j, r, e, i, f, g, c, p, d, k
Reverse topological order: k, d, p, c, g, f, i, e, r, j, h, b, a
L(k) = [4, 5)
L(i) = [4,5)[5,6)[8,9)[11, 12)
L(d) = [4,5) [5,6)
L(e) = [7,10)
L(p) = [3, 5)
L(r) = [2,5)[6, 10)
L(c) = [2, 5)
L(j) = [2,5)[6,10)[12,13)
L(g) = [4, 5)[5, 6)[9, 10)
L(h) = [4,5)[5,6)[7,10)[10,
13)
L(f) = [4, 5)[5, 6)[8, 9)
L(b) = [1,6)
L(a) = [0, 13)
Reachability Query Check:
Let u and v be two nodes of G.
Advanced Algorithm Design
Assignment 1

u is a descendant of v, if and only if, there exists an interval [, ) in L(v) such that
u [, ).
Example:
[f, f ) = [4, 5)[5,6)[8,9)
L(h) = [4,5)[5,6)[7,10)[10, 13)
Interval of f is in the interval of hImplies

node f is the descendant

of node h.

END

Advanced Algorithm Design


Assignment 1

You might also like