PageRank Flashcards

(14 cards)

1
Q

What is the PageRank algorithm and what is it used for?

A

An algorithm developed by Google to discern the popularity of a webpage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is PageRank intended to simulate?

A

If a web surfer clicks on pages at random, what is the probability he will eventually reach that page?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does PageRank calculate the popularity of a webpage?

A

As the sum of the rank of its neighbours each divided by the number of outbound links on that website

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the score that PageRank calculates called?

A

Authority

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What kind of webpage will have the highest authority in PageRank?

A

Pages with lots of high-ranking pages linking to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can we perform PageRank in a step-by-step process?

A

Initialise a vector P with each value set to an initial value, and a transition matrix H. Iterate M times such that P is equal to the transition matrix multiplied by the value of P at the previous time step. Repeat until we approach infinity and iteration doesn’t change anything

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a sink page?

A

A page that has no outgoing links to any pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the problem caused by sink pages?

A

They cause PageRank to approach 0 even for important pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do we solve the sink page problem?

A

Distribute the rank of the sink page over all pages of the web, such that each page shares 1/N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are cycle pages?

A

Webpages that are linked to a closed cycle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the problem caused by cycle pages?

A

They lead to an infinite authority increase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we solve the cycle page problem?

A

The random surfer model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the random surfer model?

A

An observation that states that a web surfer will either click the link on a webpage, or randomly start a new session

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do we incorporate the random surfer model into PageRank?

A

We denote a fixed probability d that represents the probability our user will click on a website, and multiply it by our original PageRank algorithm. Then, add on the probability that our user will get bored divided by the number of pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly