Mod 10 - Search Engines Flashcards

1
Q

What is a regular graph (social media)?

A

It uses nodes and edges to connect things together.

ex. Friends on Facebook (mutual)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a directed graph (social media)?

A

It uses nodes and edges too, but there are also arrows as there are more complex relationships.
ex. follow someone on Twitter but they don’t follow you back or vice versa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do links work?

A

Web pages have links to other pages which allows you to “travel” around the web. Hence, worldwide web.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are spiders?

A

They start at one web page and explore others linked to it so they can gather info about it
(ex. how google gathers info on pages)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a focused spider?

A

It is a spider that targets a specific topic, only looking at pages related to that topic, & gathering information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a polite spider?

A

It is very cooperative with other websites and will follow the website instructions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is spider revisit frequency?

A

It considers how often pages change to figure out how often spiders visit that page.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the issue with paywalls for spiders?

A

A subscription is required to view the content, so how do spiders access it to advertise it? Websites open backdoors for the spiders to enter through.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the issue with dynamic content & query strings for spiders?

A
  1. This content is varied for the different users viewing it - complex what spiders see.
  2. Do spiders care about query strings - additional information about content
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What do web searches need to worry about including (indexing)?

A
  • list occurrences: where, when & how many times a word appears in a website (makes a list)
  • punctuation: email vs. e-mail (detects the same thing)
  • accents: Beyonce vs. Beyoncé
  • “stop” words: the, it, is (don’t index)
  • word variants: sell, sells, selling, etc. (indexed together)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are examples of advanced indexing?

A
  • synonymy: synonyms treated similarly (big & large)

- polysemy: create separate indexes for words spelled the same with different meanings (ex river bank and money bank)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What can evil spiders do?

A
  1. They can steal content and claim it as their own

2. They can steal emails to send spam emails

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do search engines search for phrases with more than 1 word?

A

The search engine goes through all the webpages containing each word separately and looks for combinations and returns those pages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does page ranking work?

A

Pages that have lots of other pages linking to it are often more authoritative sources & thus, more important so they receive a higher ranking (meaning they show up more in searches). It looks for HTML elements that relate to the search (ex. href, title, h1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does penalization/rewarding work in page ranking?

A

Pages will be penalized for having excessive ads or aggregators.
Pages are rewarded for content quality, reputability, and authority sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can search be manipulated?

A
  • hidden text
  • aggregators
  • link farms
  • website hijacking
  • Google bombs
17
Q

How does hidden text manipulate search?

A

Text can be added & made hidden (white text on white background) that is irrelevant to the actual content, so it tricks spiders & shows up in search

18
Q

How do aggregators manipulate search?

A

It takes lots of content from a bunch of different sites and puts it together into one page so the search engine picks up the site even though it is not all that relevant.

19
Q

How do link farms manipulate search?

A

A bunch of fake web pages are created that all link tot each other so they seem more important/relevant than they really are.

20
Q

How does website hijacking manipulate search?

A

A website obtains a link from very reputable websites to make yours look more powerful. (ex. putting your website link in the comments of news page, like CNN)

21
Q

How do Google bombs manipulate search?

A

It links a phrase to a page that is unrelated to the phrase, usually for humorous or controversial reasons (ex. Bush bomb - miserable failure)

22
Q

How can the minus symbol (-) be used to effectively google search?

A

It eliminates a word from a search so no sites with that word will show up in the results.
(ex. “potato soup - celery” no results with the word celery will show up)

23
Q

How can quotation marks “ “ be used to effectively google search?

A

Surrounding a phrase in quotation marks means those words will be searched for in that exact order only.

24
Q

What are examples of other search operators that can be used to help refine a search?

A
    • means search for this AND this (cat + dog)
  • @ searches for a word in social media (@twitter)
  • $ searches for a price ($400)
  • # searches for trending topics (#uwaterloo)
    • is used as a place holder in a search (largest * in the world)
  • . . searches for a range ($20. .$50 or 10kg. .30kg)
  • OR searches for both words but the search doesn’t necessarily need to have both (marathon OR race)