04 Architecture of retrieval system 2 Flashcards

1
Q

indexing

A

create a bag-of-words representation of each document by text in a fast look up structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

inverted index

A

primary data structure generated by the indexing process

make a dictionary of all words in the collection
for each word, list all the docs it occurred in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

indexing steps

A
  1. lexical analysis (tokenisation)
  2. stop word removal
  3. stemming
  4. index structure creation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

relevance estimation

A

compute the relevance of a document for a query
- term weighting scheme which allocate a numeric to each term reflecting their importance
- similarity coefficient: use term weight to compute an overall degree of similarity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

search algorithm

A

binary AND search

best match algorithm
- for each document, score=0
- for each query term, search vocab list, pull out posting list
- for each document in the list, score += 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly