W9 Neural IR 2 Flashcards

1
Q

why would we want dense retrieval?

A

sometimes we need exact matching, but often we also want in-exact matching of documents to queries: if we use exact matching in 1st stage we might miss relevant documents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is dense retrieval?

A

neural first-stage retrieval, so using embeddings
- bi-encoder architecture: encoding the query and document independently, then computing the relevance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

bi-encoder archtitecture: 3 steps

A

1.generate a representation of the query that captures the information need
2.generate a representation of the document that captures the information contained
3.match the query and the document representations to estimate their mutual relevance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how do we measure the relevance between query and document?

A

use a function to compute the similarity between the query and the document representation vectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the 4 differences between cross-encoders and bi-encoders?

A

cross: one encoder for q and d
bi: separate encoders for q and d

cross: full interaction between words in q and d
bi: no interaction between words in q and d

cross: higher quality ranker than bi

cross: only possible in re-ranking
bi: highly efficient (also in 1st stage)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sentence-BERT

A

commonly used bi-encoder, originally designed for sentence similarity but can be used for q,d pairs

it is a pointwise model, because we only take on d into account per learning item. At inference we measure the similarity between q and each d and then sort the docs by this similarity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the goal of training bi-encoders?

A

the similarity between the 2 vectors is maximized docs relevant to q and minimized for non-relevant docs to q, given the similarity function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

why are bi-encoders less effective than cross-encoders?

A

cross-encoders can learn relevance signals from attention between the query and candidate texts at each transformer encoder layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ColBERT

A

proposed as a model that has the effectiveness of cross-encoders and the efficiency of bi-encoders
- compatible with nearest neighbour search techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is nearest-neighbour search?

A

finding which document embeddings vector are most similary to the query embeddings vector

computing the similarity for each d,q pair is not scalable => approximate nearest-neighbour (ANN) search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

similarity in ColBERT

A

similarity between d and q is the sum of maximum cosine similarities between each query term and the best matching term in d

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ColBERT: query ime

A

L(q,d+, d-) = - log e^s_q,d+ / ( e^s_q,d+ + e^s_q,d-)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

challenges of long documents

A

memory burden of reading the whole document in the encoder

mixture of many topics, query matches may be spread

neural model must aggregate the relevant matches from different parts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

challenges of short documents

A

fewer query matches

but neural model is more robust towards the vocabulary mismatch problem than term-based matching models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is the long-tail problem?

A

a good IR method must be able to retrieve
infrequently searched-for documents and perform reasonably well on queries with rare terms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly