L15 - TF-IDF Approach Flashcards

1
Q
  1. What does TF-IDF stand for?
A
  1. Term Frequency - Inverse Document Frequency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. How do we calculate the score if TF-IDF?
A
  1. (Word counts of target text) / (word counts of other texts)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. What is the purpose of TF-IDF?
A
  1. Establishes the important of terms in a document relative to a corpus of other documents
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Generally how does TF-IDF work?
A
  1. Calculate the frequency of every term E.g corpus of metal lyrics
    1. For each document, calculate the TF-IDF score
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. Explain the steps of the TF-IDP process…
A
  1. Tokenise words i.e perform stemming or lemmatisation
    1. For each term in the document, calculate it’s frequency -> (Number of term occurrences) / (number of terms in document)
    2. Multiple TF and IDF values to get the TF-IDF score for each term
    3. Represent the TF-IDF scores in a document-term matrix
How well did you know this?
1
Not at all
2
3
4
5
Perfectly