class 9 Flashcards
(35 cards)
what are 3 reasons for computers to do NLP?
- to communicate with humans
- to learn
- to have a better scientific understanding of language and language use
what is a language model?
a probability distribution describing the likelihood of any string
what is grammars purpose?
to define the syntax of legal sentences
what is the purpose of semantic rules?
to define the meaning of the legal sentences
what is the bag-of-words model?
the application of Naive Bayes to a string of words
what is tokenization?
the process of dividing a text into a sequence of words
what is an n-gram model?
use a Markov chain model that considers the dependence between n adjacent words
in what cases would you use n-gram models?
in spam detection, author attribution, and sentiment analysis
what are other alternatives to n-gram models?
character-level models or skip-gram models
what is a structured model that is usually constructed through manual labor?
a dictionary
what’s a common model for POS tagging?
the hidden markov model
HMM [hidden markov model] combined with what algorithm can produce an accuracy of ~97%
Viterbi algorithm
what is the task of assiging a part of speech to each word in a sentence?
part of speech tagging
what is the corpus of over 3M words of text annotated with POS tags?
the Penn Treebank
what are some types of POS tagging?
logistic regression: but uses a greedy search
Viterbi algorithm: slow
beam search: in between logistic and viterbi, keeps accuracy but drops less-likely tags
examples of generative pos tagging models?
Naive Bayes and HMM
what is the name for a list of allowable words?
lexicon
what are open classes?
nouns, names, verbs, adjectives, and adverbs.
change rapidy
what are closed classes?
pronouns, article propositions, etc
change relatively slowly
how can dynamic programming be used for parsing?
this method stores the result of every analyzed substring so it doesn’t need to get reanalyzed later.
analyzed substrings are stored in a chart and the algorithm that stores the substrings is called a chart parser
what is a chart parser algorithm that uses Chomsky Normal Form grammar?
CYK algorithm
is Natural language a context-free grammar?
it is not, it is very dependent on contextual evidence
what are some tasks of NLP?
text-to-speech
machine translation
speech recognition
question answering
what are some complications of real natural language?
when a word has more than one meaning: lexical ambiguity
when a phrase has multiple parses: syntactic ambiguity