Language Flashcards
(33 cards)
spans all tasks where the AI gets human language as input
Natural Language Processing
examples of tasks in Natural Language Processing
• automatic summarization
• information extraction
• language identification
• machine translation
• named entity recognition
• speech recognition
• text classification
• word sense disambiguation
sentence structure
Syntax
meaning of words or sentences
Semantics
system of rules for generating sentences in a language
Formal Grammar
text is abstracted from its meaning to represent the structure of the sentence using formal grammar
Context-Free Grammar
a sequence of n items from a sample of text.
n-gram
a contiguous sequence of n characters from a sample of text
character n-gram
a contiguous sequence of n words from a sample of text
word n-gram
a contiguous sequence of 1 item from a sample of text
unigram
a contiguous sequence of 2 item from a sample of text
bigram
a contiguous sequence of 3 item from a sample of text
trigrams
task of splitting a sequence of characters into pieces (tokens)
Tokenization
the task of splitting a sequence of characters into words
word tokenization
the task of splitting a sequence of characters into sentences
sentence tokenization
How to generate text using a Markov Model
Markov Models
a model that represents text as an unordered collection of words.
Bag-of-words Model
adding a value α to each value in our distribution to smooth the data
additive smoothing
adds 1 to each value in our distribution, pretending that all values have been observed at least once.
Laplace Smoothing
task of finding relevant documents in response to a user query.
Information retrieval
models for discovering the topics for a set of documents
topic modeling
counting how many times a term appears in a document.
term frequency
words that have little meaning on their own, but are used to grammatically connect other words
function words
am, by, do, is, which, with, yet,
function words