NLP Flashcards
(25 cards)
What is NLP?
NLP (Natural Language Processing) is a field of AI that helps computers understand, interpret, and generate human language.
How do you install the spaCy library?
!pip install spacy
How do you download the English language model for spaCy?
!python -m spacy download en_core_web_sm
How do you load the English spaCy model?
nlp = spacy.load(‘en_core_web_sm’)
How do you create a spaCy document object from text?
doc = nlp(text)
How do you perform sentence boundary detection in spaCy?
Use doc.sents and loop through it: for sent in doc.sents: print(sent.text)
What is tokenization?
Breaking text into individual units such as words or punctuation.
How do you tokenize a document in spaCy?
for token in doc: print(token.text)
What is lemmatization?
Reducing a word to its base or dictionary form.
How do you perform lemmatization in spaCy?
token.lemma_
What is Part of Speech (POS) tagging?
Labeling each word with its grammatical role like noun, verb, etc.
How do you perform POS tagging in spaCy?
token.pos_
What is Named Entity Recognition (NER)?
Identifying names, places, organizations, dates, and other proper entities in text.
How do you perform Named Entity Recognition in spaCy?
for ent in doc.ents: print(ent.text, ent.label_)
How do you calculate semantic similarity between two documents in spaCy?
doc1.similarity(doc2)
What does token.like_url do?
Returns True if the token looks like a URL.
What does token.like_email do?
Returns True if the token looks like an email address.
How do you get the syntactic dependency of a token?
token.dep_
How do you access the language of the token?
token.lang_
How do you access the named entities in a spaCy document?
Use doc.ents
How do you convert a sentence into a list of tokens?
[token.text for token in doc]
How do you access a specific sentence from a doc?
sentences = list(doc.sents); sentences[index]
How do you get the lemma of all tokens in a doc?
[token.lemma_ for token in doc]
How do you identify if a token is a punctuation mark?
token.is_punct