Topic 3 - Natural Language Services Flashcards
(27 cards)
Phonology
Part of Linguistics which refers to the systematic arrangement of sound.
Morphology
Study of the internal structure of words that represent the smallest units of meaning known as morphemes.
Free/Base Morphemes
A type of morpheme where the word cannot be divided and has meaning by itself (e.g. table, phone).
Bound Morphemes
A type of morpheme that occurs as part of a word after adding a prefix or suffix (e.g. un-happy, cat-s).
Inflectional Morphology
Changes what a word does in terms of grammar but does not create a new word (e.g. run, running, ran).
Derivational Morphology
Creates a new word out of base words (e.g. re + act = react, act + or = actor).
Lexical Analysis
The interpretation of the meaning of individual words, assigning part-of-speech (PoS) tags, and using techniques such as stemming and lemmatization.
Syntax
The study of the structure of phrases and sentences, also known as Parsing. It examines word order, stop-words, morphology, and PoS to uncover phrases that convey more meaning compared to individual words.
Parsing
Also known as Syntax, the process of uncovering phrases that convey more meaning compared to individual words by examining word order, stop-words, morphology, and PoS.
Semantic Analysis
The determination of the proper meaning of a sentence by understanding the most relevant words to derive concepts, involving disambiguating words with multiple meanings. Focuses on the literal meaning of words.
Pragmatic Analysis
Focuses on the knowledge or content that comes from outside the content of the document (i.e. speaker implied or listener infers) – inferred meaning.
Pragmatic Ambiguity
Arises when different persons derive different interpretations of the text based on inferred meaning.
Google Cloud Natural Language API
A pre-trained model used for performing various natural language analyses on documents.
analyseSyntax
A Google Cloud Natural Language API method that performs Sentence Extraction and Tokenisation.
Sentence Extraction
An operation performed by theanalyseSyntaxmethod to identify sentences within a document.
Tokenization
An operation performed by theanalyseSyntaxmethod to break down text into individual tokens (words, punctuation, etc.).
Token
An individual unit of text identified during Tokenization, containing information such as text, partOfSpeech, dependencyEdge, and lemma.
Dependency Tree/Parsing
A representation of the grammatical relationships between words in a sentence, built using the information from tokens during syntax analysis.
analyzeEntities
A Google Cloud Natural Language API method that identifies “Named entities” within a document.
Named Entity
A “real-world object” that is assigned a name, such as a person, country, product, or organization.
Salience
A key in entity analysis indicating the importance or relevance of an entity to the entire document text, with scores closer to 1.0 indicating higher importance.
analyzeSentiment
A Google Cloud Natural Language API method that provides sentiment analysis at the sentence and document levels.
Score (Sentiment)
Indicates the overall emotion of text, ranging from -1.0 (negative) to 1.0 (positive).
Magnitude (Sentiment)
Indicates how much emotional content is present in text, ranging from 0.0 to infinity and is not normalized.