Natural Language Processing Flashcards
(33 cards)
What is Natural Language Processing (NLP)?
A core area of AI that focuses on enabling computers to understand, interpret, and generate human language, both written and spoken
What is the ultimate goal of NLP?
To make sense of text in a way similar to humans, allowing machines to process human languages in relevant contexts
What is tokenization in NLP?
A preprocessing step where text is broken down into distinct words, parts of words, punctuation marks, or emojis, each represented by a unique numeric identifier
What are embeddings in NLP?
Special numeric codes that represent tokens as multivalued numeric representations, capturing semantic relationships between words with similar meanings having similar embedding codes
Give an example of how embeddings work
‘Dog’ and ‘puppy’ have vectors pointing in almost identical directions, similar to ‘cat’, while ‘skateboard’ would have a different embedding vector
What is positional encoding?
A technique that ensures the model doesn’t lose word order when processing natural language by adding a positional vector to each word’s embedding
What are attention layers?
Layers critical for determining how important each word or token is to the meaning of a sentence, especially in relation to other words
How do attention layers work in encoder blocks?
They help determine appropriate vector embedding for each word based on its context and relationship with other frequently appearing words
How do attention layers work in decoder blocks?
They predict the next token by considering the sequence of tokens so far, identifying the most influential tokens to decide what comes next
What are Transformer models?
Models that are especially good at understanding and generating language, forming the basis for today’s Large Language Models (LLMs)
What is sentiment analysis?
Calculating scores indicating emotional tone of text (positive, negative, neutral), providing labels and confidence scores at sentence and document level
What works better for sentiment analysis - small or large amounts of text?
Small amounts of text work better for sentiment analysis
What is opinion mining?
Aspect-based sentiment analysis that provides granular information about opinions related to specific aspects within text
What is key phrase extraction?
Identifying main points or concepts in a document, which works best with larger amounts of text
What is Named Entity Recognition (NER)?
Identifying and categorizing entities in text such as people, places, objects, quantities, and events
What is a subset of NER?
Personally Identifiable Information (PII) detection
What is machine translation?
Automating the translation of written and spoken language between different languages, including text, documents, and custom domain-specific terms
What is conversational AI?
AI agents that engage in dialogue with human users (bots/chatbots), requiring language models to interpret requests, determine intent, and formulate responses
What is question answering in NLP?
Defining a knowledge base of question-and-answer pairs from FAQ documents or custom entries, used by client applications to respond to user input
What does CLU stand for?
Conversational Language Understanding
What does CLU do?
Detects the intent of a user’s utterance (what user wants to do) and identifies entities (specific items referenced in the utterance)
What is Azure AI Language Service?
Provides pre-trained and customizable deep learning models for text analysis including sentiment analysis, key phrase extraction, language detection, NER, and summarization
How do you access Azure AI Language Service?
Through Language Studio
What are the three main capabilities of Azure AI Speech Service?
Speech-to-Text (transcribes speech to text), Text-to-Speech (synthesizes speech from text), and Speech Translation (translates spoken language)