Identify features of common NLP Workload Scenarios Flashcards

1
Q

Natural Language Processing (NLP)

A

In order for computer systems to interpret the subject of a text in a similar way humans do, they use natural language processing (NLP), an area within AI that deals with understanding written or spoken language, and responding in kind.

-Text analysis describes NLP processes that extract information from unstructured text.

Natural language processing might be used to create:

-A social media feed analyzer that detects sentiment for a product marketing campaign.
-A document search application that summarizes documents in a catalog.
-An application that extracts brands and company names from text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Text Analysis - Technique’s

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Tokenization

A

The first step in analyzing a corpus is to break it down into tokens. For the sake of simplicity, you can think of each distinct word in the training text as a token.

The phrase “we choose to go to the moon” can be represented by the tokens [1,2,3,4,3,5,6].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Frequency Analysis

A

After tokenizing the words, you can perform some analysis to count the number of occurrences of each token. The most commonly used words (other than stop words such as “a”, “the”, and so on) can often provide a clue as to the main subject of a text corpus.

From this information, we can easily surmise that the text is primarily concerned with space travel and going to the moon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Machine learning for Text Classification

A

Another useful text analysis technique is to use a classification algorithm, such as logistic regression, to train a machine learning model that classifies text based on a known set of categorizations.

A common application of this technique is to train a model that classifies text as positive or negative in order to perform sentiment analysis or opinion mining.

With enough labeled reviews, you can train a classification model using the tokenized text as features and the sentiment (0 or 1) a label.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Semantic Language model

A

Semantic language model is a technique that utilizes the semantic structure of an utterance to better rank the likelihood of words compos-ing the sentence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Common NLP tasks supported by language models

A

-Text analysis, such as extracting key terms or identifying named entities in text.
-Sentiment analysis and opinion mining to categorize text as (+) or (-).
-Machine translation, in which text is automatically translated from one language to another.
-Summarization, in which the main points of a large body of text are summarized.
-Conversational AI solutions such as bots or digital assistants can interpret natural language input and return an appropriate response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Azure AI Language

A

Azure AI Language is a cloud-based service that includes features for understanding and analyzing text. You can use a language resource for authoring and prediction.

-Named entity recognition identifies people, places, events, and more.
-Entity linking identifies known entities together
-Personal identifying information (PII) detection identifies personally sensitive information
-Language detection
-Sentiment analysis and opinion mining
-Summarization
-Key phrase extraction
-Conversational language understanding
-Question Answering

Language studio - a web-based interface for creating and managing Conversational Language Understanding applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

You can easily create a user support bot solution on Microsoft Azure using a combination of two core services:

A

-Azure AI Language: includes a custom question answering feature that enables you to create a knowledge base of question and answer pairs that can be queried using natural language input.

-Azure AI Bot Service: provides a framework for developing, publishing, and managing bots on Azure.

The automatic bot creation functionality, enables you to create a bot for your deployed knowledge base and publish it as an Azure AI Bot Service application with just a few clicks.

  1. You can use Azure AI Language Studio to create, train, publish, and manage question answering projects.
  2. To create a project, you must first provision a Language resource in your Azure subscription.
  3. After creating a set of question-and-answer pairs, you must save it.
  4. After you’ve created and deployed a knowledge base, you can deliver it to users through a bot.
  5. When your bot is ready to be delivered to users, you can connect it to multiple channels

You can import question and answer pairs from an existing FAQ document into a question answering knowledge base.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Conversational Language

A

To work with conversational language understanding, you need to take into account three core concepts:

-An utterance is something a user might say, which your application must interpret. For example, when using a home automation system, a user might use the following utterances: “Switch the fan on.”

-An entity is an item to which an utterance refers. For example, “fan”.

-An intent represents the purpose, or goal, expressed in a user’s utterance. For example, the intent is to turn a device on. The intent encapsulates the task (getting the time) and the entity specifies the item to which the intent is applied (the city).

The None intent is considered a fallback, and is typically used to provide a generic response to users when their requests don’t match any other intent.

You have published your conversational language understanding application. What information does a client application developer need to get predictions from it?
-The endpoint and key for the application’s prediction resource

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

AI Speech capabilities

A

AI speech capabilities enable us to manage home and auto systems with voice instructions, get answers from computers for spoken questions, generate captions from audio, and much more.

To enable this kind of interaction, the AI system must support two capabilities:

Speech recognition - the ability to detect and interpret spoken input.

-An acoustic model that converts the audio signal into phonemes (representations of specific sounds).
-A language model that maps phonemes to words, usually using a statistical algorithm that predicts the most probable sequence of words based on the phonemes.

Speech synthesis - the ability to generate spoken output

-Generating spoken responses to user input

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Azure AI Speech

A

Azure AI Speech provides speech to text and text to speech capabilities through speech recognition and synthesis.

You can use prebuilt and custom Speech service models for a variety of tasks, from transcribing audio to text with high accuracy, to identifying speakers in conversations, creating custom voices, and more.

Azure offers both speech recognition and speech synthesis capabilities through Azure AI Speech service, which includes the following:

-The Speech to text API: You can use Azure AI Speech to text API to perform real-time or batch transcription of audio into a text format.

-The Text to speech API: Enables you to convert text input to audible speech, which can either be played directly through a computer speaker or written to an audio file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Document Intelligence

A

Document intelligence describes AI capabilities that support processing text and making sense of information in text.

-It automates the process of extracting, understanding, and saving the data in text.
-Relies on machine learning models that are trained to recognize data in text.
-The ability to extract text, layout, and key-value pairs are known as document analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Azure AI Document Intelligence

A

Azure AI Document Intelligence supports features that can analyze documents and forms with prebuilt and custom models.

Azure AI Document Intelligence consists of features grouped by model type:

-Prebuilt models - pretrained models that have been built to process common document types such as invoices, business cards, ID documents, and more. These models are designed to recognize and extract specific fields that are important for each document type.

-Custom models - can be trained to identify specific fields that are not included in the existing pretrained models.

-Document analysis - general document analysis that returns structured data representations, including regions of interest and their inter-relationships.

-The merchant name and address can be identified using the receipt model.
-The receipt analyzer model is available as a service when you create an Azure AI Document Intelligence resource.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Azure AI Search

A

Azure AI Search provides the infrastructure and tools to create search solutions that extract data from various structured, semi-structured, and non-structured documents.

-Image processing, content extraction, and natural language processing to perform knowledge mining of documents.

-Provides a programmable search engine built on Apache Lucene
-99.9% uptime SLA

Azure AI Search comes with the following features:

-Data from any source
-Supporting both simple query and full Lucene query syntax
-AI powered search
-Linguistic analysis for 56 languages
-Geo-search filtering based on proximity
-Configurable user experience

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Identify elements of a search solution

A

-A typical Azure AI Search solution starts with a data source that contains the data artifacts you want to search. This could be a Azure Storage, Azure SQL Database or Azure Cosmos DB.

-If your data resides in supported data source, you can use an indexer to automate data ingestion. (An indexer connects to a data source, serializes the data, and passes to the search engine for indexing)

-Besides automating data ingestion, indexers also support AI enrichment.

-The fields containing your content are persisted in an index, which can be searched by client applications.

17
Q

Use a skillset to define an enrichment pipeline

A

AI processing is achieved by adding and combining skills in a skillset. A skillset defines the operations that extract and enrich data to make it searchable. These AI skills can be either built-in skills, such as text translation or Optical Character Recognition (OCR), or custom skills that you provide.

-Built-in skills are based on pretrained models from Microsoft, which means you can’t train the model using your own training data.

Built-in skills fall into these categories:

-Natural language processing skills: with these skills, unstructured text is mapped as searchable and filterable fields in an index. (Key Phrase Extraction, Text Translation Skill)

-Image processing skills: creates text representations of image content, making it searchable using the query capabilities of Azure AI Search. (Image Analysis Skill, Optical Character Recognition Skill)

18
Q

Indexes

A

An Azure AI Search index can be thought of as a container of searchable documents. Conceptually you can think of an index as a table and each row in the table represents a document. Tables have columns, and the columns can be thought of as equivalent to the fields in a document. Columns have data types, just as the fields do on the documents.

Schema
-In Azure AI Search, an index is a persistent collection of JSON documents and other content used to enable search functionality.
-The index includes a definition of the structure of the data in these documents, called its schema.

Attributes
-Azure AI Search needs to know how you would like to search and display the fields in the documents. You specify that by assigning attributes, or behaviors, to these fields.

19
Q

Use an indexer to build an index

A

In order to index the documents in Azure Storage, they need to be exported from their original file type to JSON. In order to export data in any format to JSON, and load it into an index, we use an indexer.

Azure AI Search lets you create and load JSON documents into an index with two approaches:

-Push method: JSON data is pushed into a search index via either the REST API or the .NET SDK.
-Pull method: Search service indexers can pull data from popular Azure data sources, and if necessary, export that data into JSON

-Use the pull method to load data with an indexer

-The Search explorer can perform quick searches to check the contents of an index, and ensure that you are getting expected search results.

You have to drop and recreate indexes if you need to make changes to field definitions.

-An indexer converts documents into JSON and forwards them to a search engine for indexing.

If you set up a search index without including a skillset, which would you still be able to query?
Text content

20
Q

Knowledge Store

A

A knowledge store is persistent storage of enriched content. The purpose is to store the data generated from AI enrichment in a container.

While the focus of an Azure AI Search solution is usually to create a searchable index, you can also take advantage of its data extraction and enrichment capabilities to persist the enriched data in a knowledge store for further analysis or processing.

A knowledge store can contain one or more of three types of projection of the extracted data:

-Table projections are used to structure the extracted data in a relational schema for querying and visualization
-Object projections are JSON documents that represent each data entity
-File projections are used to store extracted images in JPG format

21
Q

Using the Azure portal’s Import data wizard

A

Contained within the Azure AI Search service in Azure portal is the Import data wizard, which automates processes in the Azure portal to create various objects needed for the search engine. You can:

-Data Source: Persists connection information to source data, including credentials. A data source object is used exclusively with indexers.
-Index: Physical data structure used for full text search and other queries.
-Indexer: A configuration object specifying a data source, target index, an optional AI skillset, optional schedule, and optional configuration settings for error handling and base-64 encoding.
-Skillset: A complete set of instructions for manipulating, transforming, and shaping content, including analyzing and extracting information from image files
-Knowledge store: Stores output from an AI enrichment pipeline in tables and blobs

22
Q

Query data in an Azure AI Search index

A

Azure AI Search queries can be submitted as an HTTP or REST API request, with the response coming back as JSON.

Azure AI Search supports two types of syntax:
-Simple syntax covers all of the common query scenarios, while full Lucene is useful for advanced scenarios.