Dialogue Systems Flashcards

1
Q

Dialog, speech acts

A

A dialogue is a sequence of turns. Each turn is a single contribution from one speaker.

Utterances in a dialogue are called speech acts.

A speech act can be of four main typologies:

  • constative: answering, claiming, confirming, denying, disagreeing, stating
  • directive: advising, asking, forbidding, inviting, ordering, requesting
  • commissive: promising, planning, vowing, betting
  • acknowledgment: apologizing, greeting, thanking, accepting an acknowledgment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Dialogue systems: grounding

A

A dialogue is a collective act where participants exchange information. To this end, participants need to establish a common ground.

In this process, very often the hearer sends acknowledgments that she has understood the speaker. This is called grounding.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Chatbots main classes

A

Three main classes of chatbots:

  • rule-based systems: use hand-written regular expressions
  • corpus-based systems
  • hybrid systems
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Corpus-based chatbots and techniques

A

Corpus-based systems mine large datasets of human-human conversations.

Once a chatbot has been put into practice, the acquired human turns can be used as additional data for fine tuning.

Two main neural techniques to provide a response to a user turn:

  • response by retrieval: BERT [CLS]
  • response by generation: encoder-decoder
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Corpus-based chatbots: response by retrieval

A

Let C be a corpus of conversations. The main idea is to view a user turn as a query q, and to retrieve from C the response r* that is most similar to q.

We use a bi-encoder model, in which we train two separate encoders to encode the user query and the candidate response.

We can implement this through BERT’s [CLS] token: write formulas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Corpus-based systems: response by generation

A

The main idea is to think of response production as an encoder-decoder task, transducing from the user’s prior turn to the system’s turn.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Digital assistants, architectures

A

Digital assistants have the goal of helping a user to solve specific tasks.

We distinguish two main architectures for digital assistants:

  • frame-based architecture: one of the very early architectures, still in use in medium-scale systems
  • dialogue-state architecture: more advanced, used in modern, large-scale industrial systems
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Frame-based dialogue systems, tasks

A

A frame is a kind of knowledge structure representing information and intentions that the system needs to extract from user’s sentences, and is defined as a collection of slot, value pairs.

The system goal is to fill the slots in the frames with the appropriate values.

3 general tasks:

  • Domain classification: in case of multi-domain dialogue systems, detect the appropriate domain.
  • Intent determination: given the domain, which goal is the user trying to accomplish?
  • Slot filling: extract the particular slots and fillers needed to carry out the user intent.

example of frame at slide 28 pdf 14…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Dialogue-state architecture, components

A

Dialogue-state architecture is a more advanced version of the frame-based architecture.

A typical dialogue-state system is based on 6 components:

  • Automatic speech recognition
  • Natural language understanding component extracts slot fillers from the user utterances using machine learning.
  • Dialogue state tracker maintains the current state of the dialogue
  • Dialogue policy component decides what to do next: answer a question, ask a clarification, make a suggestion, and so on.
  • Natural language generation component can condition on the exact dialogue context, to produce turns that seem much more natural
  • Text to speech
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Dialogue-state architecture: natural language understanding component

A

This component exploits sequence labeling and sentence classification techniques from previous lectures to solve the following three tasks:

  • domain classification
  • intent extraction
  • slot filling

We need a training set that associates each sentence with the correct domain, intent, and set of slots.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Dialogue acts in dialogue-state systems

A

Dialogue-state systems make use of dialogue acts, which:

  • implement the conversation turn
  • carry out the function of speech act and of grounding

Depending on the domain, each system uses a specific set of dialogue act categories to classify its dialogue acts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Dialogue-state architecture: dialog state tracker component, correction acts

A

A dialogue state consists of:

  • the entire frame at a given point of the dialogue
  • the user’s most recent dialogue act

The dialogue state tracker computes the current dialogue state, that is:

  • updates the running frame
  • classifies the user’s most recent dialogue act

If a dialogue system misunderstands an utterance, the user will generally correct the error by reformulating the utterance. This is called a correction act.

Dialog state tracker is also in charge of detecting correction acts, and interacts with slot filling to decide which slot value is being changed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Dialogue-state architecture: dialog policy component

A

This component decides what dialogue act the system should generate at step i, based on the entire dialogue state.

Âi = armgmax Ai in A of P(Ai|Framei-1, Ai-i, Ui-1)

Aj be the act from the system and Uj be the act from the user.

Probabilities can be estimated by a neural classifier, using neural representations of the slot fillers and the utterances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Dialogue-state architecture: natural language generation component

A

This component generates the text of a response to the user, once the policy has decided what dialog act to generate.

This task is modelled in two stages:

  • content planning: what to say?
  • sentence realization: how to say it? translates from the dialogue act and its arguments to text sentences.

Sentence realizer is trained on representation/sentence pairs from a large corpus of labeled dialogues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Dialogue systems: evaluation

A

Chatbots are evaluated by humans, who assign a score.

  • Participant evaluation: evaluation is carried out by the human who talked to the chatbot
  • Observer evaluation: evaluation is carried out by a third party who reads a transcript of a human/chatbot conversation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly