Lecture 6 Flashcards

Pragmatics and Discourse Analysis, spaCy / Neural Networks for NLP

1
Q

Summary Of Discourse-Level NLP Tasks

A

Uncovering discourse structure
(discourse segmentations, discourse
relations, text coherence)

Uncovering document structure
* Recognizing known structure, for
example, abstracts
* Organizing documents according
to known structure

Conducting named entity resolution
across discourse elements

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Discourse Segmentation

A

Documents are automatically separated into passages, sometimes called fragments, which are different discourse segments
* Discourse segments can inform semantic interpretation of
document

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discourse Segmentation - Techniques

A

Techniques to separate documents into passages include
* Rule-based systems based on clue words and phrases
* Probabilistic techniques to separate fragments and to identify discourse segments
* Lexical cohesion to identify fragments (TextTiling)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Lexical Chains: Semantically Related Words

A

words that refer to the original:
the “book” was taken. “It” was valuable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Relatedness == Cohesion != Coherence

A

Any document can be viewed as a set
of lexical chains: Clusters of words
based on semantic similarity; but
chains by themselves do not
guarantee coherence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Relatedness == Cohesion != Coherence

A

A multi-sentence sequence becomes
more than a random set of
independent utterances:
* To the extent that semantically similar
noun phrases are used, or that
coreference connects noun phrases
across sentences (cohesion)
* And to the extent that dissimilar noun
phrases are “pragmatically” connected
through actions (coherence)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Relatedness == Cohesion != Coherence

A

Coherent
Locally Incoherent
Topically Incoherent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Discourse Structure

A

Human discourse often exhibits structures that are intended to indicate common experiences and respond to them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Discourse Relations

A

How adjacent text segments are logically connected to each other. The rhetorical
structure of the text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Rhetorical Structure Theory

A

– a theory of text organization created in the 1980s
* Text units as nuclear and satellites
* Three categories of relations:

subject matter relations,
presentational relations,
multinuclear relations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Discourse Markers

A

Many rhetorical relations can be indicated by particular words or
phrases (from Biran and Rambow, 2012)
* But many of these words are ambiguous as they can be used for
other functions in text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Entity Resolution

A

is an ability of a system to recognize and unify variant references to a single entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Coreference: A Critical Discourse Level Task

A

Anaphora - references (he, his, there) to previous text:
* “Doctor Foster went to Gloucester in a shower of rain.
He stepped in a puddle right up to his middle and never went there again.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Coreference: A Critical Discourse Level Task

A

Cataphora - references to future text:
* If you want them, there are cookies in the kitchen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Coreference: A Critical Discourse Level Task

A

Substitution - a more general word serves same function as the item for which it is substituted.
* These biscuits are stale. Get some fresh ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Coreference: A Critical Discourse Level Task

A

Ellipsis - something left out, but implied
* Joan brought some roses and Kate _ some sweet peas.

17
Q

Coreference: A Critical Discourse Level Task

A

The referent for a referring phrase is
found by the resolution algorithm
among the candidates, previous noun
phrases. Clues will involve people vs.
objects and single vs. plural.

18
Q

Generic Algorithmic Approach to Coreference

A
  1. Naively identify all referring phrases for resolution:
    * all Pronouns
    * all definite NPs
    * all Proper Nouns
19
Q

Generic Algorithmic Approach to Coreference

A
  1. Filter out things that look referential but are not
    * geographic names, the United States
    * Pronouns without actual meaning:
    * pleonastic “it”, e.g. it’s 3:45 p.m., it was cold
    * non-referential “it”, “there”
    * it was essential to understood,
    * there seems to be a mistake
20
Q

Generic Algorithmic Approach to Coreference

A
  1. Identify Referent Candidates
    * Potential Reference Candidates:
    * All noun phrases are considered potential
    referent candidates.
    * A referring phrase can also be a referent
    for a subsequent referring phrases,
    Example: (omitted sentence with name
    of suspect)
    He had 300 grams of plutonium 239 in his
    baggage. The suspected smuggler denied
    that the materials were his.
    (chain of 4 referring phrases)
    * All potential candidates are collected in a
    table collecting feature info on each
    candidate.
21
Q

Generic Algorithmic Approach to Coreference

A
  1. Collect Information of features that are used to define between a referring phrase and each candidate
    * Number agreement: plural/singular/neutral
    * Exceptions: some plural or group nouns can be referred to by either it or they
    IBM announced a new product. They have
    been working on it …
    * Gender agreement - animate objects as she/he, or inanimate objects as it
    * Person agreement - First and second person pronouns: “I”, “you; Third person pronouns
    * Grammatic role: candidates in a subject position versus referents in object position
    * Recency/Closeness: The closer a referent is to a candidate, the more likely the link
    * Reflexiveness (e.g., Ted paid himself)
22
Q

Generic Algorithmic Approach to Coreference

A

Train a classifier over an annotated corpus to identify which candidates and
referring phrases are in the same coreference group
* Typical Evaluation results are on the order of F-measure of 70 for overall
coreference, with generally higher precision than recall
* Pronoun coreference resolution by itself is much higher scoring, usually over
90%.

23
Q

Topic Modeling

A

Examination of patterns in the given text(or corpus) at the semantic level by
extracting topics from texts.

Topic: A list of words that occur in statistically meaningful ways (for the computer)

Text: Unstructured text such that no computer-readable annotations available that indicate the semantic meaning of the words in the text

24
Q

Topic Models

A

Topic Modeling takes a corpus and looks for sets of similar words across the collection (the topics) in such a way that each document can be represented by one or more topics

25
Q

Topic Modelling - LDA
(Latent Dirichlet Allocation)

A

An unsupervised method that models
documents and topics based on Dirichlet
distribution
- each document is considered to be a
distribution over various topics and each topic is modeled as a distribution over words.

26
Q

Domain knowledge

A

α:
* a low α value: more weight on having each document being composed of only a
few dominant topics,
* a high α value: more weight on having each document being composed of a
relatively larger set of topics.
* β:
* a low β value: more weight on having each topic being composed of only a few
dominant words.
* a high β value: more weight on having each topic being composed of a relatively
larger set of topics

27
Q

Pragmatics

A

Pragmatics is considered the highest level of natural language understanding and as such is the most challenging for computational approaches. Pragmatics is concerned with the actual meaning of words in the situational context. The implication is that world knowledge
plays a keystone role in enabling an accurate interpretation of an utterance.

28
Q

Overview of Pragmatics

A
  • Pragmatics takes a functional perspective
    – how we use language to achieve things
    in the real world
  • One goal is to explain how extra meaning
    is read into utterance without actually
    being encoded in them – usually through
    the construction of world knowledge from
    another source (i.e., previously
    encountered and modeled text,
    ontologies, thesauri, etc.)
29
Q

Overview of Pragmatics

A

Of prime interest for question answering,
dialog engines, natural language
generation, and human-computer
interactions through conversational agents

30
Q

Overview of Pragmatics

A
  • Many perspectives have been
    advanced to describe and predict
    properties of human conversations
  • Gricean Maxims
  • Dialogue Act Theory
  • Speech Act Theory
  • Conversational Structure Analysis
31
Q

Overview of Pragmatics

A
  • From a computational perspective,
    we want to be able to recognize
    and segment each dialog act and
    then make sense of it by
    recognizing speaker intent
32
Q

Four Gricean Maxims

A

The Maxim of Quality - make your contribution that is true
* Do not say what you believe to be false.
* Do not say that for which you lack adequate
evidence.

The Maxim of Quantity - make your contribution as informative as is required for the current
purpose of the conversation, but no more
informative than that (i.e., no excess information)

The Maxim of Relevance - Utterances should be pertinent to conversational purpose

The Maxim of Manner - Be clear, avoid obscurity,avoid ambiguity, be brief, be orderly

33
Q

Conversational Implicatures

A

When speaker is observing the
Gricean maxims directly, she will rely
on listener to amplify what she is
saying by some straightforward
inferences, called implicatures
* Example:
* A:Makes statement / asks question
* B:Responds, but on the surface, fails to
answer the question
* A:Assumes B is being co-operative;
makes inferences in order to maintain
assumption that B is being co-operative

These inferences are “conversational
implicatures”

34
Q

Word Meanings

A
  • Usually given by online lexicon such as WordNet, including:
  • Word with senses
35
Q

Dialogue Act Theory

A

In computational linguistics, more detailed
systems attempt to explain not only the
informative aspects of conversations, but the dialogue control aspects of an utterance
* Dialogue Act (DA): is “the combination of a communicative function and a semantic
content”, where the former is “the way in
which dialogue participants use information to change the context.” (Bunt, 2000)
* DA classification: the task of classifying an utterance with respect to the function it
serves in a dialogue, i.e. the act the speaker is performing.

36
Q

Speech Act Theory

A

Three Levels of Speech Acts affecting the
social reality of the speaker and listener:

Locutionary – proposition of speech act
* The literal meaning of the sentence
(what we’ve been working on in NLP)
Illocutionary – intention of speech act

  • The act of asking, answering, promising,
    etc. in uttering a sentence
    Perlocutionary – consequences of speech
  • The (often intentional) production of
    certain effects upon the feelings,
    thoughts, or actions of the addressee
37
Q

Taxonomy of Illocutionary Acts’ Intentions

A
  • Assertives – commit the speaker to
    something’s being the case – suggest, swear, boast, conclude
  • Directives – attempts by speaker to get
    listener to do something – ask, order, request, invite, advise
  • Commissives – obligate oneself to future
    course of action – promise, plan, vow, oppose
  • Expressives – share psychological state of
    speaker about something – apologize,
    deplore, thank
  • Declarations – bring about a different state of the world as a result of the utterance – resign, marry
38
Q
A