Lecture 9: Dialogue and Visual World Flashcards
(16 cards)
Cooperation in Conversation?
Grice (1975):
- In conversations, speakers and listeners (interlocutors) cooperate to make the conversation meaningful and purposeful.
- Speakers and listeners follow a set of rules (‘conversational maxims’) to make sense of conversation.
Conversational Maxisms
Grice’s Conversational Maxims:
- Maxim of Quantity: Make your contribution as informative as is required, but no more.
- Maxim of Quality: Make your contribution true. Don’t say anything that you believe to be false, or which you lack sufficient evidence.
- Maxim of Relevance: Make your contribution relevant to the aim of the conversation.
- Maxim of Manner: Be clear: Avoid obscurity, ambiguity, wordiness, and disorder in your language.
Conversational Implicatures
-2 examples
Conversational Implicatures (Grice, 1975): Inferences we make in conversations to maintain the sense and relevance of the conversation
Types of Conversational Implicatures:
1. -caused by the speaker flouting (violating) a maxim deliberately to convey an additional meaning:
A: Do you like my new shoes?
B: God, it’s hot here.
- B is violating the Maxim of Relevance.
- Therefore, we (or A) assume that B doesn’t like A’s new shoes.
- caused by the speaker flouting (violating) a maxim to follow another:
A: What did Yuki have at the new Japanese restaurant last night?
B: She had either sushi or noodles.
The Maxim of Quality (be true) and the Maxim of Quantity (be short) conflict here.
B is violating Quantity to follow Quality.
We (or A) assume that B doesn’t have a clear idea.
More on cooperation
-Audience design
Audience Design:
Hypothesis: Speakers and listeners achieve success in communication because they maintain detailed models of what the other person knows, and speak and understand against these models. (Clark & Murphy, 1982).
- Do speakers design their utterances for their listeners?
- Do listeners understand utterances using speakers’ perspectives?
Common Ground and Audience Design
-explanation and experiment
Common Ground:
- A special kind of mutual knowledge between speakers and listeners
- Common ground provides the critical background against which speakers produce utterances and listeners comprehend them. (Clark and colleagues)
Establishing Common Ground:
- Common ground is partly determined on the basis of what the speakers and listeners already know or can reasonably infer about each other before the conversation begins. (Clark, 1996)
- Do speakers control their utterances to fit with common ground?
Isaac & Clark (1987):
Task: to describe pictures of New York landmarks either to other New Yorkers or to ‘out-of-towners’.
-Speakers quickly figured out whether their listener was a New Yorker or not, and modified descriptions optimally:
‘Citicorp building’ to a New Yorker
‘a skyscraper with a slanted roof’ to a novice
Similar types of speech adaptations were found for listeners who were:
adults vs. children (Shatz & Gelman, 1973)
native vs. non-native speakers (Bortfeld & Brennan, 1997)
Speakers and Audience Design
-Experiment
Wardlow Lane, Groisman & Ferreira (2006)
Method:
Task for Speaker: To describe the target object to the addressee.
Design (2x2):
Object Type:
Target object with a (occluded) contrast (the target object can be called ‘the small heart’.)
-(but observer can’t see other heart so can just say, the heart)
Target object without a contrast (the target object can be called ‘the heart’.)
Design (2x2):
Instruction Type:
Conceal Block: the speaker was instructed to conceal the identity of the occluded object from the addressee. (with and without contrast)
Baseline Block: no concealing instruction.
Results: % describing the target object with a modifier (the small heart):
- with contrast > without contrast
- (for with contrast): Conceal inst. > No Conceal inst. (baseline)
- percentage of modified utterances (% of privileged information leakage)
- > conceal with contrast 15% (used modifier) when told not to end up doing it
- > surprising
- the real engagement of audience design is not huge
- utterances become more egocentric
Speakers can engage in audience design overall. (overall mean % 0)
Also, when the ‘privileged’ information’ is made salient (in Conceal block), speakers’ utterances become more ‘egocentric’.
*look at actual study, couldn’t understand profs accent
Learning when to deploy audience design
-experiment
Horton & Gerrig (2002):
- Task: In a referential communication task, Directors (speakers: subjects) had to describe pictures and tangrams to two different Matchers, over several rounds:
- Director: boats, rockets & people in all 9 rounds
- Rounds 1-3: Matchers A (boats, rockets) and B (boats, people)- Rounds 4-6: Matcher A (boats, rockets, people)
- Rounds 7-9: Matcher B (boats, rockets, people)
- The speakers did not take the matcher’s existing knowledge into account in Round 4 (egocentric), but did in Rounds 7 (change utterances, feedback).
- The difference is presumably due to the feedback the speakers received from the matchers in Rounds 4-6.
- Speakers do not deploy AD in an absolute manner, but can learn to detect cases in which it is necessary.
Can speakers avoid syntactic ambiguity?
-experiment
Ferreira & Dell (2000):
Task: Subjects (speakers) were presented sentences and had to recall and speak them to addressees later.
Conditions:
(1) The coach knew she missed practice.
(‘she’ disambiguates the structure - unambiguous)
(2) The coach knew you missed practice.
(at ‘you’, the sentence is still ambiguous, and ‘missed’ disambiguates the structure).
- Upon recall, subjects did not include the optional disambiguating complementiser, ‘that’, after ‘knew’ any more often in (2) than in (1).
- Speakers do not try to avoid syntactic ambiguity even when they could.
Audience Design and Common ground in Comprehesion??
Do listeners interpret utterances from speakers’ perspectives?
In particular, how could common ground affect understanding utterances?
-experiment
Clark, Schreuder, & Buttrick (1983):
Task: Subjects (listeners) had to solve a referential ambiguity.
Conditions:
- (1) The speaker and listener talked about George’s obsession about weight loss.
- or (2) They talked about George’s eating binge.
- Then, the speaker pointed out a picture of a thin man and a fat man, and said ‘George will look like that man very soon.’
Subjects interpreted ‘that man’ according to the prior topic.
Listeners use common ground in interpreting references.
Time Course of Common Ground in Comprehension
-experiment
Keysar, Barr, Balin, & Brauner (2000):
- Visual-world eye-tracking experiment
- Director (speaker) – confederate; Addressee (listener) – subject wearing an eye-tracker
- Some objects were mutually visible.
- However, some objects were not shown to the director, and the subject knew it.
- Director: ‘Move the small candle’.
Conditions
Test: one contrast (object of the same kind) is hidden
Control: no contrast is hidden (something else in Slot 3)
Probability of looks to Slot 3:
Test (with a candle) > Control (with something else)
-Listeners considered the occluded candle as a candidate even if they knew it wasn’t available to the speakers.
-However, at the majority of times (83%), listeners chose to move the correct candle (2).
-Listeners don’t necessarily use the common ground immediately to identify a referent.
-The common ground is used at a later stage of processing.
Alignment in Dialogue
-based on experiment results
Pickering & Garrod (2004):
- Interlocutors (speakers and listeners) do not use language to encode and decode messages.
- Instead, they use it as a means by which they can ‘align’ their mental states, so that they come to have the same ideas about the topic under discussion.
- For a dialogue to be successful, the language representations of the interlocutors must be aligned at many levels (mainly through priming):
- lexical level
- syntactic level
- overall mental/situation model
- Alignment occurs as a direct, automatic process.
- However, at an early stage, it is mainly ‘egocentric’, i.e., for their own good.
- Modelling of interlocutor’s mind (audience design) is costly, so it doesn’t usually happen at an early stage, but later.
- Pickering & Garrod (2007) also claim that listeners make predictions in dialogue.
- The prediction processes are driven by the production system in dialogue.
(summary of dialogue in notes)
Eye movements and language processing
-written and spoken language
Tracking eye movements in processing written language (reading):
-The link between eye movements and processing in mind: somewhat intuitive - we move our eyes to read written language.
-Observed behaviours (eye movements) are a straightforward reflection of processor’s response to the stimuli (language).
This methodology has been used for decades for psycholinguistic research.
Tracking eye movements in processing spoken language? (listening):
- The link between eye movements and processing in mind is not so transparent - Do we move our eyes while listening? (what have eyes got to do with ears?)
- A possible technique to use eye movements to study spoken language processing?
Visual-world Paradigm:
-Subjects process both a visual and auditory (linguistic) stimuli.
-The visual stimulus is usually related to the linguistic stimulus.
-Their eye movements are tracked.
(less straight forward than reading because looking at pricture)
Linking Hypothesis: Eye movements reflect both visual and auditory (linguistic) processing at a given time.
Visual-world paradigm
Visual-world Paradigm:
-Subjects process both a visual and auditory (linguistic) stimuli.
-The visual stimulus is usually related to the linguistic stimulus.
-Their eye movements are tracked.
(less straight forward than reading because looking at pricture)
Linking Hypothesis: Eye movements reflect both visual and auditory (linguistic) processing at a given time.
Generally…
- A class of eye-tracking technique that enables us to investigate the process of integrating information extracted from visual scene and auditory language (originated by Cooper, 1974).
- Eye-movements can be time-locked against certain points of the auditory stimuli.
- Perhaps, it emulates auditory language processing in ‘natural’ environments. - ‘visual-world’ = context.
- Can be used for populations that have difficulties with reading (e.g., small children).
- Can be applied to a variety of research topics: lexical, semantic, syntactic, discourse, dialogue, etc.
- Can be combined with a variety of tasks: no-task (look & listen), comprehension, moving-object, memory, problem-solving, etc.
Example set ups:
Visual stimuli: real objects.
Linguistic stimuli: given as ‘instructions’. (‘Move the frog onto the pot.’)
Task: to manipulate objects to follow instructions.
Visual stimuli: semi-realistic ‘scenes’ with several clip-art drawings.
Linguistic stimuli: presented from speakers.
Task: to look and listen (passive listening).
Tanenhaus et al. (1995)
Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy (1995)
- Subjects were presented real objects in front of them, and asked to follow auditory instructions by acting.
- During the experiment, subjects’ eye movements were recorded by a head-mounted eye-tracker.
- The visual display emulates a linguistic context employed in reading studies on referential context effects (e.g., Crain & Steedman, 1985).
Referential Theory on VP/NP attachment ambiguity:
‘Put the apple on the towel…’
-(could be location or could be modifier of apple)
Could the visual context affect the interpretation of ambiguity?
one-referent condition (1 apple)
-no need to specify which apple:
‘on the towel’ is interpreted as Goal.
the empty towel should be looked at often. (destination)
two-referent condition (2 apples)
need to specify which apple:
‘on the towel’ is interpreted as modifier of ‘the apple’.
the empty towel should NOT be looked at that often. (irrelevant)
Design: 2x2
Visual display: one-referent / two-referent
Sentence:
ambiguous: Put the apple on the towel in the box.
unambiguous: Put the apple that’s on the towel in the box.
Results:
Proportion (%) of looks at the empty towel (false Goal aka not supposed to be final destination) after ‘towel’ in ‘Put the apple (that’s) on the towel in the box’
-Ambiguous: one-ref > two-ref (people interpretted towel was that on own more often in one apple condition, VP attachment??)
-Unambiguous: one-ref
Prediction in Sentence Processing
Do people predict (anticipate) what will be mentioned later in the sentence, using verb’s semantic constraints?
-experiment
Altmann & Kamide (1999): Conditions: (1) The boy will eat the cake. (2) The boy will move the cake. (whether listeners could predict what would be said)
(1) - selective: verb’s semantic constraints on its arguments (=selectional restrictions) allow only the cake to be direct object.
(2) – non-selective: other objects meet selectional restrictions.
- somatic constraint eat, whether edible
- people looked at cake more in eat
% of looks to cake during ‘eat/move’:
(1) (eat) > (2) (move)
(=anticipatory eye movements)
- Listeners can use verbs’ semantic constraints to predict the most likely forthcoming object in context.
- This anticipation takes place at the earliest point possible (during the verb itself).
- Listeners can integrate visual contexts and linguistic input rapidly.
Semantic competition in word recognition
How do semantically similar words compete in understanding spoken words?
-experiment
Huettig & Altmann (2005):
-people just heard words rather than sentences, see what people look at
distractor: pic of goat and pic of hammer
competitor: trumpet
target: piano
% of looks:
target > competitor > mean of distractors
- The semantic feature overlapped between target and competitor (‘musical instrument’) is activated during the target word.
- Visual attention can be rapidly guided towards a semantically related object even if it is visually very different form the target object.
(summary in notes for visual-world paradigm)