lecture 8 Flashcards
(35 cards)
constituency tree vs dependency tree
- The constituency tree depicts the hierarchical syntactic structure by breaking the sentence down into nested sub-phrases.
–> Focuses on hierarchical structure and phrase groupings. - The dependency tree emphasizes the grammatical relationships between words. It shows which words depend on others
–> Focuses on direct word-to-word relationships and dependencies.
dependency grammar
directed binary grammatical relations between words
direct encoding of the relationship between predicates (verbs) and their arguments (nouns)
more popular than constituency grammar because of its emphasis on predicate-argument structure
predicates
functions that take different numbers and kinds of arguments
dependency grammar: arcs
arcs go from heads to dependents.
–> heads are often predicates, while dependents are often arguments
exactly 1 incoming edge for all tokens
root is the head of the entire structure
especially useful for language with free word order
why does dependency parsing matter for meaning
resolves attachment ambiguities that can matter for meaning
- grammatical structure of a sentence based on the relationships between words
- syntactic dependencies can be close to semantic relations
- applicable across languages
for what types of tasks might dependency parsing be useful
- information extraction
- machine translation
syntax generally:
- allows generalization from specific units to abstract categories that are often cross-linguistically valid
- allows for understanding of constituency (how words group together) or linguistic units
- shows how constituents relate to each other structurally and functionally
what information does syntax give us about meaning
- abstract categories can tell us something about the meaning of a consituent
–> e.g., nouns - abstract categories can tell us something about the meaning of a sentence
–> closed-class/function words
syntactic structure is the skeleton with which we build our mental representations, both enabling us to generalize and constrain possible meaning
is syntax enough to enable a model to understand meaning: the chinese room experiment
- suppose AI has successfully created a computer that behaves as if it understands chinese, input of chinese produces logical output of chinese (passes turing test)
- does the machine literally understand chinese (strong AI) or does it merely simulate understanding of chinese (weak AI)
why is syntax insufficient for understanding language
- syntax alone does not ground words in actions, objects, etc. in the world
- we cannot evaluate the grammaticality, truth, or naturalness of an utterance with only syntax
- language is flexible:
–> often we want to know who did what to whom
–> the same event and participants can have different syntactic relations
–> ex: [the walrus ate a sea cucumber] vs [a sea cucumber was eaten by the walrus]
the task of semantic role labeling (SRL)
task of identifying which constituents play which roles in an event
typically framed as a supervised classification task
semantic role labeling vs dependency parsing
Focus:
- SRL: Focuses on the roles that words or phrases play in the context of an event. It answers questions like who did what to whom, when, where, and how.
- Dependency Parsing: Focuses on the grammatical relationships between words, identifying syntactic structures and dependencies.
Output:
- SRL: Produces semantic labels that describe the roles of words in relation to the main verb or action. For example, identifying the agent, theme, instrument, etc.
- Dependency Parsing: Produces a dependency tree showing syntactic dependencies between words, such as subject, object, modifier, etc.
argument structure
- the lexical representation of items (predicates) that take arguments
- indicates how many participants (arguments) an item has, what their semantic relation (role) is to the item, and what their syntactic expression (e.g., nsubj, dobj) is.
uniformity of theta-assignment hypothesis (UTAH)
- states that identical semantic relations between items are represented by identical structural (syntactical) relationships between items
- if semantic roles/relations determine structural/syntactic relations, we can use our knowledge about syntax to help determine semantic roles
selectional restrictions of semantic roles
allow predicates to constrain the semantics of their arguments
semantic or thematic roles
abstract models of the role an argument plays in the event described by the predicate
recourses for SRL
- propbank: verb-oriented –> simpler, more data
- framenet: frame-oriented –> richer, less data
the proposition bank (propbank)
- predicate-argument lexicon with frame files for each verb
–> detailing the various senses and roles associated with the verb - coarse categorical labels (ARG0, ARG1) that capture some syntactic variation and shallow semantics
- includes annotations on constituents as found in the penn treebank
framenet
- semantic frames: conceptual structures that describe an event and its participants
- core and non-core frame elements to add rich semantic information
SRL traditional pipeline
- assume or compute syntactic parse and predicate senses
–> use broad-coverage parser
–> traverse parse to find all predicates - argument identification
–> select the predicate’s argument phrases by parsing the parse tree - argument classification
–> select a role for each argument using supervised classification (wrt the frame role for the predicate’s sentence)
feature-based algorithm for SRL
- assign syntactic parse to input string
- traverse parse to find all predicates
- for each predicate, examine each node in the parse tree and use supervised classification to decide the semantic role it plays for the predicate (if any)
features for SRL
given a labeled training set, a feature vector is extracted for each node
common feature templates:
- governing predicate
- phrase type
- headword POS
this information comes from syntax
evaluation of SRL
- goal: compute highest probability tag sequence given an input sequence of words
- evaluation on unseen test sentences
- each argument label must be assigned to the correct word sequence or parse constituent
- compute precision, recall, F1
- common evaluation datasets are CoNLL and OntoNotes
example system architecture SRL
main idea: treat SRL as neural sequence labeling task, similar to NER
- sentence is passed through encoder, which generates contextualized embeddings for each word
- concatenation of word embeddings with predicate information
- FFN extracts relevant features of each word
- decoder: CRF + biLSTM layer
- output: distribution over the SRL labels for each word, which indicates the likelihood of each label