lecture 2 Flashcards
(41 cards)
areas of linguistics
- phonetics
- phonology
- morphology
- syntax
- semantics
- pragmatics
phonetics
sounds of human language
phonology
sound systems in human languages
morphology
formation and internal structure of words
syntax
formation and internal structure of sentences
semantics
meaning of sentences
pragmatics
study of the way sentences with their semantic meanings are used for particular communicative goals
NLP and text
much of NLP focuses on text only, leaving out many layers of natural language
e.g., phonetics/phonology
natural language is
- compositional
- arbitrary
- creative
- displaced
compositional
the meaning of a sentence is the sum of the meaning of individual words (semantics) and how they are combined (syntax)
[set of rules that define grammaticality] + [lexicon of words that relate to the world we want to talk about]
meaning of an expression = semantics + syntax
arbitrary
the link between form and meaning is arbitrary
creative
every language can create an infinite number of possible new words and sentences
displaced
we can talk about thing that are not immediately present
human natural language
- there is a critical period for acquiring language
- children need to receive real input to acquire language
- language is interconnected with other cognitive abilities
structure & grammar
- structure dictates how we can use language
- we implicitly know complex rules about structure.
- a community of speakers share a rough consent of their implicit rules. a grammar attempts to describe these rules.
descriptive linguistics
how language is studied
focuses on describing how language is used in practice, without making judgments about correctness.
aims to objectively analyze and document rules that speakers naturally follow
prescriptive linguistics
how language is taught
prescribes rules about how language should be used
often involves enforcing traditional rules and norms, which may not reflect actual usage
language rules in education (grammar)
the rules taught as part of language education often serve purposes beyond describing the language
they often reflect social, cultural, and political influences
grammaticality
a community of speakers share a rough consent of their implicit rules.
- all utterances we can generate from these rules are grammatical
- if we cannot produce an utterance using these rules, its ungrammatical
- SVO order
- subject & object pronouns
- sentences can be grammatically correct without any meaning
- idiolects
grammaticality rules accept useless utterances and block out communicative utterances.
why do we need rules?
- if we ignore rules because we know what was probably intended, we actually limit possibilities
- rules give us expressivity
NLP before self-supervised learning
the way to approach NLP was through understanding the human language system, and trying to imitate it (rule-based)
- probing
- reverse engineering
probing
small unsupervised models that are trained to extract linguistic information from another model’s output
this helps understand how well different layers of an LLM capture various linguistic features
reverse engineering language
- syntax: parse the input to understand its grammatical structure
- semantics: interpret meaning of the parsed input
- discourse: understand the broader context and relationships between sentences
the process involves using theories to inform the design of NLP models, ensuring they can parse, understand, and generate human language
testing an LLMs understanding of syntax
- jabberwocky sentences
- learning to apply grammatical rules from vast amounts of text data
- word order
- lexical generalization