AI communication Flashcards
(13 cards)
Language Processing Challenges: Humans
Words never sound the same: different speakers, accents, fast speakers,
“lazy” speakers, etc
Words are rarely produced without any background noise
Words are rarely produced fully (e.g., we do not say “probably” entirely)
What are some of the ways / cues our minds use to split speech
into words and recognize those words
- we look for pauses and segmentations in speech
- we adapt to these segmentations for the speaker and the words the speaker tends to use
- we use context
- we use parallel processing
What is automatic speech recognition (ASR)?
a technology that converts spoken language into written text. It allows machines (like smartphones, computers, or virtual assistants) to understand and transcribe human speech in real time or from recordings.
what is Natural Language Processing (NLP)?
a field of artificial intelligence (AI) that focuses on enabling computers to understand, interpret, generate, and respond to human language in a way that’s both meaningful and useful
what does NLP do?
- processes spoke or written language to get its meaning
- transform the message in to something the machine can understand
- makes decisions based on the meaning
How are NLP systems created?
Historically, expert knowledge used to create models of sounds → words
Now, systems are trained to learn relationships between speech / text and
the desired response / outpu
What does AI struggle with (that humans manage)?
- Less familiar with different accents?
- Worse at filling in noisy speech?
- Less familiar with individual people?
- Non-literal speech, e.g., slang?
What cues / information might AI not use?
Language context? Shared references / knowledge?
Physical / visual cues? Diversity of real everyday language?
How does AI use segmentation cues to understand language?
Pauses
Stress patterns
Syllable length
(longer when part
of longer word)
Phonotactics
(allowable sound
combinations)
How does AI use flexibility?
Adapt to segmentation cues depending
on speaker - does this
but doesn’t adapt to the words the speaker tends to use
how does AI use context?
Can use the surrounding word
not visual cues or the situation
Does AI have parallel processing?
yes - activates multiple candidates for a word
What should future AI take into account
Non-literal speech (e.g., sarcasm)
● Different types of words (e.g., slang words) that are perhaps not part of the “standard training”
● Use context better to differentiate between different meanings of an ambiguous word
● More diverse input to respond better to different speakers (not just trained on white RP speakers)
● Input from different languages, including switches between languages
● Non-linguistic cues