Test 2 Flashcards

Chapters 8, 16, 17, 21 (142 cards)

1
Q

Natural Language

A

Unfettered spoken or written language

-Primary means of human communication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Natural Language Processing (NLP)

A

Enabling the use of automated methods that represent the relevant information in the text with high validity and reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Patrick Suppes

A

-Pioneer in computerized learning

“…the challenge to psychological theory made by linguists to provide an adequate theory of language learning may well be regarded as the most significant intellectual challenge to theoretical psychology in this century.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bag-of-Words

A

A language model where text is represented as a collection of words, independent of each other and disregarding word order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Keyword

A

A word or phrase that conveys special meaning or to refer to information that is relevant to such a meaning,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Machine Learning

A

A computer technique in which information learned from data is used to improve system performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

NLP Text Processing

A
  • Lexical: Tokenization, part of speech, head, lemma
  • Parsing and Chunking
  • Semantic Tagging: Semantic role, word sense
  • Certain Expressions: Named entities
  • Discourse: coreference, discourse segments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

NLP Speech Processing

A
  • Phonetic transcription
  • Segmentations (Puncutations)
  • Prosody
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Types of NLP: Information Extraction

A

Methods that process text to capture and organize specific information in the text and also to capture and organize specific relations between the pieces of information.

-Most common form in biomedicine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Biosurveillance

A

A public health activity that monitors a population for occurrence of a rare disease or increased occurrence of a common one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Named-entity Recognition

A

In language processing, a sub-task of information extraction that seeks to locate and classify atomic elements in text into predefined categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Named-entity Normalization

A

The natural language processing method, after finding a named entity in a document, for linking (normalizing) that mention with appropriate database identifiers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Modifiers of Interest

A

In NLP, a term used to describe or otherwise modify a named-entity that has been recognized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Relations Among Named Entities

A

A characterization of two entities in NLP with respect to the semantic nature of the relationship between them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Reference Resolution

A

In NLP, recognizing that two mentions in two different textual locations refer to the same entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Question Answering (QA)

A

A computer-based process whereby a user submits a natural language question that is then automatically answered by returning a specific response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Text Summarization

A

Takes one or several documents as input and produces a single, coherent text that synthesizes the main points of the input documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Text Generation

A

Methods that create coherent natural language text from structured data or from textual documents in order to satisfy a communication goal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Machine Translation

A

Automatic mapping of text written in one natural language into text of another language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Text Readability Assessment and Simplification

A

An application of NLP in which computational methods are used to assess the clarity of writing for a certain audience or to revise the exposition using similar terminology and sentence construction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Linguistic Steps in NLP: Morphology

A

The way words are built up from smaller, meaning-bearing units; the structure of words

  • Various forms of basic words
  • Make more words from less.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Linguistic Steps in NLP: Syntax

A

How words are put together to form correct sentences and what structural role each word has.

-Syntax tree assigned by grammar or lexicon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Linguistic Steps in NLP: Semantics

A

What words mean and how these meanings combine in sentences to form sentence meanings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Linguistic Steps in NLP: Pragmatics

A

How sentences are used in different situations and how use affects the interpretation of the sentence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Linguistic Steps in NLP: Discourse
How the immediately preceding sentences affect the interpretation of the next sentence.
26
Natural Language Understanding (NLU)
Subtopic of NLP in Artificial Intelligence that deals with machine reading comprehension.
27
Applications of NLP
- Intelligent computer systems - NLU interfaces to databases - Computer-aided instruction - Information Retrieval - Intelligent web searching - Data mining - Machine translation - Speech Recognition - Natural Language Generation - Question Answering
28
Difficulties of NLP
- Different ways of parsing a sentence. - Word category ambiguity - Word sense ambiguity - Words can mean more than the sum of their parts - Imparting world knowledge is difficult - Fictitious worlds - Defining scope - Language is changing and evolving - Complex ways of interaction between the kinds of knowledge - Exponential complexity at each point in using the knowledge
29
Ambiguity
The fundamental problem of computational linguistics
30
Morpheme
The smallest unit in grammar that has a meaning or linguistic function. -Generally a root of a word, a prefix, or a suffix
31
Free Morpheme
A morpheme that is a word and does not contain another morpheme
32
Bound Morpheme
A morpheme that creates a different form of a word but must always occur with another morpheme.
33
Inflectional Morpheme
A morpheme that creates a different form of a word without changing the meaning or part of speech.
34
Derivational Morpheme
A morpheme that changes the meaning or part of speech of a morpheme.
35
Regular Expression
A mathematical model of a set of strings, defined using characters of an alphabet and the operators concatenation, union, and closure. -Zero or more occurrences of an expression
36
Lexicon
A catalogue of words in a language, usually containing syntactic information such as parts of speech, pluralization rules, etc.
37
Finite State Automaton
An abstract, computer-based representation of the state of some entity together with a set of actions that can transform the state. -Collections of finite state automata can used to model complex systems.
38
Tokens (NLP)
The composite entities constructed from individual characters, typically words, numbers, dates, or punctuation.
39
Markov Process
A mathematical model of a set of strings in which the probability of a given symbol occurring depends on the identity of the immediately preceding symbol or the two immediately preceding symbols.
40
Lexemes
A minimal lexical unit in a language that res presents different forms of the same word.
41
Telegraphic (NLP)
Language that does not follow the usual rules of grammar but is compact and efficient.
42
Grammar (NLP)
A mathematical model of a potentially infinite set of strings.
43
Nested Structures (NLP)
A phrase or phrases that are used in place of simpler words within other phrases.
44
Probabilistic Context-Free Grammar
CFG in which the possible ways to expand a given symbol have varying probabilities rather than equal weight.
45
Dependency Grammar (NLP)
A linguistic theory of syntax that is based on dependency relations between words, where one word in the sentence is independent and other words are dependent on that word.
46
Logic-based Semantics
A knowledge representation method based on the use of predicates.
47
Conceptual Graph (Semantics)
A formal notation in which knowledge is represented through explicit relationships between concepts.
48
Word Senses
Possible meanings of a word
49
Semantic Types
The categorization of words into semantic classes according to meaning.
50
Semantic Patterns
The study of the patterns formed by the co-occurrence of individual words in a phrase of the co-occurrence of the associated semantic types of words.
51
Semantic Relations
A classification of the meaning of a linguistic relationship.
52
Referential Expression
A sequence of one or more words that refer to a particular person, object, or event.
53
Coreference Chains
Provide a compact representation for encoding the words and phrases in a text that all refer to the same entity.
54
Parse Tree
The representation of structural relationships that results when using a grammar to analyze a given sentence.
55
Transition Matrix
A table of numbers giving the probability of moving from one state in a Markov model to another state, or the state that is reached in a finite-state machine, depending on the current character of the alphabet.
56
Chunking (NLP)
A processing method for determining non-recursive phrases where each phrase corresponds to a specific part of speech.
57
Chart Parsing
A dynamic programming algorithm for structuring a sentence according to grammar by saving and reusing segments of the sentence that have been parsed.
58
Predicate
The part of a sentence of clause containing a verb and stating something about the subject.
59
Argument
A word or phrase that helps complete a predicate.
60
Cascading Finite State Automata (FSA)
A tagging method in NLP in which a series of FSA are employed such that the output of one FSA becomes the input of another
61
Centering Theory
A theory that attempts to explain what entities are indicated by referential expressions by noting how the center of each sentence changes across the text.
62
Extrinsic Evaluation
An evaluation of a component of a system based on an evaluation of the performance of the entire system.
63
Intrinsic Evaluation
An evaluation of a component of a system that focuses only on the performance of the component.
64
Recall
The percentage of results that should have been obtained according to the test set that actually were obtained.
65
Precision
The percent of results that the system obtained that were actually correct according to the test set.
66
F Measure
A measure of overall accuracy that is a combination of Recall and Precision.
67
Harmonic Mean
An average of a set of weighted values in which the weights are determined by the relative importance of the contribution to the average.
68
Information Retrieval (IR)
Finding material of an unstructured nature that satisfies an information need from within large collections. -aka: SEARCH
69
Challenges in Biomedical IR
- Transition from little information to information overload - Multiple expressions for search topics - Multiple meanings for each expression - Balancing open access vs. providing for cost of production and maintenance.
70
3 Major Uses of the Web
- Informational: Seeking information - Navigational: Looking for a specific page - Transactional: Exchanges of goods and services
71
IR's Relevance to Biomedicine and Health
- Growth of knowledge surpassed human memory capabilities - Clinicians have frequent and unmet information needs - Researchers must frequently update their knowledge in new areas quickly - Primary literature can be scattered and ahrd to synthesize. - Non-primary literature sources are neither comprehensive or systematic - Web is increasingly used as a source of health and biomedical information.
72
Classification of Knowledge-Based Scientific Information: Primary
Original Research: - Mainly published in journals but also conference proceedings, technical reports, books, etc. - Can include re-analysis, meta-analysis, and systematic reviews
73
Classification of Knowledge-Based Scientific Information: Secondary
Reviews, Condensations, Synopses of primary literature: - Textbooks and handbooks are staples of clinicians, researchers, and others - Guidelines are important for normalizing care and measuring quality
74
Classification of Knowledge-Based Content: Bibliographic
Contains databases of collections involving citations or pointers to published literature. -Mainly primary sources -Rich in metadata -One of the oldest mainstays of IR sources.
75
Classification of Knowledge-Based Content: Full-Text
Involves the complete textual information contained in a bibliographic source. -Everything online
76
Classification of Knowledge-Based Content: Annotated
Content that has been annotated to describe its type, subject matter, and other attributes. - Non-text - Structured text annotated with text Includes: - Image collections - Citation databases - Evidence-based Medicine databases - Clinician Decision Support - Genomic Databases
77
Classification of Knowledge-Based Content: Aggregations
Collections of content from a variety of types.
78
Types of Bibliographic Content
- Old databases have been revised - New databases have emerged - Web Catalogs - Real Simple Syndication/Rich Site Summary (RSS)
79
MEDLINE
- Contains references to biomedical journal literature - Launched in 1971; database dates back to 1966 (MEDLARS) - Free for use as of 1997 via PubMed
80
National Guidelines Clearinghouse
- Produced by Agency for Healthcare Research and Quality (AHRQ) - Contains detailed information about guidelines
81
Web Catalogs
Aim to provide quality-filtered web sites to specific audiences -Some geared towards physicians while others are for patients.
82
Real Simple Syndication/Rich Site Summary (RSS)
Feeds providing short summaries, typically of news, journal articles, or other recent web postings. -Can be filtered by user with an aggregation tool.
83
Full-Text Content
Contains complete texts as well as tables, figures, images, etc -Usually provides identical print version Includes: - Periodicals - Books - Web sites
84
Full-Text Content: Books
- Textbooks: Most well-known clinical textbooks now available online in e-text, accessible on mobile devices - Compendia of drugs, diseases, evidence, etc - Handbooks: Popular among clinicians -
85
Value of E-Books
- Added multimedia - Bundling of multiple books - Can be updated between editions - Links to related information
86
Full-Text Content: Websites
- Defined more narrowly to refer to coherent collections of information on the Web - Includes added links and multimedia - Increasingly integrated with other resources and available in different platforms.
87
Annotated Content: Image Collections
- Most prominent in visual medical specialties | - Many have associated text to support with indexing and retrieval
88
Indexing
Assignment of metadata to content to facilitate retrieval Two major types: - Manual - Automated
89
Human Indexing
- Usually performed by a professional with some background in biomedicine - Follows protocol to scan resource and select terms from a controlled vocabulary - Most vocabularies are hierarchical and have specific definitions for when term is to be assigned
90
Medical Subject Heading (MeSH)
- Over 26,000 terms with many synonyms - Hierarchical based on 16 trees - Contains 83 subheadings, for specificity - MeSH browser allows exploration
91
MEDLINE Indexing
- Done by professionals who follow protocols first derived by Bachrach (1978) - --Read: Title, Intro, and Conclusion - --Scan: Methods, Results, Figures, Tables, and lastly Abstract - Ignore publisher's "key words" - Assign 2-4 headings as central concepts and 5-10 as minor headings - Use most specific heading in assigned hierarchy - Publication Type is an important secondary tag - Modern tools have been created to assist in the task.
92
Metadata in Indexing
Indexing covers more than content, such as: - Author(s) - Source - Publication/Resource Type - Relationship to Information
93
Automated Indexing
Indexing of all words that appear in the content. - Often use stop words to rule out common words. - Some systems stem words to root form
94
Weighted Indexing
- Usually used with Automated - Gives weight to words that are frequent but discriminating - Most common approach is for weight to equal product TF*IDF - --Inverse Document Frequency (IDF) - --Term Frequency (TF)
95
Citation Indexing
Citation databases list all other articles that cite a specific article in journals - Index articles that cite other articles - Performed at content item level - Goal is to designate related or important content items
96
Limitations of Human Indexing
- Inconsistency - --Frequent Duplications - Inadequate Indexing Vocabulary - --Up to 25% of all concepts NOT in MeSH - --Ambiguities and other naming problems
97
Limitations of Word Indexing: Synonymy
Different words have the same meaning.
98
Limitations of Word Indexing: Polysemy
The same word may have different meanings or senses.
99
Limitations of Word Indexing: Content
Words in a document might not reflect its focus.
100
Limitations of Word Indexing: Context
Words take on meaning based on the words around them
101
Limitations of Word Indexing: Morphology
Words can have suffixes that do not change the underlying meaning
102
Limitations of Word Indexing: Granularity
Queries and documents may describe concepts at different levels of a hierarchy.
103
IR System Evaluation
- Is the system used? - Are users satisfied? - Do they find relevant information? - Do they complete their desired task? Physicians are the most studied group
104
The Impact of IR on Physicians (4 Themes)
1) Recall - of forgotten information 2) Learning - of new information 3) Confirmation - of existing knowledge 4) Frustration - that the system used was not succesful Also: 1) Reassurance - that the system is available 2) Practice Improvement - of patient-physician relationship
105
Future challenges of IR Evaluation
- Must understand the tasks of the user and focus evaluation accordingly - Ultimate measure may be a health outcome
106
Personal Health Records (PHR)
An electronic application through which individuals can access, manage, and share their health information, and that of others whom they are authorized, in a private, secure and confidential environment.
107
Common Areas Included in PHR
- Allergies - Medications - Personal Medical History - Past and Future Doctor's visits - Vaccinations - Surgeries/Procedures - Past Diagnoses
108
Origin of PHR Data
- Doctors - Self-report - Health plans/Government insurance plans
109
Consumer Benefits of PHR
- Insight into Medical Record - --Help uncover errors - --Help develop ownership of one's own health - --Improves care when changing providers - Improves Convenience - --Referrals - --Appointments -Health education personalized to patient
110
Proprietary State of PHR Today
- Storage repositories for medical information - Not well-defined across the industry - Many systems that are institution-specific - Information does not transfer well to other institutions
111
Future of Connectivity: Patient
- Secure communication between patient and provider - Encompasses medical records from multiple providers - Direct connectivity to Biomonitors - Responsive to varying levels of health literacy, self-efficacy, and tech fluency - High level of individually customizable security
112
Future of Connectivity: Provider
- Increased frequency of data collection from patients - Understanding notes/results from other providers - Reduced cost for chronic disease management
113
Personally Controlled Health Records (PCHR)
- Subset of PHR - Enables a patient to assemble, maintain, and manage a secure copy of their medical data - Designed on the principle idea that patients should be allowed to own and manage copies of their own health records
114
Personal Internetworked Notary and Guardian (PING)
A system designed as a fully distributed electronic medical record in which patients have control over who can read, write, or modify components - Developed in 1998 - Renamed Indivo in 2006
115
Consumer Health Informatics (CHI)
Examines patient information from the POVs such as: health literacy, consumer knowledge, and education. - Intended to empower patients while giving them the knowledge they need to make their own health decisions - Couples the consumer's needs for information with their healthcare preferences to create a tailor-made medical experience.
116
CHI defined by The National Center for Biomedicine
- Any tool or system primarily responsible for interacting with health information users or health information consumers. - Any tool into which a patient inputs their health information and receives a body of health information - Tool or system where information or other benefits may be used with the assistance of a healthcare professional, but not dependent on one.
117
Consumer Application
- Apps facilitating knowledge and understanding of disease management - Apps facilitating the knowledge of observations of daily living (ODL's) - Apps facilitating and promoting lifestyle management assistance - Apps facilitating patient health, preventative care and self-care/assisted care.
118
Self-Management Systems
- Highly varied and usable on multiple platforms. | - Best when providing a timely response with information regarding the user's current state of health.
119
Electronic PHR and Patient Portals
- Contain an individual's health information, conforming to nationally recognized standards. - Info can be pulled from multiple sources while managed and controlled by the user. - Information stored can be: - --Identifiers - --Contact Info - --Medication History - --Allergies - --Immunizations
120
Peer Interaction Systems
- Can operate alone or as part of a set of applications | - Use online forums or discussion groups to help patients communicate with others who have similar conditions.
121
Public Health Informatics
The systematic application of information and computer science and technology to public health practice, research, and learning.
122
Work of Informatics in Public Health
- Formulate models for acquiring, representing, processing, displaying, or transmitting health information or knowledge. - Develop computer systems that use the models to deliver the information/knowledge - Install systems to support the models - Assess Outcomes regarding the effects to the overall health care system.
123
Ten Essential Services of Public Health (1-5)
1) Monitor the health of individuals in the community to identify community health problems 2) Diagnose and investigate community health problems and hazards. 3) Inform, educate, and empower the community with respect to health issues. 4) Mobilize community partnerships in identifying and solving community health problems 5) Develop policies and plans that support individual and community health efforts
124
Ten Essential Services of Public Health (6-10)
6) Enforce laws and rules that protect public health and ensure safety in accordance with these laws. 7) Link individuals who have a need for community and personal health services to appropriate providers 8) Ensure a competent workforce for the provision of essential health services. 9) Research new insights and innovate solutions to community health problems. 10) Evaluate the effectiveness, accessibility, and quality of personal and population-based health services in a community
125
Epidemiology
The study of the prevalence and determinants of disability and disease in populations.
126
Core Functions of Public Health: Assessment
Tracking and monitoring the health status of populations; identifying and controlling disease outbreaks and epidemics.
127
Core Functions of Public Health: Policy Development
Utilizes the results of assessment activities and etiologic research in concert with local values and culture to recommend interventions and public policies that improve health status
128
Core Functions of Public Health: Assurance
The duty of public health agencies to assure their populations that services necessary to achieve agreed upon goals are met.
129
Public Health Surveillance
The ongoing collection, analysis, interpretation, and dissemination of data on health conditions and threats to health. - Represents one of the fundamental means by which priorities for public health action are set. - Data collected for the purpose of action.
130
Coordinated Function of Public Health Informatics: Detection and Monitoring
Support of disease and threat surveillance, national health status indicators
131
Coordinated Function of Public Health Informatics: Analysis
Facilitating real-time evaluation of live data feeds, turning data into information for people at all levels of public health.
132
Coordinated Function of Public Health Informatics: Information Resources/Knowledge Management
Reference information, distance learning, decision support
133
Coordinated Function of Public Health Informatics: Alerting and Communications
Transmission of emergency alerts, routine professional discussions, collaborative activities
134
Coordinated Function of Public Health Informatics: Response
Management support of recommendations, prophylaxis, vaccinations, etc
135
National Electronic Disease Surveillance System (NEDSS)
Major CDC initiative that addresses public health issues by promoting the use of data and information system standards to advance the development of efficient, integrated, and interoperable surveillance systems at federal, state, and local levels. -Designed to facilitate electronic transfer of appropriate information from clinical systems to public health systems, reduce provider burden in the provision of information, and enhance both the timeliness and quality of info.
136
Components of NEDSS
- Browser-based data entry - Person-centric - Case investigation capabilities - ELR messages can be received - Security to meet HIPAA standards
137
Geographic Information Systems (GIS)
- Support data warehouse capabilities - Optimized for retrieval from very large record databases - Can quickly cross-tabulate - Study seasonal and secular trends - Look for patterns by person, place, and time.
138
CDC's National Health and Nutrition Examination Survey (NHANES)
Used to assess the health and nutritional status of children and adults in the U.S. - Combines a home interview and health exam in mobile clinic. - 50 years of survey conducting experience using direct physical measures
139
Sample for NHANES
- Civilian, non-institutionalized household population - Residents of all states and D.C. - All ages - 5,000 individuals annually
140
Immunization Information Systems (IIS)
Confidential, population-based, computerized databases that record all immunization doses administered by participating providers to persons residing within a given geopolitical area. -Assists in designing and sustaining effective immunization strategies at the provider and program levels
141
Point-of-Care IIS
Can provide consolidated immunization histories for use by a vaccination provider in determining appropriate client vaccines.
142
Population Level IIS
Provides aggregate data on vaccinations for use in surveillance and program operations, and in guiding public health action with the goals of improving vaccination rates and reducing vaccination-preventing disease.