Information Extraction Flashcards by Ben Boyce

What is Information Extraction?

It is turning unstructured text into structured data, such as a relational database or set of extracted tuples

How well did you know this?

Not at all

Perfectly

What is Relation Extraction (RE)?

Relation Extractions finds semantic relations among text entities, such as parent-child, part-whole or geospatial relations

How well did you know this?

Not at all

Perfectly

What can be used to encode relational informtaion?

Knowledge-graphs

How well did you know this?

Not at all

Perfectly

What is Event Extraction?

It finds event in which entities participate

How well did you know this?

Not at all

Perfectly

What is Temporal Extraction?

It is finding times, dates and durations

How well did you know this?

Not at all

Perfectly

What is Knowledge Base Population (KBP)?

It is the task of populating knowledge bases from unstructured text using extracted information

How well did you know this?

Not at all

Perfectly

What datasets exist to perform relational extraction?

The ACE relation extraction dataset

Wikipedia info boxes (DBpedia and Wikidata)

WordNet

TACRED dataset

SemEval

How well did you know this?

Not at all

Perfectly

What are some relations in RE?

How well did you know this?

Not at all

Perfectly

What is pattern-based RE?

It is pattern-based relation extraction. It uses hand-crafted lexico-syntactic patterns that were to be followed. They were tailored to specific domain lexicons, so would only work in that domain. It had high precision, but low recall and were expensive to make

How well did you know this?

Not at all

Perfectly

What is Supervised RE?

It is supervised relation extraction with an annotated corpus, which we pass to a model which we want to learn the patterns in the corpus to be able to perform extraction. It works by finding pairs for named entities and would classify the relation within the sentence for that pair

How well did you know this?

Not at all

Perfectly

What is the input and output for a supervised RE model?

The input X is a feature set for the entity pair, and the output Y is a prediction of the relation for the provided pair.

How well did you know this?

Not at all

Perfectly

What type of classifier can be used in a RE model?

It can be LogReg, RF, RNN but in this course we use a Transformer model

How well did you know this?

Not at all

Perfectly

Why do we use a transformer model with RE?

This is because the self-attention works well for this type of problem and can learn what parts to focus on

How well did you know this?

Not at all

Perfectly

What are some techniques to improve RE models?

Replace SUBJ and OBJ with NER tags to avoid overfitting to lexical terms

Use RoBERTa or SPANbert pre-trained word embeddings instead of vanilla BERT, as pre-training is done using single sentences rather than sentence pairs with a separator between them

How well did you know this?

Not at all

Perfectly

Why do NEs tend to overfit in deep learning RE models?

RE labelled datasets are too small, so there will not be enough examples of every possible NE phrase, so the model is likely to overfit the dataset samples of the ones in the training set

How well did you know this?

Not at all

Perfectly

What is a semi-supervised RE method?

We can use a semi-supervised RE approach with bootstrapping

How well did you know this?

Not at all

Perfectly

What is bootstrapping?

It is where we have a high-quality hand crafted small set of seed tuples for the relations in the form (relation, e1, e2)

The algorithm will find sentences which contain instances of the seed tuples, extract patterns and find new seed tuples

How well did you know this?

Not at all

Perfectly

What happens when we run many iterations of boot strapping?

Study These Flashcards

We get semantic drift, so the seed tuples start looking for other things

What are some methods to reduce semantic drift?

Study These Flashcards

Apply a confidence threshold to the extraction patterns to improve quality of tuples

Limit the dependency graph walk for new tuples

What is the distant supervision method for RE?

Study These Flashcards

It is the use of a knowledge-base such as DBpedia as a source of seed tuples (r, e1, e2)

What does using a knowledge base avoid?

Study These Flashcards

It avoids the semantic drift problems of the bootstrapping approach

What are the steps taken with Distant Supervision?

Study These Flashcards

Start with a text corpus, use a NER tagger, and then match entities within the database, work out the relation to get the relation term for the seed tuple. We can then add matches to the training set as a feature set with the occurrence frequency

What is an issue with using distant supervision?

Study These Flashcards

It generates a very large training set for supervised RE, meaning that it is very noisy and will have low precision

What can be done to reduce noise during distant supervision RE?

Study These Flashcards

A GAN or incremental training approaches can be used

What is unsupervised RE?

It extracts relations with no training data.

What can Unsupervised RE also be called?

Open Information Extraction or OpenIE

What are the steps in unsupervised RE?

You take the text corpus, apply a POS tagger, identify verb-based POS patterns. This performs a syntax and lexical constrained walk of token in a sentence to explore different options to connect arguments to a relation in order to build a tuple.

What are unsupervised RE methods good at?

They provide an unlimited number of relations and entity types

What is an issue with using unsupervised RE?

The extracted phrases are not semantically grounded to a database entry making it harder for applications to use the relations

How can we evaluate RE models?

Supervised RE - P/R/F1 Semi-supervised and Unsupervised RE - Random sample + human inspection to compute precision or P@Yield or P@R which is precision at different levels of recall

What can temporal extraction be?

It can be absolute (easy) or relative (contextual to a reference point in text) Rule based systems look for lexical triggers encoded as temporal expressions

What are some approaches for Temporal Extraction?

A sequence labelling approach using BIO tags with feature templates encoding feature sets for a supervised classifier

What does temporal normalisation do?

It maps temporal expressions to a point in time or quantified duration

What do temporal normalisation approaches tend to be?

They tend to be rule-based

What are fully qualified expressions and what are temporal anchors?

Fully qualified expressions are ones that appear in full in text, such as a full date. Temporal anchors are followed by a relative temporal expression. These are words such as yesterday, tomorrow, the weekend.

What do we aim to do with temporal anchors?

We try and make time statements based on the anchors. For example, if we see yesterday, we know that means -1 day from the date of the post. For example, through the weekend would mean the coming weekend after the start

What do events in English often correspond to?

They correspond to verbs often, but not always (e.g. nouns can introduce an event)

How do we perform event extraction?

We use sequence labelling approaches using BIO tags and feature templates that encode features for a supervised classifier

What do we do with the extracted relations?

We can populate a knowledge base.

How do we populate a knowledge base?

We start with a partial KB and a large corpus of text, apply our KB technique to a large corpus to generate a fully populated KB with some noise as these approaches are not perfect.

What are the two approaches to filling a KB?

Slot filling and entity linking

What is slot filling?

We complete all the known information that we can about a particular entity. The input is a bit of text and some NER tags, apply the supervised RE to compute some relation tuples. Having done this, we use the output of those tuples to see how they are connected to form a knowledge graph by computing 0-hop and 1-hop slots.

Explain what the image shows

In the image, we have some text. We have a query entity for Mike Penner, which is getting the knowledge for Mike Penner. The table in the bottom right shows the relation tuples that have been extracted from the text. A 0-hop slot looks for the immediate connection, so Mike Penner has a spouse, which is Lisa Dillman. A 1-hop slot takes the object of the 0 hop, and use it as the subject for the 1-hop. So here, we want to see if there is a title for Lisa Dillman, which there is. This can create an n-deep knowledge graph

What is entity linking?

It takes text entities (names of things that are entities), and connects them to unique knowledge graph entities. It is typically done by using an existing KB in order to work out the links better.

Information Extraction Flashcards

(44 cards)