week 4 - history Flashcards

(44 cards)

1
Q

What is Digital History?

A

Using digital tools to study, analyze, and present the past (e.g., text mining, OCR, network analysis).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What was Cliometrics (1960s)?

A

Early quantitative history using computers for statistical analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are OCR and HTR used for?

A

OCR (Optical Character Recognition) → Reads printed texts.

HTR (Handwritten Text Recognition) → Reads handwritten documents using AI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does NER (Named Entity Recognition) do?

A

Finds and labels names, dates, places in historical texts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Topic Modeling?

A

A technique that reveals hidden themes in large text datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Semantic Web in Digital History?

A

Connects data across sources using metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Symbolic AI?

A

AI based on rules and logic, used for knowledge graphs and formal reasoning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the risks of AI in historical research?

A

Bias in digitized sources

Overreliance on searchable text

Loss of context and nuance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What did Romein et al. (2020) argue about Digital History?

A

Digital tools are powerful, but need to be used with critical interpretation and historical awareness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the DAS (2022) chapter highlight?

A

AI can now extract and structure data from historical documents, but may reproduce bias if not trained carefully.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What was Yann Ryan’s key message?

A

AI helps manage massive digitized archives, but we must avoid assuming AI = objective truth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

knowledge graphs

A

Visual web of related historical concepts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the main goal of Digital History?
A) Create fictional historical narratives using AI
B) Replace historians with machines
C) Use digital tools to study, analyze, and present the past
D) Digitize newspapers for modern journalism

A

c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does OCR stand for and what is it used for?
A) Optical Caption Reader – summarizes videos
B) Optical Character Recognition – converts printed text into digital text
C) Object Content Recognition – labels images
D) Original Content Retention – preserves manuscripts

A

b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What innovation did the Annales School introduce to historical research?
A) Use of symbolic AI for interpretation
B) A focus on short-term events
C) Emphasis on long-term structural trends (Longue Durée)
D) AI-generated storytelling

A

c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What was an early example of digital history work in the 1990s?
A) ImageNet
B) Valley of the Shadow
C) Rosetta Stone AI
D) The Asunder Proje

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does Named Entity Recognition (NER) help identify in historical texts?
A) Moral lessons and themes
B) Emotions of historical figures
C) Names, places, and dates
D) Font styles and document layouts

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Which of the following is a key challenge in Digital History?
A) Lack of primary sources
B) Overreliance on digitized sources without critical context
C) Inability to access the internet
D) Shortage of historical scholars

19
Q

What does Topic Modeling do in historical text analysis?
A) Identifies errors in historical data
B) Predicts the future using past data
C) Uncovers hidden themes in large text datasets
D) Summarizes historical documents into one sentence

20
Q

What is a Knowledge Graph?
A) A visual family tree
B) A timeline of historical events
C) A database that connects concepts through relationships
D) A software for drawing maps

21
Q

What does the Semantic Web enable historians to do?
A) Compress large image datasets
B) Link historical data across sources using shared standards
C) Translate documents into emoji
D) Create fictional reconstructions of past events

22
Q

What is the risk of relying solely on keyword search in digital archives?
A) It increases accuracy
B) It ignores visual sources
C) It may miss important context or biases
D) It is too time-consuming

23
Q

symbolic ai

A

uses rules and systems

“classical ai”

24
Q

sub symbolic ai

A

training data and no explicit rules

learns from data

25
pre ai ocr
pre prgrammed set of rules ie, see vertical like with 2 diagnols to describe a K
26
AI ocr
provide computer with lots of examples "training data" different fonts, styles etc so it recognizes "k"
27
ocr with neural networks
Neural network OCR looks at pictures of letters, finds patterns with CNNs, and reads the letters in order using RNNs to turn the image into text.
28
CNN (Convolutional Neural Network):
A type of neural network that looks at images and finds important shapes and patterns, like edges or letters.
29
RNN (Recurrent Neural Network)
A neural network that reads data in order, like letters in a word, to understand the sequence and meaning.
30
what is training data also called
ground truth
31
Handwritten Text Recognition (HTR)
HTR uses neural networks to read handwritten text.
32
What possibilities does HTR open up?
It gives easy access to old handwritten archives. It helps quickly explore and analyze large amounts of text. It supports research across different fields. It allows using text data for machine learning and data mining.
33
What are some risks HTR
We focus more on easy-to-get digital texts and ignore harder-to-find ones. This can keep one-sided stories in charge. OCR/HTR isn’t always perfect and can cause mistakes. People rely too much on searching by keywords only. But AI can help fix these problems.
34
what else can AI do beyond specific keyword searches
topic modelling Named entity rec word embeddings
35
topic modelling
looks through lots of text to find groups of related ideas or themes that might not be obvious — like finding hidden stories or topics. "It groups letters by subject, like “family stuff” or “travel,” so you don’t have to read everything.
36
NER
is a tool that finds names of people, places, or things in the text — it can help spot people who are usually overlooked or ignored. It finds all the names in a text, like people or places, even if they are not famous.
37
Word embeddings
are ways AI understands the meaning and style of words by comparing them — this helps see how language is used differently or similarly, like spotting different opinions or ways of thinking.
38
why are these tools helpful and important
help us understand big collections of texts faster and better. They find hidden stories, important people, and how language changes over time — things we might miss if we just search for words. This makes research smarter and more complete.
39
from searchable to researchable
moving beyond just being able to find words in old documents (searchable)… …to being able to really study and analyze them with AI tools
40
from searchable to researchable steps
1. historical docs 1. ground truth = correct text we use to train ai to read handwriting 2. loghi htr = tool uses ai to read and transcribe handwrittens docs 3. page XML files = store both the image of the page and the text layout
41
Entity Recognition and Linking?
Finding names of people, places, or things in text (entity recognition) And connecting them to real-world info or databases (linking)
42
Semantic Contextualisation
Understanding the deeper meaning of words in their context — not just reading them, but knowing what they really refer to or imply.
43
semantic Web
web of linked data way of making the internet smarter — by helping computers understand the meaning of information, not just read the words.
44
triples
Subject – Predicate – Object help computers understand relationships between things. They're like building blocks of meaning