week 11 - visual digital turn Flashcards
(30 cards)
Quantification & The
Humanities
* Mainly limited by
Compute
▪ Digitizatio
Gold and Klein1 define DH as the following
digital archives, quantitative analysis, and tool-building
projects
AI
Consist of systems that can handle given tasks on their
own
- search algorithms, rule based system
ML
omputer Algorithms that automate actions without
explicit programming. Can learn and improve
Clustering algorithms, K-Means, Gradient Boosting, PC
DEEP LEARNING
Deep Learning is part of ML and makes use of ‘Deep’
Neural Networks
Convolutional Neural Nets, Recurrent Neural Networks,
LLM’s (ChatGPT)
Supervised Machine Learning
Model learns by example
* Human labelled datasets
* Model learns based on examples
* Measures how ‘wrong’ it is and parameters get adjusted.
* Parameters start of randomly.
Unsupervised Machine
Learning
Does not rely on labelled data
* Aims to identify structures or patterns in data.
Training a Network - STEPS
- Data goes through the model.
- Calculate how ‘wrong’ the model is from the labels.
- Calculate what parts in a network need to be adjusted to
perform slightly better. - Change the parameters
- Repeat until model no longer improves.
Gradient Descent
: ‘Direction’ to reduce ‘wrongness’.
It helps AI models learn better by slowly improving.
It’s used in training almost all machine learning models (like ChatGPT, image recognition, etc.).
CNNs (Convolutional Neural Networks):
Specialized for image processing
Identify features like edges, shapes, textures
Use layers of convolutions to detect increasingly complex patterns
Wevers and Smits
focus on the potential of Computer Vision in the
Humanities.
* Shows three different ways to use CNN
Wevers and Smits highlight three simple uses of CNNs in the humanities
Detecting
Medium (1)
Clustering
with CNNs (2)
Your own
classifier (3)
detecting medium
They used CNNs to tell the type of media in an image — like whether it’s a photo, drawing, or printed ad. This helps researchers sort and study different kinds of visual content in archives.
- Clustering with CNNs
They grouped images that look visually similar (like sorting ads with the same layout or style) by using the features from a pre-trained CNN. This helps spot visual trends and design patterns in things like advertisements
*used pre trained CNN
- Your Own Classifier
They trained a custom CNN to recognize recurring image types in newspapers (like weather icons or political sketches). With only a few examples, the model got good at picking out these common image categories.
What has traditionally been the main focus of Digital Humanities?
A) Visual content analysis
B) Text-based analysis using OCR and computational tools
C) Audio and video processing
D) Network security
b
What is meant by the “visual turn” in Digital Humanities?
A) Moving from text analysis to also include visual data
B) Focusing only on photographs in archives
C) Using VR technology for humanities research
D) Ignoring textual data completely
a
Which type of machine learning uses labeled data to train models?
A) Supervised Learning
B) Unsupervised Learning
C) Reinforcement Learning
D) Deep Reinforcement Learning
a
What is a Convolutional Neural Network (CNN) primarily used for?
A) Text translation
B) Image processing and pattern recognition
C) Audio synthesis
D) Database management
b
In Wevers & Smits’ study, what shift did the CNN detect in newspaper images after 1900?
A) More illustrations than photos
B) A surge in photo usage, overtaking illustrations by 1925
C) No change in image types
D) A decline in image usage altogether
b
What does clustering with CNNs involve?
A) Grouping visually similar images based on extracted features
B) Labeling images with text metadata
C) Translating images into text descriptions
D) Removing noise from images
a
Which dataset did Wevers & Smits use to analyze advertisements?
A) CHRONIC
B) SIAMESET
C) ImageNet
D) MNIST
b
. What is a limitation of clustering CNNs mentioned in the study?
A) It can’t process color images
B) It struggles with abstract or text-heavy ads
C) It requires millions of labeled images
D) It only works on handwritten documents
b
How many labeled images did Wevers & Smits use to retrain their custom classifier?
A) About 100
B) About 500
C) About 10,000
D) Over 100,00
b