Week 5: Training Neural Networks Flashcards
(28 cards)
What is required for training a dataset?
Input (x) and ground truth (y)
Explain how input and ground truth is used to train data.
- Ground truth is provided by humans
- Provide corrupted input
- Compares prediction from the neural network (y hat) to the ground truth
- Based on the prediction and the ground truth, you are given an error
- Based on the error, the weight and bias ia adjuted to bring it closer to the gorund truth
Ground truth
verified, true data used fro training
Why do we need a model if you already have the ground truth?
The dataset you input is only a fraction of the amount of data in the real word where you don’t know the ground truth. We want to train the neural network with the data we do have so we can apply it to similar scenarios where we don’t know the answer.
you can’t use neural network training fot one task and apply it to a different task
E.g. you can’t train a network to predict the cost of a house in Los Angeles and apply it to New York
Is the ground truth supervised or unsupervised?
Supervised - you label each object in an image
Supervised data
annotated data (done by humans)
One hot encoding
Turns every label into a unique binary vector
The length is equal to the number of unique categories in the data
Each vector contains a single 1 at the position corresponding to the category
0s everywhere else
Explain how one-hot is used for training.
You compare the one-hot ground truth to the predicted activation outputs
If one hot ground truth doesn’t match activation output, there is an error
What is a softmax function?
You turn all of the output numbers into a probability
If you have a large activation in the output, this will become the highest probability
Why is the dataset squared when making a prediction
You don’t want the error rate to be a negative number
What does the loss number give you a prediction of?
an estimation of how your neural network is currently performing
How do you calcluate the loss?
- (Predicted output - ground truth output)^2
- Add up all of the indivudual training values
- Multiply by (1/n)
*n = number in training dataset
refer to notes on page 19
Explain training loss when you first begin training a neural network
At the beginning, the training loss will be very high
Begins by plugging in random weights, but the loss gradually decreases over time
X represent how many times you change the weight (i.e. steps), Y is training loss (i.e. the error)
Training loss will never reach 0
According to Historian Lisa Gitelman…
every field has its own defintion of what qualfies as data
Explain the CU Colorado Springs AI contoversy
-Professor installed camera to collect images of students and faculty to train dataset
-Did so without their permission
-Front facing images are too easy - wanted people to be looking away
-State senate banned facial recognition technology in Colorado
Explain how the Duke research used facial recognition.
-Duke research used multiple cameras to track the same person across campus
-A number is associated with each person, making it possible to track an individual
-Don’t need to add a neuron for every pearson. Instead, there are two neurons that identify if it is the same person or not – it becomes a yes or no question
OpenAI Lawsuit with NYT
Open AI uses articles from NYT to train it, despite IP laws
With AI, what qualifies as fair use - sometimes if it contributes to society it is allowed
How is DeepSeek trained?
from data in other AI models
Explain CAPTCHA
A turing test used to prove that you are human by clicking on the object it tells you to (Usually cars and traffic lights because they want to collect data to train self-driving cars)
Labeling is expensive but this provides a cheap and efficent way to label
What are some ethical issues for collecting data to train AI
- Consent (e.g. using artists’ work for training)
- Cheap labor (need cheap labor for supervised learning)
e.g. OpenAI used Kenyan workers on less than $2 per hour to make chatgpt less toxic
Components of ImageNet
One component: not possible without iphone data
Second component: the labelling aspect
(paying students to label was too expensive so posted the job on Amazon platform that enabled people all around the world to sign up- outsourcing labor made it cheaper)
Limitation of loss minimization
This model seems good in theory but can cause a lot of problems
Google racist mistake identifying two black people as “gorillas”
How does the example of the gorilla miscategorization highlight an issue with training?
The issue with this model is that the errors have the same weight (either 0 or 1)
They treat all miscategorization the same
(E.g treats miscategorization a dog bread and labeling a human as a gorilla in the same way)
Instead, we should change what we place most importance on when evaluating an error (e.g. prioritizing accurately categorizing humans) – will prioritize dropping the error that is most important (e.g. labelling a human as a gorilla)
But model will be less motivated to correct errors of things that are less important
Loss meaning
the error - to what degree does the prediction differ from the ground truth