Intro to ML Flashcards

Question 1

Q

POC to Production Gap

Answer

A

Proof-of-concept to production

“ML model code is 5-10% of ML project code

refer to [D. Sculley et all NIPS 2015: Hidden Technical Debt in Machine Learning System] diagram

Question 2

Q

ML project lifecycle

Answer

A

“SDMD”

scoping (X->Y) -> data -> modeling -> deployment

scoping:
* define project [X->Y]

Data:
* define data and establish baseline
* label and organize data

Modeling
* select and train model
* perform error analysis

Deployment
* deploy in production
* monitor & maintain system

Question 3

Q

Nuance between research/academia and production team’s refinement to ML model?

code (algorithm/model)
hyperparameters
data

Answer

A

research/academia:
tend to hold data the same
optimize code and hyperparameters

production team:
tend to hold code the same
optimize data and hyperparameters

Question 4

Q

Edge devices [definition?]

Answer

A

Edge devices are pieces of equipment that serve to transmit data between the local network and the cloud.

They are able to translate between the protocols, or languages, used by local devices into the protocols used by the cloud where the data will be further processed.

Question 5

Q

MLOps stand for?

Answer

A

an emerging discipline, and comprises a set of tools and principles to support progress through the ML project lifecycle.

Question 6

Q

Concept drift vs. data drift

Answer

A

data drift
[X changes]
e.g. a politician suddenly becomes famous

concept drift
[X -> Y] mapping changes
e.g. house size doesn’t change, but price change

Question 7

Q

realtime vs. Batch

Answer

A

speech -> realtime
hospital record from patient -> Batch

Question 8

Q

cloud vs. Edge/Browser

Answer

A

edge/browser -> good to always have as well, in case internet is not accessible or shut down

Question 9

Q

checklist of things to consider to create ML software

Answer

A

realtime or Batcch
cloud vs Edge/Browser
computer resources (CPU/GPU/memory)
Latency, throughput (QPS)
Logging
security and privacy

Question 10

Q

throughout (QPS)

Answer

A

Throughput(QPS) - queries per second: This is the number of requests that are successfully executed/serviced per unit of time. For example, if the throughput is 50/minute, this means that on your server, per minute, 50 requests are executed successfully (accepted, processed and responded properly)

Question 11

Q

Common ML deployment cases

Answer

A

New product/capability
automate/assist with manual task
replace previous ML system

Key ideas:
* Gradual ramp up with monitoring
* Rollback

Question 12

Q

rollback

Answer

A

if new model not work, go back to previous-working model

Question 13

Q

gradual ramp up with monitory

Answer

A

not direct big travel to new model
start from a small traffic and then ramp up

Question 14

Q

shadow mode (deployment)

Answer

A

ML system shadows the human and runs in parallel.
ML system’s output not used for any decisions during this phase.

Question 15

Q

canary deployment

Answer

A

roll out to small fraction (say 5%) of traffic initially
monitor system and ramp up traffic gradually

origin:
canary in a coal mine
which refers to how coal miners used to use canaries to spot if there’s a gas leak

Question 16

Q

Blue green deployment

Answer

Study These Flashcards

A

blue version = old version
green version = new version

router suddenly switch from old to new

benefit:
* easy way to enable rollback

Question 17

Q

degrees of automation

Answer

Study These Flashcards

A

human only -> shadow mode -> AI assistance -> partial automation (send to human if algorithm is not sure) -> full automation (only AI)

both AI assistance and partial automation are “human in the loop” deployments
common for factory

consumer software -> full automation is more necessary

Question 18

Q

monitoring 思路

Answer

Study These Flashcards

A

brainstorm the things that could go wrong
brainstorm a few statistics/metrics that will detect the problem
it is ok to use many metrics initially and gradually remove the ones you find not useful

e.g.
software metrics | memory, compute, latency, throughput, server load
Input metrics [x] | avg input length, avg input volume, num missing values, avg image brightness
output metrics [y] | # time return “” (null), # times user redoes search, #timer user switches to typing (give up on your speech system), CTR