Data Science & Math Flashcards

1
Q

What is data science?

A

The extraction of knowledge from data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data analytics vs. data science

A
  • Data analytics is about analyzing the data to draw insights (past data)
  • Data science is about data plus math and statistics to create predictions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can Machine Learning determine?

A
  1. Is this A or B? —> Classification
  2. Is this weird? —> Anomaly detection
  3. How much/how many? —> Regression analysis
  4. How is it organized? —> Unsupervised learning (e.g. clustering)
  5. What should I do next? —> Reinforcement learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

data scientist vs. data analyst vs. data engineer

A
  • data scientist focuses on analyzing and interpreting data to find insights, patterns, and make predictions
  • data analyst focuses on digging around in data, visualizations, focus on insights into past data
  • data engineer focuses on managing and organizing data, maintaining databases and data pipelines
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Can you prove a hypothesis?

A

No, you can never prove that a hypothesis is true, you can only fail to reject it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is PCA?

A

Principal Component Analysis: Unsupervised learning method that uses patterns present in high-dimensional data (data with lots of independent variables) to reduce the complexity of the data while retaining most of the information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In a decision tree, how are the ends called?

A

Leavesmµ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

History of ML/Ai

A
  • 1st generation: The Backend - Large Datasets (Fraud detection, search algos, SCM)
  • 2nd generation: The human side - data about humans (rec. systems, social media, commerce + ads)
  • 3rd generation: Modern Machine Learning - pattern recognition (speech recog., computer vision, translation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is pruning (prune=zurechtstutzen)

A

Purning removes unnecessary splits > compresses part of the tree from strict and rigid decision boundaries into ones that are more smooth and generalise better > reduces tree complexity > tree complexity = number of splits in the tree

A simple yet highly effective pruning method is to go through each node in the tree and evaluate the effect of removing it on the cost function. If it doesn’t change much, then prune away!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sum of arithmetic sequence

A

n/2(2a + (n-1)d) –> d=distance, a=1st element, n=number of elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A

This symbol represents the product operator, analogous to how ∑ represents the summation operator. It indicates that you multiply a sequence of terms together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

cumulative return compounding vs. cumulative return rebalancing

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Exponential Function

A

A mathematical function that grows faster than any polynomial function. The base of the exponential function, e, is approximately equal to 2.71828, and it’s a fundamental constant in mathematics, similar to π.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Taylor series

A

Taylor series or Taylor expansion: of a function is an a function is an infinite sum of terms that are expressed in terms of the function’s derivatives at a single point. For most common functions, the function and the sum of its Taylor series are equal near this point. Taylor series are named after Brook Taylor, who introduced them in 1715. A Taylor series is also called a Maclaurin series when 0 is the point where the derivatives are considered, after Colin Maclaurin, who made extensive use of this special case of Taylor series in the 18th century.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Stochastic process

A

a stochastic or random process is a mathematical object usually defined as a sequence of random variables in a probability space, where the index of the sequence often has the interpretation of time. Stochastic processes are widely used as mathematical models of systems and phenomena that appear to vary in a random manner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Brownian motion

A

named after the botanist Robert Brown, who first described the phenomenon in 1827, while looking through a microscope at pollen immersed in water. In 1900, the French mathematician Louis Bachelier modeled the stochastic process now called Brownian motion in his doctoral thesis, The Theory of Speculation

17
Q

geometric Brownian motion (GBM)

A

also known as exponential Brownian motion, is a continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion with drift.

It is an important example of stochastic processes satisfying a stochastic differential equation (SDE); in particular, it is used in mathematical finance to model stock prices in the Black–Scholes model.

18
Q

stochastic differential equation (SDE)

A

A stochastic differential equation (SDE) is a differential equation in which one or more of the terms is a stochastic process, resulting in a solution which is also a stochastic process.