Data Science & Math Flashcards
What is data science?
The extraction of knowledge from data
Data analytics vs. data science
- Data analytics is about analyzing the data to draw insights (past data)
- Data science is about data plus math and statistics to create predictions
What can Machine Learning determine?
- Is this A or B? —> Classification
- Is this weird? —> Anomaly detection
- How much/how many? —> Regression analysis
- How is it organized? —> Unsupervised learning (e.g. clustering)
- What should I do next? —> Reinforcement learning
data scientist vs. data analyst vs. data engineer
- data scientist focuses on analyzing and interpreting data to find insights, patterns, and make predictions
- data analyst focuses on digging around in data, visualizations, focus on insights into past data
- data engineer focuses on managing and organizing data, maintaining databases and data pipelines
Can you prove a hypothesis?
No, you can never prove that a hypothesis is true, you can only fail to reject it
What is PCA?
Principal Component Analysis: Unsupervised learning method that uses patterns present in high-dimensional data (data with lots of independent variables) to reduce the complexity of the data while retaining most of the information.
In a decision tree, how are the ends called?
Leavesmµ.
History of ML/Ai
- 1st generation: The Backend - Large Datasets (Fraud detection, search algos, SCM)
- 2nd generation: The human side - data about humans (rec. systems, social media, commerce + ads)
- 3rd generation: Modern Machine Learning - pattern recognition (speech recog., computer vision, translation)
what is pruning (prune=zurechtstutzen)
Purning removes unnecessary splits > compresses part of the tree from strict and rigid decision boundaries into ones that are more smooth and generalise better > reduces tree complexity > tree complexity = number of splits in the tree
A simple yet highly effective pruning method is to go through each node in the tree and evaluate the effect of removing it on the cost function. If it doesn’t change much, then prune away!
Sum of arithmetic sequence
n/2(2a + (n-1)d) –> d=distance, a=1st element, n=number of elements
∏
This symbol represents the product operator, analogous to how ∑ represents the summation operator. It indicates that you multiply a sequence of terms together.
cumulative return compounding vs. cumulative return rebalancing
Exponential Function
A mathematical function that grows faster than any polynomial function. The base of the exponential function, e, is approximately equal to 2.71828, and it’s a fundamental constant in mathematics, similar to π.
Taylor series
Taylor series or Taylor expansion: of a function is an a function is an infinite sum of terms that are expressed in terms of the function’s derivatives at a single point. For most common functions, the function and the sum of its Taylor series are equal near this point. Taylor series are named after Brook Taylor, who introduced them in 1715. A Taylor series is also called a Maclaurin series when 0 is the point where the derivatives are considered, after Colin Maclaurin, who made extensive use of this special case of Taylor series in the 18th century.
Stochastic process
a stochastic or random process is a mathematical object usually defined as a sequence of random variables in a probability space, where the index of the sequence often has the interpretation of time. Stochastic processes are widely used as mathematical models of systems and phenomena that appear to vary in a random manner
Brownian motion
named after the botanist Robert Brown, who first described the phenomenon in 1827, while looking through a microscope at pollen immersed in water. In 1900, the French mathematician Louis Bachelier modeled the stochastic process now called Brownian motion in his doctoral thesis, The Theory of Speculation
geometric Brownian motion (GBM)
also known as exponential Brownian motion, is a continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion with drift.
It is an important example of stochastic processes satisfying a stochastic differential equation (SDE); in particular, it is used in mathematical finance to model stock prices in the Black–Scholes model.
stochastic differential equation (SDE)
A stochastic differential equation (SDE) is a differential equation in which one or more of the terms is a stochastic process, resulting in a solution which is also a stochastic process.