Concepts and stuff (M1) Flashcards

1
Q

Data science

A

Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract insights and knowledge from structured and unstructured data. It combines techniques from statistics, mathematics, computer science, and domain-specific knowledge to analyze and interpret complex data sets. The primary goal of data science is to uncover hidden patterns, make predictions, and generate actionable insights to support decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Numeric data

A

Discret Data, Continous Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

SQL

A

SQL, or Structured Query Language, is a domain-specific programming language used for managing and manipulating relational databases. It provides a standardized way to interact with relational database management systems (RDBMS), allowing users to define, query, update, and manage data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The 5 v’s of Big data

A

Volume, Velocity, Variety, Value,Veracity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Statistics

A

It provides methods for making inferences about the characteristics and behavior of populations based on samples taken from them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Central tendency measures

A

Mean, Median, Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Measures of dispersion

A

Range, Variance, Standard Deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Variance

A

It provides a measure of how much individual data points in a dataset differ from the mean (average) of the dataset. A higher variance indicates greater variability, while a lower variance suggests that the data points are closer to the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Standard deviation

A

It provides a more interpretable measure of spread in the original units of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Skewness

A

It indicates whether the data is skewed to the left or right relative to the normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Machine learning

A

The primary goal of machine learning is to create systems that can automatically learn and improve from experience without being explicitly programmed for a specific task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Predictive Models

A

Neuronal Network, Support Vector Machine, Ramdom Forest, Bayesian Models, K Nearest Neighbors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data Warehouse

A

Centralized repository for storing and managing large volumes of data from various sources within an organization. It is designed to support business intelligence (BI) and analytical reporting activities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data Lake

A

Unlike traditional data warehouses, which follow a structured approach, data lakes can store both structured and unstructured data. The concept of a data lake is often associated with big data and the need to handle large-scale, diverse datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly