# Statistics Flashcards

Descriptive Statistics?

Descriptive statistics is what we can say about a sample by observing the sample itself. This is somewhat limited and mostly consists of summarisations of the data, e.g. like aggregates on a column in a database table.

Inferential Statistics

Inferential statistics is what we can say about a population based on what we know about a sample. That means that we can infer (deduce or conclude from evidence rather than from explicit statements) about the population based on a smaller sample.

In statistics what is ‘Probability’?

Probability is what we can generally say about samples from a population.

So if we know 10 % of the population are left handed, we can expect 10 % of a sample randomly taken to be left handed.

In Probability Theory:

What does the experiment yield?

One possible outcome of a a sample space.

The sample space for tossing a coin is {head, tail}

In Probability Theory

What is a ‘Sample Space S’

A set of possible outcomes of an experiment.

The sample space for tossing a coin is {head, tail}

In Probability Theory

What is a ‘Event E’

An event is a possible outcome of an experiment, e.g. the event head when we toss a coin.

In Probability Theory

What is a ‘Probability of Outcome P(s)’

The probability of an outcome is always greater than 0 and less than 1, and the sum of the probability of all possible outcomes is 1, .

Descriptive Statistics

In Descriptive Statistics Which are the two different areas

Centrality and variability

Centrality: mean, median, mode

Descriptive Statistics

What is the Mean, or average and what kind of data is it most useful for?

The mean / average is the sum of a value divided with the number of values.

Most useful with homogeneous data - variables of one type. categorical or binary.

In Descriptive Statistics

What is the Median

What is the median in an evenly numbered data set?

The exact middle value of the data set.

If n is even, the median is the mean value of the two middle elements

In Descriptive Statistics

What is the Mode

The mode is the most frequent element.

1 , 1, 2, 3, 4 = mode = 1

Standard Deviation

Measure of the amount of variation on a set of values.

Low standard deviation indicates that the values are closer to the mean - the distribution is less wide

A high standard deviation indicates that the values are spread out on a wider range

In Descriptive Statistics

Is Standard Deviation describing variability or centrality

Variability : Dispersion of the data

Centrality: centrality measures determine the relative significance of a node in a social network

What is Correlation Analysis concerned with

Correlation analysis is concerned with relations between variables, e.g. if one goes up, what happens to the other?

What is a Correlation Coefficient

A correlation coefficient is statistic measure of the degree that one variable Y is a function of another variable X.