Teorifrågor Matstat Test 2 Flashcards
(23 cards)
An estimator, e.g. a function of data which estimates unknown parameter of the distribution, is called
unbiased, if
- it equals the parameter it estimates
- it tends to the value of the parameter it estimates when the sample size grows
- its expectation as a random variable equals the parameter it estimates
- its variance is 0
- its variance tends to 0 when the sample size grows
- none of the above
- its expectation as a random variable equals the parameter it estimates
An estimator, e.g. a function of data which estimates unknown parameter of the distribution, is called
consistent, if
- it equals the parameter it estimates
- it tends to the value of the parameter it estimates when the sample size grows
- its expectation as a random variable equals the parameter it estimates
- its variance is 0
- its variance tends to 0 when the sample size grows
- none of the above
- its variance tends to 0 when the sample size grows
To estimate the distribution of a certain characteristic X in a large population (X could be the age, for example), we may take a relatively small sample and infer on the distribution of X in the whole population from the distribution of X in the sample.
What does it mean that the sample is representative (in what concerns X)?
- The sample consists of respected population representatives
- The sample is large
- The sample mean is the same as the mean of X in the whole population
- The distribution of X in the sample is the same as in the whole population
- None of the above
- The distribution of X in the sample is the same as in the whole population
To estimate the distribution of a certain characteristic X in a large population (X could be the age, for example), we may take a relatively small sample and infer on the distribution of X in the whole population from the distribution of X in the sample.
When is a sample representative?
- When the sample is large
- When the sample is small
- When the sample is randomly drawn so that everyone has the same chance to be selected
- It is always that
- It cannot be that
- When the sample is randomly drawn so that everyone has the same chance to be selected
What happens to the width of the interval estimate of the mean, estimated from a sample with known standard deviation when
the confidence level decreases.
- The width of the confidence interval increases
- The width of the confidence interval decreases
- The width of the confidence interval decreases
What happens to the width of the interval estimate of the mean, estimated from a sample with known standard deviation when
The standard deviation increases.
- The width of the confidence interval increases
- The width of the confidence interval decreases
- The width of the confidence interval increases
What happens to the width of the interval estimate of the mean, estimated from a sample with known standard deviation when
The sample size increases.
- The width of the confidence interval increases
- The width of the confidence interval decreases
- The width of the confidence interval decreases
To estimate distribution of a certain characteristic X of the population (e.g. the age), a sample of size n was drawn. Let X(streck) be the sample mean.
In which cases, when computing the confidence interval for X(streck)
we use quantiles of the Normal distribution?
- always
- when the characteristic X is Normally distributed in the population
- when X is t-distributed
- when X is Normally distributed with known variance or X is not necessarily Normal, but the sample is large (at least 50)
- when X is Normally distributed, the sample is small (less than 50) and we estimate its variance from the sample
- we never use it
- when X is Normally distributed with known variance or X is not necessarily Normal, but the sample is large (at least 50)
To estimate distribution of a certain characteristic X of the population (e.g. the age), a sample of size n was drawn. Let X(streck) be the sample mean.
In which cases, when computing the confidence interval for X(srteck) we use quantiles of the t-distribution?
- always
- when the characteristic X is Normally distributed in the population
- when X is t-distributed
- when X is Normally distributed with known variance or X is not necessarily Normal, but the sample is large (at least 50)
- when X is Normally distributed, the sample is small (less than 50) and we estimate its variance from the sample
- we never use it
- when X is Normally distributed, the sample is small (less than 50) and we estimate its variance from the sample
The correlation coefficient
- is a measure of corrosion of materials
- is a measure of a linear relationship between variables
- is a measure of any relationship between variables
- None of the above
- is a measure of a linear relationship between variables
The correlation coefficient may assume
- any real value
- any non-negative value
- only a value between -1 and 1 inclusive
- none of the above
- only a value between -1 and 1 inclusive
The sample correlation coefficient equal -1 means that
- the points in the scatter plot all lie on an increasing line
- the points in the scatter plot all lie on a decreasing line
- the points in the scatter plot all lie on a horizontal line
- none of the above
- the points in the scatter plot all lie on a decreasing line
The sample correlation coefficient equal 0 means that
- the points in the scatter plot all lie on an increasing line
- the points in the scatter plot all lie on a decreasing line
- the points in the scatter plot all lie on a horizontal line
- none of the above
- none of the above
It is an established fact that the number of deaths by drowning is highly positively correlated with the ice-cream consumption.
Choose the most reasonable explanation of this phenomenon.
- Eating an ice-cream prior to bathing is likely to causes drowning by cramps of cold muscles
- Relatives of the drowned people start eating more ice-cream as a consolation
- Both are affected by another latent factor or factors
- None of the above
- Both are affected by another latent factor or factors
The regression analysis in statistics aims to explain a variable Y (response) as a function of an explanatory variable X (generally, a few explanatory variables).
Residuals in statistics are
- the difference between the observed values of Y and the values obtained by the regression function at the corresponding X
- the difference between Y and X values
- the difference between X and Y values
- what is left in the coffee cup of a statistician after completion of regression analysis
- none of the above
- the difference between the observed values of Y and the values obtained by the regression function at the corresponding X
The regression analysis in statistics aims to explain a variable Y (response) as a function of an explanatory variable X (generally, a few explanatory variables).
Which quantity is minimised when the regression line (i.e. a linear regression function) is constructed (as in the course)?
- each of the residuals
- sum of the residuals
- sum of squares of the residuas
- the amount of coffee left in the cup
- none of the above
- sum of squares of the residuas
(X1,Y1),…,(Xn,Yn) is a sample taken from the pair of random variables (X,Y). The estimated value of the gradient of the regression line of Y to X is b.
What does it mean that the regression of Y to X is significant with 5% error?
- There is 95% of chance that the true gradient is not 0
- There is 5% of chance that the true gradient is not 0
- The true gradient can be 0, but then there is at least 5% of chance to obtain the gradient from the sample of size n as extreme as b (i.e. |b| or more in absolute value).
- The true gradient can be 0, but then there is less than 5% of chance to obtain the gradient from the sample of size n as extreme as b (i.e. |b| or more in absolute value).
- None of the above
- The true gradient can be 0, but then there is less than 5% of chance to obtain the gradient from the sample of size n as extreme as b (i.e. |b| or more in absolute value).
(X1,Y1),…,(Xn,Yn) is a sample taken from the pair of random variables (X,Y). The estimated value of the gradient of the regression line of Y to X is b.
What does it mean that the regression of Y to X is not significant with 95% confidence?
- There is 95% of chance that the true gradient is 0
- There is 5% of chance that the true gradient is 0
- If the true gradient is 0, there is at least 5% of chance to obtain the gradient from the sample of size n as extreme as b (i.e. |b| or more in absolute value).
- If the true gradient is 0, there is less than 5% of chance to obtain the gradient from the sample of size n as extreme as b (i.e. |b| or more in absolute value).
- None of the above
- If the true gradient is 0, there is at least 5% of chance to obtain the gradient from the sample of size n as extreme as b (i.e. |b| or more in absolute value).
A bivariate distribution consists of two random variables A and B. A random sample of (A,B)-pairs is taken and a regression of B on A is calculated.
The sum of squares of residuals (of errors, SSE) turns out to be 3.18 and the sum of squares for regression (response, SSR) is 2.49.
Find R2, the “coefficient of determination”, which represents the proportion of the “variation in the response variable” that is explained by the regression.
0.4392
A bivariate distribution consists of two random variables A and B. A random sample of (A,B)-pairs is taken and a regression of B on A is calculated.
The sum of squares of residuals (of errors, SSE) turns out to be 3.18 and the sum of squares for regression (response, SSR) is 2.49.
What is the coefficient of correlation of A with B?
0.6627
In the statistical testing,
the test statistic is:
- a function of the data used to decide whether to accept the null hypothesis or not
- a function of the data used to decide whether to accept the alternative hypothesis or not
- all of the above
- none of the above
- all of the above
In the statistical testing,
the critical region is:
- the set of such values of the test statistic where we reject the null hypothesis
- the set of such values of the test statistic where we accept the null hypothesis
- the set of such values of the test statistic which are crucial for the test to be informative
- none of the above
- the set of such values of the test statistic where we reject the null hypothesis
In the statistical testing,
the critical region when we increase the error level?
- it increases
- it decreases
- it does not change
- no general relation exists
- it increases