Linear regression Flashcards

1
Q

describe diffrent type of steps in datascience?

A

formulate question
gather data
clean data
explore and visualize data
train algorithm
evaluate model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

why cleaning of data is required?

A

to deal with
missing data
incomplete data
inaccurate data
format data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is pandas module?

A

it is an opensource library in python widely used for datascience and machine learning purposes
mainly used as a datastructure and data analysis tool
made on top of numpy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

explain the following
-describe()

A

describe()-gives a details about the data like count,max,min,std etc..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is matplotlib?

A

it is a data visualization library with which you can generate graphs,plot,histogram etc..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is pyplot in matplotlib?

A

it is using pyplot we make use of matplotlib’s plotting properties.
The various plots we can utilize using Pyplot are Line Plot, Histogram, Scatter, 3D Plot, Image, Contour, and Polar.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

explain the implementation of the following in pyplot

-plt.plot
-plt.scatter(x,y,alpha=)
-plt.show()
-plt.title()
-plt.xlabel,plt.ylabel
-plt.figure(figsize=())
-plt.xlim,plt.ylim

A

alpha-transperency
xlim,ylim-to limit x and yaxis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how many zeroes in a million

A

6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is linear regression?explain θ0 and θ1 in hypothesis equation?explain intersept and slope?
explain actual and fitted value?what are residuals

A

hypothesis eqn=h(x)=θ0+θ1x
the best fitted line is so choosed so that the absolute sum of the square of residuals is minimum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

why the term regression ?

A

outcome of the highest input tends to move towards the average of the outcomes(move backwards)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is scikit-learn module?

A

scikit-learn is machine learning module in python
used for data mining and data analysis built on top of numpy,scipy and matplotlib

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how do you implement a linear regression model in python?

A

from sklearn.linear_model import LinearRegression
regression=LinearRegression()
regression.fit(x,y)
prediction=regression.predict(x)
plot(x,prediction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how do you interpret slope and intercept( θ1 and θ0)

A

in case of movie revenue prediction
slope represent m times the budget would be the revenue
intercept represent the revenue for a zero dollar budget movie

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is goodness of fit in regression?how it is coded in python?

A

goodness of fit represents how well a model fits the given set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

why do we use %matplotlib inline?

A

it is used to display figures below each shell itself

How well did you know this?
1
Not at all
2
3
4
5
Perfectly