Linear Regression 2 Flashcards
(24 cards)
What is Linear Regression?
A supervised learning algorithm used to model the relationship between input features and a continuous target variable.
What is the equation for a simple linear regression model?
y = mX + b, where m is the coefficient (slope) and b is the intercept.
What is the general form of multivariate linear regression?
y = m1x1 + m2x2 + … + mn*xn + b
Which library in Python provides LinearRegression?
from sklearn.linear_model import LinearRegression
How do you import train_test_split?
from sklearn.model_selection import train_test_split
How do you define X and y for simple linear regression?
X = df[[‘feature’]], y = df[‘target’]
How do you create and train a simple linear regression model?
reg = LinearRegression(); reg.fit(X, y)
How do you make predictions with a simple model?
reg.predict([[value]])
How do you view the slope (m) of the regression line?
reg.coef_
How do you view the intercept (b) of the regression line?
reg.intercept_
How do you handle missing values in features like ‘bedrooms’?
df[‘bedrooms’] = df[‘bedrooms’].fillna(df[‘bedrooms’].median())
How do you define X and y for multivariate regression?
X = df[[‘feature1’, ‘feature2’, …]], y = df[‘target’]
How do you train a multivariable model?
reg.fit(X, y)
How do you make a multivariable prediction?
reg.predict([[val1, val2, val3]])
What do the values in reg.coef_ represent?
The weight (impact) of each feature on the predicted output.
How do you split data into training and testing sets?
train_test_split(X, y, test_size=0.3, random_state=101)
What is the purpose of test_size in train_test_split?
To specify the proportion of data used for testing (e.g., 0.3 = 30%).
What does random_state do in train_test_split?
Ensures reproducible splits every time you run the code.
How do you visualize actual vs predicted values?
plt.scatter(y_test, predictions); plt.xlabel(‘Actual’); plt.ylabel(‘Predicted’)
How do you plot a regression line in seaborn?
sns.lmplot(x=’feature’, y=’target’, data=df)
Which metrics are used to evaluate regression models?
MAE, MSE, RMSE from sklearn.metrics
How do you compute RMSE?
np.sqrt(mean_squared_error(y_test, predictions))
What does a lower RMSE mean?
The model predictions are closer to the actual values.