线性回归的概念和模型推导见机器学习—线性回归总结即python实现 线性回归用于解决预测问题,本文用sklearn库解决线性回归中的经典问题——波士顿房价预测
sklearn库是机器学习中常用是库,其中包含很多实用的数据集,在datasets里导入波士顿房价的数据,利用model_selection工具包进行数据分割,利用linear_model库中的LinearRegression进行模型的预测,利用metrics评价模型,利用matplotlib.pyplot进行可视化,具体见代码
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
import matplotlib.pyplot as plt
import numpy as np
boston = load_boston()
X_train, X_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.3, random_state=0)
'''
训练数据集:X_train,y_train
测试数据集:X_test,y_test
test_size=0.3表示百分之30的数据用于测试
'''
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
mse = metrics.mean_squared_error(y_test, y_pred)
print("MSE = ", mse)
plt.scatter(y_test, y_pred)
plt.xlabel("Price: $Y_i$")
plt.ylabel("Predicted prices: $\hat{Y}_i$")
plt.title("Prices vs Predicted prices: $Y_i$ vs $\hat{Y}_i$")
plt.grid()
x = np.arange(0, 50)
y = x
plt.plot(x, y, color='red', lw=4)
plt.text(30, 40, "predict line")
plt.show()
|