概述

线性回归:用回归方程来描述多个/1个自变量与因变量之间关系的一种思路

一元线性回归：y = wx+b

多元线性回归: y = T(w) * x + b

w: weight, 权重 (矩阵)/ 斜率

b:bias,偏置 / 截距

损失函数: 就是让模型效果好

误差 = 真实值 - 预测值

最小二乘：误差平方和

MAE（平均绝对误差）：绝对误差的平均值，稳健。

MSE（均方误差）：平方误差的平均值，放大大误差。

RMSE（均方根误差）：MSE的平方根，与目标变量单位一致。

代码示例

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
import pandas as pd

def wang_Linear_train_model():
    df = pd.read_csv('data/波士顿房价xy.csv')
    x = df.iloc[:,:13]
    y = df.iloc[:,13]
    x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2,random_state=22)
    scaler = StandardScaler()
    x_train = scaler.fit_transform(x_train)
    x_test = scaler.transform(x_test)
    model = LinearRegression()
    model.fit(x_train, y_train)
    y_predict = model.predict(x_test)
    print(r2_score(y_test, y_predict))
if __name__ == '__main__':
    wang_Linear_train_model()