[人工智能] 机器学习day3-逻辑回归

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 人工智能 -> 机器学习day3-逻辑回归 -> 正文阅读

[人工智能]机器学习day3-逻辑回归

先上大纲：

1.数据获取

2.数据处理（这里包括：2.1缺失值的处理、2.2特征值，目标值的确认、2.3分割数据

3.特征工程

4.机器学习（*逻辑回归*-对于事物的分类，结果会是介乎0-1之间的值。值得一提的是一般情况下现性回归的输出值会是逻辑回归的输入值，当线性回归的结果输出之后就对数据进行"sigmoid"运算得到逻辑回归的数据，通过我们设置的阈值作为标准将数据分为两类：算出的数据属于两类数据的哪一类之后与真实情况对比，计算精确率和精确率。

5.模型的评价

先上api：

from sklearn.linear_model import LogisticRegression
#有关pands和nummpy、数据分割的api、标准化的api我就不赘述，详情见day2的

这里所选的数据库是下面的有关乳腺癌良恶性的数据库

Index of /ml/machine-learning-databases/breast-cancer-wisconsin

names = ['Sample code number', 'Clump Thickness', 'Uniformity of Cell Size', 'Uniformity of Cell Shape','Marginal Adhesion', 'Single Epithelial Cell Size', 'Bare Nuclei', 'Bland Chromatin','Normal Nucleoli', 'Mitoses', 'Class']

data = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data",names=names)
#数据的获取，和对数据的整理（非处理），只是用pandas把从网上获取到的数据加了几个标签“names”

# 数据处理
data = data.replace(to_replace"?", value=np.NaN)
data = data.dropna()
x = data.iloc[:,1:10]
y = data["Class"]
#数据分割
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=22)
# 特征工程
transfer = StandardScaler()
x_train = transfer.fit_transform(x_train)
x_test = transfer.transform(x_test)
# 机器学习
estimator = LogisticRegression()
estimator.fit(x_train, y_train)
# 模型评估
estimator.score(x_test, y_test)

混淆矩阵：

混淆矩阵_百度百科