吴裕雄 python 机器学习——数据预处理流水线Pipeline模型

from sklearn.svm import LinearSVC

from sklearn.pipeline import Pipeline

from sklearn import neighbors, datasets

from sklearn.datasets import load_digits

from sklearn.linear_model import LogisticRegression

from sklearn.model_selection import train_test_split

def load_diabetes():

    #使用 scikit-learn 自带的一个糖尿病病人的数据集

    diabetes = datasets.load_diabetes()

    # 拆分成训练集和测试集，测试集大小为原始数据集大小的 1/4

    return train_test_split(diabetes.data,diabetes.target,test_size=0.25,random_state=0)  

#数据预处理流水线Pipeline模型

def test_Pipeline(X_train,X_test,y_train,y_test):

    steps=[("Linear_SVM",LinearSVC(C=1,penalty='l1',dual=False)),("LogisticRegression",LogisticRegression(C=1))]

    pipeline=Pipeline(steps)

    pipeline.fit(X_train,y_train)

    print("Named steps:",pipeline.named_steps)

    print("Pipeline Score:",pipeline.score(X_test,y_test))

# 获取分类数据

X_train,X_test,y_train,y_test=load_diabetes()

# 调用 test_Pipeline

test_Pipeline(X_train,X_test,y_train,y_test)

吴裕雄 python 机器学习——数据预处理流水线Pipeline模型的更多相关文章

吴裕雄 python 机器学习——数据预处理正则化Normalizer模型
from sklearn.preprocessing import Normalizer #数据预处理正则化Normalizer模型 def test_Normalizer(): X=[[1,2,3, ...
吴裕雄 python 机器学习——数据预处理标准化MaxAbsScaler模型
from sklearn.preprocessing import MaxAbsScaler #数据预处理标准化MaxAbsScaler模型 def test_MaxAbsScaler(): X=[[ ...
吴裕雄 python 机器学习——数据预处理标准化StandardScaler模型
from sklearn.preprocessing import StandardScaler #数据预处理标准化StandardScaler模型 def test_StandardScaler() ...
吴裕雄 python 机器学习——数据预处理标准化MinMaxScaler模型
from sklearn.preprocessing import MinMaxScaler #数据预处理标准化MinMaxScaler模型 def test_MinMaxScaler(): X=[[ ...
吴裕雄 python 机器学习——数据预处理字典学习模型
from sklearn.decomposition import DictionaryLearning #数据预处理字典学习DictionaryLearning模型 def test_Diction ...
吴裕雄 python 机器学习——数据预处理过滤式特征选取SelectPercentile模型
from sklearn.feature_selection import SelectPercentile,f_classif #数据预处理过滤式特征选取SelectPercentile模型 def ...
吴裕雄 python 机器学习——数据预处理过滤式特征选取VarianceThreshold模型
from sklearn.feature_selection import VarianceThreshold #数据预处理过滤式特征选取VarianceThreshold模型 def test_Va ...
吴裕雄 python 机器学习——数据预处理二元化OneHotEncoder模型
from sklearn.preprocessing import OneHotEncoder #数据预处理二元化OneHotEncoder模型 def test_OneHotEncoder(): X ...
吴裕雄 python 机器学习——数据预处理二元化Binarizer模型
from sklearn.preprocessing import Binarizer #数据预处理二元化Binarizer模型 def test_Binarizer(): X=[[1,2,3,4,5 ...

随机推荐

bat代码中判断 bat是否是以管理员权限运行，以及自动以管理员权限运行CMD BAT
· bat 代码中判断bat是否是以管理员权限运行,以及自动以管理员权限运行CMD BAT 一.判断bat是否是以管理员权限运行 @echo off net.exe session >NUL & ...
关于gets读入因为缓冲区出现的问题
今天被一个同学丢了代码求debug 然后发现bug挺有意思的,稍微记录一下首先我们读入的东西都会被丢进缓冲区等待接收,比如abc\n,如果你使用scanf读入的话,它在读入到\n的时候就会提取它需要 ...
MyEclipse 安装 emmet 插件
1.在线安装地址:http://download.emmet.io/eclipse/updates/ 安装完成后重新启动myeclipse 2.离线安装下载jar包:https://github. ...
HDU 2586 ( LCA/tarjan算法模板)
链接:http://acm.hdu.edu.cn/showproblem.php?pid=2586 题意:n个村庄构成一棵无根树,q次询问,求任意两个村庄之间的最短距离思路:求出两个村庄的LCA,d ...
线程同步器CountDownLatch
Java程序有的时候在主线程中会创建多个线程去执行任务,然后在主线程执行完毕之前,把所有线程的任务进行汇总,以前可以用线程的join方法,但是这个方法不够灵活,我们可以使用CountDownLatch ...
HTML列表标签
<ul>无序列表有2个属性 1.compact 属性: 规定列表呈现的效果比正常情况更小巧.没啥作用 2.type 属性 disc小圆点 square小方块 circle小圆圈(默认) ...
IntelliJ IDEA 2017.3-----导入jar包
1.选择jar包 2.右键选择 3.点击ok
Java-POJ1012-Joseph
打表啦约瑟夫环,处理时下表统一为从0开始更方便! import java.util.Scanner; public class poj1012 { public static boolean cal ...
C语言 strlen
C语言 strlen #include <string.h> size_t strlen(const char *s); 功能:计算指定指定字符串s的长度,不包含字符串结束符‘\0’ 参数 ...
.net core 2.2 使用imagemagick 将pdf转化为png
工作需要将PDF文件每一页拆分为一个一个的png文件测试环境:mac,visual studio for mac 2019 nuget:magick.net-Q16-AnyCPU 不能直接支持PDF ...

吴裕雄 python 机器学习——数据预处理流水线Pipeline模型

吴裕雄 python 机器学习——数据预处理流水线Pipeline模型的更多相关文章

随机推荐

热门专题