吴裕雄 python 机器学习——数据预处理包裹式特征选取模型

from sklearn.svm import LinearSVC

from sklearn.datasets import load_iris

from sklearn.feature_selection import RFE,RFECV

from sklearn.model_selection import train_test_split

#数据预处理包裹式特征选取RFE模型

def test_RFE():

    iris=load_iris()

    X=iris.data

    y=iris.target

    estimator=LinearSVC()

    selector=RFE(estimator=estimator,n_features_to_select=2)

    selector.fit(X,y)

    print("N_features %s"%selector.n_features_)

    print("Support is %s"%selector.support_)

    print("Ranking %s"%selector.ranking_)

#调用test_RFE()

test_RFE()

#数据预处理包裹式特征选取RFECV模型

def test_RFECV():

    iris=load_iris()

    X=iris.data

    y=iris.target

    estimator=LinearSVC()

    selector=RFECV(estimator=estimator,cv=3)

    selector.fit(X,y)

    print("N_features %s"%selector.n_features_)

    print("Support is %s"%selector.support_)

    print("Ranking %s"%selector.ranking_)

    print("Grid Scores %s"%selector.grid_scores_)

#调用test_RFECV()

test_RFECV()

def test_compare_with_no_feature_selection():

    '''

    比较经过特征选择和未经特征选择的数据集，对 LinearSVC 的预测性能的区别

    '''

    ### 加载数据

    iris=load_iris()

    X,y=iris.data,iris.target

    ### 特征提取

    estimator=LinearSVC()

    selector=RFE(estimator=estimator,n_features_to_select=2)

    X_t=selector.fit_transform(X,y)

    #### 切分测试集与验证集

    X_train,X_test,y_train,y_test=train_test_split(X, y,test_size=0.25,random_state=0,stratify=y)

    X_train_t,X_test_t,y_train_t,y_test_t=train_test_split(X_t, y,test_size=0.25,random_state=0,stratify=y)

    ### 测试与验证

    clf=LinearSVC()

    clf_t=LinearSVC()

    clf.fit(X_train,y_train)

    clf_t.fit(X_train_t,y_train_t)

    print("Original DataSet: test score=%s"%(clf.score(X_test,y_test)))

    print("Selected DataSet: test score=%s"%(clf_t.score(X_test_t,y_test_t)))

#调用test_compare_with_no_feature_selection()

test_compare_with_no_feature_selection()

吴裕雄 python 机器学习——数据预处理包裹式特征选取模型的更多相关文章

吴裕雄 python 机器学习——数据预处理过滤式特征选取SelectPercentile模型
from sklearn.feature_selection import SelectPercentile,f_classif #数据预处理过滤式特征选取SelectPercentile模型 def ...
吴裕雄 python 机器学习——数据预处理过滤式特征选取VarianceThreshold模型
from sklearn.feature_selection import VarianceThreshold #数据预处理过滤式特征选取VarianceThreshold模型 def test_Va ...
吴裕雄 python 机器学习——数据预处理二元化OneHotEncoder模型
from sklearn.preprocessing import OneHotEncoder #数据预处理二元化OneHotEncoder模型 def test_OneHotEncoder(): X ...
吴裕雄 python 机器学习——数据预处理二元化Binarizer模型
from sklearn.preprocessing import Binarizer #数据预处理二元化Binarizer模型 def test_Binarizer(): X=[[1,2,3,4,5 ...
吴裕雄 python 机器学习——数据预处理字典学习模型
from sklearn.decomposition import DictionaryLearning #数据预处理字典学习DictionaryLearning模型 def test_Diction ...
吴裕雄 python 机器学习——数据预处理嵌入式特征选择
import numpy as np import matplotlib.pyplot as plt from sklearn.svm import LinearSVC from sklearn.li ...
吴裕雄 python 机器学习——数据预处理正则化Normalizer模型
from sklearn.preprocessing import Normalizer #数据预处理正则化Normalizer模型 def test_Normalizer(): X=[[1,2,3, ...
吴裕雄 python 机器学习——数据预处理标准化MaxAbsScaler模型
from sklearn.preprocessing import MaxAbsScaler #数据预处理标准化MaxAbsScaler模型 def test_MaxAbsScaler(): X=[[ ...
吴裕雄 python 机器学习——数据预处理标准化StandardScaler模型
from sklearn.preprocessing import StandardScaler #数据预处理标准化StandardScaler模型 def test_StandardScaler() ...

随机推荐

python3练习100题——016
今天的题目比较容易了,旨在让人掌握datetime模块下的一些用法. 链接:http://www.runoob.com/python/python-exercise-example16.html 题目 ...
maven报错Non-resolvable parent POM---pom找不到parent
没有配置relativePath属性,说明运行的时候使用的是默认的,所以它会在默认的pom父类中查找,而不会到我自己创建的里面进行寻找参考链接:https://blog.csdn.net/qq_37 ...
正则表达式过滤url请求
过滤url中带reset的url请求 atgBusSignFilter.setSignUriRegex("^.*/reset/.*$")等价于 atgBusSignFilter.s ...
部署web应用程序到tomcat
昨天将一个web项目部署到本地的tomcat,历程很艰辛,各种报错.首先这个项目可以用eclipse内嵌的jetty启动起来,试着用tomcat容器,各种报错.以下是详细步骤: 1.用eclipse打 ...
1.Java多线程之wait和notify
1.首先我们来从概念上理解一下这两个方法: (1)obj.wait(),当obj对象调用wait方法时,这个方法会让当前执行了这条语句的线程处于等待状态(或者说阻塞状态),并释放调用wait方法的对象 ...
Educational Codeforces Round 76 (Rated for Div. 2) C. Dominated Subarray
Let's call an array tt dominated by value vv in the next situation. At first, array tt should have a ...
AcWing 差分一维加二维
一维 #include<bits/stdc++.h> using namespace std ; ; int n,m; int a[N],b[N]; //a为前缀和,b为差分差分和前缀和 ...
哥廷根： Heroes in My Heart
哥廷根: Heroes in My Heart (本篇的文字部分均出自北大未名BBS的连载 Heroes in my heart 中哥廷根的部分,作者 ukim. 话说,有任何人能够联系上 ukim ...
Linux服务器时间设置及同步
闲余:夏日将到,园区计划五一期间进行大面积的电网停电检修,运维同学因此将公司测试服务器提前关闭了.收假后,测试告诉我,他发现一个bug--一段定时任务程序未执行,我的第一反应就是--会不会是假期测试服 ...
PMP概略学习上--基本思想和概念
1 前言花了10天左右的时间,对PMP(Project Management Professional,项目管理专业人士)考试认证做了一个概略学习.此次学习的目的是整体了解项目管理知识,并不是以考试 ...

吴裕雄 python 机器学习——数据预处理包裹式特征选取模型

吴裕雄 python 机器学习——数据预处理包裹式特征选取模型的更多相关文章

随机推荐

热门专题