科学经得起实践检验-python3.6通过决策树实战精准准确预测今日大盘走势(含代码)

春有百花秋有月，夏有凉风冬有雪；

若无闲事挂心头，便是人间好时节。

　　--宋．无门慧开

不废话了，以下训练模型数据，采用本人发明的极致800实时指数近期的一些实际数据，

预测采用今日的真实数据

#coding=utf-8

__author__ = 'huangzhi'



import math

import operator

def calcShannonEnt(dataset):

    numEntries = len(dataset)

    labelCounts = {}

    for featVec in dataset:

        currentLabel = featVec[-1]

        if currentLabel not in labelCounts.keys():

            labelCounts[currentLabel] = 0

        labelCounts[currentLabel] += 1

    shannonEnt = 0.0

    for key in labelCounts:

        prob = float(labelCounts[key]) / numEntries

        shannonEnt -= prob * math.log(prob, 2)

    return shannonEnt

def CreateDataSet():

    # dataset = [[1, 1, 'yes'],

    #            [1, 1, 'yes'],

    #            [1, 0, 'no'],

    #            [0, 1, 'no'],

    #            [0, 1, 'no']]



    dataset = [[3, 4, 100, 85, 4, 6, 110, 120, 4, 6, 151, 122, 8, 12, 110, 185, '跌'],

               [5, 7, 88, 85, 6, 8, 100, 130, 6, 9, 131, 132, 8, 14, 100, 195, '跌'],

               [6, 2, 60, 20, 9, 3, 80, 22, 16, 4, 131, 32, 33, 5, 160, 45, '涨'],

               [3, 4, 100, 105, 4, 6, 110, 120, 4, 6, 151, 122, 8, 12, 110, 185, '跌'],

               [5, 3, 50, 30, 8, 4, 70, 28, 12, 6, 101, 42, 28, 7, 120, 35, '涨'],

               [2, 6, 60, 95, 4, 8, 90, 130, 6, 11, 101, 142, 9, 15, 99, 145, '跌'],

               [5, 3, 70, 30, 8, 4, 90, 32, 22, 6, 141, 42, 43, 8, 150, 65, '涨'],

               [2, 8, 30, 60, 9, 8, 80, 90, 9, 20, 140, 160, 12, 32, 101, 205, '跌']]

    labels = ['l1', 'l2', 'l3', 'l4', 'l5', 'l6', 'l7', 'l8', 'l9', 'l11', 'l12', 'l13', 'l14', 'l15', 'l16', 'l17']

    return dataset, labels

def splitDataSet(dataSet, axis, value):

    retDataSet = []

    for featVec in dataSet:

        if featVec[axis] == value:

            reducedFeatVec = featVec[:axis]

            reducedFeatVec.extend(featVec[axis + 1:])

            retDataSet.append(reducedFeatVec)

    return retDataSet

def chooseBestFeatureToSplit(dataSet):

    numberFeatures = len(dataSet[0]) - 1

    baseEntropy = calcShannonEnt(dataSet)

    bestInfoGain = 0.0;

    bestFeature = -1;

    for i in range(numberFeatures):

        featList = [example[i] for example in dataSet]

        # print(featList)

        uniqueVals = set(featList)

        # print(uniqueVals)

        newEntropy = 0.0

        for value in uniqueVals:

            subDataSet = splitDataSet(dataSet, i, value)

            prob = len(subDataSet) / float(len(dataSet))

            newEntropy += prob * calcShannonEnt(subDataSet)

        infoGain = baseEntropy - newEntropy

        if (infoGain > bestInfoGain):

            bestInfoGain = infoGain

            bestFeature = i

    return bestFeature

def majorityCnt(classList):

    classCount = {}

    for vote in classList:

        if vote not in classCount.keys():

            classCount[vote] = 0

        classCount[vote] = 1

    sortedClassCount = sorted(classCount.iteritems(), key=operator.itemgetter(1), reverse=True)

    return sortedClassCount[0][0]

def createTree(dataSet, inputlabels):

    labels = inputlabels[:]

    classList = [example[-1] for example in dataSet]

    if classList.count(classList[0]) == len(classList):

        return classList[0]

    if len(dataSet[0]) == 1:

        return majorityCnt(classList)

    bestFeat = chooseBestFeatureToSplit(dataSet)

    bestFeatLabel = labels[bestFeat]

    myTree = {bestFeatLabel: {}}

    del (labels[bestFeat])

    featValues = [example[bestFeat] for example in dataSet]

    uniqueVals = set(featValues)

    for value in uniqueVals:

        subLabels = labels[:]

        myTree[bestFeatLabel][value] = createTree(splitDataSet(dataSet, bestFeat, value), subLabels)

    return myTree

def classify(inputTree, featLabels, testVec):

    firstStr = list(inputTree.keys())[0]

    secondDict = inputTree[firstStr]

    featIndex = featLabels.index(firstStr)

    for key in secondDict.keys():

        if testVec[featIndex] == key:

            if type(secondDict[key]).__name__ == 'dict':

                classLabel = classify(secondDict[key], featLabels, testVec)

            else:

                classLabel = secondDict[key]

    return classLabel

myDat, labels = CreateDataSet()

# print(calcShannonEnt(myDat))



# print(splitDataSet(myDat, 1, 1))



# print(chooseBestFeatureToSplit(myDat))



myTree = createTree(myDat, labels)

#通过早上9:41分的实际数据进行预测

print(classify(myTree, labels, [1, 6, 156, 169, 1, 6, 156, 169, 1, 6, 156, 169, 1, 6, 156, 169]))

#通过早上10:41分的实际数据进行预测

print(classify(myTree, labels, [1, 6, 156, 169, 4, 9, 129, 263, 4, 9, 129, 263, 4, 9, 129, 263]))

#通过下午13:41分的实际数据进行预测

print(classify(myTree, labels, [1, 6, 156, 169, 4, 9, 129, 263, 5, 12, 123, 306, 5, 12, 123, 306]))

#通过下午14:41分的实际数据进行预测

print(classify(myTree, labels, [1, 6, 156, 169, 4, 9, 129, 263, 5, 12, 123, 306, 6, 13, 99, 397]))

运行结果如下：

D:\Programs\Python\Python36-64\python.exe D:/pyfenlei/决策树/jcs4.py

跌

跌

跌

跌

科学经得起实践检验-python3.6通过决策树实战精准准确预测今日大盘走势(含代码)的更多相关文章

通俗地说决策树算法（三）sklearn决策树实战
前情提要通俗地说决策树算法(一)基础概念介绍通俗地说决策树算法(二)实例解析上面两篇介绍了那么多决策树的知识,现在也是时候来实践一下了.Python有一个著名的机器学习框架,叫sklearn.我 ...
崔庆才Python3网络爬虫开发实战电子版书籍分享
资料下载地址: 链接:https://pan.baidu.com/s/1WV-_XHZvYIedsC1GJ1hOtw 提取码:4o94 <崔庆才Python3网络爬虫开发实战>高清中文版P ...
python3.4学习笔记(二十五) Python 调用mysql redis实例代码
python3.4学习笔记(二十五) Python 调用mysql redis实例代码 #coding: utf-8 __author__ = 'zdz8207' #python2.7 import ...
《Python3 网络爬虫开发实战》开发环境配置过程中踩过的坑
<Python3 网络爬虫开发实战>学习资料:https://www.cnblogs.com/waiwai14/p/11698175.html 如何从墙内下载Android Studio: ...
《Python3 网络爬虫开发实战》学习资料
<Python3 网络爬虫开发实战> 学习资料百度网盘:https://pan.baidu.com/s/1PisddjC9e60TXlCFMgVjrQ
Python3连接MySQL数据库实战
Python3连接MySQL数据库实战第三方库 :pymysql 数据库连接 def connect(): try: #建立数据库连接,从左至右参数依次为 # ip地址我用的是云端数据库如果为本 ...
Python——决策树实战：california房价预测
Python——决策树实战:california房价预测编译环境:Anaconda.Jupyter Notebook 首先,导入模块: import pandas as pd import matp ...
Python3网络爬虫开发实战PDF高清完整版免费下载|百度云盘
百度云盘:Python3网络爬虫开发实战高清完整版免费下载提取码:d03u 内容简介本书介绍了如何利用Python 3开发网络爬虫,书中首先介绍了环境配置和基础知识,然后讨论了urllib.req ...
转：【Python3网络爬虫开发实战】 requests基本用法
1. 准备工作在开始之前,请确保已经正确安装好了requests库.如果没有安装,可以参考1.2.1节安装. 2. 实例引入 urllib库中的urlopen()方法实际上是以GET方式请求网页,而 ...

随机推荐

2018.08.06 bzoj1500: [NOI2005]维修数列（非旋treap）
传送门平衡树好题. 我仍然是用的fhqtreap,感觉速度还行. 维护也比线段树splay什么的写起来简单. %%%非旋treap大法好. 代码: #include<bits/stdc++.h ...
Django入门与实践-第14章：用户注册（完结）
http://127.0.0.1:8000/signup/ django-admin startapp accounts INSTALLED_APPS = [ 'accounts', ] # mypr ...
C语言之接收方向键指令让屏幕上的输出能移动
首先,需要了解一下控制台坐标 #include <stdio.h> #include <stdlib.h> #include <conio.h> main() { ...
Python 学习目录
第一章 Python基础第二章 Python基础第三章 Python基础-文件操作&函数 1.python文件处理 2.py-函数基础 3.py-函数进阶第四章 Python基础-常用模 ...
LDA汇总
1.Blei的LDA代码(C):http://www.cs.princeton.edu/~blei/lda-c/index.html2.D.Bei的主页:http://www.cs.princeton ...
Ubuntu16.04安装PostgreSQL并使用pgadmin3管理数据库_图文详解
版权声明:本文地址http://blog.csdn.net/caib1109/article/details/51582663 欢迎非商业目的的转载, 作者保留一切权利 apt安装postgresql ...
hdu 1014
我:题都看不懂路人甲:这是随机数分配题目路人乙:这是求生成元,求mod N的生成元,即生成元与N互质路人丙:这是根据给出的递推公式算一下 0~ mod-1之间的数是否都有出现过,如果都出现了,那 ...
Python学习-26.Python中的三角函数
Python中的三角函数位于math模块内. 引入模块: import math 输出pi import math print(math.pi) 得:3.141592653589793 math模块内 ...
学习sqlserve的一些笔记
创建表: create table 表名 { //定义列名 id ,) primary key,//自动编号:从1开始每次增长1,约束:主键约束 name ) not null //非空约束 } 表数 ...
easyui datagrid sort 表头排序
datagrid的点击列表头刷新,分为两种,一种是页面刷新,不涉及后台服务器数据,不会从新查询数据库,只会刷新当前页数据: 一种是服务器级刷新,会重新加载全部数据. 如果不需要自定义排序,可以直接使用 ...

科学经得起实践检验-python3.6通过决策树实战精准准确预测今日大盘走势(含代码)

科学经得起实践检验-python3.6通过决策树实战精准准确预测今日大盘走势(含代码)的更多相关文章

随机推荐

热门专题