spark 朴素贝叶斯
训练代码(scala)
import org.apache.spark.mllib.classification.{NaiveBayes,NaiveBayesModel}
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.{SparkContext,SparkConf}
object NaiveBayes {
def main(args: Array[String]): Unit = {
val conf = new SparkConf()
.setMaster("local")
.setAppName("NaiveBayes")
val sc = new SparkContext(conf)
val path = "../data/sample_football_weather.txt"
val data = sc.textFile(path)
val parsedData =data.map {
line =>
val parts =line.split(',')
LabeledPoint(parts(0).toDouble,Vectors.dense(parts(1).split(' ').map(_.toDouble)))
}
//样本划分train和test数据样本60%用于train
val splits = parsedData.randomSplit(Array(0.6,0.4),seed = 11L)
val training =splits(0)
val test =splits(1)
//获得训练模型,第一个参数为数据,第二个参数为平滑参数,默认为1,可改变
val model =NaiveBayes.train(training,lambda = 1.0)
//对测试样本进行测试
//对模型进行准确度分析
val predictionAndLabel= test.map(p => (model.predict(p.features),p.label))
val accuracy =1.0 *predictionAndLabel.filter(x => x._1 == x._2).count() / test.count()
//打印一个预测值
println("NaiveBayes精度----->" + accuracy)
//我们这里特地打印一个预测值:假如一天是 晴天(0)凉(2)高(0)高(1) 踢球与否
println("假如一天是 晴天(0)凉(2)高(0)高(1) 踢球与否:" + model.predict(Vectors.dense(0.0,2.0,0.0,1.0)))
//保存model
val ModelPath = "../model/NaiveBayes_model.obj"
model.save(sc,ModelPath)
//val testmodel = NaiveBayesModel.load(sc,ModelPath)
}
}
NaiveBayes
类的分布估计调整为
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAgcAAAA7CAYAAADvhinEAAAU6klEQVR4Ae2dvasexRfH9/64SmKRiEhIQCwkAQWximJlJbERtbCMlhaCInJt/ANsYiEJiLWxsIxio1iIVVALkRDFBBERIhZiLFRUuD8+G7/X88yd3Z19eV7zPfBkdmfmvH3nzMy5+2zm2drd3d2tTEbACBgBI2AEjIAR+BeB/xkJI2AEjIARMAJGwAhEBJwcRDR8bQSMgBEwAkbACFRODhwERsAIGAEjYASMwAwCTg5m4PCNETACRsAIGAEj4OTAMWAEjIARMAJGwAjMIODkYAYO3xgBI2AEjIARMAJODhwDRsAIGAEjYASMwAwCTg5m4PCNETACRsAIGAEj4OTAMWAEjIARMAJGwAjMIODkYAYO3xgBI2AEjIARMAJODhwDkyLwyiuvVIcPH662trbqMgr/+uuv99oOHjxY0XdRhC50Yteq0okTJ6rHHntsH25N9p47d646duxY7dMtt9xSwU9dXxqDDWMqG957772+qvf6P/jgg7UflG0EPrnYauNJ27ATrNCFPErqunSncuZ1PxWmU9g3BSbL8KdvTOMncTWFv1PgvhIy+G0FkxGYEoGdnZ3dAwcO8JsduxcuXNgn+ujRo/vqFlFx8uTJ3UOHDi1CVW8dYLa9vb179uzZuiwVAF8TzqUy6DcGG8YYG8YSMvC/ixjDU6dOdXXLtmMrsZkSuk+fPp1WL+1+KkzHOjDVfFmGP31jujT+xmK6LvzjZ/S6eGo7F4bA8ePHdy9fvlxvGFxHYpFY1iLM5rss3RGD3DUJU4pVrl9aB88UG/MYbPouwqkP3PfZPPCX+BpCSsBSXjbBoTJTWVPcT4HpFHZMlRwsw58+Md0n/qbAdR1k+GuFlXh+s3lG3HfffdXx48erq1evzjj35ptvVq+++upM3aJu/vnnn6Xp7vLxp59+qp566qmubvvav//+++ro0aP76vtWjMHm0qVL1cMPP9xX5Ux/4uLQoUMzdbkbfW1CfA0h/Pzss8/2sT7xxBPVUJn7hE1QMQWmE5gxmYhl+NMnpkvjbzJA1kDQ9hrYaBPXCAG+X7z99ttri19//fV6w+N73Q8//LCu++WXX/Ytws8880z1888/V1999VV17dq1PW/5DvDy5ct7/ZHz3Xff1RsRGyIb6jfffFN9++231fXr1+vvj2FGzqOPPlqdP39+T5beb3jppZeqI0eOVB9//HH11ltvVU8++WTdB7ufffbZ6o477qjbsUc2Sy+bNzppu3jxYq1zT0HDBd9ts3HCg+1XrlyZ6clmd/bs2bruwoULte3SO9Mxc4PNLID42kb0Y/O755576m7gc/r06erMmTP1/RhsEPDnn39Wzz///IwJbXjSMbXp008/rR555JEZGbmbt99+u046aQO7Dz74oProo49qDF944YUcy0zd9vZ29c4779TjD24kqiQFMVZgSO1LMVNMMLZ9Y1G8iif0EY9xTIZgihy+M7/33ntrn2MM1xVL/KfJn7a4FE5DMC6J6ag7F39dMbBEOBejeh0eb9jG9UGA78Dj98Y8loyPvXm8GElfM6SP9ZAR+bjnw3fN1NNfxH18j0Hf36udknYeM4qQo0emyIrfQ/N4WfJSvbTRH1ltj6FpS+1s+uog9VU2dpV636DNDuGqPrJL9+gYig28Odvb8ISnySbquwjc8ZvxU39sKCX8xl/GRp9cTNImjFLM0piQHdgAn2KH+zQW6UsdX29FHdRzDw3BFD70IhuK8uqKgf9ojgxkr9ma/In+T4kxSrtiOqc7jqPwa4qBMXisC6/fOViXkVoTO9PvzbUwsGgx4bR4yR1NyPQ7yfRe/VisUh1MdLUjN+Wljj7xfQP6UAex4SCTdjad2E9yc3pr5oZ/SDbiJkE3dMQkRKzozNWrvalEvnxo6iPf1K7x0D3lUGzgxafUT+nM4QmP2mVDzia1pSW2wg/PWGJsGdcUwy772mICWWrHvjQW1QZmMY6FAZvREEzRRQzxAXfkaGMrxQksJEMl/uhaZTreXfLb/BGv/Ne9cMrNuy6MkUGfOI9z8126Ut3Ud8WAeDe5dHKwyaO7BN/igif1TDQWFjbBJmIyx8QBnji5xUe/uDHAQ12knC76xMWSRYcPdWlblKXrVK/qc6VkRjvph125hRU7WLz6Ejbl5CFHf1nTRwst9ehJeVL/+2CTjpN8j1hHv7ClxKbIo2vGGgwhfEj9UL9cid6cTam9fezDjzjGpbGIffDGeGe8qIP6Yip/sZ2xQ05qm/r0LZE3llJ/FoEx/sfxVkyX6C7pMxaTdeCfXVUntpiJEyfPxOLnLi63Oc1d6RorYDLmEgBwZLI2LeZaoKPrmtxMVJEmre4p079KJItSvDk+5LM459qQC7+oqY/a0zK3Sciu3HyQLamctnvJy8WoEoOcHVqo4eeT80325NqwCT6VGif51cVTYlOT34y1kij0oRuS7iY+6uHDtpTgBRNRqX05P0tiET0aO+mkJOmBX22U8iunS3IkQ2Oue+Ya8sbS2OQg58+8Mc7hpZgu0V3SZyyu68A/t+SAoGjaDNYBGGxkcmoxWhebl2knC5QWtNQOJmfTYpVOZiUTyIjJBtf6y1HykRs3yNhHY6cFSjzpIo4MbBDRP9oaZapPW5nqoy+LrOxJedEPTx8SRikfC5swSnFlbOSrcE1t7YON5GG35HEtHfIHHcKzxCbxpSVyFV/4qU096k55dE9f2aA67AKrOPal9uViAvu6YhHdGjvZEX0ZgqnGMPqBr8JKeoaUY5ODnD/zxlh4yN8Y0yW6S/pI9iaXc0sOCCoGad2JwIqTbt39mYf94EMiyOLIYptblFhMc/WyB342TxZOFkviJ+WhT1x84UVnJNlCvxh/yEK+ysiDXZINX6pDbZGn6xof4JM+7nMUF89ce1qHT8QkfvPBJz7oYgOkDp0itVOHLt1HH2WjSvFSdmGTG6cuHtnQZlO0QdfpWOd0q29agg92gZ30cx1jRDxqb7MvFxOpfU2xiN2KDXRgR6ScX12YIkexy3VTvEU9JdfYMpZy/swbYzBAh8roQ4nukj5R5iZez66sDR5qoSP4WfwJZj4MOvdxoUGEAlniNEng55NOBvoxGLSxwE0V2NLfVBI46GybANje1t4k2/VGoAQB4j43H0p43Wc9EWDNYY1cByKJMd2cCBQlB0BDMBPUaaat+rihK0tPIWWTRUaOWCTTJCPXb+o6bCJJaCMSIJMRmBIB5gFzhtgiATXdHAgw1k1r4M2BgL1cFwSKT0jUQS3pKWJ33313fSADh4SIOCgmdyDJc889V3fh0JtIHFjBwR3pQSSxz7yuf/vtt+qNN95oFX/XXXct9EeCWo1x40YgwME5zJmnn3567yCmjXDMTjQiwKFNjDfkH/hphMkNK4LAFllMiS386tudd945c4IdfAT5F198UXG6G6fN8etmTIC///47K5ZT7w4cOFD98ccfdbtOOSs9FS4rdGAlul988UUeZbRKIJnhBLN4el8rgxuNgBEwAkbACKwxAsVPDjimleM9I/EXP4nBzs7O3l8/P/zwQ8URpU3EefscpcnRlCQSHIe6jMQA+9KjWDmuk+SFpCESf+X9/vvvscrXRsAIGAEjYAQ2FoHmXTy4rHOqP/nkk73z6zkjn3Po07+6+VGTW2+9NXDPXuq8fc61htKz5md77787fPhw9ddff+1vSGo437/rL/0vv/yy4qx9kgLOhidJITFIvxLhq5Ou5GBKuxJXfGsEjIARMAJGYLEIlLwcwdvUpS/R8FJh19v9/I8EPssmfMKOrjeHaV+kvdjljzFwDDgGHAOOgalioO9+W/TkoM/Pwj700EPV+++/35rh8BXFqVOnWvvMu5GnIXr34dixY9Vrr73W+KSBr0puu+22eZu0Jz99GrPX4AsjYASMgBEwAgtAoOidg5KfhZWtXY/g9X1++hOv4u8qeXx/8ODBzg8bfhvxAuX9999fd+FnW/kfFpDsi7y0dSUHU9kV9fraCBgBI2AEjMAyEOh8cqD3DdhAS4j/sUAy0US8BAjRbwhdv359CNs+nqtXr9a/AU9DfImS34dP3zm4ePFixW+/t9FUdrXpcJsRMAJGwAgYgUUg0PjkgP9NcOLEiYoXCCFeIOR/F5QQb/enfblHHv+7AeK/QKJjmaQk4MyZM/WTAV5MfPzxx/eZ9OOPP1alydE+ZlcYASNgBIyAEVg3BPq+pFDSn5PfNuVIWE6EXOYJiWCpkyXTFz2xTW3YGE+pLBmnMX10uh8vy6wiEX+cfJli1mUrJ9jBy4mdOpedOu5XgRhzTiAF92WfrDglJo7zYdE1JM55wVoxxIvWyOh6KTtn3Zg1YJXiOOfbMuzriyfzj3VgynkYsZjbyr7MDTU6OPZ66MQZqzfyx6DJbQjLOv+coOy7+Ua/5nUNXvpfKH3+lwnY5uKWCbiMo72b8MHOVUjKph57x3nTiOfrh8Y50uCdIsEcswasShzn0d2tk+9Fz7O+eGLfkMSuyedYP7fkAIPZWNeZCN5V8AEbyGQJhNQebFzWxsXGuyzdbXFFspTi1NZfbUoodK+STRD8V4X6LiDzsnvq5MBx3m+khsY5WsB6io1vzBqwKnHchPoy7OuD57yTq7klBwBOdjqvrKZpQKesJzhWgbTR5SY0j76XtXGxuCxLd9u4YBex15fgyyU7ubq+sqfsz9ONrh8Lm1Jfk6x5JAfocpw3IT5bPzTOkcImNMUTxzFrwDzjmH1n7ByZp32zI/nfXR888W/qOfifJbu7c00OoiJfD0OAzVdJijLFGPRqi9LZzOiTTv408OjDQkx/NlPKmC3TLjnpBqnHkrTThi7sE8lutVOKuEavdHJfEuSyVfySp1JPq/CTvlGn+rSVLJjw4gs+NSU+1Es+OugfkxHZhwz5WIqreMUnbKN8bIxY41Mb3vIZG5DHpy82khHLkjGL/duuZT99HOc35qRiIcVtbJyDNTFEHLRRV5wTk8jBTsVpjEuNqdrTmMvFMfaMiVPpByPkQKpr8zXXlrOvCxONGXhoDpfO/RI847qTS1667Mv52VTn5KAJmRWpJ2AIdBELMkEr0gTQPROBwKSM/ZCR3lNHMKeTgPuYWGADG2ck2mMdcrRZoJvAFRGwkofOqJc2+iOL6xxRn9qIPCZKSqmfaXvbvexElz45fGmTrbJN96l/+CaCTzhQl+JKX+oYv6iDeu6hnH+0N+Et3ehFNhTlqX1IqfEewpvyYBu+iRznN5CYR5yDdYwvYR5LxYjiOo1z+mLbkDUA3lwcS+aYOMUmYgf5rEnMX+aG/Ig+tl3n7OvCBB7pBV/6i7jHNhE+Ruyo78IzjpnGI+rosk+6S8v/dplSDvdbKALpBqigJbgIBk0kGaVgidkqbem9+jGRUh1pYKe8yKMPm5iIPtRBBD0yaWeCxn5teiUrLZnccWLRjvy4IYoHfbl6tZeW2JluUPDKN8nReOi+zb8uXMWLr3FMpIMFgfoUC9mUw1t2gQkf+iCn72IJFpKhEn90rTK1Tfq7yugvfeWz43z6OGeMNFebxkUxpXaNh+4pkRHndukaAG8ujqlXHA2NU2Ro7uJDuj7SXkI5+7ow0fyd55oq23Pj0WWfeEtLJwelSC2pX7poYgZBwCRiI2wiJm6cGPDEiSw++hFoInjShSOniz5xg2FC8KEubZPsWKZ6Y1u8lrxoI+3YlNuIsIFFqg8xqaMv4pVutdEPu7UI0A9dOTtS/0pxRSa8cewYZ+qgdBxTG+tOmX+wGWyQk9qW6V5UhbypyHF+Y97MM841Vox/LmZpJ9ZK4xw5mhvwEg98SmIyjWPZNjZOxa95iT25dU/6msrUvlJMkJfOr9K534Rnie6SPk2+NtU7OWhCZgXqmWS5BIBgb5vgmpzRBQUeQSRSQOmeMs2YJYtSvDk+5DMJcm3IhV/U1EftscxNLNmULqTwyY4oo+uahUS+xb7IZ5EQ5WzRIoJN8jHnXwmu6JFv0klJIgS/2ijle06X5EiGFnzdszHkNmO1l5ZTJQf44zjfn5RrvDXWcVyGxDn8kpnbMBUnJXGeizvZlGuTbpX0jXFMvfRzDQ2JU+Ia/XyY1+hADmUpCaNoXwkmyM/5XjL3c3zCs0R3SZ9S/9XPyYGQWMGSyZJbGDCVwGla4NNAUzIBX1yEuWbjiYTcuHDEPkw2SJNHfGnwIwMbRPSPtkaZ6tNUprrox6YkW1I+dMPTh9jgo33wIgNsoh8proyNfJ0CV/TGseKeSa8ERfqoj/pkA/VQxFv4RT/wtSmu/hVRVEyVHOBLkz34lo6NjEvHI2IX8eHacX4DNWGUzhHiTBiluCruqBeuiiuNBWMUn0a0xaTkwZvKmypO8acpbmRzU5mzrwQT+SMcJR8sxqypJbpL+sie0tLJQSlSC+zHQDPRCCoCLbdwMqly9TITfjZQgpKJwkKe8tAnBi286IwkW+gXFxRkIV9l5MEuyYYv1aG2yNN2jf3wSBf3OYqTOtfeVIdseFlM8IkP19Ff8apdWOo++pjzrxRXxkm+oiNd4HLj2IU3cjQOXDfhJx9Ly7HJgWLLcX4D8XnGObFMLIE1H8UtsUbySR2xIVI7dcSX7mOc00a9SvFSdsVkLo6RM2Wc5uZvtLHtOmefMGjDZMzcF44qo30lukv6RJld11t0WLcjn22vEcghwO91/Prrr9WVK1dyzWtRt7W1Vf8gmH73Y5WN5pdPr127tsombqRtmxDnGzkwG+ZU568ybpi/dmcDEWBD3dnZqS5dulS9++67a+uhfqxsHRIDQHZisNhQ25Q4Xyxq1jYUAT85GIqc+VYGAf6CfeCBB6ojR45U58+fXxm7+hhy7ty56uWXX65/7vzkyZPV559/3ofdfW8CBDYhzm+CYdoYF50cbMxQ2hEjYASMgBEwAtMg8L9pxFiKETACRsAIGAEjsCkIODnYlJG0H0bACBgBI2AEJkLAycFEQFqMETACRsAIGIFNQcDJwaaMpP0wAkbACBgBIzARAk4OJgLSYoyAETACRsAIbAoCTg42ZSTthxEwAkbACBiBiRBwcjARkBZjBIyAETACRmBTEHBysCkjaT+MgBEwAkbACEyEgJODiYC0GCNgBIyAETACm4KAk4NNGUn7YQSMgBEwAkZgIgScHEwEpMUYASNgBIyAEdgUBP4PWzFNxoBe5mQAAAAASUVORK5CYII=" alt="" />
多项式模型下的参数估计调整为:
aaarticlea/png;base64," alt="" />
伯努力模型下参数估计调整为:
aaarticlea/png;base64," alt="" />
拉普拉斯平滑
也就是代码中的NaiveBayes.train(training,lambda = 1.0)
spark 朴素贝叶斯的更多相关文章
- Spark朴素贝叶斯(naiveBayes)
朴素贝叶斯(Naïve Bayes) 介绍 Byesian算法是统计学的分类方法,它是一种利用概率统计知识进行分类的算法.在许多场合,朴素贝叶斯分类算法可以与决策树和神经网络分类算法想媲美,该算法能运 ...
- 贝叶斯、朴素贝叶斯及调用spark官网 mllib NavieBayes示例
贝叶斯法则 机器学习的任务:在给定训练数据A时,确定假设空间B中的最佳假设. 最佳假设:一种方法是把它定义为在给定数据A以及B中不同假设的先验概率的有关知识下的最可能假设 贝叶斯理论提供了 ...
- 朴素贝叶斯算法原理及Spark MLlib实例(Scala/Java/Python)
朴素贝叶斯 算法介绍: 朴素贝叶斯法是基于贝叶斯定理与特征条件独立假设的分类方法. 朴素贝叶斯的思想基础是这样的:对于给出的待分类项,求解在此项出现的条件下各个类别出现的概率,在没有其它可用信息下,我 ...
- spark 机器学习 朴素贝叶斯 实现(二)
已知10月份10-22日网球场地,会员打球情况通过朴素贝叶斯算法,预测23,24号是否适合打网球.结果,日期,天气 温度 风速结果(0否,1是)天气(0晴天,1阴天,2下雨)温度(0热,1舒适,2冷) ...
- 朴素贝叶斯算法源码分析及代码实战【python sklearn/spark ML】
一.简介 贝叶斯定理是关于随机事件A和事件B的条件概率的一个定理.通常在事件A发生的前提下事件B发生的概率,与在事件B发生的前提下事件A发生的概率是不一致的.然而,这两者之间有确定的关系,贝叶斯定理就 ...
- 【Spark机器学习速成宝典】模型篇04朴素贝叶斯【Naive Bayes】(Python版)
目录 朴素贝叶斯原理 朴素贝叶斯代码(Spark Python) 朴素贝叶斯原理 详见博文:http://www.cnblogs.com/itmorn/p/7905975.html 返回目录 朴素贝叶 ...
- spark(1.1) mllib 源码分析(三)-朴素贝叶斯
原创文章,转载请注明: 转载自http://www.cnblogs.com/tovin/p/4042467.html 本文主要以mllib 1.1版本为基础,分析朴素贝叶斯的基本原理与源码 一.基本原 ...
- spark 机器学习 朴素贝叶斯 原理(一)
朴素贝叶斯算法仍然是流行的挖掘算法之一,该算法是有监督的学习算法,解决的是分类问题,如客户是否流失.是否值得投资.信用等级评定等多分类问题.该算法的优点在于简单易懂.学习效率高.在某些领域的分类问题中 ...
- 数据算法 --hadoop/spark数据处理技巧 --(13.朴素贝叶斯 14.情感分析)
十三.朴素贝叶斯 朴素贝叶斯是一个线性分类器.处理数值数据时,最好使用聚类技术(eg:K均值)和k-近邻方法,不过对于名字.符号.电子邮件和文本的分类,则最好使用概率方法,朴素贝叶斯就可以.在某些情况 ...
随机推荐
- 2014.04.16,读书,读书笔记-《Matlab R2014a完全自学一本通》-第17章 图形用户界面
界面对象分三类: 用户控件对象(uicontrol) 下拉式菜单对象(uimenu) 内容式菜单对象(uicontextmenu) 创建用户界面: 1.命令行方式 采用uicontrol来创建控件对象 ...
- NDK历史版本
https://developer.android.google.cn/ndk/downloads/older_releases.html https://developer.android.goog ...
- iOS 权限判断 跳转对应设置界面
相机权限 1.1 使用说明 在合适的地方导入#import <AVFoundation/AVFoundation.h> 使用AVAuthorizationStatus类获取当前权限状态 在 ...
- hdoj--1201--18岁生日(模拟)
18岁生日 Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others) Total Subm ...
- 17.UNP第一章 简介
获取时间的客户端代码: //客户端程序 #include "unp.h" int main(int argc, char **argv) { int sockfd, n; ]; s ...
- docker compose的使用--在线安装未完成
Compose 是一个用户定义和运行多个容器的 Docker 应用程序.在 Compose 中你可以使用 YAML 文件来配置你的应用服务.然后,只需要一个简单的命令,就可以创建并启动你配置的所有服务 ...
- MSSQL执行大脚本文件时,提示“内存不足”的解决办法
导出了一个脚本文件,将近900M,回来往sql studio一丢,报了个内存不足,然后就有了此文.. 问题描述: 当客户服务器不允许直接备份时,往往通过导出数据库脚本的方式来部署-还原数据库, 但是当 ...
- MacOS系统下简单安装以及配置MongoDB数据库(一)
最近写了一个用node来操作MongoDB完成增.删.改.查.排序.分页功能的示例,并且已经放在了服务器上地址:http://39.105.32.180:3333. 项目一共四部分: 1.MacOS下 ...
- jQuery应用实例2:简单动画
效果: 代码: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www ...
- deploy sql clr
1, create strong signed key file 2, create asymmetric key