想学习一下SVM,所以找到了LIBSVM--A Library for Support Vector Machines,首先阅读了一下网站提供的A practical guide to SVM classification.

写一写个人认为主要的精华的东西。

SVMs is:a technique for data classification  

Goal is:to produce a model (based on training data) which predicts the target values of the test data given only the test data attributes.

Kernels:four basic kernels

Proposed Procedure:

1.transform data to the format of an SVM package

  first have to convert categorical attributes into numeric data.We recommend using m numbers to represent an m-category attribute and only one of the m numbers is one,and others are zeros. for example {red,green,blue} can be represented as (0,0,1),(0,1,0)and(1,0,0).

2.conduct simple scaling on the data

  Note:It's importance to use the same scaling factors for training and testing sets.

3.consider the RBF kernel K(x,y) = e-r||x-y||2

4.use cross-validation to find the best parameter C and r

  The cross-validation produce can prevent the overfitting problem.We recommend a "grid-search" on C and r using cross-validation.Various pairs of (C,r)values are tried and the one with the best cross-validation accuarcy is picked.Use a coarse grid to make a better region on the grid,a finer grid search on that region can be conducted.

  For very large data sets a feasible approach is to randomly choose a subset of the data set,conduct grid-search on them,and then do a better-region-only grid-search on the completly data set.

5.use the best parameter C and r to train the whole training set

6.Test

When to use Linear but not RBF Kernel ?

  If the number of features is large, one may not need to map data to a higher dimensional space. That is, the nonlinear mapping does not improve the performance.Using the linear kernel is good enough, and one only searches for the parameter C.

  C.1 Number of instances number of features  

    when the number of features is very large, one may not need to map the data.

  C.2 Both numbers of instances and features are large

    Such data often occur in document classication.LIBLINEAR is much faster than LIBSVM to obtain a model with comparable accuracy.LIBLINEAR is efficient for large-scale document classication.

  C.3 Number of instances number of features

    As the number of features is small, one often maps data to higher dimensional spaces(i.e., using nonlinear kernels).

Learn LIBSVM---a practical Guide to SVM classification的更多相关文章

  1. [笔记]A Practical Guide to Support Vector Classi cation

    <A Practical Guide to Support Vector Classication>是一篇libSVM使用入门教程以及一些实用技巧. 1. Basic Kernels: ( ...

  2. A Practical Guide to Support Vector Classi cation

    <A Practical Guide to Support Vector Classication>是一篇libSVM使用入门教程以及一些实用技巧. 1. Basic Kernels: ( ...

  3. A Practical Guide to Distributed Scrum - 分布式Scrum的实用指南 - 读书笔记

    最近读了这本IBM出的<A Practical Guide to Distributed Scrum>(分布式Scrum的实用指南),书中的章节结构比较清楚,是针对Scrum项目进行,一个 ...

  4. 信号处理的好书Digital Signal Processing - A Practical Guide for Engineers and Scientists

    诚心给大家推荐一本讲信号处理的好书<Digital Signal Processing - A Practical Guide for Engineers and Scientists>[ ...

  5. 【SVM】A Practical Guide to Support Vector Classi cation

    零.简介 一般认为,SVM比神经网络要简单. 优化目标:

  6. Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform-part 1

    转自: http://www.confluent.io/blog/stream-data-platform-1/ These days you hear a lot about "strea ...

  7. The Practical Guide to Empathy Maps: 10-Minute User Personas

    That’s where the empathy map comes in. When created correctly, empathy maps serve as the perfect lea ...

  8. Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform-part 2

    转自: http://confluent.io/blog/stream-data-platform-2          http://www.infoq.com/cn/news/2015/03/ap ...

  9. Parsing techniques: a practical guide下载

    轮子哥隆重推荐的书,一行代码.一句公式都没有,但是却什么都讲明白了的:<Parsing Techniques>.第一版官网免费下载,第二版多出来的东西你们用不上不用看了.全书只讲parsi ...

随机推荐

  1. ni

    坚强歌词 马天宇 - 坚强 天使的翅膀挥动着的光芒一路走来学会了坚强每一次你努力认真的模样让我很欣赏 雨天的路上会有一缕阳光温暖被淋湿的希望再小的河也能汇成海洋让我去远航 一路上陪伴我的目光是最感动的 ...

  2. 线段树(updata+query)

    I Hate It Time Limit: 9000/3000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)Total S ...

  3. Roads in Berland(图论)

    Description There are n cities numbered from 1 to n in Berland. Some of them are connected by two-wa ...

  4. 关于onclick中的event对象和element对象

    event.srcElement:引发事件的目标对象,常用于onclick事件. event.fromElement:引发事件的对象源,常用于onmouseout和onmouseover事件. eve ...

  5. QT是否流行还是和历史有关啊(各个平台不同时间的方案都讲到了)

    这个还是和历史有关啊..现在基于Qt的桌面软件越来越多的...许多GTK的也在向Qt迁移..可以说在XP时代,微软自己有一套MFC,和成熟的vs系列开发工具..而Qt-Creator是09左右才有项目 ...

  6. Delphi的文件操作(定义,关联,打开,读写,关闭)

    参考自:http://www.cnblogs.com/railgunman/articles/1800318.html Delphi 中默认有input 和 output 两个文件变量,使用可以不用定 ...

  7. 【Razor语法规则小手册....】

    在经过长达半年的Windows开发后,Razor的一些语法有些生疏了.搜集些,再熟悉下.呵呵,甚是怀念以前做web 项目的时候,基于动软代码生成器自定义T4模板,后来vs2010后开始支持T4模板. ...

  8. Eclipse配置Maven开发环境

    前言: 现在Eclipse版本越来越高.高版本的Eclipse甚至已经集成了Maven像是SpringSource的哪个版本.用习惯了Eclipse.在开发中还是不想更换掉自己的IDE.如此一来就又了 ...

  9. 通过项目逐步深入了解Mybatis<四>

    延迟加载 什么是延迟加载? resultMap可以实现高级映射(使用association.collection实现一对一及一对多映射),association.collection具备延迟加载功能. ...

  10. 格而知之8:我所理解的Runtime(3)

    关联对象 14.使用Category对类进行拓展的时候,只能添加方法,而不适合添加属性(可以添加属性,也可以正常使用get方法和set方法,只是不会自动生成以下划线开头命名的成员变量). 可以通过关联 ...