想学习一下SVM,所以找到了LIBSVM--A Library for Support Vector Machines,首先阅读了一下网站提供的A practical guide to SVM classification.

写一写个人认为主要的精华的东西。

SVMs is:a technique for data classification  

Goal is:to produce a model (based on training data) which predicts the target values of the test data given only the test data attributes.

Kernels:four basic kernels

Proposed Procedure:

1.transform data to the format of an SVM package

  first have to convert categorical attributes into numeric data.We recommend using m numbers to represent an m-category attribute and only one of the m numbers is one,and others are zeros. for example {red,green,blue} can be represented as (0,0,1),(0,1,0)and(1,0,0).

2.conduct simple scaling on the data

  Note:It's importance to use the same scaling factors for training and testing sets.

3.consider the RBF kernel K(x,y) = e-r||x-y||2

4.use cross-validation to find the best parameter C and r

  The cross-validation produce can prevent the overfitting problem.We recommend a "grid-search" on C and r using cross-validation.Various pairs of (C,r)values are tried and the one with the best cross-validation accuarcy is picked.Use a coarse grid to make a better region on the grid,a finer grid search on that region can be conducted.

  For very large data sets a feasible approach is to randomly choose a subset of the data set,conduct grid-search on them,and then do a better-region-only grid-search on the completly data set.

5.use the best parameter C and r to train the whole training set

6.Test

When to use Linear but not RBF Kernel ?

  If the number of features is large, one may not need to map data to a higher dimensional space. That is, the nonlinear mapping does not improve the performance.Using the linear kernel is good enough, and one only searches for the parameter C.

  C.1 Number of instances number of features  

    when the number of features is very large, one may not need to map the data.

  C.2 Both numbers of instances and features are large

    Such data often occur in document classication.LIBLINEAR is much faster than LIBSVM to obtain a model with comparable accuracy.LIBLINEAR is efficient for large-scale document classication.

  C.3 Number of instances number of features

    As the number of features is small, one often maps data to higher dimensional spaces(i.e., using nonlinear kernels).

Learn LIBSVM---a practical Guide to SVM classification的更多相关文章

  1. [笔记]A Practical Guide to Support Vector Classi cation

    <A Practical Guide to Support Vector Classication>是一篇libSVM使用入门教程以及一些实用技巧. 1. Basic Kernels: ( ...

  2. A Practical Guide to Support Vector Classi cation

    <A Practical Guide to Support Vector Classication>是一篇libSVM使用入门教程以及一些实用技巧. 1. Basic Kernels: ( ...

  3. A Practical Guide to Distributed Scrum - 分布式Scrum的实用指南 - 读书笔记

    最近读了这本IBM出的<A Practical Guide to Distributed Scrum>(分布式Scrum的实用指南),书中的章节结构比较清楚,是针对Scrum项目进行,一个 ...

  4. 信号处理的好书Digital Signal Processing - A Practical Guide for Engineers and Scientists

    诚心给大家推荐一本讲信号处理的好书<Digital Signal Processing - A Practical Guide for Engineers and Scientists>[ ...

  5. 【SVM】A Practical Guide to Support Vector Classi cation

    零.简介 一般认为,SVM比神经网络要简单. 优化目标:

  6. Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform-part 1

    转自: http://www.confluent.io/blog/stream-data-platform-1/ These days you hear a lot about "strea ...

  7. The Practical Guide to Empathy Maps: 10-Minute User Personas

    That’s where the empathy map comes in. When created correctly, empathy maps serve as the perfect lea ...

  8. Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform-part 2

    转自: http://confluent.io/blog/stream-data-platform-2          http://www.infoq.com/cn/news/2015/03/ap ...

  9. Parsing techniques: a practical guide下载

    轮子哥隆重推荐的书,一行代码.一句公式都没有,但是却什么都讲明白了的:<Parsing Techniques>.第一版官网免费下载,第二版多出来的东西你们用不上不用看了.全书只讲parsi ...

随机推荐

  1. javascrit字符串截取

    昨天遇见一个问题就是一个地址后面加参数第一次是需要添加参数,以后每次点击按钮的时候是替换如果不进行处理的话如果页面不刷新,地址会不断的添加越来越长,所以

  2. JS笔试题

    JS 引用相关题目 以下代码输出什么? 为什么? var a = {n:1}; var b = a; a = {n:2}; a.x = a ; console.log(a.x); console.lo ...

  3. C# 父子类_实例_静态成员变量_构造函数的执行顺序

    今天去面试的时候被一道题问得一点脾气都没有,今天特地来研究下. 子类成员变量,子类静态成员变量,子类构造函数,父类成员变量,父类静态成员变量,父类构造函数的执行顺序. 现在贴上从另外一个.net程序员 ...

  4. Android ListView中带有时间数据的排序

    下面是activity: public class MainActivity extends Activity { private ListView mListView = null; private ...

  5. Linux下查看Web服务器当前的并发连接数和TCP连接状态

    对于web服务器(Nginx.Apache等)来说,并发连接数是一个比较重要的参数,下面就通过netstat命令和awk来查看web服务器的并发连接数以及TCP连接状态. $ netstat -n | ...

  6. LeeCode-Sort Colors

    Given an array with n objects colored red, white or blue, sort them so that objects of the same colo ...

  7. for、foreach和MoveNext循环效率粗比较

    今天没事对for循环.foreach循环.MoveNext循环,执行效率进行了对比:粗略测试代码如下: static void Main(string[] args) { #region 三种方式循环 ...

  8. [Node.js]在windows下不得不防的小错误

    TypeError: Arguments to path.join must be strings at f (path.js:204:15) at Object.filter (native) at ...

  9. javascript 判断系统设备

    <script> function detectOS() { var sUserAgent = navigator.userAgent; var isWin = (navigator.pl ...

  10. Oracle SQL函数之转换函数

    chartorowid(c1) [功能]转换varchar2类型为rowid值 [参数]c1,字符串,长度为18的字符串,字符串必须符合rowid格式 [返回]返回rowid值 [示例] SQL> ...