整理摘自 https://datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance-in-a-multiclass-classification-settin/16001

Micro- and macro-averages (for whatever metric) will compute slightly different things, and thus their interpretation differs. A macro-average will compute the metric independently for each class and then take the average (hence treating all classes equally), whereas a micro-average will aggregate the contributions of all classes to compute the average metric. In a multi-class classification setup, micro-average is preferable if you suspect there might be class imbalance (i.e you may have many more examples of one class than of other classes).

To illustrate why, take for example precision Pr=TP / (TP+FP). Let's imagine you have a One-vs-All(there is only one correct class output per example) multi-class classification system with four classes and the following numbers when tested:

  • Class A: 1 TP and 1 FP
  • Class B: 10 TP and 90 FP
  • Class C: 1 TP and 1 FP
  • Class D: 1 TP and 1 FP

You can see easily that PrA=PrC=PrD=0.5 , whereas PrB=0.1.

  • A macro-average will then compute: Pr=0.5+0.1+0.5+0.54=0.4
  • A micro-average will compute: Pr=1+10+1+12+100+2+2=0.123

宏查准率:这些类别中是否有尽可能多的类别的查准率尽可能高。-- 侧重各个类别是否预测准确

微查准率:这多组实验中,预测准确的数据占总的预测数据的比例。-- 侧重预测准确的数据的比例

These are quite different values for precision. Intuitively, in the macro-average the "good" precision (0.5) of classes A, C and D is contributing to maintain a "decent" overall precision (0.4). While this is technically true (across classes, the average precision is 0.4), it is a bit misleading, since a large number of examples are not properly classified. These examples predominantly correspond to class B, so they only contribute 1/4 towards the average in spite of constituting 94.3% of your test data. The micro-average will adequately capture this class imbalance, and bring the overall precision average down to 0.123 (more in line with the precision of the dominating class B (0.1)).

当class-imblance已知,但仍要采用macro-average时,需要采取的措施:

1. 报告macro-average + standard deviation(标准差) (对于>=3的多分类任务)

2. 加权macro-average  (考虑样本数的影响)

For computational reasons, it may sometimes be more convenient to compute class averages and then macro-average them. If class imbalance is known to be an issue, there are several ways around it. One is to report not only the macro-average, but also its standard deviation (for 3 or more classes). Another is to compute a weighted macro-average, in which each class contribution to the average is weighted by the relative number of examples available for it. In the above scenario, we obtain:

1. Prmacro−mean=0.25·0.5+0.25·0.1+0.25·0.5+0.25·0.5=0.4

Prmacro−stdev=0.173

2. Prmacro−weighted= 2/106 * 0.5 + 100 / 106 * 0.1 + 2 / 106 * 0.5 + 2 / 106 * 0.5

= 0.0189·0.5+0.943·0.1+0.0189·0.5+0.0189·0.5=0.009+0.094+0.009+0.009=0.123

The large standard deviation (0.173) already tells us that the 0.4 average does not stem from a uniform precision among classes, but it might be just easier to compute the weighted macro-average, which in essence is another way of computing the micro-average.

Micro Average vs Macro average Performance in a Multiclass classification setting的更多相关文章

  1. 机器学习--Micro Average,Macro Average, Weighted Average

    根据前面几篇文章我们可以知道,当我们为模型泛化性能选择评估指标时,要根据问题本身以及数据集等因素来做选择.本篇博客主要是解释Micro Average,Macro Average,Weighted A ...

  2. Spark2.0机器学习系列之5:随机森林

    概述 随机森林是决策树的组合算法,基础是决策树,关于决策树和Spark2.0中的代码设计可以参考本人另外一篇博客: http://www.cnblogs.com/itboys/p/8312894.ht ...

  3. Spark2.0机器学习系列之3:决策树

    概述 分类决策树模型是一种描述对实例进行分类的树形结构. 决策树可以看为一个if-then规则集合,具有“互斥完备”性质 .决策树基本上都是 采用的是贪心(即非回溯)的算法,自顶向下递归分治构造. 生 ...

  4. Micro和Macro性能学习【转载】

    转自:https://datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance- ...

  5. Maximum Average Subarray

    Given an array with positive and negative numbers, find the maximum average subarray which length sh ...

  6. 性能分析_linux服务器CPU_Load Average

    CPU度量Load Average 1.  概念介绍 1.1  Linux系统进程状态 在linux中,process有以下状态: runnable (就绪状态):blocked waiting fo ...

  7. LINQ 学习路程 -- 查询操作 Average Count Max Sum

    IList<, , }; var avg = intList.Average(); Console.WriteLine("Average: {0}", avg); IList ...

  8. F1 score,micro F1score,macro F1score 的定义

    F1 score,micro F1score,macro F1score 的定义 2018年09月28日 19:30:08 wanglei_1996 阅读数 976   本篇博客可能会继续更新 最近在 ...

  9. [LeetCode] 805. Split Array With Same Average 用相同均值拆分数组

    In a given integer array A, we must move every element of A to either list B or list C. (B and C ini ...

随机推荐

  1. 【luogu P3390 矩阵快速幂】 模板

    题目链接:https://www.luogu.org/problemnew/show/P3390 首先要明白矩阵乘法是什么 对于矩阵A m*p  与  B p*n 的矩阵 得到C m*n 的矩阵 矩阵 ...

  2. 菜鸟崛起 DB Chapter 4 MySQL 5.6的数据库引擎

    数据库存储引擎是数据库底层的软件组件,我们平常看不到,但是却与我们操作数据库息息相关.DBMS使用数据引擎进行创建.查询.更新和删除数据操作.不同的存储引擎提供不同的存储机制.索引技巧.锁定水平等功能 ...

  3. 如何配置Java环境变量

    百度经验 | 百度知道 | 百度首页 | 登录 | 注册 新闻 网页 贴吧 知道 经验 音乐 图片 视频 地图 百科 文库 帮助   发布经验 首页 分类 任务 回享 商城 特色 知道 百度经验 &g ...

  4. [USACO1.5]数字三角形 Number Triangles

    题目描述 观察下面的数字金字塔. 写一个程序来查找从最高点到底部任意处结束的路径,使路径经过数字的和最大.每一步可以走到左下方的点也可以到达右下方的点. 7 3 8 8 1 0 2 7 4 4 4 5 ...

  5. 【MYSQL笔记3】MYSQL过程式数据库对象之存储过程的调用、删除和修改

    mysql从5.0版本开始支持存储过程.存储函数.触发器和事件功能的实现. 我们以一本书中的例题为例:创建xscj数据库的存储过程,判断两个输入的参数哪个更大.并调用该存储过程. (1)调用 首先,创 ...

  6. spring mvc中几种获取request对象的方式

    在使用spring进行web开发的时候,优势会用到request对象,用来获取访问ip.请求头信息等 这里收集几种获取request对象的方式 方法一:在controller里面的加参数 public ...

  7. python核心编程2 第七章 练习

    7-4. 建立字典. 给定两个长度相同的列表,比如说,列表[1, 2, 3,...]和['abc', 'def','ghi',...],用这两个列表里的所有数据组成一个字典,像这样:{1:'abc', ...

  8. bean工具类

    package com.zq.utils; import java.lang.reflect.Method;import java.util.Arrays;import java.util.Colle ...

  9. jquery如何获取对应表单元素?

    问题描述:我页面中有这样多个表单,我都是这个定义的,当我点击确定按钮时,此时能够获得相对应的表单对象,我该怎么获取到他的两个值呢? 解决方案: 页面元素 <form id="form1 ...

  10. No configuration found for this host:al

    想尝试多个agent来进行传输数据其中的一个配置文件如下: al.sources = r1al.sinks = k1al.channels = c1 al.sources.r1.type = avro ...