Three ways to detect outliers
Z-score
import numpy as np def outliers_z_score(ys):
threshold = 3 mean_y = np.mean(ys)
stdev_y = np.std(ys)
z_scores = [(y - mean_y) / stdev_y for y in ys]
return np.where(np.abs(z_scores) > threshold)
Modified Z-score
import numpy as np def outliers_modified_z_score(ys):
threshold = 3.5 median_y = np.median(ys)
median_absolute_deviation_y = np.median([np.abs(y - median_y) for y in ys])
modified_z_scores = [0.6745 * (y - median_y) / median_absolute_deviation_y
for y in ys]
return np.where(np.abs(modified_z_scores) > threshold)
IQR(interquartile range)
import numpy as np def outliers_iqr(ys):
quartile_1, quartile_3 = np.percentile(ys, [25, 75])
iqr = quartile_3 - quartile_1
lower_bound = quartile_1 - (iqr * 1.5)
upper_bound = quartile_3 + (iqr * 1.5)
return np.where((ys > upper_bound) | (ys < lower_bound))
Conclusion
It is important to reiterate that these methods should not be used mechanically.
They should be used to explore the data – they let you know which points might be worth a closer look.
What to do with this information depends heavily on the situation.
Sometimes it is appropriate to exclude outliers from a dataset to make a model trained on that dataset more predictive.
Sometimes, however,
the presence of outliers is a warning sign that the real-world process generating the data is more complicated than expected. As an astute commenter on CrossValidated put it:
“Sometimes outliers are bad data, and should be excluded, such as typos.
Sometimes they are Wayne Gretzky or Michael Jordan, and should be kept.” Domain knowledge and practical wisdom are the only ways to tell the difference.
摘自:http://colingorrie.github.io/outlier-detection.html
Three ways to detect outliers的更多相关文章
- MATLAB remove outliers.
Answer by Richard Willey on 9 Jan 2012 Hi Michael MATLAB doesn't provide a specific function to remo ...
- 异常值处理outlier
python信用评分卡(附代码,博主录制) https://study.163.com/course/introduction.htm?courseId=1005214003&utm_camp ...
- 深度学习图像配准 Image Registration: From SIFT to Deep Learning
Image Registration is a fundamental step in Computer Vision. In this article, we present OpenCV feat ...
- (转)利用libcurl和国内著名的两个物联网云端通讯的例程, ubuntu和openwrt下调试成功(四)
1. libcurl 的参考文档如下 CURLOPT_HEADERFUNCTION Pass a pointer to a function that matches the following pr ...
- 如何查找物理cpu,cpu核心和逻辑cpu的数量
环境 Red Hat Enterprise Linux 4 Red Hat Enterprise Linux 5 Red Hat Enterprise Linux 6 Red Hat Enterpri ...
- Gartner 2018 年WAF魔力象限报告:云WAF持续增长,Bot管理与API安全拥有未来
Gartner 2018 年WAF魔力象限报告:云WAF持续增长,Bot管理与API安全拥有未来 来源 https://www.freebuf.com/articles/paper/184903.ht ...
- A Gentle Guide to Machine Learning
A Gentle Guide to Machine Learning Machine Learning is a subfield within Artificial Intelligence tha ...
- [linux]linuxI/O测试的方法之dd
参考http://www.thomas-krenn.com/en/wiki/Linux_I/O_Performance_Tests_using_dd Measuring Write Performan ...
- WGCNA构建基因共表达网络详细教程
这篇文章更多的是对于混乱的中文资源的梳理,并补充了一些没有提到的重要参数,希望大家不会踩坑. 1. 简介 1.1 背景 WGCNA(weighted gene co-expression networ ...
随机推荐
- 一、Windows Server 2016 AD服务器搭建
简介: AD是Active Directory的简写,中文称活动目录.活动目录(Active Directory)主要提供以下功能: 1)服务器及客户端计算机管理 2)用户服务 3)资源管理 4)桌面 ...
- Extjs 在Grid单元中格添加Tooltip提示
Grid 中的单元格添加Tooltip 的效果 Ext.QuickTips.init(); //必须要… columns: [ { text: 'Name', dataIndex: 'name' }, ...
- 基本数据对象(int,float,str)
一.整型(int) # int对象初始化 x = 2 y = int(3) n = int("A3",12) # 运算符(+.-.*././/.%.**) ''' 相关的函数 '' ...
- Python爬虫【实战篇】百度翻译
先看代码 import requests headers = { "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS ...
- Linux-基础学习(四)-部署图书管理系统项目
部署图书管理项目需要以下软件 项目文件(django项目文件夹) 数据库文件(django项目对应的数据库文件) centos7(linux本体) nginx(反向代理以及静态文件收集) uWSGI( ...
- centos搭建ftp服务器
申请个京东云服务用着.上传文件想搭建个ftp服务.遇到个坑记录一下: 这里就简单的使用yum安装 ftp服务: vsftpd 全称 very secure ftp deamon (非常安全的ftp ...
- BZOJ3174 TJOI2013 拯救小矮人 贪心、DP
传送门 原问题等价于:先给\(n\)个人排好顺序.叠在一起,然后从顶往底能走即走,问最多能走多少人 注意到一个问题:如果存在两个人\(i,j\)满足\(a_i + b_i < a_j + b_j ...
- 在Bootstrap开发框架的前端视图中使用@RenderPage实现页面内容模块化的隔离,减少复杂度
在很多开发的场景中,很多情况下我们需要考虑抽象.以及模块化等方面的内容,其目的就是为了使得开发的时候关注的变化内容更加少一些,整体开发更加简单化,从而减少开发的复杂度,在Winform开发的时候,往往 ...
- 【重磅】FineUIPro基础版免费,是时候和ExtJS说再见了!
三石的新年礼物 9 年了,FineUI(开源版)终于迎来了她的继任者 - FineUIPro(基础版),并且完全免费! FineUIPro(基础版)作为三石奉献给社区的一个礼物,绝对让你心动: 拥 ...
- VsCode调试js
1.安装Debugger for Chrome 2.编辑launch.json { "type": "chrome", "request" ...