MATLAB remove outliers.
Hi Michael
MATLAB doesn't provide a specific function to remove outliers. In general you have a couple different options to deal with outliers.
1. You can create an index that flags potential outliers and either delete them from your data set or substitute more plausible values
2. You can use robust techniques like robust regression which are less sensitive to the presence of outliers.
Your choice of strategies will depend a lot on your knowledge about the data set. For example, if you have a lot of data points that are coded with a value like -9999 these are probably error codes of some kind rather than actual numeric information.
I'm including some simple example code which shows a standard technique to detect outliers.
=====================
% Create a vector of X values
clear all
clc
hold off
X = 1:100;
X = X';
% Create a noise vector
noise = randn(100,1);
% Create a second noise value where sigma is much larger
noise2 = 10*randn(100,1);
% Substitute noise2 for noise1 at obs# (11, 31, 51, 71, 91)
% Many of these points will have an undue influence on the model
noise(11:20:91) = noise2(11:20:91);
% Specify Y = F(X)
Y = 3*X + 2 + noise;
% Cook's Distance for a given data point measures the extent to
% which a regression model would change if this data point
% were excluded from the regression. Cook's Distance is
% sometimes used to suggest whether a given data point might be an outlier.
% Use regstats to calculate Cook's Distance
stats = regstats(Y,X,'linear');
% if Cook's Distance > n/4 is a typical treshold that is used to suggest
% the presence of an outlier
potential_outlier = stats.cookd > 4/length(X);
% Display the index of potential outliers and graph the results
X(potential_outlier)
scatter(X,Y, 'b.')
hold on
scatter(X(potential_outlier),Y(potential_outlier), 'r.')
MATLAB remove outliers.的更多相关文章
- matlab中的containers.Map()
matlab中的containers.Map() 标签: matlabcontainers.Map容器map 2015-10-27 12:45 1517人阅读 评论(1) 收藏 举报 分类: Mat ...
- Taxi Trip Time Winners' Interview: 3rd place, BlueTaxi
Taxi Trip Time Winners' Interview: 3rd place, BlueTaxi This spring, Kaggle hosted two competitions w ...
- 异常值处理outlier
python信用评分卡(附代码,博主录制) https://study.163.com/course/introduction.htm?courseId=1005214003&utm_camp ...
- 壁虎书1 The Machine Learning Landscape
属性与特征: attribute: e.g., 'Mileage' feature: an attribute plus its value, e.g., 'Mileage = 15000' Note ...
- 第三课 创建函数 - 从EXCEL读取 - 导出到EXCEL - 异常值 - Lambda函数 - 切片和骰子数据
第 3 课 获取数据 - 我们的数据集将包含一个Excel文件,其中包含每天的客户数量.我们将学习如何对 excel 文件进行处理.准备数据 - 数据是有重复日期的不规则时间序列.我们将挑战数 ...
- Learning Spark中文版--第六章--Spark高级编程(2)
Working on a Per-Partition Basis(基于分区的操作) 以每个分区为基础处理数据使我们可以避免为每个数据项重做配置工作.如打开数据库连接或者创建随机数生成器这样的操作,我们 ...
- Matlab的标记分水岭分割算法
1 综述 Separating touching objects in an image is one of the more difficult image processing operation ...
- Matlab编程基础
平台:Win7 64 bit,Matlab R2014a(8.3) “Matlab”是“Matrix Laboratory” 的缩写,中文“矩阵实验室”,是强大的数学工具.本文侧重于Matlab的编程 ...
- Matlab 进阶学习记录
最近在看 Faster RCNN的Matlab code,发现很多matlab技巧,在此记录: 1. conf_proposal = proposal_config('image_means', ...
随机推荐
- ListView 点击某一项换背景图片
1. layout_search_list_item.xml <LinearLayout xmlns:android="http://schemas.android.com/apk/r ...
- wdcp系统升级mysql5.7.11
1.下载解压 下载地址为:http://dev.mysql.com/get/Downloads/MySQL-5.7/mysql-5.7.12-linux-glibc2.5-x86_64.tar.gz ...
- BUFFER CACHE之调整buffer cache的大小
Buffer Cache存放真正数据的缓冲区,shared Pool里面存放的是sql指令(LC中一次编译,多次运行,加快处理性能,cache hit ratio要高),而buffer cache里面 ...
- WebView 中重写javascript 常用函数
常规函数 javascript 常规函数包括以下3个函数: (1)alert函数:显示一个警告对话框,包括一个OK按钮. 对应:http://www.dreamdu.com/javascript ...
- JavaScript基础篇最全
本章内容: 简介 定义 注释 引入文件 变量 运算符 算术运算符 比较运算符 逻辑运算符 数据类型 数字 字符串 布尔类型 数组 Math 语句 条件语句(if.switch) 循环语句(for.fo ...
- hdu 2457(ac自动机+dp)
题意:容易理解... 分析:这是一道比较简单的ac自动机+dp的题了,直接上代码. 代码实现: #include<stdio.h> #include<string.h> #in ...
- How to easily create popup menu for DevExpress treelist z
http://www.itjungles.com/how-to-easily-create-popup-menu-for-devexpress-treelist.html Adding popup m ...
- 【SQL server】安装和配置
(1)SQL sever 版本问题1: SQL sever 2000 .SQL sever 2005.SQL sever 2008 .SQL sever 2008 R2 安装的时候需要注意是SQL s ...
- 仿酷狗音乐播放器开发日志二十六 duilib在标题栏弹出菜单的方法
转载请说明原出处,谢谢~~ 上篇日志说明了怎么让自定义控件响应右键消息.之后我给主窗体的标题栏增加右键响应,观察原酷狗后可以发现,在整个标题栏都是可以响应右键并弹出菜单的.应该的效果如下: 本以为像上 ...
- CodeForce---Educational Codeforces Round 3 D. Gadgets for dollars and pounds 正题
对于这题笔者无解,只有手抄一份正解过来了: 基本思想就是 : 二分答案,对于第x天,计算它最少的花费f(x),<=s就是可行的,这是一个单调的函数,所以可以二分. 对于f(x)的计算,我用了nl ...