Dimensionality and high dimensional data: definition, examples, curse of..
Dimensionality in statistics refers to how many attributes a dataset has. For example, healthcare data is notorious for having vast amounts of variables (e.g. blood pressure, weight, cholesterol level). In an ideal world, this data could be represented in a spreadsheet, with one column representing each dimension. In practice, this is difficult to do, in part because many variables are inter-related (like weight and blood pressure).
Note: Dimensionality means something slightly different in other areas of mathematics and science. For example, in physics, dimensionality can usually be expressed in terms of fundamental dimensions like mass, time, or length. Inmatrix algebra, two units of measure have the same dimensionality if both statements are true:
- A function exists that maps one variable onto another variable.
- The inverse of the function in (1) does the reverse.
High Dimensional Data
High Dimensional means that the number of dimensions is staggeringly惊人地 high — so high that calculations become extremely difficult. With high dimensional data, the number of features can exceed the number of observations. For example, microarrays, which measure gene expression, can contain tens of hundreds of samples. Each sample can contain tens of thousands of genes.
1. What is the dimension of time series.
Classification of time series is a somewhat tricky matter. Most classification algorithms have an implicit assumption that the data you are classifying are stationary, and they usually work in vector spaces.
So there are two "things" that can be multidimensional here: your original time series and the result of your preprocessing before feeding data to a classifier.
Supplementary knowledge:
1. downsample.降采样
2. curse of dimensionality维度灾难
当维数提高时,空间的体积提高太快,因而可用数据变得很稀疏。稀疏性对于任何要求有统计学意义的方法而言都是一个问题,为了获得在统计学上正确并且有可靠的结果,用来支撑这一结果所需要的数据量通常随着维数的提高而呈指数级增长。
3. 缩写iid: independent and identically distributed random variables. 独立同分布.
Reference:
2. What is meant by 'high dimensional' time series?
3. 万物皆Embedding,从经典的word2vec到深度学习基本操作item2vec
Dimensionality and high dimensional data: definition, examples, curse of..的更多相关文章
- CREATE TABLE——数据定义语言 (Data Definition Language, DDL)
Sql语句分为三大类: 数据定义语言,负责创建,修改,删除表,索引和视图等对象: 数据操作语言,负责数据库中数据的插入,查询,删除等操作: 数据控制语言,用来授予和撤销用户权限. 数据定义语言 (Da ...
- How to Delete XML Publisher Data Definition Template
DECLARE -- Change the following two parameters VAR_TEMPLATECODE VARCHAR2(100) := 'CUX_CHANGE_RPT1 ...
- Hive 5、Hive 的数据类型 和 DDL Data Definition Language)
官方帮助文档:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL Hive的数据类型 -- 扩展数据类型data_t ...
- sql基础之DDL(Data Definition Languages)
好久没写SQL语句了,复习一下. DDL数据定义语言,DDL定义了不同的数据段.数据库.表.列.索引等数据库对象的定义.经常使用的DDL语句包含create.drop.alter等等. 登录数据:my ...
- 02-2--数据库MySQL:DDL(Data Definition Language:数据库定义语言)操作数据库中的表(二)
DDL对数据库的操作:http://blog.csdn.net/baidu_37107022/article/details/72334560 DDL对数据库中表的操作 1)方法概览 2)演示 //创 ...
- 数据定义语言(DDL Data Definition Language)基础学习笔记
创建数据库 create database if not exists STUDY character set utf8 ; 查看新建数据库的语句 SHOW CREATE DATABASE STUDY ...
- MySQL中的DDL(Data Definition Language,数据定义语言)
create(创建表) 标准的建表语句: create table [模式名.]表名 ( #可以有多个列定义 columnName1 dataType [default expr(这是默认值)], . ...
- mysql数据库-mysql数据定义语言DDL (Data Definition Language)归类(六)
0x01 创建数据库并指定字符集和排序规则 -- 三种实例写法 create database temptab2 character set utf8 collate utf8_general_ci; ...
- Seven Techniques for Data Dimensionality Reduction
Seven Techniques for Data Dimensionality Reduction Seven Techniques for Data Dimensionality Reductio ...
随机推荐
- P3902 递增
链接:P3902 ----------------------------------------- 这道题就是最长上升子序列的模板题,因为我们修改的时候可没说不能改成小数(暴力) --------- ...
- QQ常用表情
以下表情均为QQ官方表情原图,版权归QQ所有,禁止用于商业用途.  ![3nEaFJ.p ...
- Navicat Premium15安装与激活(破解)
Navicat premium是一款数据库管理工具,是一个可多重连线资料库的管理工具,它可以让你以单一程式同时连线到 MySQL.SQLite.Oracle 及 PostgreSQL 资料库,让管理不 ...
- Hadoop学习之路(6)MapReduce自定义分区实现
MapReduce自带的分区器是HashPartitioner 原理:先对map输出的key求hash值,再模上reduce task个数,根据结果,决定此输出kv对,被匹配的reduce任务取走. ...
- 通过编写Java代码让Jvm崩溃
在书上看到一个作者提出一个问题"怎样通过编写Java代码让Jvm崩溃",我看了之后也不懂.带着问题查了一下,百度知道里面有这样一个答案: 1 package jvm; 2 3 pu ...
- 嵊州D5T3 指令 program 神奇的位运算
指令 program [问题描述] krydom 有一个神奇的机器. 一开始,可以往机器里输入若干条指令: opt x 其中,opt 是 & | ^ 中的一种,0 ≤ x ≤ 1023 . 对 ...
- NC反弹shell的几种方法
假如ubuntu.CentOS为目标服务器系统 kali为攻击者的系统,ip为:192.168.0.4,开放7777端口且没被占用 最终是将ubuntu.CentOS的shell反弹到kali上 正向 ...
- tsocks代理git wget
使用clash时, 命令行的wget和git操作可能没有被代理 安装tsocks: apt-get install tsocks 修改配置文件: vi /etc/tsocks.conf 找到: ser ...
- BZOJ 2306: [Ctsc2011]幸福路径
Description 有向图 G有n个顶点 1, 2, -, n,点i 的权值为 w(i).现在有一只蚂蚁,从 给定的起点 v0出发,沿着图 G 的边爬行.开始时,它的体力为 1.每爬过一条 边,它 ...
- File FileStream StreamReader和StreamWriter
File 静态类 ReadAllBytes 和 WriteAllBytes ,用于一次性全部读取和写入小文件的字节码, ReadLine ReadkAll 用于一 ...