Introduction to Data Visualization – Theory, R & ggplot2

The topic of data visualization is very popular in the data science community. The market size for visualization products is valued at $4 Billion and is projected to reach $7 Billion by the end of 2022 according to Mordor Intelligence. While we have seen amazing advances in the technology to display information, the understanding of how, why, and when to use visualization techniques has not kept up. Unfortunately, people are often taught how to make a chart before even thinking about whether or not it’s appropriate.

In short, are you adding value to your work or are you simply adding this to make it seem less boring? Let’s take a look at some examples before going through the Stoltzmaniac Data Visualization Philosophy.


I have to give credit to Junk Charts – it inspired a lot of this post.

One author at Vox wanted to show the cause of death in all of Shakespeare

Is this not insane!?!?!

Using a legend instead of data callouts is the only thing that could have made this worse. The author could easily have used a number of other tools to get the point across. While wordles are not ideal for any work requiring exact proportions, it does make for a great visual in this article.Junk Charts Article.

To be clear, I’m not close to being perfect when it comes to visualizations in my blog. The sizes, shapes, font colors, etc. tend to get out of control and I don’t take the time in R to tinker with all of the details. However, when it comes to displaying things professionally, it has to be spot on! So, I’ll walk through my theory and not worry too much about aesthetics (save that for a time when you’re getting paid).


The Good, The Bad, The Ugly

“The Good” visualizations:

  • Clearly illustrate a point
  • Are tailored to the appropriate audience
    • Analysts may want detail
    • Executives may want a high-level view
  • Are tailored to the presentation medium
    • A piece in an academic journal can be analyzed slowly and carefully
    • A slide in front of 5,000 people in a conference will be glanced at quickly
  • Are memorable to those who care about the material
  • Make an impact which increases the understanding of the subject matter

“The Bad” visualizations:

  • Are difficult to interpret
  • Are unintentionally misleading
  • Contain redundant and boring information

“The Ugly” visualizations:

  • Are almost impossible to interpret
  • Are filled with completely worthless information
  • Are intentionally created to mislead the audience
  • Are inaccurate

Coming soon:

  • Introduction to the ggplot2 in R and how it works
  • Determining whether or not you need a visualization
  • Choosing the type of plot to use depending on the use case
  • Visualization beyond the standard charts and graphs

As always, the code used in this post is on my GitHub

转自:https://www.stoltzmaniac.com/data-visualization-part-1/

DATA VISUALIZATION – PART 1的更多相关文章

  1. 7 Tools for Data Visualization in R, Python, and Julia

    7 Tools for Data Visualization in R, Python, and Julia Last week, some examples of creating visualiz ...

  2. Data Visualization 课程 笔记1

    对数据可视化比较有兴趣,因此最近在看coursera上伊利诺伊大学香槟分校的数据可视化课程,做了一些笔记. 1. 定义 Data visualization is a high bandwidth c ...

  3. DATA VISUALIZATION – PART 2

    A Quick Overview of the ggplot2 Package in R While it will be important to focus on theory, I want t ...

  4. Data Visualization – Banking Case Study Example (Part 1-6)

    python信用评分卡(附代码,博主录制) https://study.163.com/course/introduction.htm?courseId=1005214003&utm_camp ...

  5. D3.js & Data Visualization & SVG

    D3.js & Data Visualization & SVG https://davidwalsh.name/learning-d3 // import {scaleLinear} ...

  6. charts & data visualization

    charts & data visualization https://www.sitepoint.com/15-best-javascript-charting-libraries/ Can ...

  7. 学习笔记之Bokeh Data Visualization | DataCamp

    Bokeh Data Visualization | DataCamp https://www.datacamp.com/courses/interactive-data-visualization- ...

  8. 学习笔记之Introduction to Data Visualization with Python | DataCamp

    Introduction to Data Visualization with Python | DataCamp https://www.datacamp.com/courses/introduct ...

  9. 学习笔记之Data Visualization

    Data visualization - Wikipedia https://en.wikipedia.org/wiki/Data_visualization Data visualization o ...

随机推荐

  1. 源码阅读之mongoengine(0)

    最近工作上用到了mongodb,之前只是草草了解了一下.对于NoSQL的了解也不是太多.所以想趁机多学习一下. 工作的项目直接用了pymongo来操作直接操作mongodb.对于用惯了Djongo O ...

  2. 通过一个小游戏开始接触Python!

    之前就一直嚷嚷着要找视频看学习Python,可是一直拖到今晚才开始....好好加油吧骚年,坚持不一定就能有好的结果,但是不坚持就一定是不好的!! 看着视频学习1: 首先,打开IDLE,在IDLE中新建 ...

  3. Mac OS平台下应用程序安装包制作工具Packages的使用介绍(补充)

    上一篇:Mac OS平台下应用程序安装包制作工具Packages的使用介绍 补充说明 上一篇文章中介绍了如何使用Packages如何创建mac下的安装包.但是这样制作出来的安装包只能安装到系统的文件路 ...

  4. NOIP2001T4car的旅行计划

    洛谷传送门 一看数据就是floyed(毕竟年代久远),然而建图不是那么好贱好建,只知道三个机场,需要判断斜边来求第4个机场坐标. 往后一些麻烦的建图. 最后floyed就好. --代码 #includ ...

  5. jdk8的新特性 Lambda表达式

    很多同学一开始接触Java8可能对Java8 Lambda表达式有点陌生. //这是一个普通的集合 List<Employee> list = em.selectEmployeeByLog ...

  6. 读书笔记 effective c++ Item 50 了解何时替换new和delete 是有意义的

    1. 自定义new和delete的三个常见原因 我们先回顾一下基本原理.为什么人们一开始就想去替换编译器提供的operator new和operator delete版本?有三个最常见的原因: 为了检 ...

  7. 在Oracle中添加用户登录名称

    第一步,打开Oracle客户端单击 “帮助”-->"支持信息"-->”TNS名“,加入红色部分.页面如下: 第二步,再次打开Oracle客户端时,就会显示数据库了,只需 ...

  8. Python中的支持向量机SVM的使用(有实例)

    除了在Matlab中使用PRTools工具箱中的svm算法,Python中一样可以使用支持向量机做分类.因为Python中的sklearn也集成了SVM算法. 一.简要介绍一下sklearn Scik ...

  9. spine动画融合与动画叠加

    spine动画融合与动画叠加 一.动画融合setMix 1.概述:两个动作之间的平滑过渡 参数duration为需要多少时间从fromAnimation过渡到toAnimation,过渡时间为动画重叠 ...

  10. linux 内核的rt_mutex (realtime互斥体)

    linux 内核有实时互斥体(锁),名为rt_mutex即realtime mutex.说到realtime一定离不开priority(优先级).所谓实时,就是根据优先级的不同对任务作出不同速度的响应 ...