Introduction to Data Visualization – Theory, R & ggplot2

The topic of data visualization is very popular in the data science community. The market size for visualization products is valued at $4 Billion and is projected to reach $7 Billion by the end of 2022 according to Mordor Intelligence. While we have seen amazing advances in the technology to display information, the understanding of how, why, and when to use visualization techniques has not kept up. Unfortunately, people are often taught how to make a chart before even thinking about whether or not it’s appropriate.

In short, are you adding value to your work or are you simply adding this to make it seem less boring? Let’s take a look at some examples before going through the Stoltzmaniac Data Visualization Philosophy.


I have to give credit to Junk Charts – it inspired a lot of this post.

One author at Vox wanted to show the cause of death in all of Shakespeare

Is this not insane!?!?!

Using a legend instead of data callouts is the only thing that could have made this worse. The author could easily have used a number of other tools to get the point across. While wordles are not ideal for any work requiring exact proportions, it does make for a great visual in this article.Junk Charts Article.

To be clear, I’m not close to being perfect when it comes to visualizations in my blog. The sizes, shapes, font colors, etc. tend to get out of control and I don’t take the time in R to tinker with all of the details. However, when it comes to displaying things professionally, it has to be spot on! So, I’ll walk through my theory and not worry too much about aesthetics (save that for a time when you’re getting paid).


The Good, The Bad, The Ugly

“The Good” visualizations:

  • Clearly illustrate a point
  • Are tailored to the appropriate audience
    • Analysts may want detail
    • Executives may want a high-level view
  • Are tailored to the presentation medium
    • A piece in an academic journal can be analyzed slowly and carefully
    • A slide in front of 5,000 people in a conference will be glanced at quickly
  • Are memorable to those who care about the material
  • Make an impact which increases the understanding of the subject matter

“The Bad” visualizations:

  • Are difficult to interpret
  • Are unintentionally misleading
  • Contain redundant and boring information

“The Ugly” visualizations:

  • Are almost impossible to interpret
  • Are filled with completely worthless information
  • Are intentionally created to mislead the audience
  • Are inaccurate

Coming soon:

  • Introduction to the ggplot2 in R and how it works
  • Determining whether or not you need a visualization
  • Choosing the type of plot to use depending on the use case
  • Visualization beyond the standard charts and graphs

As always, the code used in this post is on my GitHub

转自:https://www.stoltzmaniac.com/data-visualization-part-1/

DATA VISUALIZATION – PART 1的更多相关文章

  1. 7 Tools for Data Visualization in R, Python, and Julia

    7 Tools for Data Visualization in R, Python, and Julia Last week, some examples of creating visualiz ...

  2. Data Visualization 课程 笔记1

    对数据可视化比较有兴趣,因此最近在看coursera上伊利诺伊大学香槟分校的数据可视化课程,做了一些笔记. 1. 定义 Data visualization is a high bandwidth c ...

  3. DATA VISUALIZATION – PART 2

    A Quick Overview of the ggplot2 Package in R While it will be important to focus on theory, I want t ...

  4. Data Visualization – Banking Case Study Example (Part 1-6)

    python信用评分卡(附代码,博主录制) https://study.163.com/course/introduction.htm?courseId=1005214003&utm_camp ...

  5. D3.js & Data Visualization & SVG

    D3.js & Data Visualization & SVG https://davidwalsh.name/learning-d3 // import {scaleLinear} ...

  6. charts & data visualization

    charts & data visualization https://www.sitepoint.com/15-best-javascript-charting-libraries/ Can ...

  7. 学习笔记之Bokeh Data Visualization | DataCamp

    Bokeh Data Visualization | DataCamp https://www.datacamp.com/courses/interactive-data-visualization- ...

  8. 学习笔记之Introduction to Data Visualization with Python | DataCamp

    Introduction to Data Visualization with Python | DataCamp https://www.datacamp.com/courses/introduct ...

  9. 学习笔记之Data Visualization

    Data visualization - Wikipedia https://en.wikipedia.org/wiki/Data_visualization Data visualization o ...

随机推荐

  1. IC卡读卡器web开发,支持IE,Chrome,Firefox,Safari,Opera等主流浏览 器

    IC卡读卡器在web端的应用越来越多,但是早期发布的ocx技术只支持IE浏览器,使用受到了很多的限制.IC卡读卡器云服务的推 出,彻底解决了以上的局限,使得IC卡读卡器不仅可以应用在IE浏览器上,还可 ...

  2. setTimeout 和 setInteval 的区别。

    学习前端的可能都知道js有2个定时器setTimeOut和setinteval.用的时候可能不是很在意,但是2者还是有区别的 setTimeout方法是定时程序,也就是在什么时间以后干什么.干完就完了 ...

  3. USACO Section 1.1-1 Your Ride Is Here

    USACO 1.1-1 Your Ride Is Here 你的飞碟在这儿 众所周知,在每一个彗星后都有一只UFO.这些UFO时常来收集地球上的忠诚支持者.不幸的是,他们的飞碟每次出行都只能带上一组支 ...

  4. 使用Java注解来简化你的代码

         注解(Annotation)就是一种标签,可以插入到源代码中,我们的编译器可以对他们进行逻辑判断,或者我们可以自己写一个工具方法来读取我们源代码中的注解信息,从而实现某种操作.需要申明一点, ...

  5. 栈实现getMin

    题目 实现一个特殊的栈,在实现栈的基本功能的基础上,在实现返回栈中最小元素的操作. 要求 pop.push.getMin操作的时间复杂度都是O(1). 设计的栈类型可以使用现成的栈结构. 解答 在设计 ...

  6. kNN算法个人理解

    新手,有问题的地方请大家指教 训练集的数据有属性和标签 同类即同标签的数据在属性值方面一定具有某种相似的地方,用距离来描述这种相似的程度 k=1或则较小值的话,分类对于特殊数据或者是噪点就会异常敏感, ...

  7. python——面向对象相关

    其他相关 一.isinstance(obj, cls) 检查是否obj是否是类 cls 的对象 1 2 3 4 5 6 class Foo(object):     pass   obj = Foo( ...

  8. 2017-4-26 winform 菜单和工具栏

    如何让radiobutton进行分组: 用Panel    相当于div 菜单和工具栏: MenuStrip(菜单条) ShortcutKeys-------------------------与菜单 ...

  9. bzoj2884 albus就是要第一个出场

    Description 已知一个长度为n的正整数序列A(下标从1开始), 令 S = { x | 1 <= x <= n }, S 的幂集2^S定义为S 所有子集构成的集合.定义映射 f ...

  10. EMMC与RAND的区别

    作者:Younger Liu, 本作品采用知识共享署名-非商业性使用-相同方式共享 3.0 未本地化版本许可协议进行许可. EMMC与RAND的区别 说到两者的区别,必须从flash的发展历程说起,因 ...