The Great DOM Fuzz-off of 2017

Posted by Ivan Fratric, Project Zero
https://googleprojectzero.blogspot.nl/2017/09/the-great-dom-fuzz-off-of-2017.html

Introduction

Historically, DOM engines have been one of the largest sources of web browser bugs(以前DOM是浏览器最大漏洞来源). And while in the recent years the popularity of those kinds of bugs in targeted attacks has somewhat fallen in favor of Flash (which allows for cross-browser exploits) and JavaScript engine bugs (which often result in very powerful exploitation primitives,利用更强), they are far from gone(远非如此). For example, CVE-2016-9079 (a bug that was used in November 2016 against Tor Browser users) was a bug in Firefox’s DOM implementation, specifically the part that handles SVG elements in a web page. It is also a rare case that a vendor will publish a security update that doesn’t contain fixes for at least several DOM engine bugs.//DOM漏洞还有空间?
An interesting property of many of those bugs is that they are more or less easy to find by fuzzing. This is why a lot of security researchers as well as browser vendors who care about security invest into building DOM fuzzers and associated infrastructure.//通过fuzz可以较容易找到。
As a result, after joining Project Zero, one of my first projects was to test the current state of resilience of major web browsers against DOM fuzzing.

The fuzzer

For this project I wanted to write a new fuzzer which takes some of the ideas from my previous DOM fuzzing projects, but also improves on them and implements new features. Starting from scratch also allowed me to end up with cleaner code that I’m open-sourcing(开源!) together with this blog post. The goal was not to create anything groundbreaking - as already noted by security researchers, many DOM fuzzers have begun to look like each other over time. Instead the goal was to create a fuzzer that has decent initial coverage, is easily understandable and extendible and can be reused by myself as well as other researchers for fuzzing other targets besides just DOM fuzzing.//更像一个框架
We named this new fuzzer Domato (credits to Tavis for suggesting the name). Like most DOM fuzzers, Domato is generative, meaning that the fuzzer generates a sample from scratch given a set of grammars that describes HTML/CSS structure as well as various JavaScript objects, properties and functions.
The fuzzer consists of several parts:

  • The base engine that can generate a sample given an input grammar. This part is intentionally fairly generic and can be applied to other problems besides just DOM fuzzing.//通用并且适合DOMfuzz之外的
  • The main script that parses the arguments and uses the base engine to create samples. Most logic that is DOM specific is captured in this part.//解析参数并创建例子
  • A set of grammars for generating HTML, CSS and JavaScript code.//语法
One of the most difficult aspects in the generation-based fuzzing is creating a grammar or another structure that describes the samples that are going to be created. In the past I experimented with manually created grammars as well as grammars extracted automatically from web browser code. Each of these approaches has advantages and drawbacks, so for this fuzzer I decided to use a hybrid approach:

  1. I initially extracted DOM API declarations from .idl files in Google Chrome Source. Similarly, I parsed Chrome’s layout tests to extract common (and not so common) names and values of various HTML and CSS properties.
  2. Afterwards, this automatically extracted data was heavily manually edited to make the generated samples more likely to trigger interesting behavior. One example of this are functions and properties that take strings as input: Just because a DOM property takes a string as an input does not mean that any string would have a meaning in the context of that property.
Otherwise, Domato supports features that you’d expect from a DOM fuzzer such as:

  • Generating multiple JavaScript functions that can be used as targets for various DOM callbacks and event handlers
  • Implicit (through grammar definitions) support for “interesting” APIs (e.g. the Range API) that have historically been prone to bugs.
Instead of going into much technical details here, the reader is referred to the fuzzer code and documentation at https://github.com/google/domato. It is my hope that by open-sourcing the fuzzer I would invite community contributions that would cover the areas I might have missed in the fuzzer or grammar creation.//语法部分

Setup

We tested 5 browsers with the highest market share: Google Chrome, Mozilla Firefox, Internet Explorer, Microsoft Edge and Apple Safari. We gave each browser approximately 100.000.000 iterations with the fuzzer and recorded the crashes. (If we fuzzed some browsers for longer than 100.000.000 iterations, only the bugs found within this number of iterations were counted in the results.) Running this number of iterations would take too long on a single machine and thus requires fuzzing at scale, but it is still well within the pay range of a determined attacker. For reference, it can be done for about $1k on Google Compute Engine given the smallest possible VM size, preemptable VMs (which I think work well for fuzzing jobs as they don’t need to be up all the time) and 10 seconds per run.
Here are additional details of the fuzzing setup for each browser:
  • Google Chrome was fuzzed on an internal Chrome Security fuzzing cluster called ClusterFuzz. To fuzz Google Chrome on ClusterFuzz we simply needed to upload the fuzzer and it was run automatically against various Chrome builds.
  • Mozilla Firefox was fuzzed on internal Google infrastructure (linux based). Since Mozilla already offers Firefox ASAN builds for download, we used that as a fuzzing target. Each crash was additionally verified against a release build.
  • Internet Explorer 11 was fuzzed on Google Compute Engine running Windows Server 2012 R2 64-bit. Given the lack of ASAN build, page heap was applied to iexplore.exe process to make it easier to catch some types of issues.
  • Microsoft Edge was the only browser we couldn’t easily fuzz on Google infrastructure since Google Compute Engine doesn’t support Windows 10 at this time and Windows Server 2016 does not include Microsoft Edge. That’s why for fuzzing it we created a virtual cluster of Windows 10 VMs on Microsoft Azure. Same as with Internet Explorer, page heap was applied to MicrosoftEdgeCP.exe process before fuzzing.
  • Instead of fuzzing Safari directly, which would require Apple hardware, we instead used WebKitGTK+ which we could run on internal (Linux-based) infrastructure. We created an ASAN build of the release version of WebKitGTK+. Additionally, each crash was verified against a nightly ASAN WebKit build running on a Mac.//利用WebKitGTK+而不需Apple硬件

Results

Without further ado, the number of security bugs found in each browsers are captured in the table below.
Only security bugs were counted in the results (doing anything else is tricky as some browser vendors fix non-security crashes while some don’t) and only bugs affecting the currently released version of the browser at the time of fuzzing were counted (as we don’t know if bugs in development version would be caught by internal review and fuzzing process before release).
Vendor
Browser
Engine
Number of Bugs
Project Zero Bug IDs
Google
Chrome
Blink
2
994, 1024
Mozilla
Firefox
Gecko
4**
1130, 1155, 1160, 1185
Microsoft
Internet Explorer
Trident
4
1011, 1076, 1118, 1233
Microsoft
Edge
EdgeHtml
6
1011, 1254, 1255, 1264, 1301, 1309
Apple
Safari
WebKit
17
999, 1038, 1044, 1080, 1082, 1087, 1090, 1097, 1105, 1114, 1241, 1242, 1243, 1244, 1246, 1249, 1250
Total
31*
*While adding the number of bugs results in 33, 2 of the bugs affected multiple browsers
**The root cause of one of the bugs found in Mozilla Firefox was in the Skia graphics library and not in Mozilla source. However, since the relevant code was contributed by Mozilla engineers, I consider it fair to count here.
All of the bugs listed here have been fixed in the current shipping versions of the browsers. As can be seen in the table most browsers did relatively well in the experiment with only a couple of security relevant crashes found. Since using the same methodology used to result in significantly higher number of issues just several years ago, this shows clear progress for most of the web browsers. For most of the browsers the differences are not sufficiently statistically significant to justify saying that one browser’s DOM engine is better or worse than another.//在实验阶段已经足够好。
However, Apple Safari is a clear outlier in the experiment with significantly higher number of bugs found. This is especially worrying given attackers’ interest in the platform as evidenced by the exploit prices and recent targeted attacks. It is also interesting to compare Safari’s results to Chrome’s, as until a couple of years ago, they were using the same DOM engine (WebKit). It appears that after the Blink/Webkit split either the number of bugs in Blink got significantly reduced or a significant number of bugs got introduced in the new WebKit code (or both). To attempt to address this discrepancy, I reached out to Apple Security proposing to share the tools and methodology. When one of the Project Zero members decided to transfer to Apple, he contacted me and asked if the offer was still valid. So Apple received a copy of the fuzzer and will hopefully use it to improve WebKit.
It is also interesting to observe the effect of MemGC, a use-after-free mitigation in Internet Explorer and Microsoft Edge. When this mitigation is disabled using the registry flag OverrideMemoryProtectionSetting, a lot more bugs appear. However, Microsoft considers these bugs strongly mitigated by MemGC and I agree with that assessment. Given that IE used to be plagued with use-after-free issues, MemGC is an example of a useful mitigation that results in a clear positive real-world impact. Kudos to Microsoft’s team behind it!//MemGC基本是的IE/Edge上的DOM UAF失效,但并不代表其它浏览器也是这样。
When interpreting the results, it is very important to note that they don’t necessarily reflect the security of the whole browser and instead focus on just a single component (DOM engine), but one that has historically been a source of many security issues. This experiment does not take into account other aspects such as presence and security of a sandbox, bugs in other components such as scripting engines etc. I can also not disregard忽视 the possibility that, within DOM, my fuzzer is more capable at finding certain types of issues than other, which might have an effect on the overall stats.

Experimenting with coverage-guided DOM fuzzing

Since coverage-guided fuzzing seems to produce very good results in other areas we wanted to combine it with the DOM fuzzing. We built an experimental coverage-guided DOM fuzzer and ran it against Internet Explorer. IE was selected as a target both because of the author's familiarity with it and because it is very easy to limit coverage collection to just the DOM component (mshtml.dll,仅计算DOM组件覆盖率). The experimental fuzzer used a modified Domato engine to generate mutations and used a modified WinAFL's DynamoRIO client to measure coverage. The fuzzing flow worked roughly as follows:
  1. The fuzzer generates a new set of samples by mutating existing samples in the corpus.//变异
  2. The fuzzer spawns IE process which opens a harness HTML page.
  3. The harness HTML page instructs the fuzzer to start measuring coverage and loads one of the samples in an iframe//执行
  4. After the sample executes, it notifies the harness which notifies the fuzzer to stop collecting coverage.//计算代码覆盖率
  5. Coverage map is examined and if it contains unseen coverage, the corresponding sample is added to the corpus.
  6. Go to step 3 until all samples are executed or the IE process crashes
  7. Periodically minimize the corpus using the AFL’s cmin algorithm.//自动精简
  8. Go to step 1.
The following set of mutations was used to produce new samples from the existing ones://与上面的相比
  • Adding new CSS rules
  • Adding new properties to the existing CSS rules
  • Adding new HTML elements
  • Adding new properties to the existing HTML elements
  • Adding new JavaScript lines. The new lines would be aware of the existing JavaScript variables and could thus reuse them.
Unfortunately, while we did see a steady increase in the collected coverage over time while running the fuzzer, it did not result in any new crashes (i.e. crashes that would not be discovered using dumb fuzzing.盲目fuzzing). It would appear more investigation is required in order to combine coverage information with DOM fuzzing in a meaningful way.//并没有看到代码覆盖率的稳步增长,也没有新增crash(这些crash不能通过盲目fuzzing发现)。需要进一步思考,如何将DOMfuzzing与代码覆盖率结合!

Conclusion

As stated before, DOM engines have been one of the largest sources of web browser bugs. While this type of bug are far from gone, most browsers show clear progress in this area. The results also highlight the importance of doing continuous security testing as bugs get introduced with new code and a relatively short period of development can significantly deteriorate a product’s security posture.
The big question at the end is: Are we now at a stage where it is more worthwhile to look for security bugs manually than via fuzzing? Or do more targeted fuzzers need to be created instead of using generic DOM fuzzers to achieve better results? And if we are not there yet - will we be there soon (hopefully)? The answer certainly depends on the browser and the person in question. Instead of attempting to answer these questions myself, I would like to invite the security community to let us know their thoughts.

转:The Great DOM Fuzz-off of 2017的更多相关文章

  1. JQuery基本知识、选择器、事件、DOM操作、动画--2017年2月10日

    $(对象)可以将JS对象转换为JQuery对象  .get(0)可以将JQuery对象转换为JS对象 并无太大区别,灵活点出即可

  2. 像VUE一样写微信小程序-深入研究wepy框架

    像VUE一样写微信小程序-深入研究wepy框架 微信小程序自发布到如今已经有半年多的时间了,凭借微信平台的强大影响力,越来越多企业加入小程序开发. 小程序于M页比相比,有以下优势: 1.小程序拥有更多 ...

  3. JS DOM(2017.12.28)

    一.获得元素节点的方法 document.getElementById()    根据Id获取元素节点 document.getElementsByName()    根据name获取元素节点   遍 ...

  4. 2017 年值得一瞥的 JavaScript 相关技术趋势

    跨年前两天,Dan Abramov在Twitter上提了一个问题: JS社区毫不犹豫的抛出了它们对于新技术的预期与期待,本文内容也是总结自Twitter的回复,按照流行度降序排列.有一个尚未确定的小点 ...

  5. DOM的相关优化

    为什么要进行DOM优化? DOM对象本身也是一个js对象,所以严格来说,并不是操作这个对象慢,而是说操作了这个对象后,会触发一些浏览器行为,比如布局(layout)和绘制(paint). 首先先说一些 ...

  6. python运维开发(十六)----Dom&&jQuery

    内容目录: Dom 查找 操作 事件 jQuery 查找 筛选 操作 事件 扩展 Dom 文档对象模型(Document Object Model,DOM)是一种用于HTML和XML文档的编程接口.它 ...

  7. 【2017年新篇章】 .NET 面试题汇总(二)

    本次给大家介绍的是我收集以及自己个人保存一些.NET面试题第二篇 第一篇文章请到这里:[2017年新篇章] .NET 面试题汇总(一) 简介 此次包含的不止是.NET知识,也包含少许前端知识以及.ne ...

  8. X-NUCA 2017 web专题赛训练题 阳光总在风雨后和default wp

     0X0.前言 X-NUCA 2017来了,想起2016 web专题赛,题目都打不开,希望这次主办方能够搞好点吧!还没开赛,依照惯例会有赛前指导,放一些训练题让CTFer们好感受一下题目. 题目有一大 ...

  9. HTML DOM (文档对象模型)

    当网页被加载时,浏览器会创建页面的文档对象模型(Document Object Model). HTML DOM 模型被构造为对象的树. HTML DOM 树 通过可编程的对象模型,JavaScrip ...

随机推荐

  1. 前端PHP入门-027-数组常用函数-掌握级别

    下面的函数一定要到熟悉甚至到掌握级别. 这些函数,也是面试中基础面试中最爱问到的问题. 函数名 功能 array_combine() 生成一个数组,用一个数组的值作为键名,另一个数组值作为值 rang ...

  2. 前端PHP入门-010-内部函数

    内部函数,是指在函数内部又声明了一个函数. 注意事项: 内部函数名,不能是已存在的函数名 假设在函数a里面定义了一个内部函数,不能调用两次函数a. <?php function foo() { ...

  3. php实现多继承-trait语法

    自 PHP 5.4.0 起,PHP 实现了一种代码复用的方法,称为 trait. Trait 是为类似 PHP 的单继承语言而准备的一种代码复用机制.Trait 为了减少单继承语言的限制,使开发人员能 ...

  4. Image Scaling using Deep Convolutional Neural Networks

    Image Scaling using Deep Convolutional Neural Networks This past summer I interned at Flipboard in P ...

  5. chmod及chown命令详解

    1,chmod 指令名称 : chmod 使用权限 : 所有使用者 使用方式 : chmod [-cfvR] [--help] [--version] mode file... 说明 : Linux/ ...

  6. MongoDB - MongoDB CRUD Operations, Insert Documents

    MongoDB provides the following methods for inserting documents into a collection: db.collection.inse ...

  7. web上下文监听器ServletContextListener

    1 package com.liveyc.common.listener; import javax.servlet.ServletContextEvent; import javax.servlet ...

  8. 浅谈卡特兰数(Catalan number)的原理和相关应用

    一.卡特兰数(Catalan number) 1.定义 组合数学中一个常出现在各种计数问题中出现的数列(用c表示).以比利时的数学家欧仁·查理·卡特兰的名字来命名: 2.计算公式 (1)递推公式 c[ ...

  9. matlab前景分割

    用最简单的差分法实现了一下前景分割.使用的mall数据集. 思路是这样的:首先设定一个队列的长度,若读取的图片张数少于队列长度则以当前读取到的图片做平均.否则则以队列中的图片做平均. 这样之后和当前图 ...

  10. Django之Form组件验证

    今天来谈谈Django的Form组件操作 Django中的Form一般有两种功能: ·输入html ·验证用户输入 Form验证流程 ·定义规则(是一个类)    ·前端把数据提交过来 ·匹配规则 · ...