The major web browsers load web pages in basically the same way. This process is known as parsing and is described by the HTML5 specification. A high-level understanding of this process is critical to writing web pages that load efficiently.

Parsing overview

As chunks of the HTML source become available from the network (or cache, filesystem, etc), they are streamed to the HTML parser. Next, in a process known as tokenization, the parser iterates through the source generating a token for (most notably) each start tag, end tag and character outside of a tag.

For example the input source <b>hello</b> yields 7 tokens:

start-tag { name: b }
character { data: h }
character { data: e }
character { data: l }
character { data: l }
character { data: o }
end-tag { name: b }

After each token is generated it is serially passed to the next major subsystem: the tree builder. The tree builder dynamically modifies the Document's DOM tree to reflect the new token.

The 7 input tokens above yield the following DOM tree:

<html>
<head>
<body>
<b>
"hello"

Fetching subresources

A frequent operation performed by the tree builder is creating a new HTML element and inserting it into the Document. It is at the point of insertion that HTML elements which load subresources begin fetching the subresource.

Running scripts

This parsing algorithm seems to translate HTML source into a DOM tree as efficiently as possible. That is, except for one wrinkle: scripts. When the tree builder encounters an end-tag token for a script, it must serially execute the script before parsing can continue (unless the associated script start-tag has the defer or async attribute).

There are two significant preconditions which must be fulfilled before a script can execute:

  1. If the script is external its source must be fully downloaded.
  2. For any script, all stylesheets in the document must be fully downloaded.

This means often the parser must idly wait while scripts and stylesheets are downloaded.

Why must parsing halt?

Well, a script may document.write something which affects further parsing or it may query something about the DOM which would yield incorrect results if parsing had continued (for instance the number of image elements in the DOM).

Why wait for stylesheets?

A script may expect to access the CSSOM directly or it may query an attribute of a DOM node which depends on the stylesheet (for example, how wide is a certain <table>).

Is it inefficient to block parsing?

Yes. Subresource download times often have a large constant factor limited by round trip time. This means it is faster to download two resources in parallel than to download the same two in serial. More obviously, the browser is also free to do CPU work while waiting on the network. For these reasons it is critical to efficient loading of a web page that subresource fetches are initiated as soon as possible. When parsing is blocked, the tree builder is not able to insert subsequent elements into the DOM, and thus subsequent subresource downloads are not initiated even if the HTML source which includes them is already available to the parser.

Mitigating blocking

As I've blogged previously, when the parser becomes blocked WebKit will run a lightweight parser known as the preload scanner. It mitigates the blocking problem by scanning ahead and fetching certain subresource that may be required. Other browsers employ similar techniques.

It is important to note that even with preload scanning, parsing is still blocked. Nodes cannot be added to the DOM tree. Although I haven't covered how a DOM tree becomes a render tree, layout or painting, it should be obvious that before a node is in the DOM tree it cannot be painted to the screen.

Finishing parsing

After the entire source has been parsed, first all deferred scripts will be executed (waiting for their source and all pending stylesheets to download). Their completion triggers the DOMContentLoaded event to be fired. Next, the parser will wait for any pending async scripts to finish loading and executing. Finally, once all subresources have finished downloading, the window's load event will be fired and parsing is complete.

Takeaway

With this understanding, it becomes clear how important it is to carefully consider where and how stylesheets and scripts are included in the document. Those decisions can have a significant impact on the efficiency of the page load.

How a web page loads的更多相关文章

  1. [转]How WebKit Loads a Web Page

    ref:https://www.webkit.org/blog/1188/how-webkit-loads-a-web-page/ Before WebKit can render a web pag ...

  2. 解读Web Page Diagnostics网页细分图

    解读Web Page Diagnostics网页细分图 http://blog.sina.com.cn/s/blog_62b8fc330100red5.html Web Page Diagnostic ...

  3. 网页细分图结果分析(Web Page Diagnostics)

    Discuz开源论坛网页细分图结果分析(Web Page Diagnostics) 续LR实战之Discuz开源论坛项目,之前一直是创建虚拟用户脚本(Virtual User Generator)和场 ...

  4. Atitit.web三大编程模型 Web Page Web Forms 和 MVC

    Atitit.web三大编程模型 Web Page    Web Forms 和 MVC 1. 编程模型是 Web Forms 和 MVC (Model, View, Controller). 2.  ...

  5. [转]Calling Web Service Functions Asynchronously from a Web Page 异步调用WebServices

    本文转自:http://www.codeproject.com/Articles/70441/Calling-Web-Service-Functions-Asynchronously-from Ove ...

  6. Tutorial: Importing and analyzing data from a Web Page using Power BI Desktop

    In this tutorial, you will learn how to import a table of data from a Web page and create a report t ...

  7. Android WebView常见问题的解决方案总结----例如Web page not available

    之前android虚拟机一直都可以直接联网,今天写了一个WebView之后,突然报出了Web page not available的错误,但是查看虚拟机自带的浏览器,是可以上网的,所以检查还是代码的问 ...

  8. LR实战之Discuz开源论坛——网页细分图结果分析(Web Page Diagnostics)

    续LR实战之Discuz开源论坛项目,之前一直是创建虚拟用户脚本(Virtual User Generator)和场景(Controller),现在,终于到了LoadRunner性能测试结果分析(An ...

  9. Home | eMine: Web Page Transcoding Based on Eye Tracking Project Page

    Home | eMine: Web Page Transcoding Based on Eye Tracking Project Page The World Wide Web (web) has m ...

随机推荐

  1. 【redis】redis的 key的命名规则

    key的命名规则 定义为 MS-TEN:SESSION_KEY_IN_LOGIN_NAME:fqh 使用:进行分割,这样存入redis的是有层次结构的,如下

  2. Azure上采用Powershell从已有的VHD创建VM

    刚刚的一篇Blog采用Json Template的方式从已有的VHD创建了一台新的VM.由于Json Template封装的比较好,可以改的内容不多. 下面将介绍通过用Powershell来从已有的V ...

  3. Java 字符串和时间互相转化 +时间戳

    一:字符串转换成date String datatime="2015-09-22 15:16:48"; SimpleDateFormat form = new SimpleDate ...

  4. [C++] 动态规划之矩阵连乘、最长公共子序列、最大子段和、最长单调递增子序列、0-1背包

    一.动态规划的基本思想 动态规划算法通常用于求解具有某种最优性质的问题.在这类问题中,可能会有许多可行解.每一个解都对应于一个值,我们希望找到具有最优值的解. 将待求解问题分解成若干个子问题,先求解子 ...

  5. Oracle 12c 搭建学习

    Oracle 12c 搭建学习 Vm workstaton10 安装linux 6.4 安装oracle12c Oracle 12c只支持64位系统 1 环境检查 [root@rac1 ~]# gre ...

  6. 深入VR之前 你应该知道VR头显透镜原理

    转自:http://www.gamelook.com.cn/2016/03/246817 要理解虚拟现实头显透镜的工作原理,首先要搞懂眼睛是如何看到事物的. 眼睛瞳孔后有晶状体,也就是眼珠子.眼睛的背 ...

  7. UTF-8, Unicode, GB2312格式串转换之C语言版

    原住址:http://www.cnitblog.com/wujian-IT/archive/2007/12/13/37671.html           /*      author:   wu.j ...

  8. c语言-单链表(一)

    定义节点: typedef struct Node { int data; Node* pNext; }NODE, *PNODE; 细节说明,PNode 就代表struct Node* ,上面的表单是 ...

  9. 2016.7.27 VS搜索正则表达式,在UltraEdit中可选用Perl正则引擎,按C#语法搜索

    表达式 语法 说明 任一字符 . 匹配除换行符外的任何一个字符. 最多 0 项或更多 * 匹配前面表达式的 0 个或更多搜索项. 最多一项或更多 + 匹配前面表达式的至少一个搜索项. 最少 0 项或更 ...

  10. pa14-30条职场经验

    可以说是很多本厚厚的职场经验书籍的精华部分,掌握了这30条可以说是天下无敌了,但真要掌握这30条经验可不是什么容易的事情,他们都是环环相 扣的,一条做不好可能有些能做好的项目就会落空,耐下性子,看看你 ...