How Browsers Work: Behind the scenes of modern web browsers
http://www.html5rocks.com/en/tutorials/internals/howbrowserswork/#Parser_Lexer_combination
Grammars
Parsing is based on the syntax rules the document obeys: the language or format it was written in. Every format you can parse must have deterministic grammar consisting of vocabulary and syntax rules. It is called a context free grammar. Human languages are not such languages and therefore cannot be parsed with conventional parsing techniques.
Parser–Lexer combination
Parsing can be separated into two sub processes: lexical analysis and syntax analysis.
Lexical analysis is the process of breaking the input into tokens. Tokens are the language vocabulary: the collection of valid building blocks. In human language it will consist of all the words that appear in the dictionary for that language.
Syntax analysis is the applying of the language syntax rules.
Parsers usually divide the work between two components: the lexer (sometimes called tokenizer) that is responsible for breaking the input into valid tokens, and the parser that is responsible for constructing the parse tree by analyzing the document structure according to the language syntax rules.
The lexer knows how to strip irrelevant characters like white spaces and line breaks.
Figure : from source document to parse trees
The parsing process is iterative. The parser will usually ask the lexer for a new token and try to match the token with one of the syntax rules. If a rule is matched, a node corresponding to the token will be added to the parse tree and the parser will ask for another token.
If no rule matches, the parser will store the token internally, and keep asking for tokens until a rule matching all the internally stored tokens is found. If no rule is found then the parser will raise an exception. This means the document was not valid and contained syntax errors.
How Browsers Work: Behind the scenes of modern web browsers的更多相关文章
- (转载)How browsers work--Behind the scenes of modern web browsers (前端必读)
浏览器可以被认为是使用最广泛的软件,本文将介绍浏览器的工 作原理,我们将看到,从你在地址栏输入google.com到你看到google主页过程中都发生了什么. 将讨论的浏览器 今天,有五种主流浏览器— ...
- 【转载】How browsers work--Behind the scenes of modern web browsers (前端必读)
浏览器可以被认为是使用最广泛的软件,本文将介绍浏览器的工 作原理,我们将看到,从你在地址栏输入google.com到你看到google主页过程中都发生了什么. 将讨论的浏览器 今天,有五种主流浏览器- ...
- Building Modern Web Apps-构建现代的 Web 应用程序
Building Modern Web Apps-构建现代的 Web 应用程序 视频长度:1 小时左右 视频作者:Scott Hunter 和 Scott Hanselman 视频背景:Visual ...
- a buzzword to refer to modern Web technologies
https://html.spec.whatwg.org/multipage/introduction.html#is-this-html5? HTML Living Standard — Last ...
- Building Modern Web Apps-构建现代的 Web 应用程序(一些感想)
<iframe src="http://channel9.msdn.com/Series/MVA-China/Web20140611A01/player?h=540&w=960 ...
- modern web application
http://www.codeproject.com/Reference/597538/Modern-Web-Development http://www.west-wind.com/presenta ...
- lineman 的理念与 modern web app
无意中翻到javascript 有个 lineman工具, 提供了一些脚手架 以及 默认的app目录结构,同时还附带了诸多前端的性能优化工具,在他的主页还发现其理念与我之前关于web app的开发模型 ...
- Cheatsheet: 2016 04.01 ~ 04.30
.NET String format Setting up Ubuntu for .NET Development ASP.NET Core and Angular2 - Part 1 - Upda ...
- 深入理解requestAnimationFrame
前言 本文主要参考w3c资料,从底层实现原理的角度介绍了requestAnimationFrame.cancelAnimationFrame,给出了相关的示例代码以及我对实现原理的理解和讨论. 先来看 ...
随机推荐
- TCP和UDP Socket
1.tcp协议的编程 * 1:客户端.步骤 * 1:创建Socket对象,构造方法里需要指定服务端的ip地址和端口. * Socket socket = new S ...
- jquery优化02
缓存变量:DOM遍历是昂贵的,所以尽量将会重用的元素缓存. $element = $('#element'); h = $element.height(); //缓存 $element.css('he ...
- .net 日期格式转换
DateTime dt = DateTime.Now; dt.ToString();//2005-11-5 13:21:25dt.ToFileTime().ToString();//127756416 ...
- 2016 Multi-University Training Contest 10
solved 7/11 2016 Multi-University Training Contest 10 题解链接 分类讨论 1001 Median(BH) 题意: 有长度为n排好序的序列,给两段子 ...
- 转盘游戏[XDU1006]
Problem 1006 - 转盘游戏 Time Limit: 1000MS Memory Limit: 65536KB Difficulty: Total Submit: 87 Accep ...
- extjs tips
<%@ page language="java" import="java.util.*" pageEncoding="UTF-8"% ...
- COJ968 WZJ的数据结构(负三十二)
WZJ的数据结构(负三十二) 难度级别:D: 运行时间限制:5000ms: 运行空间限制:262144KB: 代码长度限制:2000000B 试题描述 给你一棵N个点的无根树,边上均有权值,每个点上有 ...
- COJ976 WZJ的数据结构(负二十四)
试题描述 输入一个字符串S,回答Q次问题,给你l,r,输出从Sl--Sr组成的串在S中出现了多少次. 输入 第一行为一个字符串S.第二行为一个正整数Q.接下来Q行每行为l,r. 输出 对于每个询问,输 ...
- 使 SortList 实现重复键排序
SortList 默认对按Key来排序,且Key值不能重复,但有时可能需要用有重复值的Key来排序,以下是实现方式: 1.对强类型:以float为例 #region 使SortList能对重复键排序 ...
- Android--学习记录
最近天天被兔子激励,所以开始找工作,发现Android和iOS都会更有竞争力,所以就想学一下Android Android比iOS更开放,学习难度可能会更大,我已经做好了吃苦的准备 计划是三个月搞定, ...