How Browsers Work: Behind the scenes of modern web browsers
http://www.html5rocks.com/en/tutorials/internals/howbrowserswork/#Parser_Lexer_combination
Grammars
Parsing is based on the syntax rules the document obeys: the language or format it was written in. Every format you can parse must have deterministic grammar consisting of vocabulary and syntax rules. It is called a context free grammar. Human languages are not such languages and therefore cannot be parsed with conventional parsing techniques.
Parser–Lexer combination
Parsing can be separated into two sub processes: lexical analysis and syntax analysis.
Lexical analysis is the process of breaking the input into tokens. Tokens are the language vocabulary: the collection of valid building blocks. In human language it will consist of all the words that appear in the dictionary for that language.
Syntax analysis is the applying of the language syntax rules.
Parsers usually divide the work between two components: the lexer (sometimes called tokenizer) that is responsible for breaking the input into valid tokens, and the parser that is responsible for constructing the parse tree by analyzing the document structure according to the language syntax rules.
The lexer knows how to strip irrelevant characters like white spaces and line breaks.
Figure : from source document to parse trees
The parsing process is iterative. The parser will usually ask the lexer for a new token and try to match the token with one of the syntax rules. If a rule is matched, a node corresponding to the token will be added to the parse tree and the parser will ask for another token.
If no rule matches, the parser will store the token internally, and keep asking for tokens until a rule matching all the internally stored tokens is found. If no rule is found then the parser will raise an exception. This means the document was not valid and contained syntax errors.
How Browsers Work: Behind the scenes of modern web browsers的更多相关文章
- (转载)How browsers work--Behind the scenes of modern web browsers (前端必读)
浏览器可以被认为是使用最广泛的软件,本文将介绍浏览器的工 作原理,我们将看到,从你在地址栏输入google.com到你看到google主页过程中都发生了什么. 将讨论的浏览器 今天,有五种主流浏览器— ...
- 【转载】How browsers work--Behind the scenes of modern web browsers (前端必读)
浏览器可以被认为是使用最广泛的软件,本文将介绍浏览器的工 作原理,我们将看到,从你在地址栏输入google.com到你看到google主页过程中都发生了什么. 将讨论的浏览器 今天,有五种主流浏览器- ...
- Building Modern Web Apps-构建现代的 Web 应用程序
Building Modern Web Apps-构建现代的 Web 应用程序 视频长度:1 小时左右 视频作者:Scott Hunter 和 Scott Hanselman 视频背景:Visual ...
- a buzzword to refer to modern Web technologies
https://html.spec.whatwg.org/multipage/introduction.html#is-this-html5? HTML Living Standard — Last ...
- Building Modern Web Apps-构建现代的 Web 应用程序(一些感想)
<iframe src="http://channel9.msdn.com/Series/MVA-China/Web20140611A01/player?h=540&w=960 ...
- modern web application
http://www.codeproject.com/Reference/597538/Modern-Web-Development http://www.west-wind.com/presenta ...
- lineman 的理念与 modern web app
无意中翻到javascript 有个 lineman工具, 提供了一些脚手架 以及 默认的app目录结构,同时还附带了诸多前端的性能优化工具,在他的主页还发现其理念与我之前关于web app的开发模型 ...
- Cheatsheet: 2016 04.01 ~ 04.30
.NET String format Setting up Ubuntu for .NET Development ASP.NET Core and Angular2 - Part 1 - Upda ...
- 深入理解requestAnimationFrame
前言 本文主要参考w3c资料,从底层实现原理的角度介绍了requestAnimationFrame.cancelAnimationFrame,给出了相关的示例代码以及我对实现原理的理解和讨论. 先来看 ...
随机推荐
- JNDI 是什么
转自:http://blog.csdn.net/zhaosg198312/article/details/3979435 JNDI是 Java 命名与目录接口(Java Naming and Dire ...
- 由文章缩略图读出banner图
<div class="banner"> {dede:sql sql="select litpic FROM #@__archives where typei ...
- Wireshark分析器分析数据流过程
Wireshark分析器分析数据流过程 分析包是Wireshark最强大的功能之一.分析数据流过程就是将数据转换为可以理解的请求.应答.拒绝和重发等.帧包括了从捕获引擎或监听库到核心引擎的信息.Wir ...
- 水题 Codeforces Round #300 A Cutting Banner
题目传送门 /* 水题:一开始看错题意,以为是任意切割,DFS来做:结果只是在中间切出一段来 判断是否余下的是 "CODEFORCES" :) */ #include <cs ...
- Sql不区分大小写查询
select a.* from Pair_User a where 1=1 and UPPER(a.UserID) like 'EMH1001%' collate Chinese_PRC_C ...
- 设置完在Canvas的位置后,控件HitTest不响应的问题
have a Canvas with a couple of elements on it like Line, Path and Text Box. In the MouseOver event o ...
- Partran,Nastran和ANSYS的区别
Partran .Nastran是MSC公司的产品.Patran是前处理器,用于建模.划分网格.设定载荷和边界条件等等:Nastran只是MSC公司提供的求解器之一,主要用于结构分析和热分析,应用的是 ...
- quick 关于触摸的问题
以前遇到一个问题就是,如果触摸层不在最后,会导致触摸失效.这是由于下面添加的层挡住了触摸层,而后添加的层会位于上面,默认是不可点击,点击不可穿透的.所以我们必须将触摸层放置到最上面. Logic.lu ...
- BestCoder Round #74
身败名裂啊...... T1WA了半天,30min才A. T2又WA了一发,然后Hack刚2min就被别人叉了. T3做完后最后40min不知所措. 去叉别人,看到一个人写D题判m=0很奇怪,随手把他 ...
- Nginx 伪静态教程
1.将多个域名指向同一web目录: server_name www.php100.com php100.com; rewrite ^/$ / redirect; 2.将不带www的域名301转向到带w ...