http://www.html5rocks.com/en/tutorials/internals/howbrowserswork/#Parser_Lexer_combination

Grammars

Parsing is based on the syntax rules the document obeys: the language or format it was written in. Every format you can parse must have deterministic grammar consisting of vocabulary and syntax rules. It is called a context free grammar. Human languages are not such languages and therefore cannot be parsed with conventional parsing techniques.

Parser–Lexer combination

Parsing can be separated into two sub processes: lexical analysis and syntax analysis.

Lexical analysis is the process of breaking the input into tokens. Tokens are the language vocabulary: the collection of valid building blocks. In human language it will consist of all the words that appear in the dictionary for that language.

Syntax analysis is the applying of the language syntax rules.

Parsers usually divide the work between two components: the lexer (sometimes called tokenizer) that is responsible for breaking the input into valid tokens, and the parser that is responsible for constructing the parse tree by analyzing the document structure according to the language syntax rules.
The lexer knows how to strip irrelevant characters like white spaces and line breaks.

   Figure : from source document to parse trees

The parsing process is iterative. The parser will usually ask the lexer for a new token and try to match the token with one of the syntax rules.   If a rule is matched, a node corresponding to the token will be added to the parse tree and the parser will ask for another token.

If no rule matches, the parser will store the token internally, and keep asking for tokens until a rule matching all the internally stored tokens is found. If no rule is found then the parser will raise an exception.  This means the document was not valid and contained syntax errors.

How Browsers Work: Behind the scenes of modern web browsers的更多相关文章

  1. (转载)How browsers work--Behind the scenes of modern web browsers (前端必读)

    浏览器可以被认为是使用最广泛的软件,本文将介绍浏览器的工 作原理,我们将看到,从你在地址栏输入google.com到你看到google主页过程中都发生了什么. 将讨论的浏览器 今天,有五种主流浏览器— ...

  2. 【转载】How browsers work--Behind the scenes of modern web browsers (前端必读)

    浏览器可以被认为是使用最广泛的软件,本文将介绍浏览器的工 作原理,我们将看到,从你在地址栏输入google.com到你看到google主页过程中都发生了什么. 将讨论的浏览器 今天,有五种主流浏览器- ...

  3. Building Modern Web Apps-构建现代的 Web 应用程序

    Building Modern Web Apps-构建现代的 Web 应用程序 视频长度:1 小时左右 视频作者:Scott Hunter 和 Scott Hanselman 视频背景:Visual ...

  4. a buzzword to refer to modern Web technologies

    https://html.spec.whatwg.org/multipage/introduction.html#is-this-html5? HTML Living Standard — Last ...

  5. Building Modern Web Apps-构建现代的 Web 应用程序(一些感想)

    <iframe src="http://channel9.msdn.com/Series/MVA-China/Web20140611A01/player?h=540&w=960 ...

  6. modern web application

    http://www.codeproject.com/Reference/597538/Modern-Web-Development http://www.west-wind.com/presenta ...

  7. lineman 的理念与 modern web app

    无意中翻到javascript 有个 lineman工具, 提供了一些脚手架 以及 默认的app目录结构,同时还附带了诸多前端的性能优化工具,在他的主页还发现其理念与我之前关于web app的开发模型 ...

  8. Cheatsheet: 2016 04.01 ~ 04.30

    .NET String format Setting up Ubuntu for .NET Development ASP.​NET Core and Angular2 - Part 1 - Upda ...

  9. 深入理解requestAnimationFrame

    前言 本文主要参考w3c资料,从底层实现原理的角度介绍了requestAnimationFrame.cancelAnimationFrame,给出了相关的示例代码以及我对实现原理的理解和讨论. 先来看 ...

随机推荐

  1. servlet中cookie的使用

    ---恢复内容开始--- Cookie是存储在客户端计算机上的文本文件,并保留了它们的各种信息跟踪的目的. Java Servlet透明支持HTTP Cookie. 涉及标识返回用户有三个步骤: 服务 ...

  2. 分享Kali Linux 2016.2第36周镜像虚拟机

    分享Kali Linux 2016.2第36周镜像虚拟机   9月9日,Kali Linux官方发布Kali Linux 2016.2周更新镜像.今天以64位镜像安装了一个虚拟机,分享给大家.该虚拟机 ...

  3. HealthKit开发教程之HealthKit的主要类型数据

    HealthKit开发教程之HealthKit的主要类型数据 在HealthKit中,我们将最常用到的数据称之为主要数据.主要数据基本上有三种:长度类型的数据.质量类型的数据.能量类型的数据.本节将主 ...

  4. ARP缓存记录种类动态条目和静态条目

    ARP缓存记录种类动态条目和静态条目 为使广播量最小,ARP维护IP地址到MAC地址映射的缓存以便将来使用.根据缓存的有效期时间,ARP缓存中包含动态和静态条目本文选自ARP协议全面实战手册. 这里首 ...

  5. MVC ActionResult视图结果

    摘要 MVC框架针对HttpResponse进行抽象与多态,使HttpResponse均可表示为ActionResult.那么,抽象和多态表现在哪里呢? //封装一个Action的结果. public ...

  6. VC创建预编译文件

    Building a simple "hello world" Ogre application can take several seconds on a modern mach ...

  7. java多次替换(replace不行)

    import java.util.regex.Matcher; import java.util.regex.Pattern; public class test { public static vo ...

  8. BZOJ3251 : 树上三角形

    BZOJ AC1000题纪念~~~ 将x到y路径上的点权从小到大排序 如果不存在b[i]使得b[i]+b[i+1]>b[i+2]则无解 此时b数列增长速度快于斐波那契数列,当达到50项时就会超过 ...

  9. IOS 项目 小说 1

    架构: logo: logo标识(在image文件夹中修改某图片名称为icon) default: 默认页面的启动效果(在image文件夹中修改某图片名称为Default) image:存放图片(根目 ...

  10. IntelliJ IDEA 14 SVN无法正常使用问题

    通过SVN导入项目 SVN checkout时候会出现如下错误: Cannot run program "svn" (in directory "E:\Projects& ...