HTTP Cache
 
Overview
 
The HTTP Cache is the module that receives HTTP(S) requests and decides when and how to fetch data from the Disk Cache or from the network. The cache lives in the browser process, as part of the network stack. It should not be confused with Blink's in-memory cache, which lives in the renderer process and it's tightly coupled with the resource loader.
 
Logically the cache sits between the content-encoding logic and the transfer-encoding logic, which means that it deals with transfer-encoding properties and stores resources with the content-encoding set by the server.
 
The cache implements the HttpTransactionFactory interface, so an HttpCache::Transaction (which is an implementation of HttpTransaction) will be the transaction associated with the URLRequestJob used to fetch most URLRequests.
 
There's an instance of an HttpCache for every profile (and for every isolated app). In fact, a profile may contain two instances of the cache: one for regular requests and another one for media requests.
 
Note that because the HttpCache is the one in charge of serving requests either from disk or from the network, it actually owns the HttpTransactionFactory that creates network transactions, and the disk_cache::Backend that is used to serve requests from disk. When the HttpCache is destroyed (usually when the profile data goes away), both the disk backend and the network layer (HttpTransactionFactory) go away.
 
There may be code outside of the cache that keeps a copy of the pointer to the disk cache backend. In that case, it is a requirement that the real ownership is maintained at all times, which means that such code has to be owned transitively by the cache (so that backend destruction happen synchronously with the destruction of the code that kept the pointer).
 

Operation

 
The cache is responsible for:
 
  • Create and manage the disk cache backend.
 

This is mostly an initialization problem. The cache is created without a backend (but with a backend factory), and the backend is created on-demand by the first request that needs one. The HttpCache has all the logic to queue requests until the backend is created.

 
  • Create HttpCache::Transactions.
 
  • Create and manage ActiveEntries that are used by HttpCache::Transactions to interact with the disk backend.
 
An ActiveEntry is a small object that represents a disk cache entry and all the transactions that have access to it. The Writer, the list of Readers and the list of pending transactions (waiting to become Writer or Readers) are part of the ActiveEntry.
 
The cache has the code to create or open disk cache entries and place them on an ActiveEntry. It also has all the logic to attach and remove a transaction to and from ActiveEntry.
 
  • Enforce the cache lock.
 
The cache implements a single writer - multiple reader lock so that only one network request for the same resource is in flight at any given time.
 
Note that the existence of the cache lock means that no bandwidth is wasted re-fetching the same resource simultaneously. On the other hand, it forces requests to wait until a previous request finishes downloading a resource (the Writer) before they can start reading from it, which is particularly troublesome for long lived requests. Simply bypassing the cache for subsequent requests is not a viable solution as it will introduce consistency problems when a renderer experiences the effect of going back in time, as in receiving a version of the resource that is older than a version that it already received (but which skipped the browser cache).
 

The bulk of the logic of the HTTP cache is actually implemented by the cache transaction.

 

Sparse Entries

 
The HTTP Cache supports using spares entries for any resource. Sparse entries are generally used by media resources (think large video or audio files), and the general idea is to be able to store only some parts of the resource, and being able to serve those parts back from disk.
 
The mechanism that is used to tell the cache that it should create a sparse entry instead of a regular entry is by issuing a byte-range request from the caller. That tells the cache that the caller is prepared to deal with byte ranges, so the cache may store byte ranges. Note that if the cache already has a resource stored for the requested URL, issuing a byte range request will not "upgrade" that resource to be a sparse entry; in fact, in general there is no way to transform a regular entry into a sparse entry or vice-versa.
 
Once the HttpCache creates a sparse entry, the disk cache backend will be in charge of storing the byte ranges in an efficient way, and it will be able to evict part of a resource without throwing the whole entry away. For example, when watching a long video, the backend can discard the first part of the movie while still storing the part that is currently being received (and presented to the user). If the user goes back a few minutes, content can be served from the cache. If the user seeks to a portion that was already evicted, that part the video can be fetched again.
 
At any given time, it is possible for the cache to have stored a set of sections of a resource (which don't necessarily match any actual byte-range requested by the user) interspersed with missing data. In order to fulfill a given request, the HttpCache may have to issue a series of byte-range network requests for the missing parts, while returning data as needed either from disk or from the network. In other words, when dealing with sparse entries, the HttpCache::Transaction will synthesize network byte-range requests as needed.
 

Truncated Entries

 
A second scenario where the cache will generate byte-range request is when a regular entry (not sparse) was not completely received before the connection was lost (or the caller cancelled the request). In that case, the cache will attempt to serve the first part of the resource from disk, and issue a byte range request for the remainder of the resource. A large part of the logic to handle truncated entries is the same logic needed to support spares entries.
 

Byte-Range Requests

 
As explained above, byte-range requests are used to trigger the creation of sparse entries (if the resource was not previously stored). From the user point of view, the cache will transparently fulfill any combination of byte-range requests and regular requests either from sparse, truncated or normal entries. Needless to say, if a client uses byte-range requests it should be prepared to deal with the implications of that request, as having to determine when requests can be combined together, what a range applies to (over the wire bytes) etc.
 

HttpCache::Transaction

 
The bulk of the cache logic is implemented by the cache transaction. At the center of the implementation there is a very large state machine (probably the most common pattern in the network stack, given the asynchronous nature of the problem). Note that there's a block of comments that document the most common flow patterns for the state machine, just before the main switch implementation.
 
This is a general (not exhaustive) diagram of the state machine:
 
 
This diagram is not meant to track the latest version of the code, but rather to provide a rough overview of what the state machine transitions look like. The flow is relatively straight forward for regular entries, but the fact that the cache can generate a number of network requests to fulfill a single request that involves sparse entries make it so that there is a big loop going back to START_PARTIAL_CACHE_VALIDATION. Remember that each individual network request can fail, or the server may have a more recent version of the resource... although in general, that kind of server behavior while we are working with a request will result in an error condition.

Network Stack‎ : HTTP Cache的更多相关文章

  1. Network Stack‎ : Disk Cache

    Disk Cache 目录 1 Overview 2 External Interface 3 Disk Structure 3.1 Cache Address 3.2 Index File Stru ...

  2. Network Stack

    Network Stack 目录 1 Overview 2 Code Layout 3 Anatomy of a Network Request (focused on HTTP) 3.1 URLRe ...

  3. Queueing in the Linux Network Stack !!!!!!!!!!!!!!!

    https://www.coverfire.com/articles/queueing-in-the-linux-network-stack/ Queueing in the Linux Networ ...

  4. Contiki Network Stack

    一.协议栈 主要有两大网络协议栈,uIP和Rime这两大协议栈(network stack): The uIP TCP/IP stack, which provides us with IPv4 ne ...

  5. Network Stack‎ : HTTP authentication

    HTTP authentication As specified in RFC 2617, HTTP supports authentication using the WWW-Authenticat ...

  6. 谷歌开发者工具 Network:Disable cache 和 Preserve log

    参考博文地址:https://my.oschina.net/af666/blog/871793 Network Disable cache(禁止缓存):勾上,修改代码之后,刷新页面没有更新,看有没有禁 ...

  7. Network Stack‎ : CookieMonster

    CookieMonster   The CookieMonster is the class in Chromium which handles in-browser storage, managem ...

  8. XV6学习(16)Lab net: Network stack

    最后一个实验了,代码在Github上. 这一个实验其实挺简单的,就是要实现网卡的e1000_transmit和e1000_recv函数.不过看以前的实验好像还要实现上层socket相关的代码,今年就只 ...

  9. Monitoring and Tuning the Linux Networking Stack: Receiving Data

    http://blog.packagecloud.io/eng/2016/06/22/monitoring-tuning-linux-networking-stack-receiving-data/ ...

随机推荐

  1. vps上运行serv-u的问题

    为了给产品环境建个测试站,今天特意申请一个vps来做开发用,但运行了Serv-U的ServUDaemon.exe后始终提示: 响应: 530 User czhan cannot log in. 很无语 ...

  2. weblogic 生产模式和开发模式的互相转换

    weblogic 生产模式和开发模式的互相转换 学习了:http://blog.csdn.net/qew110123/article/details/45845935 weblogic10.3生产模式 ...

  3. [Typescript] Promise based delay function using async / await

    Learn how to write a promise based delay function and then use it in async await to see how much it ...

  4. View注入框架:Butterknife简单使用

    View注入框架 下载地址 1.Activity Binging 通过@Bind凝视字段,Butter Knife能够通过View的ID自己主动找到并把对应的视图布局. class ExampleAc ...

  5. node03--http

    form.html <!DOCTYPE html> <html lang="en"> <head> <meta charset=" ...

  6. BZOJ2179: FFT快速傅立叶 & caioj1450:【快速傅里叶变换】大整数乘法

    [传送门:BZOJ2179&caioj1450] 简要题意: 给出两个超级大的整数,求出a*b 题解: Rose_max出的一道FFT例题,卡掉高精度 = =(没想到BZOJ也有) 只要把a和 ...

  7. 疯狂java讲义之数据类型与运算符

    Java是一门强类型语言 所有变量必须先声明.后使用 指定类型的变量只能接受类型匹配的值 注释 @author 作者 @version 版本 @param 方法参数 @return 返回值 标识符与关 ...

  8. Conditionals

    1. Modulus operator (%) The modulus operator works on integers and yields the remainder when the fir ...

  9. Android textView开头空两格问题,排版缩进2个汉字

    一般为了排版,textView中字符段落开头一般都会空两格显示,如下图 但是如果你靠敲击空格来解决那就错了,那样在不同的屏幕上显示会差异,完美的解决方法是用转义字符”\t“,在段首加\t\t就解决.加 ...

  10. PostgreSQL Replication之第三章 理解即时恢复(2)

    3.2 归档事务日志 看过图片之后,我们可以看看如何使这些东西进入工作状态.当谈到及时归档时,您需要做的第一件事是归档XLOG.PostgreSQL通过postgresql.conf提供了所有与归档相 ...