Facebook Architecture
Facebook Architecture
Quora article
a relatively old presentation on facebook architecture
another InfoQ presentation on Facebook architecture / scale
Web frontend
- PHP
- HipHop
- HipHop Virtual Machine (HHVM)
- BigPipe to pipeline page rendering, by dividing the page into pagelet and pipeline.
- Vanish Cache for web caching
Business Logic
- service-oriented, exposed as service
- Thrift API
- multiple language bindings
- no need to worry about serialization / connection handling / threading
- support different server type: non-blocking, async, single-thread, multi-thread
- Java service uses a custom application server (not Tomcat or Jetty etc.)
Persistence
- MySQL, Memcached, Hadoop's HBase
- MySQL/Innodb used as key-value store, distributed / load-balanced to many instances
- global ID is assigned to user data (user info, wall posts, comments etc.)
- Blob data e.g. photos and videos, are handled separately
Logging
- Scribe, one instance on each host
- Scribe-HDFS for analytics
Photo
- first version is NFS-backed storage, served via HTTP
- Haystack, Facebook's object store for photos
- Haystack slides
- Massive CDN to cache/delivery data
- previously NFS-backed, but traditional POSIX file system incurs too much overhead which is not necessary: directory resolution, file metadata, inode etc.
- Haystack Store: 1 server's 10 TB storage is split into 100 "physical volumes"; physical volumes on different hosts are organized into "logical volumes", data are replicated within logical volume
- physical volume is simply a very large file (100 GB) mounted at /hay/haystack_/
- Haystack Cache: internal cache
- example of an image's URL:
http://<CDN>/<Cache>/<Machine id>/<Logical volume, Photo> - Haystack Directory: metadata / mapping
- mapping and URL construction
- load balance among logical volumes for write, and load balance among physical volumes (within a specific logical volume) for read.
- XFS works best with Haystack
News Feed
- the system is called multifeed in FB
- Facebook News Feed: Social Data at Scale, and slides
- recent (2015) redesign to News Feed
- What is News Feed
- fetch recent activity from all your friends
- gather it in a central place
- group into stories
- rank stories by relevance etc.
- send back results
- Scale
- 10 billion / day
- 60ms average latency
- Fan-out-on-write vs. Fan-out-on-read
- fan-out-on-write i.e. push writes to your friend
- can cause so called write amplification
- what Twitter originally does (with some optimization later on users with many followers, Justin Bieber Problem..)
- fan-out-on-read i.e. fetch and aggregate at read time - what Facebook does
- flexibility on read-time aggregation (like what content to generate, bound the data volume)
- How it works
- incoming requests is sent from PHP layer to an "aggregator", which figures out users to query (e.g. a request from me will query for all my friends)
- a server named leaf node holds all activities of a number of users
- there're many many leaf nodes for such purpose, with partitioning / possibly replication
- data is then loaded from the corresponding leaf node, then rank/aggregate the data, and finally send the stories back.
- PHP layer gets back a list of "action ids", and queries memcached/MySQL to load content of the action (like a video, a post)
- a "tailer": input data pipelines user actions and feedbacks to a leaf node in realtime (e.g. when a user posts a new video)
Facebook Chat
- Chat Stability and Scalability
- channel server: receive a user's message, and send to the user's browser, written in Erlang
- presence server: whether a user is online or not - channel server pushes active users to presence server - written in C++
- lexical_cast causes memory allocation, when heap is fragmented, new malloc() will spend quite some CPU time on finding memory
Facebook Search
- Intro to facebook search
- Role: find a specific name/page in Facebook, e.g. a guy named "Bob", a band named "Johny"
- Ranking (relevance indicators)
- personal context;
- social context;
- query itself;
- global popularity
- challenges
- no query cache can be used;
- no locality in index (i.e. no hot index)
- Life of a Typeahead Query
- initial try: preload user's friends, pages, groups, applications, upcoming events into browser cache - and try to serve the search here
- request sent to aggregator (similar to News Feed's aggregator), which delegates to several leaf services
- Graph Search on people
- Graph Search on objects
- global objects - an index on all pages and applications on Facebook, no personalization - could be cached
- each leaf service returns some data, aggregator merges and ranks the result, and send to web tier
- result from aggregator are ids to resources, web-tier will load the data and send back to user's browser
Graph Search
- Unicorn: A System for Searching the Social Graph
- Under the Hood: Building out the infrastructure for Graph Search
- Under the Hood: Indexing and ranking in Graph Search
- Under the Hood: The natural language interface of Graph Search
- Under the Hood: Building posts search
- hisotry of facebook search
- keyword based search
- typeahead search, prefix-matching
- Unicorn is an inverted index system for many-to-many mapping. Difference with typical inverted index is that it not only indexes "documents" or entities like users/pages/groups/applications, but also search based on the edges (edge types) between nodes
- graph search natural language interface example: employers of my friends who live in New York
- input node: ME
ME --[friend-edge]--> my friends (who live in NY)- load list of nodes connected by a specific edge-type to the input nodes, here edge-type is "friend-edge"[MY FRIENDS FROM NY]--[works-at-edge]--> employers- "apply operator" i.e. "work-at" edge
- Indexing: performed as a combination of map-reduce jobs that collect data from Hive tables, process them and convert into inverted index data structures
- live udpates are streamed into the index via a separate live udpate pipeline.
- Graph Search components (Unicorn) - essentially an in-memory database with a query language interface
- Vertical - an unicorn instance - different entity types are kept in separate Unicorn verticals, e.g. USER Vertical, PAGES Vertical
- index server - part of a vertical, holds some of the index given the index is too large to fit into one single host
- Vertical Aggregator - broadcasts query to all verticals, and rank them
- because there're multiple Unicorn instances (Verticals), there's a TOP AGGREGATOR to on top of all vertical aggregators - which runs blending algorithm to blend result from each vertical
- Query Rewriting: parse the query into a structured Unicorn retrivial query, correct spelling, synonyms / segmentation etc.
- example: "restaurants liked by Facebook employees" gets converted to
273819889375819/places/20531316728/employees/places-liked/intersect - Scoring to rank result (static ranking); then "Result set scoring" to score the result as a whole, and only return a subset (e.g. "photos of facebook employees" may contain too many photos from Mark Zuckerberg)
- Nested Queries: the structured query may be nested and need to be JOINed, e.g. "restaurants liked by Facebook employees"
- Query Suggestion: relies on a NLP module to identify what kinds of entity that may be (sri as in name vs. sri as in "people who live in Sri.."
- Machine Learning is used to adjust the "scoring function"
- How to evaluate Search algorithm changes
- CTR - click through rate
- DCG (discounted cumulative gain) - measures the usefulness (gain) of a result set, by considering the gain of each result in the set and the position of the result
- Natural Language Interface to Graph Search
- keywords as an interface is not good: nouns only, while connections in Facebook Graph data are verbs
- quite intensive content, see article
- Building Posts Search
- more than 1 billion posts added everyday
- Wormhole to listen on posts from MySQL store of posts
- much larger than other index types - stored in SSD instead of RAM
- trillions of posts, nobody can read all result - dynamically add optional clauses to bias the result towards what we think are more valuable to the user
Facebook Messages
- presentation in Hadoop Summit 2011
- Scaling the Messages Application Back End
- Inside Facebook Messages' Application Server
- The Underlying Technology of Messages
- HBase as main storage
- Database Layer: Master / Backup Master / Region Server [1..n]
- Storage Layer: Name node / secondary name node / Data node [1..n]
- Coordination Service: Zookeeper peers
- A user is sticky to an application server
- Cell: application server + HBase node
- 5 or more racks per cell, 20 servers per rack => more than 100 machine for a cell
- controllers (master nodes, zookeeper, name nodes) spread across racks
- User Directory Service: find cell for a given user
- A separate backup system - quick and dirty to me
- Use Scribe
- double logging to reduce loss - merge and dedup
- ability to restore
- quite some effort to make HBase more reliable, fail safe, and support real-time workload.
- action log - any updates to a user's mailbox is recorded into the action log - can be replayed for various purposes
- full text search - use Lucene to extract data and add to HBase, each keyword has its own column
- Testing via Dark Launch - mirror live traffic from Chat and Inbox into a test Messages cluster for about 10% of the users.
Configuration Management
- an 2015 paper on this topic
Facebook Architecture的更多相关文章
- facebook architecture 2 【转】
At the scale that Facebook operates, a lot of traditional approaches to serving web content breaks d ...
- 【转发】揭秘Facebook 的系统架构
揭底Facebook 的系统架构 www.MyException.Cn 发布于:2012-08-28 12:37:01 浏览:0次 0 揭秘Facebook 的系统架构 www.MyExcep ...
- Facebook的体系结构分析---外文转载
Facebook的体系结构分析---外文转载 From various readings and conversations I had, my understanding of Facebook's ...
- 【转】为什么很多看起来不是很复杂的网站,比如 Facebook、淘宝,都需要大量顶尖高手来开发?
先说你看到的页面上,最重要的几个:[搜索商品]——这个功能,如果你有几千条商品,完全可以用select * from tableXX where title like %XX%这样的操作来搞定.但是— ...
- Facebook MyRocks at MariaDB
Recently my colleague Rasmus Johansson announced that MariaDB is adding support for the Facebook MyR ...
- Facebook技术架构
Facebook MySQL,Multifeed (a custom distributed system which takes the tens of thousands of updates f ...
- Analyzing The Papers Behind Facebook's Computer Vision Approach
Analyzing The Papers Behind Facebook's Computer Vision Approach Introduction You know that company c ...
- 100 open source Big Data architecture papers for data professionals
zhuan :https://www.linkedin.com/pulse/100-open-source-big-data-architecture-papers-anil-madan Big Da ...
- Facebook 的系统架构(转)
来源:http://www.quora.com/What-is-Facebooks-architecture(由Micha?l Figuière回答) 根据我现有的阅读和谈话,我所理解的今天Faceb ...
随机推荐
- Spring静态属性注入
今天遇到一个工具类,需要静态注入一个属性,方法如下: 第一步:属性的set和get方法不要加static package cn.com.chinalife.ebusiness.common.util; ...
- The Primo ScholarRank Technology: Bringing the Most Relevant Results to the Top of the List
By Tamar Sadeh, Director of Marketing In today’s world, users’ expectations for a quick and easy sea ...
- jquery mobile将页面内容当成弹框进行显示
注:必须使用相对应版本的jquery mobile css.不然无法正常显示 <div data-role="page" id="pageone"> ...
- JQUERY1.9学习笔记 之基本过滤器(一) 动态选择器
动态选择器:animated Selector 描述:当选择器运行时,选择动态进程中的所有元素.(对动态进程起作用) jQuery( ":animated" ) 注释::anima ...
- phpcms v9教程 联动搜索在房地产网站开发中的应用
开发简述:使用phpcms v9系统,修改源文件5个,创建模型:楼盘.出售.出租.中介.小区,增加联动菜单:楼盘,增加用户组:房产中介.实现功能:游客发布信息.会员申请中介.楼盘全方位展示.报名团购. ...
- sersync做实时同步(第二步)
配置文件一般都在sersync2的根目录下.为.xml文件 下面做逐行的进行解释说明: <host hostip="localhost" port="8008&qu ...
- php异步调用方法实现示例
php 异步调用方法 客户端与服务器端是通过HTTP协议进行连接通讯,客户端发起请求,服务器端接收到请求后执行处理,并返回处理结果. 有时服务器需要执行很耗时的操作,这个操作的结果并不需要返回 ...
- C程序设计语言练习题1-20
练习1-20 编写程序detab,将输入中的制表符替换成适当数目的空格,使空格充满到下一个制表符终止位的地方.假设制表符终止位的位置是固定的,比如每隔n列就会出现一个制表符终止位.n应该是变量还是符号 ...
- (摘)Zebra打印机异常处理
一.一般条码打印设备按图指示方向,虚线为碳带安装路径,实线为标签路径.回卷后废碳带不易剥落,则在装入前用废标签的光滑底纸卷在回卷轴上,然后再上碳带.安装标签时,根据不同标签宽度调整限纸器.压头弹簧均匀 ...
- 【转】Microsoft visio 2013 pro 图文激活破解教程
原文网址:http://jingyan.baidu.com/article/db55b609ab0e1f4ba30a2ff0.html Microsoft visio 2013 pro 假如您使用Mi ...