TCAM and CAM memory usage inside networking devices Valter Popeskic Equipment and tools, Physical layer, Routing, Switching 8 Comments

As this is networking blog I will focus mostly on the usage of CAM and TCAM memory in routers and switches. I will explain TCAM role in router prefix lookup process and switch mac address table lookup.

However, when we talk about this specific topic, most of you will ask: how is this memory made from architectural aspect?

How is it made in order to have the capability of making lookups faster than any other hardware or software solution? That is the reason for the second part of the article where I will try to explain in short how are the most usual TCAM memory build to have the capabilities they have.

CAM AND TCAM MEMORY

When using TCAM – Ternary Content Addressable Memory inside routers it’s used for faster address lookup that enables fast routing.

In switches CAM – Content Addressable Memory is used for building and lookup of mac address table that enables L2 forwarding decisions. By implementing router prefix lookup in TCAM, we are moving process of Forwarding Information Base lookup from software to hardware.

When we implement TCAM we enable the address search process not to depend on the number of prefix entries because TCAM main characteristic is that it is able to search all its entries in parallel. It means that no matter how many address prefixes are stored in TCAM, router will find the longest prefix match in one iteration. It’s magic, right?

CEF Lookup

Image 1 shows how FIB lookup functions and points to an entry in the adjacency table. Search process goes through all entries in TCAM table in one iteration.

ROUTER

In routers, like High-End Cisco ones, TCAM is used to enable CEF – Cisco Express Forwarding in hardware. CEF is building FIB table from RIB table (Routing table) and Adjacency table from ARP table for building pre-prepared L2 headers for every next-hop neighbour.

TCAM finds, in one try, every destination prefix inside FIB. Every prefix in FIB points to adjacency table’s pre-prepared L2 header for every outgoing interface. Router glues the header to packet in question and send it out that interface. It seems fast to do it that way? It is fast!

SWITCH

In Layer 2 world of switches, CAM memory is most used as it enables the switch to build and lookup MAC address tables. MAC address is always unique and so CAM architecture and ability to search for only exact matches is perfect for MAC address lookup. That gives the switch ability to go over all MAC addresses of all host connected to all ports in one iteration and resolve where to send received packets.

CAM is so perfect here as the architecture of CAM provides the result of two kinds 0 or 1. So then we make the lookup on CAM table it will only get us with true (1) result if we searched for the exact same bits. L2 forwarding decisions are the one using this fast magical electronics!

MORE THAN PLAIN ROUTING AND SWITCHING

Besides Longest-Prefix Matching, TCAM in today’s Routers and Multilayer Switch devices are used to store ACL, QoS and other things from upper-layer processing. TCAM architecture and the ability of fast lookup enables us to implement Access-Lists without an impact on router/switch performance.

Devices with this ability mostly have more TCAM memory modules in order to implement Access-List in both directions and QoS at the same time at the same port without any performance impact. All those different functions and their lookup process towards a decision is made in parallel.

MORE ON TCAM

TCAM is basically a special version of CAM constructed for rapid table lookups. Not mentioned before, TCAM can get Us with three different results when doing lookups: 0, 1, and X (I don’t care state).

With this strange third state, TCAM is perfect for building and searching tables for stored longest matches in IP routing tables.

There is just one condition that IP prefixes need to be sorted before they are stored in TCAM so that longest prefixes are on upper position with higher priority (lower address location) in a table. This enables us to always select the longest prefix from given results an thus enables Longest-Prefix Matching.

TCAM ARCHITECTURE

In the Image 2 here below I showed, (please disregard my style), one of the simplest CEF Explanations I could find in scientific articles around. It is basically showing you usage of FIB on the left and Adjacency table on the right. FIB stored in TCAM table and Adjacency table stored in RAM memory. Great, it shows without words what we spoked about before in “ROUTER” section.

TCAM FIB

Image 2 FIB implemented in TCAM, adjacency table implemented in RAM

Ok, Here you must know that IP addresses in the example are smaller that real ones. Here we have addresses of 5 bits not 32 like IPv4, all other is the same as the real stuff.

We are looking on the left side now at the CAM part, it is basically explained for TCAM.

So in TCAM world in order to get the longest match like in the Image 2 above here, before populating the TCAM we need to sort the entries so that longer prefixes are always situated on higher priority places. As the lookup goes from top downwards it means that higher priority is higher in the table, closer to the top. OK, now that we solved this it is easy to see that TCAM here is searching in parallel from left to right all four address entries.

Entries here in TCAM are numbered 00,01,02,03 from top to bottom. Not like in Routing table above where they are numbered 1,2,3,4. Don’t let that confuse you.

Second and third entry (01 and 02 entry) are the same as the one we search in first three bits. When it comes to the fourth bit, he is “X” for entry 02.

X means don’t care or the third possible solution that can come out of TCAM table query. In the situation above, if we look at the second and third line of TCAM table, this search will make a match for both of entries. The fourth bit of “01” is matched and the fifth bit does not care. For “02” it will show true value at the encoder entrance as a fourth and fifth place do not care!

Based on the priority order from above, line “01” is the longest-prefix match and it is selected and send to encoder who will link that entry to Adjacency table entry for making the packet L2 ready. Remember, on this image, “01” is sent to Adjacency table as a pointer. It is pointing to Adjacency table entry 01 which will then be used use for this packet creation.

L2 header will be added to that packet and the packet will be sent out on port B to the neighbour.

TCAM PARALLEL SEARCH PROCESS INSIDE CIRCUITRY

Actually with CAM and TCAM chips the logic is slightly different that you might think.

For all entries that are matching the searched one, encoder entry will get “true” signal, and all not matched entries will show “false” output, no problems there. The catch is in the beginning of the process. Before search begins all entries when entered inside TCAM are closing the circuitry on TCAM word entry and show “true” at encoder side. All entries are temporarily in the match state. When parallel search is done it will brake all entries that have at least one bit that does not match the searched entry.

Here is the explanation of the “don’t care bit”, in the search process when the search gets to X bit (“don’t care bit”) it will not change the state of that matchline. That’s why No 2 and No3 lines made a match, and that’s why TCAM is perfect for longest-prefix lookup.

This also explains why TCAM memory is so power hungry. It needs to power on all circuits to be able to make a search not only the matched ones. Limited memory space and power consumption associated with a large amount of parallel active circuitry are the main issues with TCAM.

If we look at the right side of the Image 2 now, we see that adjacency table is built in RAM memory. Adjacency table uses ARP table and Routing table data for building pre-prepared L2 headers for every next-hop neighbour. As described before in “Router” section it will prepare the packet to be sent to Layer 1 and out the interface in a flash. Entries need to keep L2 data and this data does not change often. RAM memory is consequently perfect fit for adjacency table. Quick, not expensive, not space limited and not so power hungry.

TCAM and CAM memory usage inside networking devices(转)的更多相关文章

  1. GPU Memory Usage占满而GPU-Util却为0的调试

    最近使用github上的一个开源项目训练基于CNN的翻译模型,使用THEANO_FLAGS='floatX=float32,device=gpu2,lib.cnmem=1' python run_nn ...

  2. Shell script for logging cpu and memory usage of a Linux process

    Shell script for logging cpu and memory usage of a Linux process http://www.unix.com/shell-programmi ...

  3. 5 commands to check memory usage on Linux

    Memory Usage On linux, there are commands for almost everything, because the gui might not be always ...

  4. SHELL:Find Memory Usage In Linux (统计每个程序内存使用情况)

    转载一个shell统计linux系统中每个程序的内存使用情况,因为内存结构非常复杂,不一定100%精确,此shell可以在Ghub上下载. [root@db231 ~]# ./memstat.sh P ...

  5. Why does the memory usage increase when I redeploy a web application?

    That is because your web application has a memory leak. A common issue are "PermGen" memor ...

  6. Reducing and Profiling GPU Memory Usage in Keras with TensorFlow Backend

    keras 自适应分配显存 & 清理不用的变量释放 GPU 显存 Intro Are you running out of GPU memory when using keras or ten ...

  7. 【转】C++ Incorrect Memory Usage and Corrupted Memory(模拟C++程序内存使用崩溃问题)

    http://www.bogotobogo.com/cplusplus/CppCrashDebuggingMemoryLeak.php Incorrect Memory Usage and Corru ...

  8. Memory usage of a Java process java Xms Xmx Xmn

    http://www.oracle.com/technetwork/java/javase/memleaks-137499.html 3.1 Meaning of OutOfMemoryError O ...

  9. Redis: Reducing Memory Usage

    High Level Tips for Redis Most of Stream-Framework's users start out with Redis and eventually move ...

  10. detect data races The cost of race detection varies by program, but for a typical program, memory usage may increase by 5-10x and execution time by 2-20x.

    小结: 1. conflicting access 2.性能危害 优化 The cost of race detection varies by program, but for a typical ...

随机推荐

  1. 为啥要对jvm做优化?

    摘要:在jvm中有很多的参数可以进行设置,这样可以让jvm在各种环境中都能够高效的运行.绝大部分的参数保持默认即可. 本文分享自华为云社区<为什么需要对jvm进行优化,jvm运行参数之标准参数& ...

  2. 我们从 CircleCI 安全事件获得的3个经验教训

    CircleCI 作为业内最受欢迎的 CI/CD 平台提供商之一,有超过20万个 DevOps 团队使用其平台.该公司在今年1月在其官网报告了一起安全事件引起客户恐慌.在此事件中,有身份不明的恶意攻击 ...

  3. MySQL-SQL语句查询关键字

    1.SQL语句查询关键字 1.select:指定需要查找的字段信息,eg:select *,select name.同时select也支持对字段做处理,eg:select char_length(na ...

  4. 如何快速搭建 Maven私服Nexus【实践可行版】

    搭建私服Nexus Nexus 一般用来搭建位于组织或公司内部的 Maven 私服,代理所有的仓库(包括中央仓库),用户通过它就可以获取和管理所有所需的 Maven 构件. Nexus 开源版具有以下 ...

  5. C# Replace:一个熟悉而又陌生的替换

    前言 Replace 的作用就是,通过指定内容的替换,返回一个新字符串. 返回值中,已将当前字符串中的指定 Unicode 字符或 String 的 所有匹配项,替换为指定的新的 Unicode 字符 ...

  6. quasar打包时:Module not found: Can't resolve imported dependency "dayjs/plugin/customParseFormat"

    运行quasar build -m electron 后,报错如下: 看了这篇webpack 编译 element-plus 报错后,找到了报错的根源所在 于是,在quasar官方文档找到了针对web ...

  7. ubuntu lnmp环境搭建 LNMP(Ubuntu 20.04 + Nginx + PHP 7.1 + Mysql5.7)

    转载csdn: ubuntu lnmp环境搭建 LNMP(Ubuntu 20.04 + Nginx + PHP 7.1 + Mysql5.7)_ts3211的博客-CSDN博客_lnmp环境搭建

  8. Cobalt Strike 之:域内渗透

    郑重声明: 本笔记编写目的只用于安全知识提升,并与更多人共享安全知识,切勿使用笔记中的技术进行违法活动,利用笔记中的技术造成的后果与作者本人无关.倡导维护网络安全人人有责,共同维护网络文明和谐. Co ...

  9. NSAIDs以优化剂量治疗中轴型SpA:聚焦6周期间骶髂关节MRI变化

    NSAIDs以优化剂量治疗中轴型SpA:聚焦6周期间骶髂关节MRI变化 PresentID: OP0170 TREATMENT OF AXIAL SPONDYLOARTHRITIS WITH AN O ...

  10. 推荐系统[八]算法实践总结V2:排序学习框架(特征提取标签获取方式)以及京东推荐算法精排技术实战

    0.前言 「排序学习(Learning to Rank,LTR)」,也称「机器排序学习(Machine-learned Ranking,MLR)」 ,就是使用机器学习的技术解决排序问题.自从机器学习的 ...