参考

https://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Other-Builtins.html#Other-Builtins

https://en.wikipedia.org/wiki/Find_first_set#CTZ

Tool/library Name Type Input type(s) Notes Result for zero input
POSIX.1 compliant libc
4.3BSD libc
OS X 10.3 libc[2][21]
ffs Library function int Includes glibc.
POSIX does not supply the complementary log base 2 / clz.
0
FreeBSD 5.3 libc
OS X 10.4 libc[22]
ffsl
fls
flsl
Library function int,
long
fls ("find last set") computes (log base 2) + 1. 0
FreeBSD 7.1 libc[23] ffsll
flsll
Library function long long   0
GCC __builtin_ffs[l,ll,imax] Built-in functions unsigned int,

unsigned long,


unsigned long long,


uintmax_t
  0
GCC 3.4.0[24][25]

Clang 5.x [26][27]

__builtin_clz[l,ll,imax]

__builtin_ctz[l,ll,imax]
undefined
Visual Studio 2005 _BitScanForward[28]
_BitScanReverse[29]
Compiler intrinsics unsigned long,
unsigned __int64
Separate return value to indicate zero input 0
Visual Studio 2008 __lzcnt[30] Compiler intrinsic unsigned short,
unsigned int,
unsigned __int64
Relies on x64-only lzcnt instruction Input size in bits
Intel C++ Compiler _bit_scan_forward
_bit_scan_reverse[31]
Compiler intrinsics int   undefined
NVIDIA CUDA[32] __clz Functions 32-bit, 64-bit Compiles to fewer instructions on the GeForce 400 Series 32
__ffs 0
LLVM llvm.ctlz.*
llvm.cttz.*[33]
Intrinsic 8, 16, 32, 64, 256 LLVM assembly language Input size if arg 2
is 0, else undefined
GHC 7.10 (base 4.8), in Data.Bits countLeadingZeros
countTrailingZeros
Library function FiniteBits b => b Haskell programming language Input size in bits
— Built-in Function: int __builtin_ffs (unsigned int x)
  Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero.
  右起第一个1的位置
— Built-in Function: int __builtin_clz (unsigned int x)
  Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.
  左起0的个数
— Built-in Function: int __builtin_ctz (unsigned int x)
  Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined。
  右起0的个数
— Built-in Function: int __builtin_popcount (unsigned int x)
  Returns the number of 1-bits in x.

  x中1的个数
— Built-in Function: int __builtin_parity (unsigned int x)
  Returns the parity of x, i.e. the number of 1-bits in x modulo 2.

  x中1的个数的奇偶
上面函数都有long,long long版本[-l, -ll]。按上面表格看,也有-imax(eg, -i32 for uint32_t)
 
— Built-in Function: int32_t __builtin_bswap32 (int32_t x)

翻转32位数各字节

Returns x with the order of the bytes reversed; for example,
0xaabbccdd becomes 0xddccbbaa. Byte here always means
exactly 8 bits.

— Built-in Function: int64_t __builtin_bswap64 (int64_t x)

翻转64位数各字节

Similar to __builtin_bswap32, except the argument and return types
are 64-bit.

 
— Built-in Function: long __builtin_expect (long exp, long c)

分支预测

You may use __builtin_expect to provide the compiler with
branch prediction information. In general, you should prefer to
use actual profile feedback for this (-fprofile-arcs), as
programmers are notoriously bad at predicting how their programs
actually perform. However, there are applications in which this
data is hard to collect.

The return value is the value of exp, which should be an integral
expression. The semantics of the built-in are that it is expected that
exp == c. For example:

          if (__builtin_expect (x, 0))
foo ();

would indicate that we do not expect to call foo, since we expect x to be zero. Since you are limited to integral expressions for exp, you should use constructions such as

          if (__builtin_expect (ptr != NULL, 1))
error ();

when testing pointer or floating-point values.

— Built-in Function: void __builtin_prefetch (const void *addr, ...)

预取

This function is used to minimize cache-miss latency by moving data into
a cache before it is accessed.
You can insert calls to __builtin_prefetch into code for which
you know addresses of data in memory that is likely to be accessed soon.
If the target supports them, data prefetch instructions will be generated.
If the prefetch is done early enough before the access then the data will
be in the cache by the time it is accessed.

The value of addr is the address of the memory to prefetch.
There are two optional arguments, rw and locality.
The value of rw is a compile-time constant one or zero; one
means that the prefetch is preparing for a write to the memory address
and zero, the default, means that the prefetch is preparing for a read.
The value locality must be a compile-time constant integer between
zero and three. A value of zero means that the data has no temporal
locality, so it need not be left in the cache after the access. A value
of three means that the data has a high degree of temporal locality and
should be left in all levels of cache possible. Values of one and two
mean, respectively, a low or moderate degree of temporal locality. The
default is three.

          for (i = 0; i < n; i++)
{
a[i] = a[i] + b[i];
__builtin_prefetch (&a[i+j], 1, 1);
__builtin_prefetch (&b[i+j], 0, 1);
/* ... */
}

Data prefetch does not generate faults if addr is invalid, but the address expression itself must be valid. For example, a prefetch of p->next will not fault if p->next is not a valid address, but evaluation will fault if p is not a valid address.

If the target does not support data prefetch, the address expression is evaluated if it includes side effects but no other code is generated and GCC does not issue a warning.

__builtin_return_address(LEVEL)

—This function returns the return address of the current function,or of one of its callers. The LEVEL argument is number of frames to scan up the call stack. A value of ‘0’ yields the return address of the current function,a value of ‘1’ yields the return address of the caller of the current function,and so forth.

__builtin_alloca (https://linux.die.net/man/3/alloca)

alloca - allocate memory that is automatically freed

Synopsis

#include <alloca.h>

void *alloca(size_t size);

Description

The alloca() function allocates size bytes of space in the stack frame of the caller. This temporary space is automatically freed when the function that called alloca() returns to its caller.

Return Value

 

The alloca() function returns a pointer to the beginning of the allocated space. If the allocation causes stack overflow, program behavior is undefined.

Conforming to

This function is not in POSIX.1-2001.

There is evidence that the alloca() function appeared in 32V, PWB, PWB.2, 3BSD, and 4BSD. There is a man page for it in 4.3BSD. Linux uses the GNU version.

Notes

The alloca() function is machine- and compiler-dependent. For certain applications, its use can improve efficiency compared to the use of malloc(3) plus free(3). In certain cases, it can also simplify memory deallocation in applications that use longjmp(3) or siglongjmp(3). Otherwise, its use is discouraged.

Because the space allocated by alloca() is allocated within the stack frame, that space is automatically freed if the function return is jumped over by a call to longjmp(3) or siglongjmp(3).

Do not attempt to free(3) space allocated by alloca()!

Notes on the GNU version

Normally, gcc(1) translates calls to alloca() with inlined code. This is not done when either the -ansi, -std=c89, -std=c99, or the -fno-builtin option is given (and the header <alloca.h> is not included). But beware! By default the glibc version of <stdlib.h> includes <alloca.h> and that contains the line:

#define alloca(size)   __builtin_alloca (size)

with messy consequences if one has a private version of this function.

The fact that the code is inlined means that it is impossible to take the address of this function, or to change its behavior by linking with a different library.

The inlined code often consists of a single instruction adjusting the stack pointer, and does not check for stack overflow. Thus, there is no NULL error return.

 

GCC提供的几个內建函数的更多相关文章

  1. python基础(二)字符串內建函数详解

    字符串 定义:它是一个有序的字符的集合,用于存储和表示基本的文本信息,''或""或''' '''中间包含的内容称之为字符串特性:1.只能存放一个值2.不可变,只能重新赋值3.按照从 ...

  2. Python学习进程(8)字符串內建函数

        Python字符串內建函数实现了string模块的大部分方法,并包括了对Unicode编码方式的支持.     (1)capitalize(): 将字符串的第一个字母变成大写,其他字母变小写. ...

  3. gcc提供的原子操作函数

    gcc从4.1.2提供了__sync_*系列的built-in函数,用于提供加减和逻辑运算的原子操作.其声明如下: type __sync_fetch_and_add (type *ptr, type ...

  4. 第四章:Python基础の快速认识內置函数和操作实战

    本課主題 內置函数介紹和操作实战 装饰器介紹和操作实战 本周作业 內置函数介紹和操作实战 返回Boolean值的內置函数 all( ): 接受一個可以被迭代的對象,如果函数裡所有為真,才會真:有一個是 ...

  5. GCC 提供的原子操作

    gcc从4.1.2提供了__sync_*系列的built-in函数,用于提供加减和逻辑运算的原子操作. 其声明如下: type __sync_fetch_and_add (type *ptr, typ ...

  6. 转载:GCC 提供的原子操作

    转载自:GCC 提供的原子操作 GCC 提供的原子操作 gcc从4.1.2提供了__sync_*系列的built-in函数,用于提供加减和逻辑运算的原子操作. 其声明如下: type __sync_f ...

  7. Atomic Builtins - Using the GNU Compiler Collection (GCC) GCC 提供的原子操作

    http://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/Atomic-Builtins.html gcc从4.1.2提供了__sync_*系列的built-in函数,用 ...

  8. JS事件之自建函数bind()与兼容性问题解决

    JavaScript事件绑定常用方法 对象.事件 = 函数; 它只能同时为一个对象的一个事件绑定一个响应函数 不能绑定多个,如果有多个,后面的会覆盖前面的 addEventListener() 此方法 ...

  9. python学习-day11-内建函数

    python-内建函数 -int:将字符串转换为数字 a = " print(type(a),a) b = int(a) print(type(b),b) num = " v = ...

随机推荐

  1. python+NLTK 自然语言学习处理五:词典资源

    前面介绍了很多NLTK中携带的词典资源,这些词典资源对于我们处理文本是有大的作用的,比如实现这样一个功能,寻找由egivronl几个字母组成的单词.且组成的单词每个字母的次数不得超过egivronl中 ...

  2. CENTOS7 修改网卡名称为eth[012...],格式

    具体操作是修改/etc/default/grub文件 在GRUB_CMDLINE_LINUX一行中添加net.ifnames=0 biosdevname=0 保存文件后然后运行 grub2-mkcon ...

  3. ADT和Android SDK的安装

    本文主要涉及Android开发环境搭建时的Eclipse.ADT及Android SDK的安装方法,还有遇到的两个问题及其解决办法.其中,ADT的安装介绍了在线和离线安装两种方式.  1.安装ecli ...

  4. AbstractQueuedSynchronizer(一)

    应该将子类定义为非公共内部帮助器类,一般并发包类用内部类Sync sync来继承并实现.为实现依赖于先进先出 (FIFO) 等待队列的阻塞锁和相关同步器(信号量.事件,等等)提供一个框架.此类的设计目 ...

  5. Scala window下安装

    第一步:Java 设置 检测方法前文已说明,这里不再描述. 如果还为安装,可以参考我们的Java 开发环境配置. 接下来,我们可以从 Scala 官网地址 http://www.scala-lang. ...

  6. 声明:关于该博客部分Java等方向知识参考来源的说明

    [声明] 该博客部分代码是通过学习黑马程序员(传智播客)视频后,参考毕向东.张孝祥.杨中科等老师的公开课视频中讲解的代码,再结合自己的理解,自己手敲上去的,一方面加深自己的理解和方便以后自己用到的时候 ...

  7. P4965 薇尔莉特的打字机

    题目 P4965 薇尔莉特的打字机 快到十二点了正在颓废突然发现了一道好题 虽然毒瘤,但确实是容斥原理的好题啊,做法也特别巧妙(标程 思路 题目大意(怕自己突然忘) n个初始字符,m个操作(加入或删除 ...

  8. 转的es6 =>函数

    原文地址 箭头函数=>无疑是ES6中最受关注的一个新特性了,通过它可以简写 function 函数表达式,你也可以在各种提及箭头函数的地方看到这样的观点--"=> 就是一个新的 ...

  9. VNC服务安装、配置与使用

    原帖地址: http://blog.itpub.net/519536/viewspace-607549/ 该文档配置环境是RHEL,不同系统可能会有差别,本人测试过centos,ubuntu 1.确认 ...

  10. 算法(Algorithms)第4版 练习 1.5.9

    不可能.如果是weighted quick-union的话,6的父节点应该是5,而不是5的父节点是6.