Skip to content

UB21⚓︎

Author: Yikai Cui.

Definition⚓︎

An lvalue having array type is converted to a pointer to the initial element of the array, and the array object has register storage class (6.3.2.1)

当一个具有数组类型的左值被转换成了指向数组首个元素的指针,而该数组具有register存储类型时。

Description⚓︎

Except when it is the operand of the sizeof operator, or typeof operators, or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined. (from ISO/IEC 9899:2023(E) 6.3.2.1, 3)

一个类型为“某个类的数组”的表达式会被转换成类型为“指向这个类”的指针的表达式,具体指向数组的首个对象,并且不是左值,除非作为sizeof操作符、typeof操作、单元运算符&的操作数,或者一个字符串常量被用于初始化数组。如果数组对象具有register存储类型,那么行为是未定义的。

As stated in Undefined Behavior # 20, the address of any part of a register object cannot be computed. By converting an array expression to a pointer expression, the address of the first element of the array will taken during the evaluation of the converted expression. Thus we cannot determine the behavior of such conversion of the register array expression.

第20条未定义行为中有介绍:一个具有“register”存储类型的对象的任何一部分的地址都不能被以直接或间接方式计算。通过上述的这种从数组类型到指针类型的转换,将会在接下来的“evaluation”过程中计算数组首个元素的地址。因此无法确定这种操作的行为。

Code⚓︎

UB21.1.c
#include<stdio.h>

int main() {
    register int arr[10] = {0};
    printf("%d", arr);      // undefined behavior! (1)
}
  1. Undefined behavior! Here arr is an array of int, which will be converted to a pointer to int, of whom the evaluation would cause an undefined behavior.

View source


UB21.2.c
#include<stdio.h>

int main() {
    register int arr[3] = {0};
    printf("%d", arr[2]);   // undefined behavior! (1)
}
  1. Undefined behavior! Here arr[2] is seen as * (arr + 2) (see array subscripting (6.5.2.1)), which also includes array-to-pointer conversion.

View source

Configurations⚓︎

OS: Microsoft Windows 11 22H2

gcc -v : gcc version 11.2.0 (GCC), x86_64-w64-mingw32

compile and run commands: gcc UB21.1.c -o UB21.1.exe && ./UB21.1.exe

OS: Microsoft Windows 11 22H2

cl.exe Version Code: 19.35.32215, Arcitecture: amd64, Version: 8.1 or Windows 10 SDK version

compile and run commands: cl.exe /FeUB21.1 UB21.1.c && ./UB21.1.exe

OS: Ubuntu 20.04 x86_64 GNU/Linux

clang -v : 17.0.0, Target: x86_64-pc-linux-gnu

compile and run commands: clang UB21.1.c -o UB21.1.exe && ./UB21.1.exe

OS and gcc -v: same as UB21.1.c MinGW

compile and run commands: gcc UB21.2.c -o UB21.2.exe && ./UB21.2.exe

OS and cl.exe version: same as UB21.1.c MSVC

compile and run commands: cl.exe /FeUB21.2 UB21.2.c && ./UB21.2.exe

OS and clang -v: same as UB21.1.c Clang

compile and run commands: clang UB21.2.c -o UB21.2.exe && ./UB21.2.exe

Behaviors⚓︎

  • Compilation unsuccessful
  • Compiler messages: "error: address of register variable 'arr' requested"
  • Compilation unsuccessful
  • Compiler messages: "error C2103: '&' on register variable"
  • Compilation unsuccessful
  • Compiler messages: "error: address of register variable requested"
  • Compilation successful
  • Output: 0
  • Note: If we take a look at the assembly script the compiler generates, we find out that the array arr is actually in the registers. MinGW takes a different approach to translate arr[2], directly reading from the compiler instead of array subscripting.
  • Compilation unsuccessful
  • Compiler messages: "error C2103: '&' on register variable"
  • Compilation unsuccessful
  • Compiler messages: "error: address of register variable requested"

Advice⚓︎

As we see in the behaviors part, the use of an element of a register array will cause compatibility issue (e.g. UB21.2.c, which uses arr[1], can be compiled in MinGW, while fails with MSCV and Clang). And given that array subscripting is one of the most frequently used operation on an array. Therefore programmers should avoid using register arrays when coding.

In author's opinion, the register array is actually quite awkward. If a register array is too long, compiler cannot put it into register file for faster accessing latency. If a register array is short enough to fit into register file, then using a bunch of register variables will be a better solution, avoiding array subscripting problem.