UB43⚓︎
Author: Yikai Cui.
Definition⚓︎
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object (6.5.6).
一个指向数组对象,或刚好超过数组末尾的指针加减一个整数,得到的结果不指向同一个数组对象或者数组对象的边界。
Description⚓︎
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i + n-th and i − n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If the pointer operand and the result do not point to elements of the same array object or one past the last element of the array object, the behavior is undefined. If the addition or subtraction produces an overflow, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
当一个具有整数类型的表达式与一个指针相加或相减时,结果具有指针类型。如果指针指向一个数组对象的一个元素,并且数组足够大,那么结果指向一个相对于原始元素的元素,使得结果元素的下标与原始元素的下标之差等于该整数表达式。换句话说,如果表达式 P 指向数组对象的第 i 个元素,表达式 (P)+N (等价于 N+(P)) 和 (P)-N (其中 N 的值为 n) 分别指向数组对象的第 i + n 个和第 i − n 个元素,如果它们存在的话。此外,如果表达式 P 指向数组对象的最后一个元素,表达式 (P)+1 指向数组对象的最后一个元素之后的一个元素,如果表达式 Q 指向数组对象的最后一个元素之后的一个元素,表达式 (Q)-1 指向数组对象的最后一个元素。如果指针操作数和结果不指向同一个数组对象或数组对象的最后一个元素之后的一个元素,行为是未定义的。如果加法或减法产生溢出,行为是未定义的。如果结果指向数组对象的最后一个元素之后的一个元素,它不应该作为一个一元 * 运算符的操作数。
Code⚓︎
#include<stdio.h>
int main() {
char arr[10] = {0};
char *p1 = arr;
char *p2 = p1 + 0xffffff00; // Undefined Behavior! (1)
printf("sizeof pointer in this environment is %d\n", sizeof(p1));
printf("p1 = %016llx or %08x or %x\n", p1, p1);
printf("p2 = %016llx or %08x or %x\n", p2, p2);
if (p2 > p1) {
printf("p2 > p1");
} else if (p2 < p1) {
printf("p2 < p1");
} else if (p1 == p2) {
printf("p1 == p2");
}
}
- Undefined Behaviour! Here, p2 does not point to one past the final element of the array
arr[10]
This code snippet aims to demonstrate the undefined behavior of UB43. The code snippet is compiled and run on two different platforms (with different architectures), MSVC x86 and MinGW x86_64.
Configurations⚓︎
OS: Microsoft Windows 11 22H2
cl.exe Version Code: 19.35.32215
, Arcitecture: x86
, Version: 8.1 or Windows 10 SDK version
compile and run commands: cl.exe /FeUB43 UB43.c && ./UB43.exe
OS: Microsoft Windows 11 22H2
gcc -v
: gcc version 11.2.0 (GCC)
, x86_64-w64-mingw32
compile and run commands: gcc UB43.c -o UB43.exe && ./UB43.exe
Behaviors⚓︎
- Compilation successful
- Output
actually this is quite strange why p2 < p1 is printed.
Advice⚓︎
Programmers should be careful about the platform the code runs on. As is demonstrated in the behaviors part, Some code that seems to be functioning well in one platform might not be functional on another.