C How to check if an address is available for access?

How to check if an address is available for access?

Colleagues, I've been wanting to ask this question for many years.

If I access an address that is not part of the allocated memory area, for example, if I go beyond the boundaries of the allocated area, then I start having all sorts of nightmares - the program unloads an emergency memory dump, stops working, etc.

Is it possible, before accessing a certain pointer, to find out whether its value is associated with the allocated memory area? For example, a function like
C:
int isalloc (void *ptr)
?

I appreciate the answers,
Ogogon.
 
Don't even try. C can't give you that, it's your responsibility to keep track. The malloc(3) function doesn't map directly to operating system calls. If you use it to allocate a few bytes, either there's already room left in your mapped address space, or it would trigger a call to mmap(2) (which is a syscall), mapping you at least one entire page. Consequently, you could access any address in that page, which is just "undefined behavior" as far as C is concerned, and the OS won't have any objections because the page is mapped.
 
How to check if an address is available for access?
The standard technique is called "tombstoning".

When you free memory, you have some sort of meta-data which you can check for the memory's state. Ancient MacOS used to use this for its memory management strategy.

Another approach could be to mprotect(2) the freed data. But there are some limitations with this too.
 
You need to use Valgrind. Maybe sanitizers can do this as well.

You have two choices. The easiest requires no code instrumentation, just gdb.

  1. Open a terminal and run 'valgrind --vgdb-error=0 {your exe}. Valgrind will print instructions for attaching gdb using vgdb.
  2. Open a second terminal and run gdb and run the command printed in the first terminal.
  3. Set a break point and continue to it.
  4. If you get a gdb message about your auto load safe path do what it says - modify your ~/.gdbinit adding add-auto-load-safe-path. Kill the gdb inferior and redo steps 1 to 3.
  5. Now you can use the "mc" (memcheck) command. The subcommands that you are likely to need are
    check_memory addressable addr [len]
    check_memory defined addr [len]
    There is also "examine bits"
    xb addr [len]
Alternatively you can do the same sort of thing using annotation.
  • Include <valgrind/memcheck.h> in your source
  • Use VALGRIND_CHECK_MEM_IS_ADDRESSABLE(addr, len) to check addressability
  • Use VALGRIND_CHECK_MEM_IS_DEFINED(addr, len) to check definedness
There's no easy way to have the same functionality as the gdb "xb" command with annotation. You will need to use VALGRIND_GET_VBITS(addr, vbits_array, nbytes) which also means you need to create the vbits array somewhere. See this article for an example of using the VBITS annotation https://accu.org/journals/overload/20/110/floyd_1905/
 
Why and when you need this?
The observer pattern / C++'s std::weak_ptr<T> are an example of why this might be useful.

I.e imagine in a game, you have an enemy targeting another enemy. If the second enemy gets killed (i.e falling off a cliff), the original enemies target variable automatically gets set to NULL as a flag that it should target something else rather than potentially risking accessing invalid memory (or needing to manually handle a "target_has_died" event).

Whether it gets set to NULL or if some kind of mechanism passively returns NULL during the check, ends with similar results.
 
I.e imagine in a game, you have an enemy targeting another enemy. If the second enemy gets killed (i.e falling off a cliff), the original enemies target variable automatically gets set to NULL as a flag that it should target something else rather than potentially risking accessing invalid memory (or needing to manually handle a "target_has_died" event).
If I implement this game, I will keep the object of killed enemy undestroyed with flag like killed/dead and the pointer will be valid. There is at least existing corpse of the enemy for some seconds.
 
This is actually quite tricky as the faulting instruction needs to be ignored after handling the SIGSEGV. Quite complex on x64. Perhaps long-jumps could help.

You can just establish a mapping under the failing pointer in the signal handler :D
 
Some options:
  • Walk /proc/<my-pid>/map to find your address (or not)
  • Catch the SIGSEGV signal
  • Fork and see whether the child segfaults

OP isn't clear whether the need is only for accessible memory or if it is also for valid memory.

Memory that has been allocated then deallocated is addressable but not valid. Similarly memory between the bottom of the stack and the stack guard page is accessible but not valid.

You won't get a segfault accessing invalid memory.
 
If I access an address that is not part of the allocated memory area, for example, if I go beyond the boundaries of the allocated area, then I start having all sorts of nightmares - the program unloads an emergency memory dump, stops working, etc.
If you access an address that is outside of your process container, then you will get a segmentation fault; you aren't supposed to BE (or look!) there. The OS is protecting you because your code is obviously confused (what did it EXPECT to be there?). Like walking into an ice cream parlor and wondering why there aren't any SHOES for sale! ("Sorry, the shoe store is two doors down...")

If you access an address IN your process container that doesn't contain anything specific, then the results are indeterminate. Like opening a shoebox in a shoe store and not knowing (ahead of time) if it even has SHOES in it!

This is typically only possible in languages where you have raw access to pointers. They can be manipulated to refer to locations that don't exist or that don't "contain" anything. More modern languages force you to refer to specific constructs in the language (variables, points of control, etc.).

The process container's default size (and bits thereof) can be altered as a configuration parameter. But, it is already probably larger than you need (in most cases).

The contents of a "random" memory location within it can be altered by PUTTING something in it.
 
If I implement this game, I will keep the object of killed enemy undestroyed with flag like killed/dead and the pointer will be valid. There is at least existing corpse of the enemy for some seconds.
There are many solutions around the issue (including unique IDs).

A slight flaw I foresee in your approach is (admittedly I am also exaggerating the issue a little bit), when do you know to reuse the existing data pointed to by the target variable?
What might happen is that the memory gets reused as an enemy on the same team as the one targeting the original memory location. So you would end up with a weird situation of the enemy attacking its own team mate. In short, you can't rely that the other enemy has seen its killed "flag" within those few arbitrary seconds during the death animation. There might have been a lag spike (deltatime cranked up to a massive value) or the other enemy might have been "frozen" and its usual onTick events aren't processing for the duration.

Yes, you can add lots of ad-hoc "validity" checks, but frankly it is much cleaner if the pointer can simply be reset to NULL (you would be checking for NULL anyway since the enemy might not have yet acquired a target in the first place).
 
I see no problems with this carefully laid out plan.
One does not just walk into Mordor ...

Anecdote: Many decades ago, I used the Watcom 32-bit environment on Windows 3.1. There, you could READ memory in many places, including the NULL pointer and similar small addresses. You could even write to the NULL pointer, because the x86 architecture stores the interrupt vector table, and the first few interrupts are actually not used frequently. I was working on a 1/2 million line system in C, and many pieces of it used NULL pointers by mistake. For individual variables and small structs, writing and reading them worked great. The acid test was to run the DOS "print" command to a networked printer (Novell Network) after running our program: It used the most interrupts, and if the print command hung, you had messed up by writing to NULL and had to reboot. Ah, the bad old days.
 
You can just establish a mapping under the failing pointer in the signal handler :D

You can also
  • map some memory WRITE and EXEC
  • copy some opcodes into the mapped memory (possibly based on the code around the faulting opcode) that will allow you to recover
  • modify the context in the signal handler to return to your patched code
  • let the signal handler finish
To do this you need to ensure that sysctl kern.elf64.allow_wx (or 32) is off.

That's more useful with SIGFPE from x87 where the x87 stack has more state than just the exception flags. It isn't enough to just clear the flags and resume execution.
 
Back
Top