Detecting Stack Overflows on an Embedded System

Stack overflow

Stack overflows can be one of the most difficult bugs to squash. An overflow is often hidden in a cryptic, non-static error state. Sometimes the application runs without issue. Sometimes it may hard fault and reset the hardware without warning. It may be overwriting data unbeknownst to application designer. The worst error state is the dreaded Heisenbug: an error that seems to vanish when you go looking for the cause.

The GNU Compiler Collection (GCC) has a few features to help detect stack overflows. But before we get in to that, it’s time to review how a typical stack may work. A stack is an implementation, so the typical stack I am describing may not be implemented the same way on your system.

Try Our ARM® Embedded IoT Dev Kit

Netburner ARM Cortex M7 embedded Development Kit for IoT product development and industrial automation.

Or, learn more about NetBurner IoT.

Stack Basics

A stack can be thought of as memory space an application is free to utilize. Along with the heap, a stack will be found in the RAM of your device. Given a memory block in RAM, the heap can be found at the beginning, growing down as an application requests heap space. On the other end, the stack will be found at the end, growing from the bottom up. In a multi-threaded environment, there may be several tasks, each with a stack, utilizing the end of the memory space. If your application runs out of stack space, or a task runs out of it’s allotted stack space, a stack overflow occurs.

A Stack Overflow Occurred. Why?

There are several ways a stack overflow can occur. The basic scenario is that the stack runs out of space. The most common example of this is creating a large array that is bigger than the stack: char inputBuffer[5][10000]; may not be the best idea on a small embedded system. Another failure case is recursion in function calls. Every time a function in entered, a stack frame is created and placed on the stack. Recursively enter a function too many times, and the stack frames may fill up the task stack to the point of overflow. An easily missed cause of stack overflow occurs when an Interrupt takes place. Interrupt code is placed on the current task’s stack. Even if the stack size is planned to fit the local variables, an Interrupt may occur that overflows the stack.

Errors

So, you said GCC can help?

GCC provides a few compiler flags to help diagnose and identify stack problems.

-fstack-usage

A great first step and heading off stack overflows is to use -fstack-usage. This compiler flags generates worst case scenario stack usage on a per function basis. This information is generated at compile time and placed in .su files in the project directory. Upon analysis, functions with a high stack requirement should be examined to determine if the size requirements are required.

ip.cpp:190:6:BOOL IsBroadCast(IPADDR4, int)           8 dynamic,bounded
ip.cpp:130:5:int GetMultiHomeInterface4(IPADDR4, int) 40 dynamic,bounded
ip.cpp:702:6:void DoIPPacket(PoolPtr, PEFRAME, WORD)  32 dynamic,bounded
ip.cpp:489:7:BYTE* GetData(PIPPKT)                    4 static
ip.cpp:515:6:void FixHeaderAndSend(PoolPtr, PIPPKT)   56 dynamic,bounded
ip.cpp:858:6:void KillStack()                         0 static

-fstack-check

A method of detecting stack overflows is to create a canary space at the end of each task. This space is filled with some known data. If this data is ever modified, then the application has written past the end of the stack. But a common problem with this approach is that a large, empty array, followed by a second array will cause GCC to jump over the canary. Any data written to the second array will be corrupting memory outside of the stack. -fstack-check helps to prevent this problem. This flag forces the compiler to write “0” every 2^N bytes when declaring an array, where N is selected as a flag when compiling GCC. The default value of 4096-byte intervals is suitable for many systems, but I needed to lower this to 256-byte intervals on smaller, embedded stack spaces. Once enabled, an application needs to create a canary space the size of 2^N. This will prevent the jump over flaw described earlier. Protecting the canary space can be accomplished in a number of ways.

Canary Protection with CPU Watchpoints with -fstack-check

One of the fastest way to protect the canary is through CPU watchpoints. Many embedded processors include a way to watch a single point or range of memory addresses for read/write access. Turn on the CPU watchpoint on the processor and watch the canary space for write access. When too large of array is declared on the stack, the “0” write will occur in the canary space and the processor will enter an Interrupt. Use this Interrupt to display pertinent information such as the program counter to get a line number of when the stack overflow occurred. This real-time Interrupt can quickly zero in on stack overflows.

What happens when the application is running a multi-threaded environment? The CPU watchpoint will need to be constantly changed to the running task. Add code to accomplish this to the scheduler so that the watchpoint is changed whenever a task switch occurs.

Canary Software Protection with -fstack-check

The application may not have access to the CPU watchpoints described above. If that is the case, then a software implementation could check the canary space at a predefined interval. Depending on how often the canary checks occur, significant overhead may be added to the application when utilizing software protection. Another potential pitfall when relying on this method is that the stack overflow may have corrupted memory before the application catches on that an overflow has occurred.

In a multi-threaded application, the canary space could be checked whenever a task switch occurs. This will allow the error state to print out the culprit task, but cannot print the program counter, as the application has likely moved passed the actual stack overflow. However, it could be narrowed down to a function by checking the canary space with -finstrument-functions.

-finstrument-functions allows the application to inject code when entering or exiting a function. Profiler functions that match the following declaration must be added to the application

void __cyg_profile_func_enter(void * this, void * call) __attribute__((no_instrument_function));
void __cyg_profile_func_exit(void * this, void * call) __attribute__((no_instrument_function));

The functions are passed two parameters: the address of the function being entered or exited, and the address from which it was called. Adding a canary check here allows the application to print the address of the current function in an error state.

Bugs in the dark

-fstack-protector

Sometimes, stack overflows may be malicious. Accepting user input without bounds checking opens up an application to a buffer overflow attack. Using -fstack-protector can help harden the code and prevent these types of attacks.

Stack Protector works by pushing a known integer on to the stack just after the function return pointer. Immediately before the function returns, this memory address is checked to verify that the known integer is still set. If it has been modified, the application enters an error state and jumps to a defined handler function void __stack_chk_fail(void). Use this function to halt and display a relevant error message to indicate a buffer overflow has occurred.

Share this post

Subscribe to our Newsletter

Get monthly updates from our Learn Blog with the latest in IoT and Embedded technology news, trends, tutorial and best practices. Or just opt in for product change notifications.

Leave a Reply
Click to access the login or register cheese