Memory allocation

1. What's the largest array that you can allocate?

On the stack?
In the data segment?
On the heap?

2. Stack

First, some exploration in the terminal.

View maximum stack size of a process:

➜  cat /proc/17032/limits | grep stack
Max stack size            8388608              unlimited            bytes

View allocated stack size of a process:

➜  pmap 17032 | grep stack
00007ffdd41eb000    124K rw---   [ stack ]

Second, confirm programatically.

An idea that comes to mind is to iterate up from 1 until the program fails. I expect it will fail with a segmentation fault. As long as we print every line, then we can just look at our terminal to see what the last successful message was and we’ll know our answer.

But that’s not very elegant.

It would be better if we could catch the segfault. For one, this would let us jump by values greater than 1 and then when we segfault we could reduce the value that we jumped by and try again. This would let us zero in on exactly which value causes the segfault.

We’ll need a global variable to store the size of the allocation because we’ll be changing it inside of our signal handler. When we catch a segfault, we’ll half the size and try again. When the size is down to 1, we’ll log the final value and exit.

initialize global variables

ulong change = 1 << 20;
ulong i = 1;

Used by 1

The core of this file will be a function that tries to allocate some bytes on the stack and a main function that iterates up from 1 by the value of change. Each iteration, it will call the function to try to allocate some bytes. If the function segfaults, reset i to its previous value and reduce the size of the change by half and then try again. When the size of the change reaches 1, we’re done.

/maxstack.c

@{maxstack.c includes}
@{initialize global variables}
@{define segfault signal handler}

@{dummy `tryallocate` function that just allocates bytes on the stack}

int main()
{
    @{setup segfault signal handler}
    while (change != 1) {
        tryallocate();
        i += change;
        printf("Current allocation: %lu\n", i);
    }
    printf("Max allocation: %lu\n", i);
}

Our tryallocate function:

dummy `tryallocate` function that just allocates bytes on the stack

void tryallocate() {
    char buf[i];
    buf[0] = 0;
    return;
}

Used by 1

Note: I noticed that I was never segfaulting without the buf[0] = 0 line. I thought that might be because it was getting inlined. But I even tried adding a __attribute__((noinline)) and that didn’t fix it. (You can’t declare it as static because the size isn’t constant.)

And here’s our signal handler that resets i to its previous value and then decreases the change so our next iteration might succeed.

define segfault signal handler

@{safe print}
void handleSegfault(int i) {
    safeprintf("Received SIGSEGV at %d bytes\n", i);
    i -= change;
    change >>= 1;
    return;
}

Used by 1

A note on signal handling from CS:APP 8.5.5. Don’t call printf from a signal handler

Safe Signal Handling Signal handlers are tricky because they can run concurrently with the main program and with each other, as we saw in Figure 8.31. If a handler and the main program access the same global data structure concurrently, then the results can be unpredictable and often fatal. We will explore concurrent programming in detail in Chapter 12. Our aim here is to give you some conservative guidelines for writing handlers that are safe to run concurrently. If you ignore these guidelines, you run the risk of introducing subtle concurrency errors. With such errors, your program works correctly most of the time. However, when it fails, it fails in unpredictable and unrepeatable ways that are horrendously difficult to debug. Forewarned is forearmed!
- G0. Keep handlers as simple as possible. The best way to avoid trouble is to keep your handlers as small and simple as possible. For example, the handler might simply set a global flag and return immediately; all processing associated with the receipt of the signal is performed by the main program, which periodically checks (and resets) the flag.
- G1. Call only async-signal-safe functions in your handlers. A function that is async-signal-safe, or simply safe, has the property that it can be safely called from a signal handler, either because it is reentrant (e.g., accesses only local variables; see Section 12.7.2), or because it cannot be interrupted by a signal handler. Figure 8.33 lists the system-level functions that Linux guarantees to be safe. Notice that many popular functions, such as printf, sprintf, malloc, and exit, are not on this list. The only safe way to generate output from a signal handler is to use the write function (see Section 10.1). In particular, calling printf or sprintf is unsafe. To work around this unfortunate restriction, we have developed some safe functions, called the Sio (Safe I/O) package, that you can use to print simple messages from signal handlers.

Here’s a very naive signal-safe print. (The write system call is safe.)

safe print

#define SAFEPRINTF_MAXBUF 200
int safeprintf(char *msg, ...)
{
    va_list args;
    va_start(args, msg);
    char *buf = malloc(sizeof(char) * SAFEPRINTF_MAXBUF);
    size_t written = vsnprintf(buf, SAFEPRINTF_MAXBUF, msg, args);
    return write(1, buf, written);
}

Used by 1 2

See the man pages for details on va_list and va_start.

setup segfault signal handler

struct sigaction action;
action.sa_handler = &handleSegfault;
sigemptyset(&action.sa_mask);
sigaction(SIGSEGV, &action, NULL);

Used by 1

maxstack.c includes

#include <stdarg.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

Used by 1

And that’s it for testing our stack size.

This unfortunately doesn’t work. For some reason, my signal handler’s not being called.

3. Other memory explorations.

/memory-explorations.c

#include <stdarg.h>
#include <limits.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

@{safe print}
void handleSegfault() {
    safeprintf("Received SIGSEGV at %d bytes\n");
    exit(0);
}
char buf[400] = {0};
void maxStackAllocation() {
    uint i = 0;
    struct sigaction action;
    action.sa_handler = &handleSegfault;
    sigemptyset(&action.sa_mask);
    sigaction(SIGSEGV, &action, NULL);
    for (uint i = 0; i < UINT_MAX; i+=1e2) {
        char someArray[i];
        someArray[0] = 0;
        sprintf(buf, "%d\t%d\n", i, someArray[1]);
        safeprintf(buf);
    }
}

int main()
{
    maxStackAllocation();
}

➜  cache git:(main) ✗ ./a.out | head
0       0
10000   0
# ...
8370000 0
[1]    17203 segmentation fault (core dumped)  ./a.out

Data segment

ulimit reports an unlimited data segment size.

➜  lib64 ulimit -a | grep data
-d: data seg size (kbytes)          unlimited

Since the data segment is initialized at compile time, we need to change it outside of the C language itself. I’m using a shell script to loop over the size values.

data segment

SIZE=0
while true; do
    cat > /tmp/scratch.c <<EOF
#include <stdio.h>
char someArray[$SIZE];
int main()
{
    printf("$SIZE\n");
}
EOF
    cc -O0 -o /tmp/scratch /tmp/scratch.c
    /tmp/scratch
    SIZE=$((SIZE+1000000000))
done

This segfaults at a size of 50 gigabytes

➜ ./scratch.sh
0
1000000000
# ...
50000000000
./scratch.sh: line 15: 20181 Segmentation fault      /tmp/scratch

Heap

heap

#include <limits.h>
#include <stdio.h>
#include <stdlib.h>

int main()
{
    for (ulong i = 1; i < ULONG_MAX; i++) {
        char *someMemory = malloc(sizeof(char) * 2 << i);
        printf("Allocated 2 << %d (%lu) bytes\n", i, 2 << i);
        free(someMemory);
    }
}

➜  .cache git:(main) ✗ cc heap.c  && ./a.out | head -80
Allocated 2 << 1 (4) bytes
# ...
Allocated 2 << 62 (9223372036854775808) bytes
Allocated 2 << 63 (0) bytes

Memory allocation

1. What's the largest array that you can allocate?

2. Stack

3. Other memory explorations.

Data segment

Heap

4. How fast can you allocate memory?