Let's start by describing how the data on a freelist is laid out in memory. This is the first two blocks in freelist for thread id 3 in bin 3 (8 bytes):
+----------------+ | next* ---------|--+ (_S_bin[ 3 ].first[ 3 ] points here) | | | | | | | | | +----------------+ | | thread_id = 3 | | | | | | | | | | | +----------------+ | | DATA | | (A pointer to here is what is returned to the | | | the application when needed) | | | | | | | | | | | | | | | | | | +----------------+ | +----------------+ | | next* |<-+ (If next == NULL it's the last one on the list) | | | | | | +----------------+ | thread_id = 3 | | | | | | | +----------------+ | DATA | | | | | | | | | | | | | | | +----------------+
With this in mind we simplify things a bit for a while and say that there is only one thread (a ST application). In this case all operations are made to what is referred to as the global pool - thread id 0 (No thread may be assigned this id since they span from 1 to _S_max_threads in a MT application).
When the application requests memory (calling allocate()) we first look at the requested size and if this is > _S_max_bytes we call new() directly and return.
If the requested size is within limits we start by finding out from which bin we should serve this request by looking in _S_binmap.
A quick look at _S_bin[ bin ].first[ 0 ] tells us if there are any blocks of this size on the freelist (0). If this is not NULL - fine, just remove the block that _S_bin[ bin ].first[ 0 ] points to from the list, update _S_bin[ bin ].first[ 0 ] and return a pointer to that blocks data.
If the freelist is empty (the pointer is NULL) we must get memory from the system and build us a freelist within this memory. All requests for new memory is made in chunks of _S_chunk_size. Knowing the size of a block_record and the bytes that this bin stores we then calculate how many blocks we can create within this chunk, build the list, remove the first block, update the pointer (_S_bin[ bin ].first[ 0 ]) and return a pointer to that blocks data.
Deallocation is equally simple; the pointer is casted back to a block_record pointer, lookup which bin to use based on the size, add the block to the front of the global freelist and update the pointer as needed (_S_bin[ bin ].first[ 0 ]).
The decision to add deallocated blocks to the front of the freelist was made after a set of performance measurements that showed that this is roughly 10% faster than maintaining a set of "last pointers" as well.