# CSAPP malloclab

This is the writeup for CSAPP Malloclab

I am writing this blog after spending two days working on the malloclab. Finally came out with a piece of shit. I got 85/100 in this lab, which almost “totally references” others’ implementations. I should do more work on C programming. I still have a long way to go. :(

## Goals

In this lab, we are going to write our own dynamic memory allocator, including the functionalities of what we usually use: malloc, free, and realloc. We will have to finish four functions in the provided lab handout: mm_init, mm_malloc, mm_free, and mm_realloc. Similar to the original malloc (ptmalloc), our dynamic allocator is an immediate general-purpose allocator, which (1) we may not make any assumption on the data structure we are going to store. (2) have to make allocation immediately after calling the allocation function. (3) data should align with the 8-byte alignment.

We will have a check program mdriver.c, after we do make in the handout directory, we are able to run the driver and test our code on different functionalities. (Caution: the trace file does not provide us with the handout, you may search on the web or use what I found in my CSAPP github repo)

Similar to the syscall that manages the virtual memory: brk, sbrk, and mmap (which we are not going to use), the handout provides us some syscall-like functions in memlib.c: mem_sbrk, mem_heap_lo, mem_heap_hi, mem_heapsize, and mem_pagesize.

## Implementations

### Implicit Free List

Implicit free list implementation is actually provided by the code in the book. The data structure used in this implementation is as follows:

By the way, this project doesn’t allow us to define any explicitly stated structure like above, so we may first find out a good way to do the pointer arithmetic.

We are suggested to use the macro to solve this issue. The reason not to use different helper functions to implement pointer arithmetic is that function calls are expensive. Each pointer arithmetic is really simple, but to make a function call needs to allocate a stack, jump to the function body, and jump back. This lab will be going to measure both the memory utilization of the implementation and the throughput. Macros, on the other hand, will do the pointer arithmetic in the compile time. The compiler will transform different macros to the corresponding assembly instructions and replace them everywhere in the binary ELF file.

The book provides us with the macro below. With those macros, we can easily manipulate the heap block data structure.

The implementation details will be discussed through the sequence of the calling routine.

mm_init function is straightforward. First open part of the memory region for a default heap data structure: prologue and epilogue. These two data blocks are used to constrain the search will not go ahead of the heap start (prologue) and will not go beyond the heap end (epilogue). Then we extend a CHUNKSIZE of the heap to store values.

extend_heap function mainly does two things: (1) use sbrk to open new space for heap. (2) make sure the newly opened space will be consistent with the original blocks, which we will use coalesce function for the newly inserted free block.

coalesce function will coalesce the previous and next free block to make a continuous big free block. Discuss the condition case by case based on the allocation bit that is stored in the header and footer of the previous and next blocks will be very easy to implement.

mm_malloc function will deal with first aligning the size (including header, footer, and data block size) provided by the user, trying to find a fit block of this size. There are two different search algorithms that will be discussed later: First Fit Search and Best Fit Search. If it fails, try to extend the heap by at least a CHUNKSIZE and return the new ptr as the allocated block.

place function just to place the newly allocated block, mark it as allocated and do the split if the block is too large to fit.

mm_free and mm_realloc will be easy to implement, so just place the code here.

First fit search version of find_fit helper function:

Only implement the first fit search is not good enough, the utilization is fine, but throuput is too low.

The next fit search version of find_fit helper function is shown below. We will need an additional global rover to store what is the last allocated block. We search from that block to the end, then search from the beginning until around back to rover block.

The result of the test shows that next fit is a lot better than the first fit search (probably because most of the test trace are written in malloc a lot of block at the beginning)

I didn’t implement the best fit search, but I did implement it on my wisc easier version of malloc lab. So the logic is easy and can add on to both the first fit and next fit search. Have an additional variable to store the best size fit and the best fit block pointer. If we find another block fits better than the current one, we replace it with the new one. Once we find a perfect fit (same size), then we terminate the search and return that block.

### Segregate Free List

Rather than using the implicit free list, we could make the free list visible to our program so that we can search through the free list instead of the. Though there are still many different implementations on the free list. I choose a relatively more sufficient implementation: segregate free list.

Segregate free list, by definition, is to put multiple free blocks within a range of size into a separate free list. We could have multiple ranges of sizes, here for malloclab, I choose the range cutoff: 16 32 64 128 256 512 1024 2048 4096. Each free block will have a new data structure as follow:

Since new data structure is introduced, we will need new macrod to deal with the free list.

In order to store the free list, we will need additional space in the heap block to store it, so we will first make more room in the heap block to store the list.

sgrgt_list_init function is used to initialize the segregate list before we initialize any prologue and epilogue block in the mm_init function.

sgrgt_list_insrt and sgrgt_list_remov functions are used to insert and remove a block to/from the segregate list. The implementation is easy, just a double-linked list insert and remove. A block should be inserted if it was coalesced in coalesce function and removed if it was to place as an allocated block in place function or coalesced in coalesce function to make a bigger free block.

With all those helper functions above (and some small logic like setting the prev and next pointer when a free a block, remove the redundant pointer after alloc the block. Check out the CSAPP github repo), we could successfully implement the segregate list version of malloc. The improvement is huge compared to the first version.

## Debugging Tips

Modify the Makefile to compile the program with debugging symbols and also turns on the gprof profile check.

gprof is a great tool to find out where we could do to improve the program. It will give a detailed call graph of the program and also some analysis on it. Here is what I have on the segregate free list version

## Summary

Malloclab is inevitably the hardest lab among all labs in CSAPP. But I learned a lot from it (though my code is still a piece of shit). The next step will be to take look at what real libc ptmalloc is doing by studying its source code. Then we could head to the last lab - proxylab.

## References

Source code from the book

https://zhuanlan.zhihu.com/p/150100073

https://littlecsd.net/2019/02/14/csapp-Malloclab/