Programming

A collection of resources relating to C programming, including Makefiles.

Linked Lists

Similar to arrays, linked lists store a sequence of data all of the same type. One benefit of linked lists is that elements are created as you need them, so linked lists use only the memory they need. Another benefit is that adding or removing elements is constant time.

Structs

As a reminder, a structure is a collection of variables. Unlike arrays, the elements of a structure are permitted to have different types. When memory is reserved for a struct, a chunk large enough to hold all the variables is allocated. A simple structure might look like


struct Animal {
    char name[20];
    Colour colour;
    int powerLevel;
};

This creates a new type struct Animal. The Animal in struct Animal is called the struct tag.

In this example, our struct is a record of the name, colour, and power level of an animal.

We can declare an instance of our struct and set its fields using, for example,


struct Animal animal;
strncpy(animal.name, "cat", 20);
animal.colour = RED;
animal.powerLevel = 20;

Or we can use the heap


struct Animal *animal = malloc(sizeof(struct Animal));
strncpy(animal->name, "cat", 20);
animal->colour = RED;
animal->powerLevel = 20;

Basic Chaining

A linked list can be created from an existing structure by adding a self-referential field. In other words, a field whose type is a pointer to that same structure type. In the following example, we have modified the struct we saw earlier to have a new self-referential field next with type struct Animal *


struct Animal {
    char name[20];
    Colour colour;
    int powerLevel;
    struct Animal *next;
};

By having the next field of a node point at the subsequent node, we are able to chain a sequence of structs together to form a list. If there is no subsequent node, then next should be NULL.

Creating a linked list

The first thing you need to do is create a variable that will be used to keep track of the start of your linked list, ie. the head of the list.

struct Animal *head = NULL;

We set this variable to be NULL initially, which indicates that there is nothing in the list. When we add an element to the list for the first time, we will set our head variable to point to the new element.

Appending

Appending a node means to attach the node to the end of the list. To add to the end of the list, we need a reference to the last element in the list. This reference is called the tail of the list.


// Create our new element
struct Animal *newThing = malloc(sizeof(struct Animal));
// Set next to be NULL, as there is nothing after this element.
newThing->next = NULL;

// Add the new element to the list 
if (head == NULL) { 
    // Nothing in the list yet, so make our new element the head. 
    head = newThing; 
} else { 
    // Add our element to the end of the list. 
    struct Animal *tail = head; 
    while (tail->next != NULL) { 
        tail = tail->next; 
    } 
    tail->next = newThing; 
} 

// Go off and use our new element...

As you can see in the example, we are taking into account the possibility that we don't have anything in our list yet. If we find head to be NULL, we are setting head to our new element and then moving on. If head does have something in it, then we search through our list for the last element (the tail) then add our new element after it.

We only need to search for the tail if we are not tracking it - if you have a tail variable as well as a head variable, you can use it (but remember to update tail to point to the new element afterwards):

Set tail->next to be the new node animal
Set animal->next to NULL, to indicate the end of the list
Update tail to point to the new node.

Inserting

Whereas appending would add a node the end of the list, inserting is more general. A node can be inserted anywhere in the middle of a list or at either end. In the following, we assume that node represents the node after which the new node is to be inserted, and animal is the node to be inserted. In other words, we are inserting between node and node->next.

Set animal->next to be node->next
Update node->next to be animal

This is simple enough, all we had to do was update the next fields to sandwich our new node into the list. However, we have to be careful. If we happen to be inserting at the end of the list and if we have a tail node that tracks the end of the list, we'll need to update this to point at the newly inserted node.

Traversing the list

Once our list is populated, we might want to go through each element of the list. Since we want to start at the beginning, we'll need a reference to the first node in the list. This reference to the first node is called the head of the list.

Let's try printing the name of each animal in our list. The following assumes we have a variable animal which is a pointer to our struct.

Set animal to the head of the list
Print animal->name
Update animal to point to the next node in the list
Repeat the last two steps until we reach the end of the list.

As long as we ensure that the last node in our list has next set to NULL, then we can use that property to check whether we have reached the end of the list.

If we think about this, we can come up with a general pattern for traversing the loop


for(animal = head; animal != NULL; animal = animal->next) {
	/* ... do stuff ... */
}

List Struct

Everything discussed up to this point is sufficient to build a satisfactory linked list. However, if we are using a head and tail, it can sometimes be nice to put these into a struct of their own. This also makes it easy to add other fields that provide information about the list as a whole, such as the length of our list.


struct List {
    struct Animal *head;
    struct Animal *tail;
    int size;
};

Initialising this struct is equivalent to creating an empty list. We initialise the fields to sensible values, NULL for the two pointers and 0 for the size.

Methods

You can think of structs as a primitive version of classes. When you're coding a data structure, like a linked list or a tree, you should write methods to perform operations on your data structure. For any data structure, you'll probably need a constructor and a destructor.

The constructor allocates (using malloc) memory for the data structure and the destructor frees it. The constructor also initialises the data structure, either by setting fields of the struct from the constructor arguments or by setting them to sensible defaults, like NULL for pointers. These ideas should be familiar to you from Java or Python or C++. In Python, __init__ is the constructor for a class.

For a linked list, we'll need some other methods, for example we might need methods to

append elements to the end
remove an element
find an element to check if its in the list

Writing a function for each operation, like those above, means that you don't have to litter code that's specific to your data structure throughout your code and even better means that you can test the methods to make sure your data structure is robust.

Makefiles

It is common to use the make program to compile code. This is especially the case for larger projects with multiple binaries or complicated compilation procedures.

An advantage of using make is having a single, simple command to use to compile a program. After creating a Makefile, you are able to run any target to produce a program, clean up the working environment, create a tar distribution or anything else that you think would be useful.

Makefiles

When run, make will search for a Makefile in the current directory. The contents of the Makefile contain the instructions that make will use to perform some action, such as build your project.

Makefiles follow a set structure, and it is important that you follow this to ensure it will work as expected. Of particular note is the indentation of commands within a rule - these must be TAB characters. If they are not, then make will substitute your rule with a default, which is undesirable.

Rules

Rules are composed of targets and dependencies and have an associated command. To build a rule, make first checks that all dependencies are met. Each dependency specifies a file which is necessary for the rule to be built. If the file does not exist, make searches the Makefile for a target that matches the dependency.

The format for a rule is:

target: dependencies 
	command

The target is the name of the file to be output, or a dummy name (in the case of all, clean, etc. This will have zero or more dependencies, which is a list of files (which may or may not have their own rules in the Makefile). Finally there are zero or more commands listed for the rule, indented with a single TAB character underneath the target line.

A target will only be run if the target has dependencies that are newer than it (assuming the target is not specified under .PHONY. When using a target, make compares the timestamp of the target against the timestamps of the dependencies. If any of the dependencies have a newer timestamp than the target, make will run the commands for that target again. If the target is newer, no commands for that rule will be run.

An example from the Makefile below is:

server: server.c shared.o 
	$(CC) $(CFLAGS) shared.o server.c -o server

This rule will compile a binary called server. This depends on two files, one of which has a rule defined for it elsewhere in the Makefile. When make runs this rule, it will work out that shared.o has its own rule, and will then go about checking its dependencies to find out if its rule needs to be run before running the server rule.

By default, when you run make, it will look at the first rule in the Makefile and any rules that it depends on. If you want to use a different rule as your default, or to ensure make always uses the correct rule as the default, include a .DEFAULT rule in your Makefile.

Variables

Make supports variables that have a similar syntax to Bash variables. Some common variables are

CC: specifies the compiler, such as gcc, g++ or clang,
CFLAGS: specifies compilation flags, and
LDFLAGS: specifies linker flags.

Variables are set like so: CC = gcc. The value of a variable can be used in rules by prepending a dollar sign ($) and wrapping the variable name in parentheses (although single-letter variables do not require parentheses), like so: $(CC)

Example Makefile

The following is an example Makefile. It will compile two programs, server and client. There is also a shared object that is compiled and linked in to both of these binaries, shared.o. There is also a special debug rule, which updates the flags for compilation before running all of the rules for the binaries listed in the TARGETS variable.

CC = gcc
CFLAGS = -Wall -pedantic -std=gnu99
DEBUG = -g
TARGETS = server client

# Mark the default target to run (otherwise make will select the first target in the file)
.DEFAULT: all
# Mark targets as not generating output files (ensure the targets will always run)
.PHONY: all debug clean

all: $(TARGETS)

# A debug target to update flags before cleaning and compiling all targets
debug: CFLAGS += $(DEBUG)
debug: clean $(TARGETS)

# Create a shared object for inclusion in our programs
shared.o: shared.c shared.h
	$(CC) $(CFLAGS) -c shared.c -o shared.o

server: server.c shared.o
	$(CC) $(CFLAGS) shared.o server.c -o server

client: client.c shared.o
	$(CC) $(CFLAGS) shared.o client.c -o client

# Clean up our directory - remove objects and binaries
clean:
	rm -f $(TARGETS) *.o

Pointers

Pointers are a part of the C programming language that are very useful, but can be confusing when starting off learning C.

Pointer fun with Binky

A good resource to learn about pointers is Pointer fun with Binky. This is a video developed by Stanford University for their Computer Science Library. There are also some resources that accompany the video which may be of use.