A collection of resources relating to C programming, including Makefiles.
Similar to arrays, linked lists store a sequence of data all of the same type. One benefit of linked lists is that elements are created as you need them, so linked lists use only the memory they need. Another benefit is that adding or removing elements is constant time.
As a reminder, a structure is a collection of variables. Unlike arrays, the elements of a structure are permitted to have different types. When memory is reserved for a struct, a chunk large enough to hold all the variables is allocated. A simple structure might look like
struct Animal {
char name[20];
Colour colour;
int powerLevel;
};
This creates a new type struct Animal
. The Animal
in struct Animal
is called the struct tag.
In this example, our struct is a record of the name, colour, and power level of an animal.
We can declare an instance of our struct and set its fields using, for example,
struct Animal animal;
strncpy(animal.name, "cat", 20);
animal.colour = RED;
animal.powerLevel = 20;
Or we can use the heap
struct Animal *animal = malloc(sizeof(struct Animal));
strncpy(animal->name, "cat", 20);
animal->colour = RED;
animal->powerLevel = 20;
A linked list can be created from an existing structure by adding a self-referential field. In other words, a field whose type is a pointer to that same structure type. In the following example, we have modified the struct we saw earlier to have a new self-referential field next
with type struct Animal *
struct Animal {
char name[20];
Colour colour;
int powerLevel;
struct Animal *next;
};
By having the next
field of a node point at the subsequent node, we are able to chain a sequence of structs together to form a list. If there is no subsequent node, then next
should be NULL.
The first thing you need to do is create a variable that will be used to keep track of the start of your linked list, ie. the head of the list.
struct Animal *head = NULL;
We set this variable to be NULL initially, which indicates that there is nothing in the list. When we add an element to the list for the first time, we will set our head
variable to point to the new element.
Appending a node means to attach the node to the end of the list. To add to the end of the list, we need a reference to the last element in the list. This reference is called the tail
of the list.
// Create our new element
struct Animal *newThing = malloc(sizeof(struct Animal));
// Set next to be NULL, as there is nothing after this element.
newThing->next = NULL;
// Add the new element to the list
if (head == NULL) {
// Nothing in the list yet, so make our new element the head.
head = newThing;
} else {
// Add our element to the end of the list.
struct Animal *tail = head;
while (tail->next != NULL) {
tail = tail->next;
}
tail->next = newThing;
}
// Go off and use our new element...
As you can see in the example, we are taking into account the possibility that we don't have anything in our list yet. If we find head
to be NULL
, we are setting head
to our new element and then moving on. If head
does have something in it, then we search through our list for the last element (the tail
) then add our new element after it.
We only need to search for the tail
if we are not tracking it - if you have a tail
variable as well as a head
variable, you can use it (but remember to update tail
to point to the new element afterwards):
tail->next
to be the new node animal
animal->next
to NULL
, to indicate the end of the listtail
to point to the new node.Whereas appending would add a node the end of the list, inserting is more general. A node can be inserted anywhere in the middle of a list or at either end. In the following, we assume that node
represents the node after which the new node is to be inserted, and animal
is the node to be inserted. In other words, we are inserting between node
and node->next
.
animal->next
to be node->next
node->next
to be animal
This is simple enough, all we had to do was update the next
fields to sandwich our new node into the list. However, we have to be careful. If we happen to be inserting at the end of the list and if we have a tail
node that tracks the end of the list, we'll need to update this to point at the newly inserted node.
Once our list is populated, we might want to go through each element of the list. Since we want to start at the beginning, we'll need a reference to the first node in the list. This reference to the first node is called the head
of the list.
Let's try printing the name of each animal in our list. The following assumes we have a variable animal
which is a pointer to our struct.
animal
to the head of the listanimal->name
animal
to point to the next node in the listAs long as we ensure that the last node in our list has next
set to NULL
, then we can use that property to check whether we have reached the end of the list.
If we think about this, we can come up with a general pattern for traversing the loop
for(animal = head; animal != NULL; animal = animal->next) {
/* ... do stuff ... */
}
Everything discussed up to this point is sufficient to build a satisfactory linked list. However, if we are using a head
and tail
, it can sometimes be nice to put these into a struct of their own. This also makes it easy to add other fields that provide information about the list as a whole, such as the length of our list.
struct List {
struct Animal *head;
struct Animal *tail;
int size;
};
Initialising this struct is equivalent to creating an empty list. We initialise the fields to sensible values, NULL for the two pointers and 0 for the size.
You can think of structs as a primitive version of classes. When you're coding a data structure, like a linked list or a tree, you should write methods to perform operations on your data structure. For any data structure, you'll probably need a constructor and a destructor.
The constructor allocates (using malloc) memory for the data structure and the destructor frees it. The constructor also initialises the data structure, either by setting fields of the struct from the constructor arguments or by setting them to sensible defaults, like NULL
for pointers. These ideas should be familiar to you from Java or Python or C++. In Python, __init__
is the constructor for a class.
For a linked list, we'll need some other methods, for example we might need methods to
append
elements to the endremove
an elementfind
an element to check if its in the listWriting a function for each operation, like those above, means that you don't have to litter code that's specific to your data structure throughout your code and even better means that you can test the methods to make sure your data structure is robust.
It is common to use the make program to compile code. This is especially the case for larger projects with multiple binaries or complicated compilation procedures.
An advantage of using make is having a single, simple command to use to compile a program. After creating a Makefile, you are able to run any target to produce a program, clean up the working environment, create a tar distribution or anything else that you think would be useful.
When run, make will search for a Makefile in the current directory. The contents of the Makefile contain the instructions that make will use to perform some action, such as build your project.
Makefiles follow a set structure, and it is important that you follow this to ensure it will work as expected. Of particular note is the indentation of commands within a rule - these must be TAB characters. If they are not, then make will substitute your rule with a default, which is undesirable.
Rules are composed of targets and dependencies and have an associated command. To build a rule, make first checks that all dependencies are met. Each dependency specifies a file which is necessary for the rule to be built. If the file does not exist, make searches the Makefile for a target that matches the dependency.
The format for a rule is:
target: dependencies
command
The target
is the name of the file to be output, or a dummy name (in the case of all
, clean
, etc. This will have zero or more dependencies, which is a list of files (which may or may not have their own rules in the Makefile). Finally there are zero or more commands listed for the rule, indented with a single TAB character underneath the target line.
A target will only be run if the target has dependencies that are newer than it (assuming the target is not specified under .PHONY
. When using a target, make compares the timestamp of the target against the timestamps of the dependencies. If any of the dependencies have a newer timestamp than the target, make will run the commands for that target again. If the target is newer, no commands for that rule will be run.
An example from the Makefile below is:
server: server.c shared.o
$(CC) $(CFLAGS) shared.o server.c -o server
This rule will compile a binary called server
. This depends on two files, one of which has a rule defined for it elsewhere in the Makefile. When make runs this rule, it will work out that shared.o
has its own rule, and will then go about checking its dependencies to find out if its rule needs to be run before running the server rule.
By default, when you run make, it will look at the first rule in the Makefile and any rules that it depends on. If you want to use a different rule as your default, or to ensure make always uses the correct rule as the default, include a .DEFAULT
rule in your Makefile.
Make supports variables that have a similar syntax to Bash variables. Some common variables are
Variables are set like so: CC = gcc
. The value of a variable can be used in rules by prepending a dollar sign ($) and wrapping the variable name in parentheses (although single-letter variables do not require parentheses), like so: $(CC)
The following is an example Makefile. It will compile two programs, server
and client
. There is also a shared object that is compiled and linked in to both of these binaries, shared.o
. There is also a special debug
rule, which updates the flags for compilation before running all of the rules for the binaries listed in the TARGETS
variable.
CC = gcc
CFLAGS = -Wall -pedantic -std=gnu99
DEBUG = -g
TARGETS = server client
# Mark the default target to run (otherwise make will select the first target in the file)
.DEFAULT: all
# Mark targets as not generating output files (ensure the targets will always run)
.PHONY: all debug clean
all: $(TARGETS)
# A debug target to update flags before cleaning and compiling all targets
debug: CFLAGS += $(DEBUG)
debug: clean $(TARGETS)
# Create a shared object for inclusion in our programs
shared.o: shared.c shared.h
$(CC) $(CFLAGS) -c shared.c -o shared.o
server: server.c shared.o
$(CC) $(CFLAGS) shared.o server.c -o server
client: client.c shared.o
$(CC) $(CFLAGS) shared.o client.c -o client
# Clean up our directory - remove objects and binaries
clean:
rm -f $(TARGETS) *.o
Pointers are a part of the C programming language that are very useful, but can be confusing when starting off learning C.
A good resource to learn about pointers is Pointer fun with Binky. This is a video developed by Stanford University for their Computer Science Library. There are also some resources that accompany the video which may be of use.