Similar to arrays, linked lists store a sequence of data all of the same type. One benefit of linked lists is that elements are created as you need them, so linked lists use only the memory they need. Another benefit is that adding or removing elements is constant time.
Structs
As a reminder, a structure is a collection of variables. Unlike arrays, the elements of a structure are permitted to have different types. When memory is reserved for a struct, a chunk large enough to hold all the variables is allocated. A simple structure might look like
struct Animal {
char name[20];
Colour colour;
int powerLevel;
};
This creates a new type struct Animal
. The Animal
in struct Animal
is called the struct tag.
In this example, our struct is a record of the name, colour, and power level of an animal.
We can declare an instance of our struct and set its fields using, for example,
struct Animal animal;
strncpy(animal.name, "cat", 20);
animal.colour = RED;
animal.powerLevel = 20;
Or we can use the heap
struct Animal *animal = malloc(sizeof(struct Animal));
strncpy(animal->name, "cat", 20);
animal->colour = RED;
animal->powerLevel = 20;
Basic Chaining
A linked list can be created from an existing structure by adding a self-referential field. In other words, a field whose type is a pointer to that same structure type. In the following example, we have modified the struct we saw earlier to have a new self-referential field next
with type struct Animal *
struct Animal {
char name[20];
Colour colour;
int powerLevel;
struct Animal *next;
};
By having the next
field of a node point at the subsequent node, we are able to chain a sequence of structs together to form a list. If there is no subsequent node, then next
should be NULL.
Creating a linked list
The first thing you need to do is create a variable that will be used to keep track of the start of your linked list, ie. the head of the list.
struct Animal *head = NULL;
We set this variable to be NULL initially, which indicates that there is nothing in the list. When we add an element to the list for the first time, we will set our head
variable to point to the new element.
Appending
Appending a node means to attach the node to the end of the list. To add to the end of the list, we need a reference to the last element in the list. This reference is called the tail
of the list.
// Create our new element
struct Animal *newThing = malloc(sizeof(struct Animal));
// Set next to be NULL, as there is nothing after this element.
newThing->next = NULL;
// Add the new element to the list
if (head == NULL) {
// Nothing in the list yet, so make our new element the head.
head = newThing;
} else {
// Add our element to the end of the list.
struct Animal *tail = head;
while (tail->next != NULL) {
tail = tail->next;
}
tail->next = newThing;
}
// Go off and use our new element...
As you can see in the example, we are taking into account the possibility that we don't have anything in our list yet. If we find head
to be NULL
, we are setting head
to our new element and then moving on. If head
does have something in it, then we search through our list for the last element (the tail
) then add our new element after it.
We only need to search for the tail
if we are not tracking it - if you have a tail
variable as well as a head
variable, you can use it (but remember to update tail
to point to the new element afterwards):
- Set
tail->next
to be the new nodeanimal
- Set
animal->next
toNULL
, to indicate the end of the list - Update
tail
to point to the new node.
Inserting
Whereas appending would add a node the end of the list, inserting is more general. A node can be inserted anywhere in the middle of a list or at either end. In the following, we assume that node
represents the node after which the new node is to be inserted, and animal
is the node to be inserted. In other words, we are inserting between node
and node->next
.
- Set
animal->next
to benode->next
- Update
node->next
to beanimal
This is simple enough, all we had to do was update the next
fields to sandwich our new node into the list. However, we have to be careful. If we happen to be inserting at the end of the list and if we have a tail
node that tracks the end of the list, we'll need to update this to point at the newly inserted node.
Traversing the list
Once our list is populated, we might want to go through each element of the list. Since we want to start at the beginning, we'll need a reference to the first node in the list. This reference to the first node is called the head
of the list.
Let's try printing the name of each animal in our list. The following assumes we have a variable animal
which is a pointer to our struct.
- Set
animal
to the head of the list - Print
animal->name
- Update
animal
to point to the next node in the list - Repeat the last two steps until we reach the end of the list.
As long as we ensure that the last node in our list has next
set to NULL
, then we can use that property to check whether we have reached the end of the list.
If we think about this, we can come up with a general pattern for traversing the loop
for(animal = head; animal != NULL; animal = animal->next) {
/* ... do stuff ... */
}
List Struct
Everything discussed up to this point is sufficient to build a satisfactory linked list. However, if we are using a head
and tail
, it can sometimes be nice to put these into a struct of their own. This also makes it easy to add other fields that provide information about the list as a whole, such as the length of our list.
struct List {
struct Animal *head;
struct Animal *tail;
int size;
};
Initialising this struct is equivalent to creating an empty list. We initialise the fields to sensible values, NULL for the two pointers and 0 for the size.
Methods
You can think of structs as a primitive version of classes. When you're coding a data structure, like a linked list or a tree, you should write methods to perform operations on your data structure. For any data structure, you'll probably need a constructor and a destructor.
The constructor allocates (using malloc) memory for the data structure and the destructor frees it. The constructor also initialises the data structure, either by setting fields of the struct from the constructor arguments or by setting them to sensible defaults, like NULL
for pointers. These ideas should be familiar to you from Java or Python or C++. In Python, __init__
is the constructor for a class.
For a linked list, we'll need some other methods, for example we might need methods to
append
elements to the endremove
an elementfind
an element to check if its in the list
Writing a function for each operation, like those above, means that you don't have to litter code that's specific to your data structure throughout your code and even better means that you can test the methods to make sure your data structure is robust.