Data Structures Part 2: Arrays

In the previous section, we answered the question of just what a data structure is. Now it’s time to finally start diving in head first and looking at our first real data structure: arrays.

Arrays are the simplest and most fundamental of the data structures we will be covering in this series. In fact, pretty much all programming languages come with arrays built right in!

A Bit About Computer Memory

First of all, yes, that pun is totally intended!

Anyway, before we jump into looking at arrays we need to first understand the basics of computer memory. Don’t worry, we won’t be looking at all of the excruciating details of how the memory works in this series!

When we run a program, that application needs somewhere to store and access all of the data used in the program (integers, strings, etc.). All of this information is stored in the random access memory (RAM). We sometimes also refer to this memory as working memory, or simply just memory.

NOTE: It’s worth mentioning here that this explanation of how computer memory works is a gross over-simplification. The real picture requires a lot more complexity to fully understand, but that’s outside the scope of this series. This explanation should suffice for our purposes.

How RAM is Organized

To understand how computer memory is organized, I want you to picture a bookcase. A very, very tall bookcase. When I say a tall bookcase, I mean one that has (literally) billions of shelves.

Each shelf on this bookcase is assigned a number, which we refer to as it’s address. These memory addresses start at 0 and continue to grow as you move down. So, a visual representation of memory might look like this:

Diagram of how RAM is organized.
Again, this is a bit of an over-simplification, but it works to illustrate the point I want to make. Just remember, there are literally billions of these memory addresses in a modern computer!

Each address in memory can store 8-bits (1-byte) of data. This is refereed to as byte addressable memory.

There’s a whole lot more to know about how memory in a computer works, but this will be enough detail for us to understand arrays. If anyone is interested in a much more in depth look at computer memory let me know in the comments below!

What is an Array?

Okay, with that little bit about memory out of the way, just what is an array?

To answer that question, let’s first recall what a variable is. As you’ll recall, a variable is nothing but a container where we can store a value of some type. So, we could store a single integer in an int variable, or a single character in a char. What if we want to store several values?

One approach might be to create a variable for every value. This isn’t very efficient, nor is it scalable. Just consider an example where we want to store every character in a text file. This file might be thousands of characters long. In addition, what if we want to store text files of different sizes? The single variable approach for every value simply would not work here!

This is where arrays come into play. An array is just a contiguous section of memory. An array is declared with a data type (which is the type of data that will be stored in it. E.G. int, char, float, etc.) and a size, which is the number of slots in the array. So, an int array of size 6 can store 6 integers.

We could declare this int array in C as follows:

int exampleArray[6];

Pretty simple concept so far, right? If not, the following sections should help further clear things up!

How Arrays are Stored in Memory

Now you’re about to see why we had to have that mini-lesson about computer memory at the beginning of this!

We’ve already established that an array is just a contiguous section of memory. Consider the example array we created above in C. When we compile the program, the compiler will set aside 6 contiguous blocks of memory that are big enough to store 6 integers. On most systems, an integer is represented by 4-bits, so the compiler will set aside 6, 4-bit sections of memory, or 24 continuous bits. Our example array would look similar to this in memory:

Example of how an empty array looks in memory.

NOTE: You’ll notice that I have switched to representing the memory for arrays as being horizontal, instead of vertical. This is simply because it’s easier to visualize arrays this way. It works exactly the same way. In this representation, the memory addresses increase as you move to the right.

Accessing Array Members

Notice that each section of memory in the above diagram has a number above it. These are not memory addresses, but rather an offset into the array. We refer to this number as an index. Most languages, including C, start numbering array indices at 0, but there are some oddball languages that start at 1.

If we wanted to access the first (zeroth) element of our array, we could do so with:

exampleArray[0];

Similarly, we could access the 6th element with:

exampleArray[5];

Going Deeper: Want to know some more details of how array indices work? Then this little section is for you! When you access an element in an array, like in the examples above, there is some magic going on behind the scenes. First, the computer determines the starting address for the exampleArray. It will then look at the index you’re calling and calculate the offset into that memory address to access that element. So, let’s say we want to access the 4th element in an array of integers. Each integer takes up 4-bits of storage, so the memory address the computer would read would be the address of the first element of the array + (4 * i), where i is the index we are accessing. In our case, 3 (remember, arrays start at 0). At the end of this post I’ll include a bonus section that explores this idea a bit more!

Let’s Look at an Example

Alright, we’ve got the basic theory behind arrays down, now let’s look at an example. Let’s start by taking a look at the code:

#include <stdio.h>

int main(int argc, char **argv)
{
    // An example array, pre-populated with some values
    int exampleArray[10] = {44, 14, 6, 7, 90, 100, 15, 77, 83, 23};
    // The size of our array
    int n = 10;
    // Counter for the loop
    int i;

    for (i = 0; i < n; i++)
    {
        printf("exampleArray[%d]: %d\n", i, exampleArray[i]);
    }

    return 0;
}

First, on line 6, we instantiate an integer array of size 10. Notice here that I have populated the array with some values.

Then, on lines 12-15, we print out the value of each entry in the array. Running this program would produce the following output:

exampleArray[0]: 44
exampleArray[1]: 14
exampleArray[2]: 6
exampleArray[3]: 7
exampleArray[4]: 90
exampleArray[5]: 100
exampleArray[6]: 15
exampleArray[7]: 77
exampleArray[8]: 83
exampleArray[9]: 23

Pretty straight-forward, eh?

NOTE: If you are curious how to dynamically determine the size of an array (in C), then read the bonus section at the end of this post! It will include how to do that. It’s also worth noting that some other languages make getting the size of an array extremely easy. For example, in Java you could do exampleArray.length. But this isn’t meant to be a lesson in programming languages, just the data structures 🙂

A Few More Things

Let’s cover just a few more things about arrays. First, let’s take a look at how to modify the value of an element of an array. This is extremely straight-forward. For example, let’s say we want to set the 4th element of exampleArray to the value 67. This could be done as follows:

exampleArray[3] = 67;

Just for completeness, let’s modify our example a bit:

#include <stdio.h>

int main(int argc, char **argv)
{
    // An example array, pre-populated with some values
    int exampleArray[10] = {44, 14, 6, 7, 90, 100, 15, 77, 83, 23};
    // The size of our array
    int n = 10;
    // Counter for the loop
    int i;

    for (i = 0; i < n; i++)
    {
        exampleArray[i] = exampleArray[i] * i;
        printf("exampleArray[%d]: %d\n", i, exampleArray[i]);
    }

    return 0;
}

As you can see, this code is almost identical to the old code, with the exception of line 14. Here, we set the array element at the current index of our loop to the value of the current value of that array element at the current array index. Running this will now produce:

exampleArray[0]: 0
exampleArray[1]: 14
exampleArray[2]: 12
exampleArray[3]: 21
exampleArray[4]: 360
exampleArray[5]: 500
exampleArray[6]: 90
exampleArray[7]: 539
exampleArray[8]: 664
exampleArray[9]: 207

Bonus Section: More About Array Indexes

I have marked this section as bonus material, but I feel like it’s well worth a read! I want to take a deeper look at how array indexing works. Let’s consider this example code:

#include <stdio.h>

int main(int argc, char **argv)
{
    // An example array, pre-populated with some values
    int exampleArray[10] = {44, 14, 6, 7, 90, 100, 15, 77, 83, 23};
    // The size of our array
    int n = 10;
    // Counter for the loop
    int i;

    printf("%d\n", exampleArray[3]);

    return 0;
}

At this point, it should be abundantly clear what’s going on here. We are simply printing the value of the 4th element of exampleArray. So, this code will simply print out the value 7 to the console.

That’s all well and good, but let’s look at what exactly is happening when we call exampleArray[3].

First of all, the program has an address to where exampleArray is in memory. This address points to the start of the first item in the array. So, in our case, the address that is stored for exampleArray will point to exampleArray[0]. Let’s just say, for the sake of argument, that this address is 4,120,194,304 (0xf5952d00 in hex).

To get the value at the index 3, there will need to be some magic behind the scenes for the program to know what offset in memory to read from. The first step is that the compiler looks at the type of data stored in our array. In this case, it’s an array of int values. An int is 4-bits.

Now the compiler looks at the index we want to read from, in this case, 3. So, the program needs to load the value that is 3, 4-bit blocks away from the starting address of the array. In other words, it needs to read 4*3 bits ahead of the address it has for the array. So, the final value read would be:

0xf5952d00 + (4 * 3)

So, the value at index 3 is located at the memory address: 4,120,194,316, or 0x4120194316, in hex.

Dynamically Getting the Size of an Array in C

Here’s one more bonus thing to consider, how can we dynamically calculate the size of an array? Let’s reconsider the example from earlier:

#include <stdio.h>

int main(int argc, char **argv)
{
    // An example array, pre-populated with some values
    int exampleArray[10] = {44, 14, 6, 7, 90, 100, 15, 77, 83, 23};
    // The size of our array
    int n = 10;
    // Counter for the loop
    int i;

    for (i = 0; i < n; i++)
    {
        exampleArray[i] = exampleArray[i] * i;
        printf("exampleArray[%d]: %d\n", i, exampleArray[i]);
    }

    return 0;
}

Notice that we are simply storing the size of our array in the variable n. This is not a very scalable way of doing things, so let’s take a look at how it would be done dynamically.

#include <stdio.h>

int main(int argc, char **argv)
{
    // An example array, pre-populated with some values
    int exampleArray[10] = {44, 14, 6, 7, 90, 100, 15, 77, 83, 23};
    // The size of our array
    int n = sizeof(exampleArray) / sizeof(exampleArray[0]);
    // Counter for the loop
    int i;

    for (i = 0; i < n; i++)
    {
        exampleArray[i] = exampleArray[i] * i;
        printf("exampleArray[%d]: %d\n", i, exampleArray[i]);
    }

    return 0;
}

As you can see, this code is almost exactly the same, except now we are dynamically calculating the size of our array.

First, we determine how much room the entire array takes up in memory: sizeof(exampleArray). We then divide that by the size of each element in the array: sizeof(exampleArray[0]). That’s all there is to it!

Conclusion

Well, there’s arrays in a nutshell. It’s important that you have a firm grasp on arrays before continuing on in this series. You will find that they are the fundamental building blocks of all sorts of additional data structures.

Before fully wrapping up, I want to consider the disadvantages of arrays. The biggest one is that, once created, an array cannot be resized. If you want to change the size of an array you’d have to create a whole new one of the new size. When a data structure can’t be edited in this way it’s known as non-mutable.

This has the side effect of meaning that you can’t really delete an element from an array. You could change the value of a particular element, but you can delete it.

Another drawback comes from the fact that the values stored in an array are stored consecutively in memory. This means that operations like insertions and deletions can become costly.

But, that’s enough babbling about arrays for one day!

Table of Contents

  • Part 1: Introduction
  • Part 2: Arrays
  • Part 3: Lists
  • Part 4: Stacks
  • Part 5: Queues
  • Part 6: Sets
  • Part 7: Maps
  • Part 8: Hash Tables
  • Part 9: Trees
  • Part 10: Graphs

1 Comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.