We interview a lot of people at Excella. Many people we interview for developer positions have many years of experience under their belts, come from prestigious engineering schools, or have worked on million-dollar systems.
However, you’d be surprised by the amount of people who miss one of the most common questions that we ask: “What is the difference between an array and a list?” and “When would you use one over the other?”
Admittedly, it’s a loaded question for a developer. Both are a very distinct and fundamental data structure, and in an age where we develop using CLR’s, garbage collection, and optimizing compilers – it’s easy to forget that there are differences between the two.
This is increasingly apparent as our applications are being deployed on Mobile phones or IoT devices with battery, memory, and processing power constraints. Being smart with your processing cycles, memory usage, and consequently power consumption can be the difference between a total flop and a huge success.
Here, we’ll clear up the mystery so that you can nail the question in your next interview.
Too Long; Didn’t Read
What’s the Difference between Arrays and Lists?
The difference between an array and a list is that the array has statically allocated memory, whereas lists resize dynamically as the collection grows.
When Would You Use Arrays?
Realistically, almost never. However, you might want to use arrays when you know the size of the data that you are going to collect, or if the speed for reading and writing data is crucial to the application. For example, say that you want to shave a couple seconds off a batch processing job and you’ve exhausted all other options.
When Would You Use Lists?
When the collection of data you are storing is likely to be resized, you don’t know how large the collection will become, or if there are many changes that are going to be made to the collection.
The (Long) Argument for Arrays
Arrays allocate memory statically, meaning that they are a specified size and cannot resize. This becomes a problem when you want to increase the capacity of your array. However, it does have several advantages.
The first advantage is performance. Since arrays are statically allocated spaces in memory and the time that it takes to access a single element in the array is O(1). If you don’t understand Big-O notation, it basically means that the number of checks the application must perform in order to confirm that it has found its target in a collection is 1.
Think of it as you’re looking at a carton of eggs with one brown egg, and you’re asked “point at the brown egg.” You can see the entire carton, so you can instantly spot the brown egg. Corollary, finding a single element in a list is like someone handing you a single egg at a time, and asking you to tell them which is the brown egg. It will take the number of eggs before the brown egg, (or O(N)) in order to find a single element.
The second advantage is the amount of memory that it takes to store information in an array. In .NET, Lists incur a slight amount of overhead for the additional functionality that is offered when using a List. I’ll explain why later. However, because arrays are stored in a contiguous block of memory, they are as compact as they can possibly be.
If Arrays Are So Great, Why Do We Use Lists?
While you may receive a performance gain when using arrays, in the real world, you’ll probably use Lists exclusively. In .NET and most other managed languages, you’ll very rarely see arrays being defined, and if they are, there better be a very good reason for using them.
Unlike arrays, Lists dynamically allocate memory when the collection grows. This gives Lists a huge advantage over their statically allocated brethren. The reason? Arrays cannot grow in size! When an array reaches its capacity, a new array will actually be allocated in memory, the contents of the original array are copied over, and the original array is deleted. This is a very costly procedure and negates any performance improvements you would get from using an array if resizing becomes a frequent operation.
Using my egg carton analogy, imagine that your carton is full, and you’re given a new egg. An array would get a new carton with one more slot, transfer all the eggs, and throw away the original carton!
You can consider Lists as a collection of arrays. When a List reaches its capacity, instead of copying over the contents to another List, a new block of memory is allocated (often doubling the size) and is tied to the original list. This ensures that there is always room for more elements.
Lists try to take advantage of the speed of arrays while allocating memory dynamically as needed. This, of course, means that there will be some wasted space than an array would normally use.
Again, comparing it to egg cartons: if a carton of eggs is about to reach its capacity, a new carton is brought in and a string is tied to the cartons so we know they’re all meant to be sold together. This means that there will always be space for enough eggs, but that some of the cartons will be empty.
In summary, there are times when you need to use arrays, but they should be only used in very specific circumstances. Lists are considered the de-facto data structure to use in C# because they combine the performance of an array with the dynamic nature of a linked-list.