Arrays and Vectors
Overview
Teaching: 20 min
Exercises: 20 minQuestions
How can I create collections of things in C++
How can I get and set values in a collection?
If I don’t know the size of the collection in advance, what do I do?
Objectives
Understand static arrays of objects in C++ for creating collections.
Know how to use the standard library vector to create variable sized collections.
Be able to access members of the collection and update, insert and delete values.
N.B.
I feel after having written this episode that it would be very useful to have covered
for
loops at this point, even just to be able to write an example where one
can loop over the container.
It’s also appropriate to introduce container iteration loops alongside containers.
Also, any iterator type really, really should be handled with auto
…
Introduction
We saw previously how to create variables that represented the basic types of C++. However, usually in scientific and data intensive problems we need to deal a huge amount of data and these are usually grouped into collections of numbers rather than being created and accessed one by one (which would be incredibly tedious!).
Fixed Size Containers: Arrays
In C++ we have the possibility to create arrays of the basic types (well, we’ll see later this can be extended to virtually any object in C++). Here’s a simple example:
#include <iostream>
#include <array>
int main() {
std::array<double, 4> four_vector {2., 3., 4., -10.};
std::cout << "This 4-vector has time component " << four_vector[3] << std::endl;
four_vector[3] = 7.5;
std::cout << "This 4-vector now has time component " << four_vector[3] << std::endl;
return 0;
}
What do we see here?
First, arrays in modern C++ are part of the standard library
and first we need to tell the compiler we want to use that
functionality by including the header file array
.
Second, we define an array in C++ using a syntax like this:
array<TYPE, NUMBER>
This is actually an example of what’s called a template in C++, which we’ll study in more detail later, but in this simple example it’s pretty clear how to use it passing these two arguments in.
We’re also showing you a way to initialise multiple value objects like arrays
using a standard C++ syntax called uniform initialization: {a, b, c, ... }
.
Third, data in the array is accessed using the [n]
notation where
n
counts from 0. So an array of size N
has elements 0
, 1
,
2
, ...
, N-1
.
When we compile and run we get:
$ g++ -std=c++17 array-basic.cpp -o array-basic
$ ./array-basic
This 4-vector has time component -10
This 4-vector now has time component 7.5
$
Note that arrays are always a fixed size in C++.
Array indexes
As we saw, array indexes run from
0
toN-1
for an array of sizeN
. If you accidentally try to access a value outside of this range then the results are undefined behavior - you’re program might crash, or worse it might silently just produce wrong answers. So this is something to be very careful of.Arrays also support an access method called
at
which will check that a valid data element is being accessed:
four_vector.at(3)
will access the third element, but will fail (throwing an exception - more on them latter) if the third element wasn’t valid.
If you needed to know the number of elements in an array, this code will return that value to you:
four_vector.size()
EXERCISE FOR ARRAYS HERE
Blah blah blah
Variable Size Containers: Vectors
Arrays are great to use when the data size is known up-front. However, in many cases we might not know how large a container we need at the beginning. In this case C++ comes to our aid with a variable sized container type called a vector.
One can use a vector in much the same way as an array:
#include <iostream>
#include <vector>
int main() {
std::vector<double> n_vector {2., 3., 4., -10.};
std::cout << "This n-vector has time component " << n_vector[3] << std::endl;
n_vector[3] = 7.5;
std::cout << "This n-vector now has time component " << n_vector[3] << std::endl;
n_vector[0]++;
std::cout << "The first element of the vector is " << n_vector.front() <<
" and the last is " << n_vector.back() << std::endl;
std::cout << "The vector size is " << n_vector.size() << std::endl;
return 0;
}
This looks very like our array code above and one of the nice things about these C++ containers is that they are used in very similar ways!
- Instead of including the
array
header, this time we usevector
. - When we define our vector we don’t need to give the size, as this is mutable.
- In this case, we initialised the vector with 4 elements the size of the vector is 4.
- Accessing the elements of a vector uses the same
[]
notation as arrays.- Vectors also have the methods
front()
andback()
that access the start and the end of the container.
- Vectors also have the methods
- The size of the vector is returned by
size()
, just like it was for an array.
Compiling and running the code produces the expected results:
$ g++ --std=c++17 -Wall vector-basic.cpp -o vector-basic
$ ./vector-basic
This n-vector has time component -10
This n-vector now has time component 7.5
The first element of the vector is 3 and the last is 7.5
The vector size is 4
$
Adding and Deleting Elements from Vectors
Adding
To add a new element to a vector there is a the push_back()
method:
#include <iostream>
#include <vector>
int main() {
std::vector<double> n_vector {};
n_vector.push_back(1.5);
n_vector.push_back(2.7);
std::cout << "Vector elements are " << n_vector[0] << " and " << n_vector[1] <<
" and the size is " << n_vector.size() << std::endl;
n_vector.push_back(3.9);
std::cout << "Vector now has " << n_vector.size() << " elements and the last value is " << n_vector[n_vector.size()-1] << std::endl;
return 0;
}
The example shows that the push_back()
extends the container by one element, adding the argument
to as the last value.
$ g++ --std=c++17 -Wall vector-push-back.cpp -o vector-push-back
$ ./vector-push-back
Vector elements are 1.5 and 2.7 and the size is 2
Vector now has 3 elements; and the last value is 3.9
If you need to add an element to an arbitrary position in a vector then you can use the
insert()
method:
#include <iostream>
#include <vector>
int main() {
std::vector<double> n_vector {-1.0, -2.0, -3.0};
n_vector.insert(n_vector.begin(), 100.0);
std::cout << "The first vector element is " << n_vector[0] <<
" and the size is " << n_vector.size() << std::endl;
n_vector.insert(n_vector.begin()+1, 200.0);
std::cout << "The second vector element is " << n_vector[1] <<
" and the size is " << n_vector.size() << std::endl;
return 0;
}
Here we used the method begin()
to get the start of the vector and you can see
with the second insert
that adding values from this starting point
work as you would expect (begin()+3
, for example, is the 4th element position).
$ g++ --std=c++17 -Wall vector-insert.cpp -o vector-insert
$ ./vector-insert
The first vector element is 100 and the size is 4
The second vector element is 200 and the size is 5
Iterator arithmetic
Formally in C++ the vector
v.begin()
method returns what is called an iterator that can be used to identify a position inside a container. Adding and subtracting then moves the access point up and down the vector, respectively, e.g.,
auto second_element = v.begin()+1
Just be careful that you don’t try to insert into an invalid place in the vector as bad things will happen, i.e., inserting into a location before the beginning of the vector or inserting into a place more than one step beyond the end. Inserting right at the end is valid and makes
insert()
behave likepush_back()
:
v.insert(v.end(), VALUE)
behaves the same as
v.push_back(VALUE)
Note that
v.end()
is used a lot in C++ and returns the position in the container one step beyond the last valid value. (Thus in loops the comparison is always less than:still_valid < v.end()
.)
EXERCISE FOR VECTORS HERE
Blah blah blah
Deleting
If you want to delete the final element of the vector the pop_back()
method will
do this. Note this method returns nothing, i.e., it’s a void
function in C++.
#include <iostream>
#include <vector>
int main() {
std::vector<double> n_vector {-1.0, -2.0, -3.0};
n_vector.pop_back();
std::cout << "After pop the last vector element is " << n_vector.back() <<
" and the size is " << n_vector.size() << std::endl;
return 0;
}
$ g++ --std=c++17 -Wall vector-pop-back.cpp -o vector-pop-back
$ ./vector-pop-back
After pop the last vector element is -2 and the size is 2
And to delete an arbitrary position, use the iterator position that we
saw above with erase()
.
#include <iostream>
#include <vector>
int main() {
std::vector<double> n_vector {-1.0, -2.0, -3.0, -4.0, -5.0};
n_vector.erase(n_vector.begin()+2);
std::cout << "After single erase the third vector element is " << n_vector[2] <<
" and the size is " << n_vector.size() << std::endl;
n_vector.erase(n_vector.begin()+2, n_vector.end());
std::cout << "After a block erase the vector size is " << n_vector.size() << std::endl;
return 0;
}
$ g++ --std=c++17 -Wall vector-erase.cpp -o vector-erase
$ ./vector-erase
After single erase the third vector element is -4 and the size is 4
After a block erase the vector size is 2
Here we illustrated two ways to use erase
, first with a single element
and then with a range where all the elements between the starting
point and before the end point are removed.
If you want to delete all of the current entries in a vector then use
the clear()
method:
#include <iostream>
#include <vector>
int main() {
std::vector<double> n_vector {-1.0, -2.0, -3.0, -4.0, -5.0};
std::cout << "Vector initial size is " << n_vector.size() << std::endl;
n_vector.clear();
std::cout << "After a clear() the vector size is " << n_vector.size() << std::endl;
return 0;
}
$ g++ --std=c++17 -Wall vector-clear.cpp -o vector-clear
$ ./vector-clear
Vector initial size is 5
After a clear() the vector size is 0
EXERCISE FOR VECTORS EXTENSION HERE
Blah blah blah
How vector storage works in C++
It’s useful to know they way in which a vector works in C++ so you understand how to use it most effectively. Vectors are guaranteed in C++ to occupy contiguous memory, which means that processing a vector’s elements one after another is usually rather efficient on modern CPUs.
However, that does mean that if elements are inserted at the beginning or in the middle of a vector then all the elements after it have to be moved, which is slower. The advice is always try to fill up vectors at the end!
As a vector’s size is mutable the library will usually allocate some more space for the vector elements than are currently used. If there is spare space in the underlying allocation then
push_back()
will be very quick. However, when the underlying storage is exhausted the library has to reallocate memory in order to add a new element. Roughly this process is:
- Call to
push_back()
- Realise that the current storage if full.
- Allocate a new, larger, storage block.
- Copy each element in turn from the old block to the new one.
- Add the final, new, element.
- Free the old storage.
This is slow, and gets slower as the vector grows in size. To offset this problem, use the
reserve(N)
method of a vector that will tell the library to pre-allocate enough space forN
elements in advance. e.g.,
vector<double> v;
v.reserve(1000000);
for (unsigned int i=0; i<1000000; ++i) v.push_back(i);
will avoid a lot of unnecessary reallocations and copying compared to the same code without the
reserve()
call.
Key Points
C++ can assign fixed sized collections as arrays.
C++ can manage variable sized collections as vectors of objects.
For vectors, data elements can be added and removed, as well as updated.