By default the python operators (+, -, *, /, **) operate element-wise (except matrix multiplication). To work element-wise it is important that the shapes match. At this position broadcasting is important. It mean, when the shapes do not match, numpy try to match them with broadcasting.
In this blog, I have shared the concept of array broadcasting and how to implement it in NumPy.
Limitation with Array Arithmetic
We can perform arithmetic directly on NumPy arrays, such as addition and subtraction.
For example, two arrays can be added together to create a new array where the values at each index are added together.
For example, an array a can be defined as [1, 2, 3] and array b can be defined as [1, 2, 3] and adding together will result in a new array with the values [2, 4, 6].
Strictly, arithmetic may only be performed on arrays that have the same dimensions and dimensions with the same size.
This means that a one-dimensional array with the length of 10 can only perform arithmetic with another one-dimensional array with the length 10.
This limitation on array arithmetic is quite limiting indeed. Thankfully, NumPy provides a built-in workaround to allow arithmetic between arrays with differing sizes.
Array Broadcasting
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Arrays with different sizes cannot be added, subtracted, or generally be used in arithmetic.
A way to overcome this is to duplicate the smaller array so that it is the dimensionality and size as the larger array. This is called array broadcasting and is available in NumPy when performing array arithmetic, which can greatly reduce and simplify our code.
NumPy is smart enough to use the original scalar value without actually making copies so that broadcasting operations are as memory and computationally efficient as possible.
Numpy operations are usually done element-by-element which requires two arrays to have exactly the same shape:
Numpy’s broadcasting rule relaxes this constraint when the arrays’ shapes meet certain constraints. The simplest broadcasting example occurs when an array and a scalar value are combined in an operation:
The result is equivalent to the previous example where b was an array. We can think of the scalar b being stretched during the arithmetic operation into an array with the same shape as a. The new elements in b, as shown in below figure (1), are simply copies of the original scalar. The stretching analogy is only conceptual. Numpy is smart enough to use the original scalar value without actually making copies so that broadcasting operations are as memory and computationally efficient as possible. Because example (2) moves less memory, (b is a scalar, not an array) around during the multiplication, it is about 10% faster than example (1) using the standard numpy on Windows 2000 with one million element arrays.
In the simplest example of broadcasting, the scalar ``b`` is stretched to become an array of with the same shape as ``a`` so the shapes are compatible for element-by-element multiplication.
The rule governing whether two arrays have compatible shapes for broadcasting can be expressed in a single sentence.
The Broadcasting Rule
In order to broadcast:
- The arrays all have exactly the same shape.
- The arrays all have the same number of dimensions and the length of each dimension is either a common length or 1.
- The arrays that have too few dimensions can have their shapes prepended with a dimension of length 1 to satisfy property 2.
If these condition is not met, a ValueError (‘frames are not aligned’) exception is thrown indicating that the arrays have incompatible shapes. The size of the result array created by broadcast operations is the maximum size along each dimension from the input arrays. Note that the rule does not say anything about the two arrays needing to have the same number of dimensions. So, for example, if you have a 256 x 256 x 3 array of RGB values, and you want to scale each color in the image by a different value, you can multiply the image by a one-dimensional array with 3 values. Lining up the sizes of the trailing axes of these arrays according to the broadcast rule shows that they are compatible.
In the following example, both the A and B arrays have axes with length one that are expanded to a larger size in a broadcast operation.
Below, are several code examples and graphical representations that help make the broadcast rule visually obvious. Example 3 adds a one-dimensional array to a two-dimensional array:
As shown in below figure(2), b is added to each row of a. A two dimensional array multiplied by a one dimensional array results in broadcasting if number of 1-d array elements matches the number of 2-d array columns.
When b is longer than the rows of a, as in below figure (3), an exception is raised because of the incompatible shapes. When the trailing dimensions of the arrays are unequal, broadcasting fails because it is impossible to align the values in the rows of the 1st array with the elements of the 2nd arrays for element-by-element addition.
Broadcasting provides a convenient way of taking the outer product (or any other outer operation) of two arrays. The following example shows an outer addition operation of two 1-d arrays.
Here the newaxis index operator inserts a new axis into a, making it a two-dimensional 4x1 array. figure (4) illustrates the stretching of both arrays to produce the desired 4x3 output array.
Limitations of Broadcasting
- Broadcasting is a handy shortcut that proves very useful in practice when working with NumPy arrays.
- That being said, it does not work for all cases, and in fact imposes a strict rule that must be satisfied for broadcasting to be performed.
- Arithmetic, including broadcasting, can only be performed when the shape of each dimension in the arrays are equal or one has the dimension size of 1. The dimensions are considered in reverse order, starting with the trailing dimension; for example, looking at columns before rows in a two-dimensional case.
- This make more sense when we consider that NumPy will in effect pad missing dimensions with a size of “1” when comparing arrays.
References
i) https://numpy.org/doc/stable/user/basics.broadcasting.html
ii) https://numpy.org/doc/stable/user/theory.broadcasting.html#array-broadcasting-in-numpy