Master NumPy array manipulation for machine learning & data science. Learn to reshape, index, slice, and modify ndarrays effectively with Python.

NumPy Array Manipulation

NumPy is a fundamental package for scientific computing in Python, offering powerful tools for working with arrays. This guide covers essential NumPy routines for manipulating elements within ndarray objects, categorized for clarity.

1. Changing Shape

These routines alter an array's dimensions without changing its data.

numpy.reshape(a, newshape, order='C'): Gives a new shape to an array without changing its data. The total number of elements must remain the same.

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
new_arr = arr.reshape((2, 3))
print(new_arr)
# Output:
# [[1 2 3]
#  [4 5 6]]

ndarray.flat: A 1-D iterator over the array. Allows iterating through array elements one by one, regardless of the original shape.

import numpy as np
arr = np.array([[1, 2], [3, 4]])
for element in arr.flat:
    print(element, end=' ')
# Output: 1 2 3 4

numpy.flatten(order='C'): Returns a copy of the array collapsed into one dimension. This is a method of the ndarray object.

import numpy as np
arr = np.array([[1, 2], [3, 4]])
flattened_arr = arr.flatten()
print(flattened_arr)
# Output: [1 2 3 4]

numpy.ravel(a, order='C'): Returns a contiguous flattened array. This function returns a view of the original array whenever possible, making it more memory-efficient than flatten().
```
import numpy as np
arr = np.array([[1, 2], [3, 4]])
raveled_arr = np.ravel(arr)
print(raveled_arr)
# Output: [1 2 3 4]
```

numpy.pad(array, pad_width, mode='constant', **kwargs): Returns a padded array with its shape increased according to pad_width. Useful for adding borders or margins to arrays.

import numpy as np
arr = np.array([1, 2, 3])
padded_arr = np.pad(arr, (1, 2), 'constant', constant_values=(0, 10))
print(padded_arr)
# Output: [ 0  1  2  3 10 10]

2. Transpose Operations

Transpose operations swap rows and columns in 2D arrays or rearrange axes in higher-dimensional arrays.

numpy.transpose(a, axes=None): Permutes the dimensions of an array. By default, it reverses the order of axes.

import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
transposed_arr = np.transpose(arr)
print(transposed_arr)
# Output:
# [[1 4]
#  [2 5]
#  [3 6]]

ndarray.T: A shorthand for numpy.transpose(). It's an attribute that returns the transposed array.

import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.T)
# Output:
# [[1 4]
#  [2 5]
#  [3 6]]

numpy.rollaxis(a, axis, start=0): Rolls the specified axis backward. This shifts elements along the given axis.

import numpy as np
arr = np.arange(6).reshape((2,3))
rolled_arr = np.rollaxis(arr, 1, 0)
print(rolled_arr)
# Output:
# [[0 3]
#  [1 4]
#  [2 5]]

numpy.swapaxes(a, axis1, axis2): Interchanges two axes of an array.

import numpy as np
arr = np.arange(6).reshape((2,3))
swapped_arr = np.swapaxes(arr, 0, 1)
print(swapped_arr)
# Output:
# [[0 3]
#  [1 4]
#  [2 5]]

numpy.moveaxis(a, source, destination): Moves axes of an array to new positions. Allows for flexible rearrangement of dimensions.

import numpy as np
arr = np.random.rand(3, 4, 5)
moved_arr = np.moveaxis(arr, 0, -1)
print(moved_arr.shape)
# Output: (4, 5, 3)

3. Changing Dimensions

These functions reshape or restructure arrays without altering data.

numpy.broadcast: Produces an object that mimics broadcasting. This is more for understanding the broadcasting mechanism.
numpy.broadcast_to(array, shape, subok=False): Broadcasts an array to a new shape. An array can be broadcasted to another shape if the dimensions of the array match the dimensions of the new shape in a certain way.
```
import numpy as np
arr = np.array([1, 2, 3])
broadcasted_arr = np.broadcast_to(arr, (3, 3))
print(broadcasted_arr)
# Output:
# [[1 2 3]
#  [1 2 3]
#  [1 2 3]]
```

numpy.expand_dims(a, axis): Expands the shape of an array by inserting a new axis at the specified position.

import numpy as np
arr = np.array([1, 2, 3])
expanded_arr = np.expand_dims(arr, axis=0)
print(expanded_arr)
# Output: [[1 2 3]]
print(expanded_arr.shape)
# Output: (1, 3)

numpy.squeeze(a, axis=None): Removes single-dimensional entries from the shape of an array.

import numpy as np
arr = np.array([[1, 2, 3]])
squeezed_arr = np.squeeze(arr)
print(squeezed_arr)
# Output: [1 2 3]
print(squeezed_arr.shape)
# Output: (3,)

4. Joining Arrays

Joining combines multiple arrays along specified axes.

numpy.concatenate((a1, a2, ...), axis=0, out=None): Joins a sequence of arrays along an existing axis.

import numpy as np
arr1 = np.array([1, 2])
arr2 = np.array([3, 4])
concatenated_arr = np.concatenate((arr1, arr2))
print(concatenated_arr)
# Output: [1 2 3 4]

numpy.stack(arrays, axis=0, out=None): Joins arrays along a new axis. This increases the dimensionality of the resulting array.

import numpy as np
arr1 = np.array([1, 2])
arr2 = np.array([3, 4])
stacked_arr = np.stack((arr1, arr2))
print(stacked_arr)
# Output:
# [[1 2]
#  [3 4]]
print(stacked_arr.shape)
# Output: (2, 2)

numpy.hstack(tup): Stacks arrays horizontally (column-wise). Equivalent to concatenate along axis 1 for 2D arrays.

import numpy as np
arr1 = np.array([1, 2])
arr2 = np.array([3, 4])
hstacked_arr = np.hstack((arr1, arr2))
print(hstacked_arr)
# Output: [1 2 3 4]

numpy.vstack(tup): Stacks arrays vertically (row-wise). Equivalent to concatenate along axis 0 for 2D arrays.

import numpy as np
arr1 = np.array([1, 2])
arr2 = np.array([3, 4])
vstacked_arr = np.vstack((arr1, arr2))
print(vstacked_arr)
# Output:
# [[1 2]
#  [3 4]]

numpy.dstack(tup): Stacks arrays depth-wise (along the third axis).

import numpy as np
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
dstacked_arr = np.dstack((arr1, arr2))
print(dstacked_arr)
# Output:
# [[[1 5]
#   [2 6]]
#
#  [[3 7]
#   [4 8]]]

numpy.column_stack(tup): Stacks 1-D arrays as columns into a 2-D array. For N-D arrays, it stacks them as columns along the last axis.

import numpy as np
arr1 = np.array([1, 2])
arr2 = np.array([3, 4])
col_stacked_arr = np.column_stack((arr1, arr2))
print(col_stacked_arr)
# Output:
# [[1 3]
#  [2 4]]

numpy.row_stack(tup): Stacks 1-D arrays as rows into a 2-D array. Equivalent to vstack.

5. Splitting Arrays

Splitting divides arrays into smaller arrays along specified axes.

numpy.split(ary, indices_or_sections, axis=0): Splits an array into multiple sub-arrays. indices_or_sections can be an integer (number of sub-arrays) or a list of indices where splits occur.

import numpy as np
arr = np.arange(10)
sub_arrays = np.split(arr, 2)
print(sub_arrays)
# Output: [array([0, 1, 4, 5]), array([2, 3, 6, 7])] - Note: This example is incorrect, split evenly. Correct split:
sub_arrays_correct = np.split(arr, [3, 7])
print(sub_arrays_correct)
# Output: [array([0, 1, 2]), array([3, 4, 5, 6]), array([7, 8, 9])]

numpy.hsplit(ary, indices_or_sections): Splits an array horizontally (column-wise). Equivalent to split along axis 1.

import numpy as np
arr = np.arange(12).reshape((3, 4))
h_split_arr = np.hsplit(arr, 2)
print(h_split_arr[0])
# Output:
# [[0 1]
#  [4 5]
#  [8 9]]

numpy.vsplit(ary, indices_or_sections): Splits an array vertically (row-wise). Equivalent to split along axis 0.

import numpy as np
arr = np.arange(12).reshape((3, 4))
v_split_arr = np.vsplit(arr, 3)
print(v_split_arr[0])
# Output:
# [[0 1 2 3]]

numpy.dsplit(ary, indices_or_sections): Splits an array along the third axis (depth).
numpy.array_split(ary, indices_or_sections, axis=0): Splits an array into multiple sub-arrays, even if the split does not result in equal sized parts.

6. Adding / Removing Elements

These functions allow insertion or deletion of elements.

numpy.resize(a, new_shape): Returns a new array with specified shape. If the new shape requires more elements than the original array, the original elements are repeated. If fewer elements are needed, the extra elements are discarded.
```
import numpy as np
arr = np.array([1, 2, 3])
resized_arr = np.resize(arr, (2, 3))
print(resized_arr)
# Output:
# [[1 2 3]
#  [1 2 3]]
```

numpy.append(arr, values, axis=None): Appends values to the end of an array. If axis is specified, the arrays are joined along that axis.

import numpy as np
arr = np.array([1, 2, 3])
appended_arr = np.append(arr, [4, 5])
print(appended_arr)
# Output: [1 2 3 4 5]

numpy.insert(arr, obj, values, axis=None): Inserts values before specified indices along an axis.

import numpy as np
arr = np.array([1, 2, 3])
inserted_arr = np.insert(arr, 1, [9, 8])
print(inserted_arr)
# Output: [1 9 8 2 3]

numpy.delete(arr, obj, axis=None): Returns a new array with elements deleted along an axis.

import numpy as np
arr = np.array([1, 2, 3, 4, 5])
deleted_arr = np.delete(arr, [1, 3])
print(deleted_arr)
# Output: [1 3 5]

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None): Finds unique elements in an array and can return their indices and counts.
```
import numpy as np
arr = np.array([1, 2, 2, 3, 1, 4])
unique_elements = np.unique(arr)
print(unique_elements)
# Output: [1 2 3 4]
```

7. Repeating and Tiling Arrays

Techniques to create larger arrays by duplicating elements.

numpy.repeat(a, repeats, axis=None): Repeats each element of an array.

import numpy as np
arr = np.array([1, 2, 3])
repeated_arr = np.repeat(arr, 2)
print(repeated_arr)
# Output: [1 1 2 2 3 3]

numpy.tile(a, reps): Constructs an array by repeating an array a specified number of times. reps can be an integer or a tuple specifying the number of repetitions along each dimension.

import numpy as np
arr = np.array([1, 2])
tiled_arr = np.tile(arr, 3)
print(tiled_arr)
# Output: [1 2 1 2 1 2]

tiled_arr_2d = np.tile(arr, (2, 1)) # Repeat 2 times along axis 0
print(tiled_arr_2d)
# Output:
# [[1 2]
#  [1 2]]

8. Rearranging Elements

Operations to reorder elements within an array.

numpy.flip(m, axis=None): Reverses the order of elements along a given axis or axes.

import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
flipped_arr = np.flip(arr, axis=0)
print(flipped_arr)
# Output:
# [[4 5 6]
#  [1 2 3]]

numpy.fliplr(m): Reverses the order of elements along axis 1 (left/right).

import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
fliplr_arr = np.fliplr(arr)
print(fliplr_arr)
# Output:
# [[3 2 1]
#  [6 5 4]]

numpy.flipud(m): Reverses the order of elements along axis 0 (up/down).

import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
flipud_arr = np.flipud(arr)
print(flipud_arr)
# Output:
# [[4 5 6]
#  [1 2 3]]

numpy.roll(a, shift, axis=None): Rolls array elements along a specified axis. This shifts elements cyclically.

import numpy as np
arr = np.array([1, 2, 3, 4])
rolled_arr = np.roll(arr, 2)
print(rolled_arr)
# Output: [3 4 1 2]

9. Sorting and Searching

Powerful tools for sorting arrays and searching within them.

numpy.sort(a, axis=-1, kind=None, order=None): Returns a sorted copy of the array.

import numpy as np
arr = np.array([3, 1, 4, 2])
sorted_arr = np.sort(arr)
print(sorted_arr)
# Output: [1 2 3 4]

numpy.argsort(a, axis=-1, kind=None, order=None): Returns the indices that would sort the array.

import numpy as np
arr = np.array([3, 1, 4, 2])
sorted_indices = np.argsort(arr)
print(sorted_indices)
# Output: [1 3 0 2]

numpy.lexsort(keys, axis=-1): Performs an indirect stable sort using a sequence of keys. The last key in keys is the primary sort key.
numpy.searchsorted(a, v, side='left', sorter=None): Finds indices where elements should be inserted into a sorted array to maintain order.
```
import numpy as np
arr = np.array([1, 3, 5, 7])
indices = np.searchsorted(arr, [2, 6])
print(indices)
# Output: [1 3]
```

numpy.argmax(a, axis=None, out=None): Returns the indices of the maximum values along an axis.

import numpy as np
arr = np.array([[1, 5, 3], [4, 2, 6]])
max_indices = np.argmax(arr, axis=1)
print(max_indices)
# Output: [1 2]

numpy.argmin(a, axis=None, out=None): Returns the indices of the minimum values along an axis.

import numpy as np
arr = np.array([[1, 5, 3], [4, 2, 6]])
min_indices = np.argmin(arr, axis=1)
print(min_indices)
# Output: [0 1]

numpy.nonzero(a): Returns the indices of the non-zero elements in an array.

import numpy as np
arr = np.array([[0, 1, 0], [2, 0, 0]])
non_zero_indices = np.nonzero(arr)
print(non_zero_indices)
# Output: (array([0, 1]), array([1, 0]))

numpy.where(condition[, x, y]): Returns elements chosen from x or y depending on the condition. If only condition is provided, it returns indices of True elements.
```
import numpy as np
arr = np.array([1, 2, 3, 4])
where_result = np.where(arr > 2)
print(where_result)
# Output: (array([2, 3]),)
```

10. Set Operations

Perform mathematical set operations on arrays such as union, intersection, and difference. These functions operate on flattened arrays.

numpy.in1d(ar1, ar2, assume_unique=False, invert=False): Tests whether each element of one array is present in another array.

import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
in_arr = np.in1d(arr1, arr2)
print(in_arr)
# Output: [False False  True  True]

numpy.intersect1d(ar1, ar2, assume_unique=False, return_indices=False): Finds the intersection of two arrays (unique elements present in both).

import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
intersection = np.intersect1d(arr1, arr2)
print(intersection)
# Output: [3 4]

numpy.setdiff1d(ar1, ar2, assume_unique=False): Finds the set difference of two arrays (unique values in the first array that are not in the second).

import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
difference = np.setdiff1d(arr1, arr2)
print(difference)
# Output: [1 2]

numpy.setxor1d(ar1, ar2, assume_unique=False): Finds the set symmetric difference of two arrays (unique values present in either, but not both).

import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
symmetric_difference = np.setxor1d(arr1, arr2)
print(symmetric_difference)
# Output: [1 2 5 6]

numpy.union1d(ar1, ar2): Finds the sorted union of two arrays (unique elements from both).

import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
union = np.union1d(arr1, arr2)
print(union)
# Output: [1 2 3 4 5 6]

11. Other Array Operations

Additional useful array operations.

numpy.clip(a, a_min, a_max, out=None): Limits (clips) values in an array to be within a specified range.

import numpy as np
arr = np.array([-1, 0, 5, 10])
clipped_arr = np.clip(arr, 0, 5)
print(clipped_arr)
# Output: [0 0 5 5]

numpy.round(a, decimals=0, out=None): Rounds array values to the given number of decimal places.

import numpy as np
arr = np.array([1.234, 5.678, 9.012])
rounded_arr = np.round(arr, decimals=1)
print(rounded_arr)
# Output: [1.2 5.7 9.0]

numpy.diagonal(a, offset=0, axis1=0, axis2=1): Returns specified diagonals of an array.

import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
diagonals = np.diagonal(arr)
print(diagonals)
# Output: [1 5 9]

numpy.trace(a, offset=0, axis1=0, axis2=1, dtype=None, out=None): Returns the sum along the diagonals of an array.

import numpy as np
arr = np.array([[1, 2], [3, 4]])
trace_sum = np.trace(arr)
print(trace_sum)
# Output: 5 (1 + 4)

numpy.take(a, indices, axis=None, out=None, mode='clip'): Takes elements from an array along an axis. This is an alternative to fancy indexing.

import numpy as np
arr = np.array([0, 1, 2, 3, 4])
taken_elements = np.take(arr, [1, 3, 4])
print(taken_elements)
# Output: [1 3 4]

numpy.put(a, ind, v, mode='raise'): Replaces specified elements of an array with given values.

import numpy as np
arr = np.array([0, 1, 2, 3, 4])
np.put(arr, [1, 3], [9, 8])
print(arr)
# Output: [0 9 2 8 4]

numpy.choose(a, choices, out=None, mode='raise'): Constructs an array from an index array and a list of arrays (choices).

import numpy as np
arr = np.array([0, 1, 2]) # Indices
choices_arr = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
chosen_elements = np.choose(arr, choices_arr)
print(chosen_elements)
# Output: [10 50 90]

NumPy Array Manipulation for ML & Data Science