NumPy Array Manipulation for ML & Data Science
Master NumPy array manipulation for machine learning & data science. Learn to reshape, index, slice, and modify ndarrays effectively with Python.
NumPy Array Manipulation
NumPy is a fundamental package for scientific computing in Python, offering powerful tools for working with arrays. This guide covers essential NumPy routines for manipulating elements within ndarray
objects, categorized for clarity.
1. Changing Shape
These routines alter an array's dimensions without changing its data.
-
numpy.reshape(a, newshape, order='C')
: Gives a new shape to an array without changing its data. The total number of elements must remain the same.import numpy as np arr = np.array([1, 2, 3, 4, 5, 6]) new_arr = arr.reshape((2, 3)) print(new_arr) # Output: # [[1 2 3] # [4 5 6]]
-
ndarray.flat
: A 1-D iterator over the array. Allows iterating through array elements one by one, regardless of the original shape.import numpy as np arr = np.array([[1, 2], [3, 4]]) for element in arr.flat: print(element, end=' ') # Output: 1 2 3 4
-
numpy.flatten(order='C')
: Returns a copy of the array collapsed into one dimension. This is a method of thendarray
object.import numpy as np arr = np.array([[1, 2], [3, 4]]) flattened_arr = arr.flatten() print(flattened_arr) # Output: [1 2 3 4]
-
numpy.ravel(a, order='C')
: Returns a contiguous flattened array. This function returns a view of the original array whenever possible, making it more memory-efficient thanflatten()
.import numpy as np arr = np.array([[1, 2], [3, 4]]) raveled_arr = np.ravel(arr) print(raveled_arr) # Output: [1 2 3 4]
-
numpy.pad(array, pad_width, mode='constant', **kwargs)
: Returns a padded array with its shape increased according topad_width
. Useful for adding borders or margins to arrays.import numpy as np arr = np.array([1, 2, 3]) padded_arr = np.pad(arr, (1, 2), 'constant', constant_values=(0, 10)) print(padded_arr) # Output: [ 0 1 2 3 10 10]
2. Transpose Operations
Transpose operations swap rows and columns in 2D arrays or rearrange axes in higher-dimensional arrays.
-
numpy.transpose(a, axes=None)
: Permutes the dimensions of an array. By default, it reverses the order of axes.import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6]]) transposed_arr = np.transpose(arr) print(transposed_arr) # Output: # [[1 4] # [2 5] # [3 6]]
-
ndarray.T
: A shorthand fornumpy.transpose()
. It's an attribute that returns the transposed array.import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6]]) print(arr.T) # Output: # [[1 4] # [2 5] # [3 6]]
-
numpy.rollaxis(a, axis, start=0)
: Rolls the specified axis backward. This shifts elements along the given axis.import numpy as np arr = np.arange(6).reshape((2,3)) rolled_arr = np.rollaxis(arr, 1, 0) print(rolled_arr) # Output: # [[0 3] # [1 4] # [2 5]]
-
numpy.swapaxes(a, axis1, axis2)
: Interchanges two axes of an array.import numpy as np arr = np.arange(6).reshape((2,3)) swapped_arr = np.swapaxes(arr, 0, 1) print(swapped_arr) # Output: # [[0 3] # [1 4] # [2 5]]
-
numpy.moveaxis(a, source, destination)
: Moves axes of an array to new positions. Allows for flexible rearrangement of dimensions.import numpy as np arr = np.random.rand(3, 4, 5) moved_arr = np.moveaxis(arr, 0, -1) print(moved_arr.shape) # Output: (4, 5, 3)
3. Changing Dimensions
These functions reshape or restructure arrays without altering data.
-
numpy.broadcast
: Produces an object that mimics broadcasting. This is more for understanding the broadcasting mechanism. -
numpy.broadcast_to(array, shape, subok=False)
: Broadcasts an array to a new shape. An array can be broadcasted to another shape if the dimensions of the array match the dimensions of the new shape in a certain way.import numpy as np arr = np.array([1, 2, 3]) broadcasted_arr = np.broadcast_to(arr, (3, 3)) print(broadcasted_arr) # Output: # [[1 2 3] # [1 2 3] # [1 2 3]]
-
numpy.expand_dims(a, axis)
: Expands the shape of an array by inserting a new axis at the specified position.import numpy as np arr = np.array([1, 2, 3]) expanded_arr = np.expand_dims(arr, axis=0) print(expanded_arr) # Output: [[1 2 3]] print(expanded_arr.shape) # Output: (1, 3)
-
numpy.squeeze(a, axis=None)
: Removes single-dimensional entries from the shape of an array.import numpy as np arr = np.array([[1, 2, 3]]) squeezed_arr = np.squeeze(arr) print(squeezed_arr) # Output: [1 2 3] print(squeezed_arr.shape) # Output: (3,)
4. Joining Arrays
Joining combines multiple arrays along specified axes.
-
numpy.concatenate((a1, a2, ...), axis=0, out=None)
: Joins a sequence of arrays along an existing axis.import numpy as np arr1 = np.array([1, 2]) arr2 = np.array([3, 4]) concatenated_arr = np.concatenate((arr1, arr2)) print(concatenated_arr) # Output: [1 2 3 4]
-
numpy.stack(arrays, axis=0, out=None)
: Joins arrays along a new axis. This increases the dimensionality of the resulting array.import numpy as np arr1 = np.array([1, 2]) arr2 = np.array([3, 4]) stacked_arr = np.stack((arr1, arr2)) print(stacked_arr) # Output: # [[1 2] # [3 4]] print(stacked_arr.shape) # Output: (2, 2)
-
numpy.hstack(tup)
: Stacks arrays horizontally (column-wise). Equivalent toconcatenate
along axis 1 for 2D arrays.import numpy as np arr1 = np.array([1, 2]) arr2 = np.array([3, 4]) hstacked_arr = np.hstack((arr1, arr2)) print(hstacked_arr) # Output: [1 2 3 4]
-
numpy.vstack(tup)
: Stacks arrays vertically (row-wise). Equivalent toconcatenate
along axis 0 for 2D arrays.import numpy as np arr1 = np.array([1, 2]) arr2 = np.array([3, 4]) vstacked_arr = np.vstack((arr1, arr2)) print(vstacked_arr) # Output: # [[1 2] # [3 4]]
-
numpy.dstack(tup)
: Stacks arrays depth-wise (along the third axis).import numpy as np arr1 = np.array([[1, 2], [3, 4]]) arr2 = np.array([[5, 6], [7, 8]]) dstacked_arr = np.dstack((arr1, arr2)) print(dstacked_arr) # Output: # [[[1 5] # [2 6]] # # [[3 7] # [4 8]]]
-
numpy.column_stack(tup)
: Stacks 1-D arrays as columns into a 2-D array. For N-D arrays, it stacks them as columns along the last axis.import numpy as np arr1 = np.array([1, 2]) arr2 = np.array([3, 4]) col_stacked_arr = np.column_stack((arr1, arr2)) print(col_stacked_arr) # Output: # [[1 3] # [2 4]]
-
numpy.row_stack(tup)
: Stacks 1-D arrays as rows into a 2-D array. Equivalent tovstack
.
5. Splitting Arrays
Splitting divides arrays into smaller arrays along specified axes.
-
numpy.split(ary, indices_or_sections, axis=0)
: Splits an array into multiple sub-arrays.indices_or_sections
can be an integer (number of sub-arrays) or a list of indices where splits occur.import numpy as np arr = np.arange(10) sub_arrays = np.split(arr, 2) print(sub_arrays) # Output: [array([0, 1, 4, 5]), array([2, 3, 6, 7])] - Note: This example is incorrect, split evenly. Correct split: sub_arrays_correct = np.split(arr, [3, 7]) print(sub_arrays_correct) # Output: [array([0, 1, 2]), array([3, 4, 5, 6]), array([7, 8, 9])]
-
numpy.hsplit(ary, indices_or_sections)
: Splits an array horizontally (column-wise). Equivalent tosplit
along axis 1.import numpy as np arr = np.arange(12).reshape((3, 4)) h_split_arr = np.hsplit(arr, 2) print(h_split_arr[0]) # Output: # [[0 1] # [4 5] # [8 9]]
-
numpy.vsplit(ary, indices_or_sections)
: Splits an array vertically (row-wise). Equivalent tosplit
along axis 0.import numpy as np arr = np.arange(12).reshape((3, 4)) v_split_arr = np.vsplit(arr, 3) print(v_split_arr[0]) # Output: # [[0 1 2 3]]
-
numpy.dsplit(ary, indices_or_sections)
: Splits an array along the third axis (depth). -
numpy.array_split(ary, indices_or_sections, axis=0)
: Splits an array into multiple sub-arrays, even if the split does not result in equal sized parts.
6. Adding / Removing Elements
These functions allow insertion or deletion of elements.
-
numpy.resize(a, new_shape)
: Returns a new array with specified shape. If the new shape requires more elements than the original array, the original elements are repeated. If fewer elements are needed, the extra elements are discarded.import numpy as np arr = np.array([1, 2, 3]) resized_arr = np.resize(arr, (2, 3)) print(resized_arr) # Output: # [[1 2 3] # [1 2 3]]
-
numpy.append(arr, values, axis=None)
: Appends values to the end of an array. Ifaxis
is specified, the arrays are joined along that axis.import numpy as np arr = np.array([1, 2, 3]) appended_arr = np.append(arr, [4, 5]) print(appended_arr) # Output: [1 2 3 4 5]
-
numpy.insert(arr, obj, values, axis=None)
: Inserts values before specified indices along an axis.import numpy as np arr = np.array([1, 2, 3]) inserted_arr = np.insert(arr, 1, [9, 8]) print(inserted_arr) # Output: [1 9 8 2 3]
-
numpy.delete(arr, obj, axis=None)
: Returns a new array with elements deleted along an axis.import numpy as np arr = np.array([1, 2, 3, 4, 5]) deleted_arr = np.delete(arr, [1, 3]) print(deleted_arr) # Output: [1 3 5]
-
numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)
: Finds unique elements in an array and can return their indices and counts.import numpy as np arr = np.array([1, 2, 2, 3, 1, 4]) unique_elements = np.unique(arr) print(unique_elements) # Output: [1 2 3 4]
7. Repeating and Tiling Arrays
Techniques to create larger arrays by duplicating elements.
-
numpy.repeat(a, repeats, axis=None)
: Repeats each element of an array.import numpy as np arr = np.array([1, 2, 3]) repeated_arr = np.repeat(arr, 2) print(repeated_arr) # Output: [1 1 2 2 3 3]
-
numpy.tile(a, reps)
: Constructs an array by repeating an array a specified number of times.reps
can be an integer or a tuple specifying the number of repetitions along each dimension.import numpy as np arr = np.array([1, 2]) tiled_arr = np.tile(arr, 3) print(tiled_arr) # Output: [1 2 1 2 1 2] tiled_arr_2d = np.tile(arr, (2, 1)) # Repeat 2 times along axis 0 print(tiled_arr_2d) # Output: # [[1 2] # [1 2]]
8. Rearranging Elements
Operations to reorder elements within an array.
-
numpy.flip(m, axis=None)
: Reverses the order of elements along a given axis or axes.import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6]]) flipped_arr = np.flip(arr, axis=0) print(flipped_arr) # Output: # [[4 5 6] # [1 2 3]]
-
numpy.fliplr(m)
: Reverses the order of elements along axis 1 (left/right).import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6]]) fliplr_arr = np.fliplr(arr) print(fliplr_arr) # Output: # [[3 2 1] # [6 5 4]]
-
numpy.flipud(m)
: Reverses the order of elements along axis 0 (up/down).import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6]]) flipud_arr = np.flipud(arr) print(flipud_arr) # Output: # [[4 5 6] # [1 2 3]]
-
numpy.roll(a, shift, axis=None)
: Rolls array elements along a specified axis. This shifts elements cyclically.import numpy as np arr = np.array([1, 2, 3, 4]) rolled_arr = np.roll(arr, 2) print(rolled_arr) # Output: [3 4 1 2]
9. Sorting and Searching
Powerful tools for sorting arrays and searching within them.
-
numpy.sort(a, axis=-1, kind=None, order=None)
: Returns a sorted copy of the array.import numpy as np arr = np.array([3, 1, 4, 2]) sorted_arr = np.sort(arr) print(sorted_arr) # Output: [1 2 3 4]
-
numpy.argsort(a, axis=-1, kind=None, order=None)
: Returns the indices that would sort the array.import numpy as np arr = np.array([3, 1, 4, 2]) sorted_indices = np.argsort(arr) print(sorted_indices) # Output: [1 3 0 2]
-
numpy.lexsort(keys, axis=-1)
: Performs an indirect stable sort using a sequence of keys. The last key inkeys
is the primary sort key. -
numpy.searchsorted(a, v, side='left', sorter=None)
: Finds indices where elements should be inserted into a sorted array to maintain order.import numpy as np arr = np.array([1, 3, 5, 7]) indices = np.searchsorted(arr, [2, 6]) print(indices) # Output: [1 3]
-
numpy.argmax(a, axis=None, out=None)
: Returns the indices of the maximum values along an axis.import numpy as np arr = np.array([[1, 5, 3], [4, 2, 6]]) max_indices = np.argmax(arr, axis=1) print(max_indices) # Output: [1 2]
-
numpy.argmin(a, axis=None, out=None)
: Returns the indices of the minimum values along an axis.import numpy as np arr = np.array([[1, 5, 3], [4, 2, 6]]) min_indices = np.argmin(arr, axis=1) print(min_indices) # Output: [0 1]
-
numpy.nonzero(a)
: Returns the indices of the non-zero elements in an array.import numpy as np arr = np.array([[0, 1, 0], [2, 0, 0]]) non_zero_indices = np.nonzero(arr) print(non_zero_indices) # Output: (array([0, 1]), array([1, 0]))
-
numpy.where(condition[, x, y])
: Returns elements chosen fromx
ory
depending on thecondition
. If onlycondition
is provided, it returns indices ofTrue
elements.import numpy as np arr = np.array([1, 2, 3, 4]) where_result = np.where(arr > 2) print(where_result) # Output: (array([2, 3]),)
10. Set Operations
Perform mathematical set operations on arrays such as union, intersection, and difference. These functions operate on flattened arrays.
-
numpy.in1d(ar1, ar2, assume_unique=False, invert=False)
: Tests whether each element of one array is present in another array.import numpy as np arr1 = np.array([1, 2, 3, 4]) arr2 = np.array([3, 4, 5, 6]) in_arr = np.in1d(arr1, arr2) print(in_arr) # Output: [False False True True]
-
numpy.intersect1d(ar1, ar2, assume_unique=False, return_indices=False)
: Finds the intersection of two arrays (unique elements present in both).import numpy as np arr1 = np.array([1, 2, 3, 4]) arr2 = np.array([3, 4, 5, 6]) intersection = np.intersect1d(arr1, arr2) print(intersection) # Output: [3 4]
-
numpy.setdiff1d(ar1, ar2, assume_unique=False)
: Finds the set difference of two arrays (unique values in the first array that are not in the second).import numpy as np arr1 = np.array([1, 2, 3, 4]) arr2 = np.array([3, 4, 5, 6]) difference = np.setdiff1d(arr1, arr2) print(difference) # Output: [1 2]
-
numpy.setxor1d(ar1, ar2, assume_unique=False)
: Finds the set symmetric difference of two arrays (unique values present in either, but not both).import numpy as np arr1 = np.array([1, 2, 3, 4]) arr2 = np.array([3, 4, 5, 6]) symmetric_difference = np.setxor1d(arr1, arr2) print(symmetric_difference) # Output: [1 2 5 6]
-
numpy.union1d(ar1, ar2)
: Finds the sorted union of two arrays (unique elements from both).import numpy as np arr1 = np.array([1, 2, 3, 4]) arr2 = np.array([3, 4, 5, 6]) union = np.union1d(arr1, arr2) print(union) # Output: [1 2 3 4 5 6]
11. Other Array Operations
Additional useful array operations.
-
numpy.clip(a, a_min, a_max, out=None)
: Limits (clips) values in an array to be within a specified range.import numpy as np arr = np.array([-1, 0, 5, 10]) clipped_arr = np.clip(arr, 0, 5) print(clipped_arr) # Output: [0 0 5 5]
-
numpy.round(a, decimals=0, out=None)
: Rounds array values to the given number of decimal places.import numpy as np arr = np.array([1.234, 5.678, 9.012]) rounded_arr = np.round(arr, decimals=1) print(rounded_arr) # Output: [1.2 5.7 9.0]
-
numpy.diagonal(a, offset=0, axis1=0, axis2=1)
: Returns specified diagonals of an array.import numpy as np arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) diagonals = np.diagonal(arr) print(diagonals) # Output: [1 5 9]
-
numpy.trace(a, offset=0, axis1=0, axis2=1, dtype=None, out=None)
: Returns the sum along the diagonals of an array.import numpy as np arr = np.array([[1, 2], [3, 4]]) trace_sum = np.trace(arr) print(trace_sum) # Output: 5 (1 + 4)
-
numpy.take(a, indices, axis=None, out=None, mode='clip')
: Takes elements from an array along an axis. This is an alternative to fancy indexing.import numpy as np arr = np.array([0, 1, 2, 3, 4]) taken_elements = np.take(arr, [1, 3, 4]) print(taken_elements) # Output: [1 3 4]
-
numpy.put(a, ind, v, mode='raise')
: Replaces specified elements of an array with given values.import numpy as np arr = np.array([0, 1, 2, 3, 4]) np.put(arr, [1, 3], [9, 8]) print(arr) # Output: [0 9 2 8 4]
-
numpy.choose(a, choices, out=None, mode='raise')
: Constructs an array from an index array and a list of arrays (choices).import numpy as np arr = np.array([0, 1, 2]) # Indices choices_arr = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]]) chosen_elements = np.choose(arr, choices_arr) print(chosen_elements) # Output: [10 50 90]
NumPy Arrays from Data: Efficient ML Data Handling
Learn to create NumPy arrays from existing Python data structures for efficient ML data manipulation. Explore practical techniques with clear examples.
NumPy Advanced Indexing for ML & Data Science
Master NumPy advanced indexing for precise array manipulation in ML, AI, and data science. Learn conditional filtering & element selection, returning copies, not views.