NumPy Array Splitting for ML: Split, Hsplit, Vsplit
Master NumPy array splitting with split, array_split, hsplit, vsplit, and dsplit. Essential techniques for data preparation in machine learning and AI.
Splitting Arrays in NumPy
NumPy provides powerful tools to divide a single array into multiple sub-arrays. This capability is essential for organizing data, preparing datasets for machine learning models, and performing computations on specific partitions of your data.
NumPy offers several functions for array splitting:
numpy.split()
numpy.array_split()
numpy.hsplit()
numpy.vsplit()
numpy.dsplit()
1. Splitting Arrays Using numpy.split()
The numpy.split()
function divides an array into equal-sized sub-arrays or at specified indices along a given axis.
Syntax:
numpy.split(array, indices_or_sections, axis=0)
array
: The input NumPy array to be split.indices_or_sections
:- If an integer
N
, the array is divided intoN
equal-sized sub-arrays. The array must be divisible evenly byN
along the specified axis. - If a 1D array of sorted integers, the array is split at these indices. For example,
[1, 2]
will split the array before index 1 and before index 2.
- If an integer
axis
: The axis along which to split the array. Defaults to0
(rows).
Example 1: Splitting into Equal-sized Sub-arrays
This example splits a 2D array into 3 equal parts along axis=1
(columns).
import numpy as np
arr = np.arange(9).reshape(3, 3)
split_arr = np.split(arr, 3, axis=1)
print("Original Array:")
print(arr)
print("\nSplit into 3 equal sub-arrays along axis 1:")
for sub_arr in split_arr:
print(sub_arr)
Output:
Original Array:
[[0 1 2]
[3 4 5]
[6 7 8]]
Split into 3 equal sub-arrays along axis 1:
[[0]
[3]
[6]]
[[1]
[4]
[7]]
[[2]
[5]
[8]]
Example 2: Splitting at Specific Indices
This example splits a 2D array at specific row indices ([1, 2]
) along axis=0
(rows).
import numpy as np
arr = np.arange(9).reshape(3, 3)
split_arr = np.split(arr, [1, 2], axis=0)
print("Split at indices [1, 2] along axis 0:")
for sub_arr in split_arr:
print(sub_arr)
Output:
Split at indices [1, 2] along axis 0:
[[0 1 2]]
[[3 4 5]]
[[6 7 8]]
2. Splitting Arrays Using numpy.array_split()
The numpy.array_split()
function is a more flexible version of split()
. It allows you to split an array into sub-arrays that may not be equal in size. This is particularly useful when the array size is not perfectly divisible by the number of splits.
Syntax:
numpy.array_split(array, indices_or_sections, axis=0)
array
: The input NumPy array to be split.indices_or_sections
: Similar tonumpy.split()
, this can be an integer specifying the number of sub-arrays or a list of indices at which to split.axis
: The axis along which to split. Defaults to0
.
Example 3: Splitting Unequally
This example splits a 1D array into 3 sub-arrays, where the sizes might differ.
import numpy as np
arr = np.arange(10)
split_arr = np.array_split(arr, 3)
print("Original Array:")
print(arr)
print("\nSplit into 3 unequal sub-arrays:")
for sub_arr in split_arr:
print(sub_arr)
Output:
Original Array:
[0 1 2 3 4 5 6 7 8 9]
Split into 3 unequal sub-arrays:
[0 1 2 3]
[4 5 6]
[7 8 9]
3. Horizontal Splitting Using numpy.hsplit()
The numpy.hsplit()
function is a convenience function for splitting an array along the horizontal axis (axis 1). This is equivalent to using numpy.split()
with axis=1
. It's particularly useful for splitting 2D arrays into columns.
Syntax:
numpy.hsplit(array, indices_or_sections)
array
: The input NumPy array to be split.indices_or_sections
: Similar tonumpy.split()
, this can be an integer specifying the number of sub-arrays or a list of indices at which to split horizontally.
Example 4: Splitting Horizontally
This example splits a 2D array into 2 equal parts horizontally.
import numpy as np
arr = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]])
split_arr = np.hsplit(arr, 2)
print("Original Array:")
print(arr)
print("\nSplit into 2 equal parts along axis 1 (horizontally):")
for sub_arr in split_arr:
print(sub_arr)
Output:
Original Array:
[[1 2 3 4]
[5 6 7 8]]
Split into 2 equal parts along axis 1 (horizontally):
[[1 2]
[5 6]]
[[3 4]
[7 8]]
4. Vertical Splitting Using numpy.vsplit()
The numpy.vsplit()
function is a convenience function for splitting an array along the vertical axis (axis 0). This is equivalent to using numpy.split()
with axis=0
. It's particularly useful for splitting 2D arrays into rows.
Syntax:
numpy.vsplit(array, indices_or_sections)
array
: The input NumPy array to be split.indices_or_sections
: Similar tonumpy.split()
, this can be an integer specifying the number of sub-arrays or a list of indices at which to split vertically.
Example 5: Splitting Vertically
This example splits a 2D array into 3 equal parts vertically.
import numpy as np
arr = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
split_arr = np.vsplit(arr, 3)
print("Original Array:")
print(arr)
print("\nSplit into 3 equal parts along axis 0 (vertically):")
for sub_arr in split_arr:
print(sub_arr)
Output:
Original Array:
[[1 2 3]
[4 5 6]
[7 8 9]]
Split into 3 equal parts along axis 0 (vertically):
[[1 2 3]]
[[4 5 6]]
[[7 8 9]]
5. Depth Splitting Using numpy.dsplit()
The numpy.dsplit()
function is designed for splitting 3D arrays along the depth axis, which is the third axis (axis 2). This is equivalent to using numpy.split()
with axis=2
.
Syntax:
numpy.dsplit(array, indices_or_sections)
array
: The input 3D NumPy array to be split.indices_or_sections
: Similar tonumpy.split()
, this can be an integer specifying the number of sub-arrays or a list of indices at which to split along the depth.
Example 6: Splitting in Depth (3D Array)
This example splits a 3D array into 4 equal parts along the depth (axis 2).
import numpy as np
arr = np.arange(24).reshape((2, 3, 4))
split_arr = np.dsplit(arr, 4)
print("Original Array:")
print(arr)
print("\nSplit into 4 equal parts along axis 2 (depth):")
for sub_arr in split_arr:
print(sub_arr)
print() # Add a blank line for better readability between splits
Output:
Original Array:
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
Split into 4 equal parts along axis 2 (depth):
[[[ 0]
[ 4]
[ 8]]
[[12]
[16]
[20]]]
[[[ 1]
[ 5]
[ 9]]
[[13]
[17]
[21]]]
[[[ 2]
[ 6]
[10]]
[[14]
[18]
[22]]]
[[[ 3]
[ 7]
[11]]
[[15]
[19]
[23]]]
Conclusion
Array splitting is a fundamental operation in NumPy for efficient data manipulation, preprocessing, and analysis. By leveraging functions like split
, array_split
, hsplit
, vsplit
, and dsplit
, you can effectively partition arrays along any axis according to your specific needs.
SEO Keywords:
NumPy split arrays, NumPy array_split function, NumPy horizontal split hsplit, NumPy vertical split vsplit, NumPy depth split dsplit, NumPy split examples, How to split arrays in Python, NumPy 2D and 3D array split, Python NumPy operations, NumPy tutorial.
NumPy Boolean Array Slicing for Efficient Data Filtering
Master NumPy boolean array slicing for efficient data filtering and manipulation in ML/AI. Learn to select data based on conditions without loops.
NumPy Array Stacking for ML & AI: A Deep Dive
Master NumPy array stacking (stack, vstack, hstack, dstack, column_stack) for efficient ML/AI data manipulation and model building.