NumPy Array Splitting for ML: Split, Hsplit, Vsplit

Master NumPy array splitting with split, array_split, hsplit, vsplit, and dsplit. Essential techniques for data preparation in machine learning and AI.

Splitting Arrays in NumPy

NumPy provides powerful tools to divide a single array into multiple sub-arrays. This capability is essential for organizing data, preparing datasets for machine learning models, and performing computations on specific partitions of your data.

NumPy offers several functions for array splitting:

  • numpy.split()
  • numpy.array_split()
  • numpy.hsplit()
  • numpy.vsplit()
  • numpy.dsplit()

1. Splitting Arrays Using numpy.split()

The numpy.split() function divides an array into equal-sized sub-arrays or at specified indices along a given axis.

Syntax:

numpy.split(array, indices_or_sections, axis=0)
  • array: The input NumPy array to be split.
  • indices_or_sections:
    • If an integer N, the array is divided into N equal-sized sub-arrays. The array must be divisible evenly by N along the specified axis.
    • If a 1D array of sorted integers, the array is split at these indices. For example, [1, 2] will split the array before index 1 and before index 2.
  • axis: The axis along which to split the array. Defaults to 0 (rows).

Example 1: Splitting into Equal-sized Sub-arrays

This example splits a 2D array into 3 equal parts along axis=1 (columns).

import numpy as np

arr = np.arange(9).reshape(3, 3)
split_arr = np.split(arr, 3, axis=1)

print("Original Array:")
print(arr)
print("\nSplit into 3 equal sub-arrays along axis 1:")
for sub_arr in split_arr:
    print(sub_arr)

Output:

Original Array:
[[0 1 2]
 [3 4 5]
 [6 7 8]]

Split into 3 equal sub-arrays along axis 1:
[[0]
 [3]
 [6]]
[[1]
 [4]
 [7]]
[[2]
 [5]
 [8]]

Example 2: Splitting at Specific Indices

This example splits a 2D array at specific row indices ([1, 2]) along axis=0 (rows).

import numpy as np

arr = np.arange(9).reshape(3, 3)
split_arr = np.split(arr, [1, 2], axis=0)

print("Split at indices [1, 2] along axis 0:")
for sub_arr in split_arr:
    print(sub_arr)

Output:

Split at indices [1, 2] along axis 0:
[[0 1 2]]
[[3 4 5]]
[[6 7 8]]

2. Splitting Arrays Using numpy.array_split()

The numpy.array_split() function is a more flexible version of split(). It allows you to split an array into sub-arrays that may not be equal in size. This is particularly useful when the array size is not perfectly divisible by the number of splits.

Syntax:

numpy.array_split(array, indices_or_sections, axis=0)
  • array: The input NumPy array to be split.
  • indices_or_sections: Similar to numpy.split(), this can be an integer specifying the number of sub-arrays or a list of indices at which to split.
  • axis: The axis along which to split. Defaults to 0.

Example 3: Splitting Unequally

This example splits a 1D array into 3 sub-arrays, where the sizes might differ.

import numpy as np

arr = np.arange(10)
split_arr = np.array_split(arr, 3)

print("Original Array:")
print(arr)
print("\nSplit into 3 unequal sub-arrays:")
for sub_arr in split_arr:
    print(sub_arr)

Output:

Original Array:
[0 1 2 3 4 5 6 7 8 9]

Split into 3 unequal sub-arrays:
[0 1 2 3]
[4 5 6]
[7 8 9]

3. Horizontal Splitting Using numpy.hsplit()

The numpy.hsplit() function is a convenience function for splitting an array along the horizontal axis (axis 1). This is equivalent to using numpy.split() with axis=1. It's particularly useful for splitting 2D arrays into columns.

Syntax:

numpy.hsplit(array, indices_or_sections)
  • array: The input NumPy array to be split.
  • indices_or_sections: Similar to numpy.split(), this can be an integer specifying the number of sub-arrays or a list of indices at which to split horizontally.

Example 4: Splitting Horizontally

This example splits a 2D array into 2 equal parts horizontally.

import numpy as np

arr = np.array([[1, 2, 3, 4],
                [5, 6, 7, 8]])
split_arr = np.hsplit(arr, 2)

print("Original Array:")
print(arr)
print("\nSplit into 2 equal parts along axis 1 (horizontally):")
for sub_arr in split_arr:
    print(sub_arr)

Output:

Original Array:
[[1 2 3 4]
 [5 6 7 8]]

Split into 2 equal parts along axis 1 (horizontally):
[[1 2]
 [5 6]]
[[3 4]
 [7 8]]

4. Vertical Splitting Using numpy.vsplit()

The numpy.vsplit() function is a convenience function for splitting an array along the vertical axis (axis 0). This is equivalent to using numpy.split() with axis=0. It's particularly useful for splitting 2D arrays into rows.

Syntax:

numpy.vsplit(array, indices_or_sections)
  • array: The input NumPy array to be split.
  • indices_or_sections: Similar to numpy.split(), this can be an integer specifying the number of sub-arrays or a list of indices at which to split vertically.

Example 5: Splitting Vertically

This example splits a 2D array into 3 equal parts vertically.

import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])
split_arr = np.vsplit(arr, 3)

print("Original Array:")
print(arr)
print("\nSplit into 3 equal parts along axis 0 (vertically):")
for sub_arr in split_arr:
    print(sub_arr)

Output:

Original Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Split into 3 equal parts along axis 0 (vertically):
[[1 2 3]]
[[4 5 6]]
[[7 8 9]]

5. Depth Splitting Using numpy.dsplit()

The numpy.dsplit() function is designed for splitting 3D arrays along the depth axis, which is the third axis (axis 2). This is equivalent to using numpy.split() with axis=2.

Syntax:

numpy.dsplit(array, indices_or_sections)
  • array: The input 3D NumPy array to be split.
  • indices_or_sections: Similar to numpy.split(), this can be an integer specifying the number of sub-arrays or a list of indices at which to split along the depth.

Example 6: Splitting in Depth (3D Array)

This example splits a 3D array into 4 equal parts along the depth (axis 2).

import numpy as np

arr = np.arange(24).reshape((2, 3, 4))
split_arr = np.dsplit(arr, 4)

print("Original Array:")
print(arr)
print("\nSplit into 4 equal parts along axis 2 (depth):")
for sub_arr in split_arr:
    print(sub_arr)
    print() # Add a blank line for better readability between splits

Output:

Original Array:
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]

Split into 4 equal parts along axis 2 (depth):
[[[ 0]
  [ 4]
  [ 8]]

 [[12]
  [16]
  [20]]]

[[[ 1]
  [ 5]
  [ 9]]

 [[13]
  [17]
  [21]]]

[[[ 2]
  [ 6]
  [10]]

 [[14]
  [18]
  [22]]]

[[[ 3]
  [ 7]
  [11]]

 [[15]
  [19]
  [23]]]

Conclusion

Array splitting is a fundamental operation in NumPy for efficient data manipulation, preprocessing, and analysis. By leveraging functions like split, array_split, hsplit, vsplit, and dsplit, you can effectively partition arrays along any axis according to your specific needs.


SEO Keywords:

NumPy split arrays, NumPy array_split function, NumPy horizontal split hsplit, NumPy vertical split vsplit, NumPy depth split dsplit, NumPy split examples, How to split arrays in Python, NumPy 2D and 3D array split, Python NumPy operations, NumPy tutorial.