Unique Elements: np.unique()
Often, you need to identify the unique elements within a NumPy array. The np.unique()
function simplifies this process considerably. It returns a sorted array containing only the unique values.
import numpy as np
= np.array([1, 2, 2, 3, 4, 4, 5, 1])
arr = np.unique(arr)
unique_elements print(unique_elements) # Output: [1 2 3 4 5]
np.unique()
can also return the indices of the unique elements in the original array using the return_index
argument. This is helpful when you need to know the original positions of the unique values.
= np.array([1, 2, 2, 3, 4, 4, 5, 1])
arr = np.unique(arr, return_index=True)
unique_elements, indices print(unique_elements) # Output: [1 2 3 4 5]
print(indices) # Output: [0 1 3 4 6]
You can also get the counts of each unique element using the return_counts
argument.
= np.array([1, 2, 2, 3, 4, 4, 5, 1])
arr = np.unique(arr, return_counts=True)
unique_elements, counts print(unique_elements) # Output: [1 2 3 4 5]
print(counts) # Output: [2 2 1 2 1]
Set Operations: np.intersect1d()
, np.union1d()
, np.setdiff1d()
, np.setxor1d()
NumPy provides functions mirroring standard set operations:
np.intersect1d(arr1, arr2)
: Returns the common elements between two arrays.
= np.array([1, 2, 3, 4, 5])
arr1 = np.array([3, 5, 6, 7, 8])
arr2 = np.intersect1d(arr1, arr2)
intersection print(intersection) # Output: [3 5]
np.union1d(arr1, arr2)
: Returns the union of two arrays (all unique elements from both).
= np.array([1, 2, 3, 4, 5])
arr1 = np.array([3, 5, 6, 7, 8])
arr2 = np.union1d(arr1, arr2)
union print(union) # Output: [1 2 3 4 5 6 7 8]
np.setdiff1d(arr1, arr2)
: Returns the elements inarr1
that are not inarr2
.
= np.array([1, 2, 3, 4, 5])
arr1 = np.array([3, 5, 6, 7, 8])
arr2 = np.setdiff1d(arr1, arr2)
difference print(difference) # Output: [1 2 4]
np.setxor1d(arr1, arr2)
: Returns the elements that are in eitherarr1
orarr2
, but not both (symmetric difference).
= np.array([1, 2, 3, 4, 5])
arr1 = np.array([3, 5, 6, 7, 8])
arr2 = np.setxor1d(arr1, arr2)
symmetric_difference print(symmetric_difference) # Output: [1 2 4 6 7 8]
These functions offer a concise and efficient way to perform set operations on NumPy arrays, making your code cleaner and faster, especially when dealing with large datasets. They are essential tools for data cleaning, analysis, and manipulation tasks.