Understanding Boolean Indexing
Boolean indexing leverages Boolean arrays (arrays containing only True
and False
values) to filter elements from another array. The index array’s shape must be compatible with the array being indexed. A compatible shape means that either the index array has the same shape as the array being indexed, or it has a shape that’s broadcastable to match.
Let’s illustrate with a simple example:
import numpy as np
= np.array([10, 20, 30, 40, 50])
arr = np.array([True, False, True, False, True])
bool_index
= arr[bool_index]
result print(result) # Output: [10 30 50]
In this example, bool_index
selects elements where the corresponding value is True
.
Creating Boolean Arrays with Comparison Operators
The power of Boolean indexing truly shines when you generate the Boolean array dynamically using comparison operators. These operators directly compare array elements against a value or another array, resulting in a Boolean array:
= np.array([10, 20, 30, 40, 50])
arr
= arr > 25
greater_than_25 print(greater_than_25) # Output: [False False True True True]
= arr[greater_than_25]
result print(result) # Output: [30 40 50]
= (arr >= 20) & (arr <= 40) # Combining conditions with & (and) and | (or)
within_range print(within_range) # Output: [False True True True False]
= arr[within_range]
result print(result) # Output: [20 30 40]
This demonstrates how to easily select subsets of your data based on specific criteria. Note the use of &
(logical AND) and |
(logical OR) to combine multiple conditions.
Multi-Dimensional Boolean Indexing
Boolean indexing isn’t limited to one-dimensional arrays. It extends seamlessly to multi-dimensional arrays:
= np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr_2d
= arr_2d > 5
greater_than_5 print(greater_than_5)
#Output:
#[[False False False]
= arr_2d[greater_than_5]
result print(result) # Output: [6 7 8 9]
#Selecting rows based on a condition on a column
= arr_2d[:,0] > 3
row_condition print(row_condition) # Output: [False True True]
= arr_2d[row_condition,:]
result print(result) #Output: [[4 5 6] [7 8 9]]
Here, we demonstrate how to select elements that meet the criteria across rows and columns.
Modifying Arrays with Boolean Indexing
Boolean indexing isn’t just for selection; you can also use it to modify array elements.
= np.array([10, 20, 30, 40, 50])
arr > 25] = 100
arr[arr print(arr) # Output: [ 10 20 100 100 100]
This example shows how to efficiently update elements fulfilling a condition.
Advanced Techniques: np.where()
NumPy’s np.where()
function provides a concise way to perform conditional assignments:
= np.array([1,2,3,4,5])
arr = np.where(arr>2, 100, arr)
arr_new print(arr_new) # Output: [ 1 2 100 100 100]
np.where(condition, x, y)
assigns x
where the condition is true and y
where it’s false. This adds another layer of flexibility to your Boolean indexing operations.