Using .loc
for Label-Based Selection
The .loc
accessor is your go-to method when selecting rows based on their labels (index values). It’s intuitive and highly versatile.
import pandas as pd
= {'col1': [1, 2, 3, 4, 5],
data 'col2': [6, 7, 8, 9, 10]}
= pd.DataFrame(data, index=['A', 'B', 'C', 'D', 'E'])
df
print(df.loc['A'])
print(df.loc[['A', 'C', 'E']])
print(df.loc['B':'D']) # Inclusive of 'D'
.loc
also allows for boolean indexing, enabling powerful row selection based on conditions.
print(df.loc[df['col1'] > 2])
print(df.loc[(df['col1'] > 2) & (df['col2'] < 9)])
Using .iloc
for Integer-Based Selection
.iloc
provides integer-based indexing, allowing you to select rows based on their position in the DataFrame. This is particularly useful when you don’t rely on the DataFrame’s index.
print(df.iloc[0])
print(df.iloc[1:4])
print(df.iloc[::2])
Similar to .loc
, .iloc
can be combined with boolean indexing for more complex selection. However, the boolean array must align with the integer positions.
= [False, True, False, True, False]
boolean_array print(df.iloc[boolean_array])
Using .at
and .iat
for Single Element Selection
For accessing single elements (a single row and single column), .at
(label-based) and .iat
(integer-based) offer the most efficient approach. Avoid using .loc
or .iloc
for single-element access; these are significantly slower.
print(df.at['B', 'col1'])
print(df.iat[2, 0])
Filtering with query()
for Readable Code
The query()
method offers a highly readable way to filter rows based on conditions, especially with complex queries.
print(df.query('col1 > 2'))
print(df.query('col1 > 2 and col2 < 9'))
This method significantly improves code readability, making it easier to understand and maintain complex filtering operations. Note that column names containing spaces will need to be quoted within the query string.
These techniques empower you to effectively and efficiently select rows in your Pandas DataFrames, significantly enhancing your data analysis workflow. Experiment with these methods to find the approach best suited to your specific needs and data structure.