Basic Slicing: Selecting Rows and Columns
The simplest form of slicing uses bracket notation ([]
). To select rows, you specify the row indices (or labels) you want. To select columns, you specify the column names.
import pandas as pd
= {'col1': [1, 2, 3, 4, 5],
data 'col2': [6, 7, 8, 9, 10],
'col3': [11, 12, 13, 14, 15]}
= pd.DataFrame(data)
df
print(df[1:3])
print(df[['col1', 'col2']])
print(df['col1'][0]) # Accesses the first element of 'col1'
Using .loc
for Label-Based Slicing
The .loc
accessor allows you to select rows and columns using labels. This is particularly useful when your DataFrame has custom index labels.
= ['A', 'B', 'C', 'D', 'E']
df.index
print(df.loc[['B', 'D']])
print(df.loc['B':'D'])
print(df.loc[['B', 'D'], ['col1', 'col3']])
Using .iloc
for Integer-Based Slicing
The .iloc
accessor uses integer-based indexing, mirroring standard Python list slicing. This provides a more direct way to access data by position.
print(df.iloc[:3])
print(df.iloc[[1, 3], [0, 2]])
#Select every other row starting from the first row.
print(df.iloc[::2])
Boolean Indexing: Conditional Selection
Boolean indexing allows you to select rows based on a condition. This is incredibly useful for filtering data based on specific criteria.
print(df[df['col1'] > 2])
#Select rows where col1 is greater than 2 and col2 is less than 9
print(df[(df['col1'] > 2) & (df['col2'] < 9)])
print(df[df['col1'].isin([1, 5])])
Combining Slicing Techniques
You can combine these methods for more complex selections. For instance, you can slice rows using .iloc
and then select specific columns by name. The flexibility and power of these techniques are crucial for effectively working with DataFrames.
print(df.iloc[:2]['col2'])