DataFrame iloc

pandas
Published

August 14, 2024

Understanding .iloc

.iloc stands for “integer location” and allows you to access data within a DataFrame using integer-based indexing. Unlike .loc, which uses labels, .iloc uses numerical positions to select rows and columns. This makes it particularly useful when you know the exact row and column numbers you need. It’s zero-based indexing, meaning the first row is at index 0, the second at index 1, and so on.

Basic Usage: Selecting Single Elements

The simplest application of .iloc is selecting a single element. Let’s create a sample DataFrame:

import pandas as pd

data = {'col1': [1, 2, 3], 'col2': [4, 5, 6], 'col3': [7, 8, 9]}
df = pd.DataFrame(data)
print(df)

To access the element in the first row (index 0) and second column (index 1), we use:

element = df.iloc[0, 1]
print(element)  # Output: 4

Selecting Rows and Columns: Slicing

.iloc excels at selecting ranges of rows and columns using slicing:

first_two_rows = df.iloc[:2, :]
print(first_two_rows)

first_two_cols = df.iloc[:, :2]
print(first_two_cols)

specific_selection = df.iloc[1:3, [0, 2]]
print(specific_selection)

This demonstrates the flexibility of using slices (:) to specify ranges. Remember that the upper bound of the slice is exclusive.

Selecting Specific Rows and Columns with Lists

You can also select specific rows and columns by passing lists of integer indices to .iloc:

selected_rows_cols = df.iloc[[0, 2], [1, 2]]
print(selected_rows_cols)

This method allows for non-contiguous selections, offering greater precision.

Handling Multiple Conditions

While .iloc doesn’t directly support boolean indexing like .loc, you can combine it with other Pandas operations to achieve conditional selections. For instance:

rows_to_select = df['col1'] > 1
selected_data = df[rows_to_select].iloc[:, 1:3]
print(selected_data)

This example first filters the DataFrame based on a condition and then uses .iloc to select specific columns from the filtered result.

Working with Single Rows and Columns

Accessing entire rows or columns is straightforward:

first_row = df.iloc[0]
print(first_row)

second_col = df.iloc[:, 1]
print(second_col)

This simplifies the retrieval of complete rows or columns based on their integer positions.

Modifying Data with .iloc

.iloc isn’t just for reading; you can also modify the DataFrame using it:

df.iloc[1, 0] = 10
print(df)

#Modify a whole column
df.iloc[:,0] = [100, 200, 300]
print(df)

This allows for in-place updates to specific elements, rows or columns. Remember to be cautious when modifying DataFrames, always back up your data if necessary!