Method 1: Direct Assignment
The simplest way to add a new column is by assigning a list, array, or Series to a new column name. The length of the assigned data must match the number of rows in your DataFrame.
import pandas as pd
= {'col1': [1, 2, 3], 'col2': [4, 5, 6]}
data = pd.DataFrame(data)
df print("Original DataFrame:\n", df)
'col3'] = [7, 8, 9]
df[print("\nDataFrame after adding 'col3':\n", df)
#Adding a column from a list of the same size
'col4'] = [10,11,12]
df[print("\nDataFrame after adding 'col4':\n", df)
#Adding a new column with a scalar value (same value for all rows)
'col5'] = 100
df[print("\nDataFrame after adding 'col5':\n", df)
This method is efficient and straightforward for adding simple columns.
Method 2: Using insert()
The insert()
method allows you to add a column at a specific position within the DataFrame. This is useful when you need to control the order of columns. The method takes three arguments: the location (index), the column name, and the data.
1, 'col6', [13, 14, 15])
df.insert(print("\nDataFrame after inserting 'col6':\n", df)
Note that the index starts from 0.
Method 3: Creating a New Column Based on Existing Columns
Often, you’ll need to create a new column based on calculations or transformations of existing columns.
'col7'] = df['col1'] + df['col2']
df[print("\nDataFrame after adding 'col7':\n", df)
#Creating 'col8' based on a conditional statement
'col8'] = ['High' if x > 10 else 'Low' for x in df['col7']]
df[print("\nDataFrame after adding 'col8':\n", df)
This method leverages Pandas’ vectorized operations for efficiency.
Method 4: Applying a Function with apply()
For more complex calculations, you can use the apply()
method with a custom function.
def square(x):
return x**2
'col9'] = df['col1'].apply(square)
df[print("\nDataFrame after adding 'col9':\n", df)
This allows for flexibility in how you generate the values for the new column.
Method 5: Using assign()
The assign()
method is particularly useful for adding multiple columns at once. It returns a new DataFrame with the added columns, leaving the original DataFrame unchanged.
#Adding multiple columns at once using assign
= df.assign(col10 = df['col1'] * 2, col11 = df['col2'] / 2)
new_df print("\nNew DataFrame after using assign:\n",new_df)
print("\nOriginal DataFrame remains unchanged:\n", df)
This method promotes cleaner and more readable code when adding several columns simultaneously.
Handling Different Data Types
Remember to ensure the data type of the new column is consistent with the values you’re adding. Pandas will often infer the data type automatically, but you can explicitly specify it if needed using .astype()
. For example, to create a column of strings:
'col12'] = df['col1'].astype(str) df[