From a Single List to a DataFrame
If you have a single list, Pandas will interpret it as a single column in your DataFrame. You need to specify the column name.
import pandas as pd
= [10, 20, 30, 40, 50]
data = pd.DataFrame(data, columns=['Values'])
df print(df)
This will output:
Values
0 10
1 20
2 30
3 40
4 50
From a List of Lists to a DataFrame
For more complex datasets, you’ll often use a list of lists. Each inner list represents a row in your DataFrame. You can optionally specify column names.
import pandas as pd
= [[1, 'Alice', 25], [2, 'Bob', 30], [3, 'Charlie', 22]]
data = pd.DataFrame(data, columns=['ID', 'Name', 'Age'])
df print(df)
This will produce:
ID Name Age
0 1 Alice 25
1 2 Bob 30
2 3 Charlie 22
If you omit the columns
parameter, Pandas will automatically assign numerical column names (0, 1, 2, …).
import pandas as pd
= [[1, 'Alice', 25], [2, 'Bob', 30], [3, 'Charlie', 22]]
data = pd.DataFrame(data)
df print(df)
Handling Different Data Types
DataFrames can handle various data types within a single column or across columns.
import pandas as pd
= [[1, 'Alice', 25.5, True], [2, 'Bob', 30, False], [3, 'Charlie', 22, True]]
data = pd.DataFrame(data, columns=['ID', 'Name', 'Age', 'Status'])
df print(df)
This example shows a mix of integers, strings, floats, and booleans. Pandas handles these automatically.
Using Dictionaries for Column Names and Data
An alternative, and often more readable, approach is to use a dictionary where keys represent column names and values are lists representing the data for each column.
import pandas as pd
= {'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 22]}
data = pd.DataFrame(data)
df print(df)
This offers a clearer way to structure your data, especially when dealing with numerous columns.
Creating DataFrames from Lists of Dictionaries
You can also create a DataFrame from a list of dictionaries. Each dictionary represents a row, and keys represent column names.
import pandas as pd
= [{'ID': 1, 'Name': 'Alice', 'Age': 25}, {'ID': 2, 'Name': 'Bob', 'Age': 30}, {'ID': 3, 'Name': 'Charlie', 'Age': 22}]
data = pd.DataFrame(data)
df print(df)
This method is useful when your data is naturally structured as a list of individual records. Note that all dictionaries should ideally contain the same keys (columns).
Handling Missing Data
If your lists have unequal lengths, or dictionaries have missing keys, Pandas will fill in missing values with NaN
(Not a Number).
import pandas as pd
= [[1, 'Alice', 25], [2, 'Bob'], [3, 'Charlie', 22, 'extra']]
data = pd.DataFrame(data)
df print(df)
Pandas gracefully handles these situations, allowing for flexible data input. You can later use Pandas’ powerful tools to handle these missing values (e.g., imputation, removal).