NumPy’s timedelta64
dtype offers a powerful way to handle time differences within your Python code, especially when working with large datasets. Unlike standard Python’s datetime.timedelta
objects, timedelta64
arrays offer the speed and efficiency of NumPy’s vectorized operations, making them ideal for time series analysis and manipulation. This post dives deep into working with NumPy Timedelta arrays, demonstrating their capabilities with clear examples.
Creating Timedelta Arrays
Creating timedelta64
arrays is straightforward. You can specify the unit (e.g., ‘D’ for days, ‘h’ for hours, ‘m’ for minutes, ‘s’ for seconds, ‘ms’ for milliseconds, ‘us’ for microseconds, ‘ns’ for nanoseconds) directly within the array creation:
import numpy as np
= np.array([1, 2, 3], dtype='timedelta64[D]') #Days
timedeltas print(timedeltas)
= np.array([1, 2, 3], dtype='timedelta64[h]') #Hours
timedeltas_hours print(timedeltas_hours)
#Creating from a list of strings
= np.array(['1D', '2D', '3D'], dtype='timedelta64')
timedeltas_strings print(timedeltas_strings)
#From a list of existing timedelta64 objects
import datetime
= [datetime.timedelta(days=1), datetime.timedelta(days=2), datetime.timedelta(days=3)]
td_list = np.array(td_list)
timedeltas_from_list print(timedeltas_from_list)
Performing Arithmetic Operations
The real power of timedelta64
shines through its ability to perform arithmetic operations efficiently on entire arrays:
= timedeltas + np.array([4, 5, 6], dtype='timedelta64[D]')
result_add print(result_add)
= timedeltas - np.array([1, 1, 1], dtype='timedelta64[D]')
result_sub print(result_sub)
= timedeltas * 2
result_mul print(result_mul)
= timedeltas / 2 #this will result in a float
result_div print(result_div)
Converting Units
You can easily convert between different time units using NumPy’s casting abilities:
= timedeltas.astype('timedelta64[h]')
days_to_hours print(days_to_hours)
= days_to_hours.astype('timedelta64[s]')
hours_to_seconds print(hours_to_seconds)
Combining with datetime64
timedelta64
arrays work seamlessly with datetime64
arrays, enabling powerful date and time calculations:
= np.array(['2024-03-10', '2024-03-11', '2024-03-12'], dtype='datetime64')
dates = dates + timedeltas
future_dates print(future_dates)
Handling Missing Values
NumPy’s timedelta64
also gracefully handles missing values represented by NaT
(Not a Time):
= np.array([1, 2, np.nan], dtype='timedelta64[D]')
timedeltas_with_nan print(timedeltas_with_nan)
= timedeltas_with_nan + np.array([4, 5, 6], dtype='timedelta64[D]')
result_nan print(result_nan)
Beyond the Basics: Advanced Usage
NumPy’s timedelta64
arrays integrate well with other NumPy functions and features, allowing for complex time series analysis, including aggregations, filtering, and more. Explore NumPy’s documentation for a complete understanding of its capabilities. This only scratches the surface of what you can achieve with timedelta64
for efficient time-based computations.