Pandas Dates and Times
Using Dates and Times in Pandas
As with many of the other topics, Pandas has helpful methods built in for date time functionality. However, before we get too far into the different methods that we can use we need to make sure that the data is in the correct format for Pandas date time functions:
list_1 = ['7/1/2021', '7/4/2021', '8/20/2021', '9/2/2021', '4/1/2050']
list_2 = ['Presentation', 'Class', 'Presentation', 'Event', 'Class']
list_3 = [25, 0, 50, 48, 1000000]
myDF = pd.DataFrame(zip(list_1, list_2, list_3), columns=['date', 'type', 'attendance'])
print(myDF.dtypes)
date object type object attendance int64
As we can see in the example, the first date
column shows as object
in the output. In this case if we tried to use any of the date time functions we would get an error. Thankfully we can convert the column to the correct date time format:
myDF['date'] = pd.to_datetime(myDF['date'], format='%m/%d/%Y')
print(myDF.dtypes)
date datetime64[ns] type object attendance int64
We can now see in the output that the date
column has been updated to datetime64[ns]
. This will allow us to utilize the Pandas date time functions. The format=
field can be a bit confusing. When using this field we need to pass the type of date format that we’d expect to use. In this case %m
indicates month, %d
indicates day, and %Y
indicates year.
Date time formatting does pay attention to capitalization. For example, %y
is the year without the century while %Y
is the year with the century. Think 13
compared to 2013
. Thankfull there are lots of helpful tables to explain the different formats. You’ll also need to match the separators for the different date formats. If the date field is 8-1-2021
instead of 8/1/2021
you’ll need to match that in the format input.
Date Time Methods
Once you have the data in the correct data type there are lots of functions that can be beneficial when working with date time data. A few of the most common functions are included below:
What if I just wanted the month or year?
myDF['month'] = myDF['date'].dt.month
myDF['year'] = myDF['date'].dt.year
print(myDF)
date type attendance month year 0 2021-07-01 Presentation 25 7 2021 1 2021-07-04 Class 0 7 2021 2 2021-08-20 Presentation 50 8 2021 3 2021-09-02 Event 48 9 2021 4 2050-04-01 Class 1000000 4 2050
What if I wanted to find the day of the week?
myDF['day_of_the_week'] = myDF['date'].dt.day_name()
print(myDF)
date type attendance day_of_the_week 0 2021-07-01 Presentation 25 Thursday 1 2021-07-04 Class 0 Sunday 2 2021-08-20 Presentation 50 Friday 3 2021-09-02 Event 48 Thursday 4 2050-04-01 Class 1000000 Friday
There is also a month_name()
function that could be used!
What if I wanted to shift all the days forward one week?
myDF['one_week_later'] = myDF['date'] + pd.Timedelta("7 days")
print(myDF)
date type attendance one_week_later 0 2021-07-01 Presentation 25 2021-07-08 1 2021-07-04 Class 0 2021-07-11 2 2021-08-20 Presentation 50 2021-08-27 3 2021-09-02 Event 48 2021-09-09 4 2050-04-01 Class 1000000 2050-04-08
What if I wanted to see how many days until a specific date?
one_year_later = myDF['date'] + pd.offsets.DateOffset(years=1)
myDF['end_of_school'] = pd.to_datetime({'month': 5, 'day': 31, 'year':one_year_later.dt.year})
myDF['days_until_school_is_over'] = myDF['end_of_school'] - myDF['date']
print(myDF)
date type attendance end_of_school days_until_school_is_over 0 2021-07-01 Presentation 25 2022-05-31 334 days 1 2021-07-04 Class 0 2022-05-31 331 days 2 2021-08-20 Presentation 50 2022-05-31 284 days 3 2021-09-02 Event 48 2022-05-31 271 days 4 2050-04-01 Class 1000000 2051-05-31 425 days
Resources
As you can see there is a ton of different functionality when working with date time objects in Pandas. As usual, Pandas has some great documentation regarding date time functionality and all the different methods that are available.