Skip to content

Latest commit

 

History

History

06_date_and_time_handling

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

Date and Time Functions

Description: Date and time functions handle and manipulate datetime data within pandas, enabling time-based analysis and manipulation of time series data.

  • pd.to_datetime(): Converts a series or list to datetime format.
  • pd.date_range(): Generates a range of datetime values.
  • DataFrame.resample(): Resamples time-series data based on a specified frequency.
  • DataFrame.asfreq(): Converts time-series data to a specified frequency.
  • DataFrame.tz_localize(): Localizes time-series data to a specific timezone.
  • DataFrame.tz_convert(): Converts time-series data from one timezone to another.
  • from datetime import datetime: Imports the datetime module for handling date and time operations.

pd.to_datetime()

Explanation: The pd.to_datetime() function converts arguments to datetime objects. It is used to transform date and time information from various formats into a standard datetime64 format for easy manipulation and analysis.

Features:

  • Parsing: Can parse dates from strings, integers, or other formats.
  • Flexible: Supports various date formats and can infer formats automatically.
  • Handling Time Zones: Can handle and convert time zones.

Problem Solved:

  • Converts date/time data into a consistent format for further analysis, making it easier to perform operations like sorting, filtering, and time-based calculations.

Parameters:

  • arg: The data to convert. This can be a string, list, or other formats.
  • format: A specific date/time format string (optional). If not provided, the function tries to infer the format.
  • errors: Defines how to handle errors. Options are 'raise', 'coerce', or 'ignore'.

Example Code:

import pandas as pd

# Sample data with date strings
date_strings = ['2024-01-01', '2024-02-15', '2024-03-10']

# Converting to datetime
dates = pd.to_datetime(date_strings)

# Displaying the result
print("Converted Dates:")
print(dates)

Explanation:

  • pd.to_datetime(date_strings) converts the list of date strings into datetime64 objects.

Apply Filter on Date Column

  • dataframe.column.dt
  • dt.strftime
  • dt.strptime

In pandas, when you have a datetime column in a DataFrame, you can format the date and time information using the strftime method, which allows for flexible formatting by specifying format codes. Each code represents a different element of the date and time. Below are some of the commonly used format codes that you can use with strftime:

  • %a - Abbreviated weekday name.
  • %A - Full weekday name.
  • %w - Weekday as a decimal number, where 0 is Sunday and 6 is Saturday.
  • %d - Day of the month as a zero-padded decimal number.
  • %b - Abbreviated month name.
  • %B - Full month name.
  • %m - Month as a zero-padded decimal number.
  • %y - Year without century as a zero-padded decimal number.
  • %Y - Year with century as a decimal number.
  • %H - Hour (24-hour clock) as a zero-padded decimal number.
  • %I - Hour (12-hour clock) as a zero-padded decimal number.
  • %p - Locale’s equivalent of either AM or PM.
  • %M - Minute as a zero-padded decimal number.
  • %S - Second as a zero-padded decimal number.
  • %f - Microsecond as a decimal number, zero-padded on the left.
  • %z - UTC offset in the form ±HHMM[SS[.ffffff]] (empty string if the object is naive).
  • %Z - Time zone name (empty string if the object is naive).
  • %j - Day of the year as a zero-padded decimal number.
  • %U - Week number of the year (Sunday as the first day of the week) as a zero-padded decimal number.
  • %W - Week number of the year (Monday as the first day of the week) as a zero-padded decimal number.
  • %c - Locale’s appropriate date and time representation.
  • %x - Locale’s appropriate date representation.
  • %X - Locale’s appropriate time representation.
  • %G - ISO 8601 year with century representing the year that contains the greater part of the ISO week (%V).
  • %u - ISO 8601 weekday as a decimal number where 1 is Monday.
  • %V - ISO 8601 week number as a decimal number with Monday as the first day of the week.

pd.date_range()

Explanation: The pd.date_range() function generates a range of dates with a specified frequency. It is useful for creating sequences of dates for time series analysis or plotting.

Features:

  • Range Generation: Creates a range of dates with defined start and end dates or periods.
  • Frequency: Supports various frequencies (e.g., daily, monthly, yearly).

Problem Solved:

  • Generates a sequence of dates for use in time series data, analysis, or as an index for DataFrames.

Parameters:

  • start: The start date of the range.
  • end: The end date of the range (optional). If not provided, use periods instead.
  • periods: Number of periods to generate (optional).
  • freq: Frequency of date generation (e.g., 'D' for daily, 'M' for monthly).

Example Code:

import pandas as pd

# Generating a range of dates
date_range = pd.date_range(start='2024-01-01', end='2024-01-10', freq='D')

# Displaying the result
print("Date Range:")
print(date_range)

Explanation:

  • pd.date_range(start='2024-01-01', end='2024-01-10', freq='D') creates a range of daily dates between January 1 and January 10, 2024.

DataFrame.resample()

Explanation: The DataFrame.resample() function is used to resample time-series data at different frequencies. It allows you to aggregate, downsample, or upsample time-based data.

Features:

  • Resampling: Aggregates data based on a new frequency.
  • Flexible: Supports various aggregation functions and frequencies.

Problem Solved:

  • Resamples time-series data to different frequencies, which is useful for aggregating data (e.g., daily to monthly) or filling missing values.

Parameters:

  • rule: The frequency rule (e.g., 'D' for daily, 'M' for monthly).
  • how: Aggregation method to apply (e.g., 'sum', 'mean').
  • fill_value: Value to use for missing data.

Example Code:

import pandas as pd

# Sample DataFrame with time series data
data = {
    'Date': pd.date_range(start='2024-01-01', periods=10, freq='D'),
    'Value': range(10)
}
df = pd.DataFrame(data)
df.set_index('Date', inplace=True)

# Resampling to get monthly data
df_resampled = df.resample('M').sum()

# Displaying the result
print("Resampled DataFrame:")
print(df_resampled)

Explanation:

  • df.resample('M').sum() resamples the data to a monthly frequency and sums the values within each month.

DataFrame.asfreq()

Explanation: The DataFrame.asfreq() function is used to change the frequency of a time series DataFrame. It allows you to specify a new frequency and fill missing data if needed.

Features:

  • Frequency Change: Changes the frequency of the time series data.
  • Filling Values: Can fill in missing data using specified methods.

Problem Solved:

  • Adjusts the frequency of time series data and handles missing values, which is essential for aligning data with specific time intervals.

Parameters:

  • freq: The new frequency (e.g., 'D' for daily, 'H' for hourly).
  • fill_value: Value to use for missing data (optional).
  • how: Method to use for filling (e.g., 'ffill' for forward fill).

Example Code:

import pandas as pd

# Sample DataFrame with time series data
data = {
    'Date': pd.date_range(start='2024-01-01', periods=5, freq='D'),
    'Value': range(5)
}
df = pd.DataFrame(data)
df.set_index('Date', inplace=True)

# Changing frequency to hourly
df_asfreq = df.asfreq('H', fill_value=0)

# Displaying the result
print("DataFrame with Hourly Frequency:")
print(df_asfreq)

Explanation:

  • df.asfreq('H', fill_value=0) changes the frequency of the DataFrame to hourly and fills missing values with 0.

DataFrame.tz_localize()

Explanation: The DataFrame.tz_localize() function localizes the timezone of a datetime index. It adds timezone information to naive datetime objects.

Features:

  • Timezone Localization: Adds timezone information to datetime indices.
  • Flexible: Supports various timezones.

Problem Solved:

  • Adds timezone information to datetime data, which is necessary for working with time zones in time series analysis.

Parameters:

  • tz: The timezone to localize to (e.g., 'UTC', 'America/New_York').

Example Code:

import pandas as pd

# Sample DataFrame with naive datetime index
data = {
    'Date': pd.date_range(start='2024-01-01', periods=3, freq='D'),
    'Value': [1, 2, 3]
}
df = pd.DataFrame(data)
df.set_index('Date', inplace=True)

# Localizing the timezone to 'UTC'
df_localized = df.tz_localize('UTC')

# Displaying the result
print("DataFrame with Localized Timezone:")
print(df_localized)

Explanation:

  • df.tz_localize('UTC') localizes the datetime index to the 'UTC' timezone.

DataFrame.tz_convert()

Explanation: The DataFrame.tz_convert() function converts the timezone of a datetime index from one timezone to another. It requires the datetime index to be already localized.

Features:

  • Timezone Conversion: Converts from one timezone to another.
  • Requires Localized Time: Works on datetimes that already have timezone information.

Problem Solved:

  • Converts datetime data from one timezone to another, which is useful for handling time series data across different time zones.

Parameters:

  • tz: The target timezone to convert to (e.g., 'UTC', 'Europe/Berlin').

Example Code:

import pandas as pd

# Sample DataFrame with datetime index localized to 'UTC'
data = {
    'Date': pd.date_range(start='2024-01-01', periods=3, freq='D'),
    'Value': [1, 2, 3]
}
df = pd.DataFrame(data)
df.set_index('Date', inplace=True)
df = df.tz_localize('UTC')

# Converting timezone to 'America/New_York'
df_converted = df.tz_convert('America/New_York')

# Displaying the result
print("DataFrame with Converted Timezone:")
print(df_converted)

Explanation:

  • df.tz_convert('America/New_York') converts the datetime index from 'UTC' to 'America/New_York' timezone.

from datetime import datetime

Explanation: The datetime module from the datetime library is used for working with dates and times in Python. It provides classes for manipulating dates and times, including datetime, date, time, and timedelta.

Features:

  • Date and Time Manipulation: Provides functionality for creating and manipulating date and time objects.
  • Formatting: Supports formatting dates and times as strings.

Problem Solved:

  • Enables handling of date and time data in Python, including creation, manipulation, and formatting.

Parameters:

  • None: The datetime class itself does not have parameters, but its methods and functions do.

Example Code:

from datetime import datetime

# Creating a datetime object
now = datetime.now()

# Formatting the datetime object
formatted_date =

 now.strftime('%Y-%m-%d %H:%M:%S')

# Displaying the result
print("Current Date and Time:")
print(formatted_date)

Explanation:

  • datetime.now() creates a datetime object with the current date and time, and strftime('%Y-%m-%d %H:%M:%S') formats it as a string.