Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
target_rvdss_data.csv	target_rvdss_data.csv

Target Data for Respiratory Virus Detection Surveillance System's Lab Detections (2024-25)

Overview

The target-data folder contains the CSV data that the forecasts will be compared against. This data serves as the "gold standard" for evaluating the forecasting models. For the current Flu season, the data is stored in target-data/season_2024_2025/data_report.csv file.

Lab Detections Data
Accessing Target Data
Data Processing
Additional Resources

Lab Detections Data

Source

Respiratory Virus Detection Surveillance System (RVDSS)

Our hub's prediction targets (sarscov2_pct_positive, rsv_pct_positive and flu_pct_positive) are scraped from the Respiratory Virus Detection Surveillance System (RVDSS), published by the Public Health Agency of Canada (PHAC). The data was historically reported in weekly reports, but the current season is moved to an interactive dashboard. Historic reports and the interactive dashboard can be found here. The target data file data_reports.csv is generated using the raw data that is made available through the webscraping scripts provided by the Delphi Epi Data.

Previously collected data from earlier seasons are included in the .auxiliary-data\target-data-archive directory in their respective season sub-directories as data_reports.csv. The sarscov2_pct_positive data starts from the season_2022_2023 and hence these column values are not included in previous season data files.

Target Data Column Names (data_report.csv)

time_value: the last day of the epiweek
geo_type: the type of geographical location
geo_value: the actual geographical location
[virus]_pct_positive: the percentage of tests for a given virus that are positive (target)

Accessing Target Data

Primary Data Source: Respiratory Virus Detection Surveillance System (RVDSS)

CSV Files

A set of CSV files is updated weekly with the latest observed values for [target type, e.g., percentage of positive virus detections]. These are available at:

./target-data/season_2024_2025/target_rvdss_data.csv
auxiliary-data/season_2024_2025_raw_files (Raw Files)

Data Processing

The rvdss_update.py code processes and updates weekly data on respiratory virus detections in Canada, automatically adding new entries. It begins by defining functions to standardize virus and geographic names (e.g., "parainfluenza" to "hpiv" and "Newfoundland" to "nl") and to categorize geographic areas (as nation, region, or province) for consistent organization.

Two main functions then retrieve and transform the data. get_revised_data() accesses historical weekly data, reformats it with a multi-index structure and ensures date consistency. get_weekly_data() retrieves data for the latest epidemiological week, determining the correct year and week from a summary file. It then applies the same formatting and standardization as with the historical data.

After processing, the code saves the data in positive_tests.csv and respiratory_detections.csv files. If these files already exist, it checks for new entries by comparing indices, appending updated data to prevent duplication. After saving updates to positive_tests.csv and respiratory_detections.csv, the code consolidates both datasets into a unified file, target_rvdss_data.csv. It includes updated geo_type values and removes duplicates, keeping only the latest (revised) entry for each combination of time_value, geo_type, and geo_value. It retains our target columns (COLUMNS_TARGET) and rounds percentage values to two decimal places, creating a ready-to-analyze file with standardized weekly data across Canada.

Source Field

For each season, the code generates three files:

positive_tests.csv
- Displays the percentage of positive tests for each virus by week.
- Aggregated at the regional level, with national totals included.
- Includes revisions for each update.
- Matches Table 1 in the reports, typically titled “Respiratory virus detections for the week ending...”
respiratory_detections.csv
- Shows the number of positive tests for each virus (including subtypes) by week.
- Aggregated at the lab level, with summaries at the regional level.
- Includes revisions for each update.
- Matches Figures 3-9 in the reports, typically titled “Positive [virus] tests (%)...”
target_rvdss_data.csv
- Consolidates data from positive_tests.csv and respiratory_detections.csv.
- Updates the geo_type field based on location corrections (from LOC_CORRECTION).
- Removes duplicate rows, keeping only the latest (revised) entry for each combination of time_value, geo_type, and geo_value.
- Drops unnecessary columns (e.g., issue and epiweek), creating a streamlined dataset for further analysis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

season_2024_2025

season_2024_2025

README.md

Target Data for Respiratory Virus Detection Surveillance System's Lab Detections (2024-25)

Overview

Table of Contents

Lab Detections Data

Source

Target Data Column Names (data_report.csv)

Accessing Target Data

CSV Files

Data Processing

Source Field

Additional Resources

Files

season_2024_2025

Directory actions

More options

Directory actions

More options

Latest commit

History

season_2024_2025

Folders and files

parent directory

README.md

Target Data for Respiratory Virus Detection Surveillance System's Lab Detections (2024-25)

Overview

Table of Contents

Lab Detections Data

Source

Target Data Column Names (data_report.csv)

Accessing Target Data

CSV Files

Data Processing

Source Field

Additional Resources