Skip to content

Detailed Instructions

Catherine Birney edited this page Dec 9, 2021 · 4 revisions

Step-By-Step Guide to Generating a Flow-By-Activity Dataset

Flow-By-Activity (FBA) datasets are environmental and other data formatted into a standardized table. The data are standardized to enable use for Flow-By-Sector (FBS) table generation.

This example demonstrates creating the 2017 US Department of Agriculture (USDA) Census of Agriculture (CoA) Cropland FBA. The name for the FBA is “USDA_CoA_Cropland_2017.”

  1. The first step is to write instructions for where the data can be found in a YAML file. These data are written in a human-readable form and are read in as a pandas dictionary in python. The yaml for USGS_NWIS_WU can be found here, with an explanation of all possible parameters found in the README.

  2. The first lines of the YAML are used to generate bibliography information.

  3. The next lines indicate if an API key is required and if so, the name of the API key.

  4. If an API key is required, a user must generate their own API key (instructions in the wiki) and store the key in a .env file in a user’s MODULEPATH. An example .env file is found in FLOWSA’s example folder, here.

  5. Most of the information in the YAML is used to build the URLs called to import data and indicate what form the data is loaded as (json, csv, pdf, etc.). An option within the URLS is to surround a variable name in double underscores to indicate that a string function will dynamically replace the variable when the URLs are built, such as "secLevel".

  6. If there are any variables that need replacement in the URL build ("secLevel), the list of strings with which to replace the variable are also listed in the YAML.

  7. Although the functions to load, build, and parse Flow-By-Activity datasets are generalized, each FBA requires functions specific to the dataset. These functions are listed in the method yaml.