ch02_concepts.xml

﻿<?xml version="1.0" encoding="UTF-8"?>

<chapter id="ch_concepts">

<title>SMOKE Concepts</title>

<section>

<title>Introduction</title>

<para>The purpose of SMOKE is to convert the resolution of the data in an emission inventory to the resolution needed by an air quality model. Emission inventories typically have an annual-total emissions value for each emissions source, or perhaps an average-day emissions value. The AQMs, however, typically require emissions data on an hourly basis, for each model grid cell (and perhaps model layer), and for each model species. Consequently, to achieve the input requirements of the AQM, emissions processing must (at a minimum) transform inventory data by temporal allocation, chemical speciation, spatial allocation, and perhaps layer assignment.</para>

<para>In addition to changing the resolution of the data, SMOKE must also provide the AQM input files in the correct file format. SMOKE can create the Input/Output Applications Programming Interface (I/O API) Network Common Data Form (NetCDF) output format needed by the CMAQ model. It can also create the Fortran binary format for the 2-D emissions needed by UAM, and CAM<subscript>X</subscript>, and the ASCII elevated-point-source format used by the Ptsrce preprocessor to these models. File format is also important for the input files used by SMOKE, most of which are ASCII files, but some of which are I/O API NetCDF or CF-compliant NetCDF format files.</para>

<para>In this chapter, we introduce you to various concepts that are critical to understanding the technical description of emissions processing, as well as provide more detail about the processing capabilities of SMOKE. (Later, <xref linkend="ch_utilities" />, <xref linkend="ch_programs" />, <xref linkend="ch_quality_assurance" />, <xref linkend="ch_input_files" />, <xref linkend="ch_intermediate_files" />, and <xref linkend="ch_output_files" /> give more specifics about each program&rsquo;s capabilities and each file&rsquo;s format.) This chapter provides the context and framework for the rest of the user&rsquo;s manual. To assist you in reading and using this chapter, we provide <xref linkend="app_glossary" /> for definitions of emissions inventory and emissions modeling terminology.</para>

</section>

<comment>
<section id="sect_concepts_assigns">

<title>Assigns file and environment variables</title>

<para>The Assigns file is a script used to set up the parameters of a SMOKE run. The file configures the UNIX environment so that all of the correct input, intermediate, and output directories and files can be identified and used by the SMOKE programs. It also sets things like the name of the grid and the time period that you will run SMOKE for a given case. It does this by setting many UNIX environment variables, explained in the next paragraph. The Assigns file also uses environment variables to configure compiler options, so that SMOKE can be compiled on operating system other than the ones provided with the SMOKE distribution. More information on the Assigns file is provided in <xref linkend="sect_scripts_assigns_files" />.</para>

<para>Environment variables are aliases that can be set by a UNIX operating system. These variables are defined during a user&rsquo;s UNIX session, usually defined by an <command>xterm</command> or other UNIX terminal window. The environment variables that SMOKE uses store the input, intermediate, and output files and directories. For example, the environment variable for the directory that is the SMOKE root directory is <envar>SMKROOT</envar>. At the UNIX prompt, this environment variable could be defined to an actual path such as <filename class="directory">/home/mylogin/smoke</filename>. To set an environment variable, the UNIX <command>setenv</command> command is needed. In this example, the command to define <envar>SMKROOT</envar> as the given path is:</para>

<para><userinput><command>setenv</command> <envar>SMKROOT</envar> <filename class="directory">/home/mylogin/smoke</filename></userinput></para>

<para>After this command is issued, the <envar>SMKROOT</envar> environment variable stores the characters <filename class="directory">/home/mylogin/smoke</filename> as its value. To use the value of an environment variable, the dollar sign must proceed the variable name at the UNIX prompt. In the follow example, we give the UNIX command <command>echo</command> to print the contents of the <envar>SMKROOT</envar> environment variable at the UNIX prompt. Note the use of the dollar sign before the <envar>SMKROOT</envar> variable name.</para>

<para><userinput><command>echo</command> <envar>$SMKROOT</envar></userinput></para>

<para>When the UNIX system executes this command, the following is displayed at the UNIX prompt:</para>

<para><computeroutput>/home/mylogin/smoke</computeroutput></para>

<para>The environment variables set by the Assigns file for directories are described in <xref linkend="ch_dirs_files" />. The variables used by the SMOKE scripts for controlling SMOKE execution are described in <xref linkend="sect_scripts_script_settings" />. Finally, the environment variables that control program behavior are described in <xref linkend="ch_utilities" />, <xref linkend="ch_programs" />, and <xref linkend="ch_quality_assurance" />.</para>

</section>
</comment>

<section id="sect_concepts_emis_inv">

<title>Emission inventories</title>

<para>Emission inventories are the key input files to SMOKE and emissions modeling. The data types that these inventories contain are called inventory pollutants (e.g., carbon monoxide, ammonia, mercury). By itself, SMOKE does not require specific data types in the inventory files it reads. However, the AQMs that SMOKE supports do require certain input data, called model species, which in turn requires SMOKE to use certain inventory pollutants.</para>

<para>In this section, we focus on the inventory files that SMOKE uses. <xref linkend="sect_concepts_inv_data_types" /> describes the major inventory types useable by SMOKE. In <xref linkend="sect_concepts_inv_source_categories" />, we describe the inventory source categories, and in <xref linkend="sect_concepts_inv_file_formats" /> we discuss the inventory file formats. The remaining sections describe the various codes used in specific inventory sources: <xref linkend="sect_concepts_costcy_codes" />, <xref linkend="sect_concepts_scc_codes" />, <xref linkend="sect_concepts_sic_codes" />, <xref linkend="sect_concepts_mact_codes" />, <xref linkend="sect_concepts_section_112" /> and <xref linkend="sect_concepts_source_type_codes" />.</para>

<section id="sect_concepts_inv_data_types">

<title>Inventory data types</title>

<para>SMOKE processes criteria, particulate, toxics, and activity data inventories. Activity data will be discussed along with on-road mobile sources in the next section. By criteria inventories, we mean inventories containing EPA&rsquo;s criteria pollutants: carbon monoxide (CO), nitrogen oxides (NO<subscript>x</subscript>), and volatile organic compounds (VOC) or total organic gases (TOG). Particulate inventories contain ammonia (NH<subscript>3</subscript>), sulfur dioxide (SO<subscript>2</subscript>), particulate matter (PM) of size 10 microns or less (PM<subscript>10</subscript>), and PM of size 2.5 microns or less (PM<subscript>2.5</subscript>).</para>

<para>Additionally, SMOKE can process inventories with pre-speciated criteria and/or particulate emissions. For example, elemental carbon of size 2.5 microns or less can be provided as input to SMOKE directly, instead of letting SMOKE&rsquo;s speciation step compute it from the PM<subscript>2.5</subscript> total emissions. To ensure that SMOKE correctly processes the data when you are using pre-speciated emissions, other input files must be configured in specific ways<comment>, as explained in <xref linkend="sect_scripts_change_speciation" /></comment>.</para>

<para>The toxics inventories that SMOKE can process are data from the National Emission Inventory (NEI) for Hazardous Air Pollutants (HAPs). This inventory contains hundreds of specific compounds representing the 188 HAPs defined by the Clear Air Act. The original list of 189 HAPs and modifications representing the current list are available from the <ulink url="http://www.epa.gov/ttn/atw/orig189.html">EPA&rsquo;s web site</ulink>. The reason the inventory contains many more pollutants than 188 is because several on the list of 188 are pollutant groups, such as polycyclic organic matter, cyanide compounds and numerous metal compounds including chromium compounds, cadmium compounds, manganese compounds, and others. Note that because of these groups, specific compounds in the inventory in one inventory year may not exactly match the compounds in another inventory year. For example, one may have lead oxide reported one year but not in a subsequent year. However, those compounds not belonging to compound groups are likely to be in the inventory year after year, particularly the common gaseous HAPs emitted by mobile sources such as benzene, 1,3-butadiene, acrolein, formaldehyde, and acetaldehyde.</para>

</section>

<section id="sect_concepts_inv_source_categories">

<title>Inventory source categories</title>

<section>

<title>Overview</title>

<para>Emission inventories are divided into several source categories. These divisions stem from both differing methods for preparing the inventories and from different characteristics and attributes of the categories (more on these terms later). Generally, emission inventories are divided into the following source categories:</para>

<itemizedlist>
<listitem>
<para><emphasis role="bold">Stationary area/Nonpoint sources:</emphasis> Sources that are treated as being spread over a spatial extent (usually a county or air district) and that are not moveable (as compared to nonroad mobile and on-road mobile sources). Because it is not possible to collect the emissions at each point of emission, they are estimated over larger regions. The EPA introduced the term <quote>nonpoint</quote> to replace <quote>stationary area</quote> in order to avoid confusion with the term <quote>area source</quote>, which is used as a regulatory term in the toxics realm. However, <quote>nonpoint</quote> has not gained acceptance (thus far) by the criteria inventory/modeling community. Thus, in this manual we will use the term <quote>stationary area</quote> to refer to these sources when they are in criteria inventories, while we use the term <quote>nonpoint</quote> to refer to these sources when they are in toxics inventories. Examples of nonpoint or stationary area sources are residential heating and architectural coatings. Numerous sources, such as dry cleaning facilities, may be treated either as stationary area/nonpoint sources or as point sources; in particular, the toxics inventory contains numerous small sources (based on emissions) that are not inventoried as nonpoint sources because their locations are known and are provided.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Nonroad mobile sources:</emphasis> Vehicular and otherwise movable sources that do not include vehicles that travel on roadways. These sources are also computed as being spread over a spatial extent (again, a county or air district). Examples of nonroad mobile sources include locomotives, lawn and garden equipment, construction vehicles, and boating emissions. These sources are included in both criteria and toxics inventories.</para>
</listitem>

<listitem>
<para><emphasis role="bold">On-road mobile sources:</emphasis> Vehicular sources that travel on roadways. These sources can be computed either as being spread over a spatial extent or as being assigned to a line location (called a link). Data in on-road inventories can be either emissions or activity data. Activity data consists of vehicle miles traveled (VMT) and, optionally, vehicle speed. Activity data are used when SMOKE will be computing emission factors via another model such as MOVES. Examples of on-road mobile sources include light-duty gasoline vehicles and heavy-duty diesel vehicles. On-road mobile sources are included in both criteria and toxics inventories.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Point sources:</emphasis> These are sources that are identified by point locations, typically because they are regulated and their locations are available in regulatory reports. Point sources are often further subdivided into electric generating utilities (EGUs) and non-EGU sources, particularly in criteria inventories in which EGUs are a primary source of NO<subscript>x</subscript> and SO<subscript>2</subscript>. Examples of non-EGU point sources include chemical manufacturers and furniture refinishers. Point sources are included in both the criteria and toxics inventories.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Wildfire sources:</emphasis> Traditionally, wildfire emissions have been treated as stationary area sources. More recently, data have also been developed for point locations, with day-specific emissions and hour-specific plume rise (vertical distribution of emissions). In this case, the wildfire emissions are processed by SMOKE as point sources.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Biogenic land use data:</emphasis> Biogenic land use data characterize the type of vegetation that exists in either county total or grid cell values. The biogenic land use data in North American is available using two different sets of land use categories: the Biogenic Emissions Landcover Database (BELD) version 2 (BELD2), and the BELD version 3 (BELD3).</para>
</listitem>
</itemizedlist>

<para>Emission processing in SMOKE is divided into four processing categories: area, biogenic, mobile, and point. The definitions of these categories that SMOKE uses are different than those used for defining emission inventories. <xref linkend="tbl_concepts_inv_categories" /> lists the inventory source categories, the types of inventories (activity data, criteria, particulates, and toxics) that SMOKE can process, the temporal resolution that is acceptable to SMOKE, and the SMOKE processing category that should be used for processing the inventory.</para>

<table id="tbl_concepts_inv_categories">
<title>Inventory source categories and SMOKE processing capabilities and categories</title>

<tgroup cols="6">
<colspec colname="c1" colwidth="20*" />
<colspec colname="c2" colwidth="10*" />
<colspec colname="c3" colwidth="10*" />
<colspec colname="c4" colwidth="10*" />
<colspec colname="c5" colwidth="10*" />
<colspec colname="c6" colwidth="15*" />

<thead>
<row>
<entry morerows="1" valign="bottom" align="center">Inventory source category</entry>
<entry namest="c2" nameend="c5" align="center">Temporal resolution that SMOKE can process*</entry>
<entry morerows="1" valign="bottom" align="center">SMOKE processing category</entry>
</row>

<row>
<entry align="center">Activity data</entry>
<entry align="center">Criteria</entry>
<entry align="center">Particulates</entry>
<entry align="center">Toxics</entry>
</row>
</thead>

<tbody>
<row>
<entry>Nonpoint or stationary area</entry>
<entry align="center">N/A</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry>Area</entry>
</row>

<row>
<entry>Nonroad mobile</entry>
<entry align="center">N/A</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry>Area</entry>
</row>

<row>
<entry>On-road mobile (MOBILE 6)</entry>
<entry align="center">A</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry>Mobile</entry>
</row>

<row>
<entry>On-road mobile (MOVES)</entry>
<entry align="center">A</entry>
<entry align="center">H</entry>
<entry align="center">H</entry>
<entry align="center">H</entry>
<entry>Mobile</entry>
</row>

<row>
<entry>EGU</entry>
<entry align="center">N/A</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry>Point</entry>
</row>

<row>
<entry>Non-EGU</entry>
<entry align="center">N/A</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry align="center">A, S, D, H</entry>
<entry>Point</entry>
</row>

<row>
<entry>Wildfire with precomputed plume rise</entry>
<entry align="center">N/A</entry>
<entry align="center">D, H</entry>
<entry align="center">D, H</entry>
<entry align="center">N/A</entry>
<entry>Point</entry>
</row>

<row>
<entry>Wildfire with internal plume rise calculation</entry>
<entry align="center">N/A</entry>
<entry align="center">D</entry>
<entry align="center">D</entry>
<entry align="center">N/A</entry>
<entry>Point</entry>
</row>

<row>
<entry>Biogenic land use</entry>
<entry align="center">N/A</entry>
<entry align="center">X</entry>
<entry align="center">N/A</entry>
<entry align="center">N/A</entry>
<entry>Biogenic</entry>
</row>

<row>
<entry namest="c1" nameend="c6" align="center">* <emphasis role="bold">A</emphasis> = Supports annual data; <emphasis role="bold">S</emphasis> = Supports average-day data; <emphasis role="bold">D</emphasis> = Supports day-specific data; <emphasis role="bold">H</emphasis> = Supports hourly data; <emphasis role="bold">X</emphasis> = Supports available data</entry>
</row>
</tbody>

</tgroup>
</table>

</section>

<section>

<title>Detailed source category descriptions</title>

<para>Each inventory source category has source characteristics, source attributes, data values, and data attributes. <emphasis>Source characteristics</emphasis> are unique to each inventory source category and also distinguish one source in the inventory from another. <emphasis>Source attributes</emphasis> further describe the sources with other information that is useful for emissions processing, such as point-source flue gas exit height and temperature. The <emphasis>data values</emphasis> are either emissions values or activity values. The <emphasis>data attributes</emphasis> are additional information about the data values, such as the percentage reduction in emission from controls already applied to the source. In the following subsections, we summarize the source characteristics and attributes and the data values and attributes that are used by SMOKE for each of the inventory categories.</para>

<section>

<title>Nonpoint/stationary area and nonroad mobile (SMOKE category: area)</title>

<itemizedlist>

<listitem>
	<para><emphasis role="bold">Source characteristics:</emphasis> For all typical inventories, the source characteristics that identify these sources are country/state/county code, SCC and/or GEOCODE_LEVEL[1-4]. See
	<xref linkend="sect_concepts_costcy_codes" /> and <xref linkend="sect_concepts_scc_codes" /> and <xref linkend="sect_concepts_geocodes" /> for further information.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Optional source characteristics:</emphasis> SMOKE can also use pregridded data from the same modeling domain as a SMOKE area source; this is described in more detail in <xref linkend="sect_concepts_pregridded_data" />. In this case, the source characteristics and attributes, (country/state/county code and SCC) are <emphasis>not</emphasis> used in SMOKE. SMOKE can also use pregridded data from a different modeling domain along with geographical codes (GEOCODE_LEVEL[1-4]) and source information from the ARINV file to specify the source characteristics and associated source attributes for each grid cell.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Source attributes:</emphasis> The inventory year is associated with all sources in the inventory input files. In addition, SMOKE assigns a time zone (see <xref linkend="sect_concepts_assign_countries" />) and an approach for normalization of temporal profiles (see <xref linkend="sect_concepts_set_weekday_approach" />). In the nonpoint toxics inventory only, Standard Industrial Classification (SIC) codes, Maximum Achievable Control Technology (MACT) codes, and North American Industrial Classification System (NAICS) codes are optional source attributes; the NAICS code is read by SMOKE but not otherwise used at this time. Additionally, a <quote>source type</quote> field is available in the nonpoint inventory to identify major and Clean Air Act (CAA) section 112 area sources. See <xref linkend="sect_concepts_sic_codes" /> for a description of SIC codes, <xref linkend="sect_concepts_mact_codes" /> for more about MACT codes, and <xref linkend="sect_concepts_section_112" /> for more about source types. We will refer to the CAA section 112 area sources as simply <quote>section-112 area sources</quote>.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data:</emphasis> SMOKE can read emissions data for criteria, particulate, and toxics pollutants for nonpoint/stationary area and nonroad inventories. The SMOKE system is not constrained with regard to the pollutants read (although typical examples were given in <xref linkend="sect_concepts_inv_data_types" />). SMOKE accepts annual emissions data, average-day emissions data, or both (though not all input formats support all types). An emission factor value can also be read by SMOKE, but SMOKE does nothing with it.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data attributes:</emphasis> Inventories for nonpoint/stationary area and nonroad mobile sources can contain control efficiency, rule penetration, and rule effectiveness information for each pollutant. SMOKE will use these data if provided; otherwise it will set default values that indicate that no control-based adjustments have been applied to the inventory pollutant data. The defaults are listed in the file formats in <xref linkend="ch_input_files" />.</para>
</listitem>

</itemizedlist>

</section>

<section>

<title>On-road mobile (SMOKE category: mobile)</title>

<itemizedlist>

<listitem>
<para><emphasis role="bold">Source characteristics:</emphasis> For on-road mobile inventories, the minimum source characteristics that identify these sources are country/state/county code and either SCC <emphasis>or</emphasis> road class and vehicle type codes. When the SCC is provided, it must follow a specific pattern in order to contain the road class and vehicle type codes (see <xref linkend="sect_concepts_onroad_sccs" />). When road class and vehicle type codes are provided to SMOKE directly, SMOKE translates these to SCC values.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Optional source characteristics:</emphasis> A link code may also identify on-road sources. This code must be unique within each county and SCC (or road class/vehicle type combination).</para>
</listitem>

<listitem>
<para><emphasis role="bold">Source attributes:</emphasis> The inventory year is associated with all sources in the inventory input files. In addition, SMOKE assigns a time zone (see <xref linkend="sect_concepts_assign_countries" />) and an approach for normalization of temporal profiles (see <xref linkend="sect_concepts_set_weekday_approach" />). For sources with link codes, SMOKE will use the starting and ending coordinates of the link, using either latitude-longitude (lat-lon) values or coordinates in the Universal Transverse Mercator (UTM) coordinate system.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data:</emphasis> Emissions data for criteria, particulate, and toxics pollutants can be read for on-road mobile inventories. SMOKE is not constrained with regard to the pollutants read (although typical examples were given in <xref linkend="sect_concepts_inv_data_types" />). SMOKE accepts annual emissions data, average-day emissions data, or both (though not all input formats support all types).</para>

<para>Additionally, on-road mobile inventories can contain VMT and average speed activity data, which are needed when users would like SMOKE to run MOVES to compute emissions. A combination of precomputed emissions and VMT data is also acceptable for input to SMOKE, but you are responsible for preventing duplication of emissions. Duplication could occur if you input precomputed emissions for the same sources that you use SMOKE to compute the emissions on the fly, by multiplying the on-road emissions factors from MOVES by hourly VMT, and the off-network emission factors from MOVES by annual vehicle populations.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data attributes:</emphasis> No data attributes are associated with on-road mobile sources.</para>
</listitem>

</itemizedlist>

</section>

<section>

<title>Point sources (SMOKE category: point)</title>

<itemizedlist>

<listitem>
	<para><emphasis role="bold">Source characteristics:</emphasis> The source characteristics for point sources depend on the inventory input format. The SMOKE one-record-per-line (ORL) and Flat File 10 Format (FF10) formats identify sources by country/state/county code, plant code, point code, stack code, segment code, and SCC. 95.
	</para>
</listitem>

<listitem>
<para><emphasis role="bold">Optional source characteristics:</emphasis> SMOKE can support up to five location identifiers within a plant, although the most used in any currently implemented input file format is four.</para>
</listitem>

<listitem>
	<para><emphasis role="bold">Source attributes:</emphasis> As with other source categories, inventory year is associated with all sources in the inventory input files. SMOKE also assigns a time zone (see <xref linkend="sect_concepts_assign_countries" />) and an approach for normalization of temporal profiles (see <xref linkend="sect_concepts_set_weekday_approach" />). In addition, point sources have the following required source attributes not associated with other source categories: latitude, longitude, stack height, stack diameter (at the exit location), flue gas exit velocity, and flue gas exit temperature. Finally, the following optional source attributes are also used by SMOKE: SIC codes, MACT codes, plant descriptions, emissions release type point (e.g., horizontal stack, fugitive), source type (major or section-112 area), Office of Regulatory Information Systems (ORIS) identification codes, and boiler identification codes. Also, the MACT code and source types are supported only by the ORL format. See <xref linkend="sect_concepts_sic_codes" />, <xref linkend="sect_concepts_mact_codes" />, and <xref linkend="sect_concepts_section_112" /> for more information.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data:</emphasis> Emissions data for criteria, particulate, and toxics pollutants can be read for point inventories. SMOKE is not constrained with regard to the pollutants read (although typical examples were given in <xref linkend="sect_concepts_inv_data_types" />). SMOKE accepts annual emissions data, average-day emissions data, or both.</para>

<para>Optionally, point-source emissions data can be provided using day-specific or hour-specific records. The formats for these data are described in <xref linkend="sect_input_ptday" /> and <xref linkend="sect_input_pthour" />.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data attributes:</emphasis> EGU and non-EGU point sources can contain control efficiency and rule effectiveness information for each pollutant. SMOKE will use these data if provided; otherwise it will set default values that indicate that no control-based adjustments have been applied to the inventory pollutant data. The defaults are listed in the file formats in <xref linkend="ch_input_files" />.</para>
</listitem>

</itemizedlist>

</section>

<section>

<title>Wildfire with precomputed plume rise (SMOKE category: point)</title>

<itemizedlist>

<listitem>
<para><emphasis role="bold">Source characteristics:</emphasis> Wildfires with precomputed plumes are identified by the country/state/county code and the fire name.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Optional source characteristics:</emphasis> There are no optional source characteristics for wildfire sources.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Source attributes:</emphasis> Like other source categories, inventory year is associated with all sources in the inventory input files. SMOKE also assigns a time zone (see <xref linkend="sect_concepts_assign_countries" />) and an approach for normalization of temporal profiles (see <xref linkend="sect_concepts_set_weekday_approach" />). In addition, wildfire sources require the latitude and longitude source attributes. Finally, additional hour-specific source attributes for wildfire sources <emphasis>must</emphasis> be provided for the fraction of emissions in the surface layer, the height of the bottom of the plume, and the height of the top of the plume. These hour-specific attributes are provided to SMOKE using the point source hour-specific formats described in <xref linkend="sect_input_pthour" />.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data:</emphasis> Wildfire source inventories can contain criteria and particulate pollutants. SMOKE is not constrained with regard to the pollutants read (although typical examples were given in <xref linkend="sect_concepts_inv_data_types" />). These data must be provided as day-specific or hour-specific emissions values using point source formats specified in <xref linkend="sect_input_ptday" /> and <xref linkend="sect_input_pthour" />.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data attributes:</emphasis> No data attributes are associated with wildfire sources.</para>
</listitem>

</itemizedlist>

</section>

<section>

<title>Wildfires with internal plume rise calculation (SMOKE category: point)</title>

<itemizedlist>

<listitem>
<para><emphasis role="bold">Source characteristics:</emphasis> Wildfires with internal plume rise calculation are identified by the country/state/county code, fire identification, fire name, location identification, and SCC.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Optional source characteristics:</emphasis> There are optional source characteristics for fire sources, such as material burned, vegetation types, size of area burned, fuel loading, and fire start/end hour. The size of area burned and fuel loading are used for computing the fire-specific plume rise. Fire starting and ending hours are needed to adjust the hourly temporal profiles for the emissions.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Source attributes:</emphasis> Like other source categories, inventory year is associated with all sources in the inventory input files. SMOKE also assigns a time zone (see <xref linkend="sect_concepts_assign_countries" />) and will re-normalize temporal profiles based on the starting and ending hours of the fire. In addition, wildfire sources require the latitude and longitude source attributes to locate the fire. Note that all emissions for a fire will be assumed to come from the single grid cell that contains the latititude and longitude of the fire. Finally, additional day-specific source attributes listed above for fire sources <emphasis>must</emphasis> be provided for calculating the heat flux of each fire, which is used to estimate the fraction of emissions in the surface layer, the height of the bottom of the plume, and the height of the top of the plume. <comment>See more information about how to process at <xref linkend="sect_ptfire_emis_cmaq"/></comment></para>
</listitem>

<listitem>
<para><emphasis role="bold">Data:</emphasis> Fire source inventories can contain criteria and particulate pollutants. SMOKE is not constrained with regard to the pollutants read (although typical examples were given in <xref linkend="sect_input_ptinv_fire" />). These data must be provided as day-specific emissions values using point source formats specified in <xref linkend="sect_input_ptday_fireemis" />.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data attributes:</emphasis> No data attributes are associated with wildfire sources.</para>
</listitem>

</itemizedlist>

</section>

<section>

<title>Biogenic land use (SMOKE category: biogenic)</title>

<itemizedlist>

<listitem>
<para><emphasis role="bold">Source characteristics:</emphasis> Biogenic emission data does not fit as neatly into the source-characteristic paradigm as the previously described source types. Emissions for biogenic sources are estimated starting with land use data, which are available for both BELD2 and BELD3 processing. The BELD2 data are available either by U.S. state/county and BELD2 land use category or by grid cell and BELD2 land use category. BELD3 land use data are available by 1-km grid cell over North and Central America and by BELD3 land use category.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Optional source characteristics:</emphasis> Biogenic land use data do not include optional source characteristics. The data are either by state/county or by grid cell.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Source attributes:</emphasis> There are no source attributes for biogenic land use data.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data:</emphasis> The biogenic land use data consist of fractions associated with each land use type within a county or grid cell.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Data attributes:</emphasis> There are no data attributes for biogenic land use data.</para>
</listitem>

</itemizedlist>

</section>

</section>

</section>

<section id="sect_concepts_inv_file_formats">

<title>Inventory file formats</title>

<para>SMOKE supports a variety of inventory formats for criteria, particulate, toxics, and activity data inventories, which are described in detail in <xref linkend="sect_input_inventory" />. Here, we provide a brief introduction to these formats, which will be helpful as you read more about SMOKE in the remainder of this chapter and the chapters before <xref linkend="ch_input_files" />. All formats described here are text files. To convert your data to these formats, the best approach is to use a database or spreadsheet program to reformat and output the data in the requested format. There is not a standard format-conversion method that comes with SMOKE.</para>

<para> In the following paragraphs, we describe the formats available for nonpoint/stationary area, nonroad mobile, on-road mobile, point, and point-wildfire sources.</para>

<itemizedlist>

<listitem>
	<para><emphasis role="bold">Nonpoint/stationary area sources:</emphasis> SMOKE supports two formats for nonpoint/stationary area sources.  The ORL and FF10 (Flat File 10) format are list directed (comma or semicolon delimited) and these file formats may be used to represent many different sources. The header of the file indicates what source data are in the file.</para>
</listitem>

<listitem>
	<para><emphasis role="bold">Nonroad mobile sources:</emphasis> There are three available inventory formats for nonroad mobile sources.  The FF10 (Flat File 10) format is list directed (comma or semicolon delimeted) and the header of the file is used to indicate the nonroad mobile source data is within the file.</para>
</listitem>

<listitem>
	<para><emphasis role="bold">On-road mobile sources:</emphasis> The Flat File 10 (FF10) format is list directed (comma or semicolon delimited) and contains activity inventory such as VMT, speed, and vehicle population data.  This format requires VMT, SPEED, and VPOP inventory data.</para>
</listitem>

<listitem>
	<para><emphasis role="bold">Point sources:</emphasis> SMOKE has formats for annual or average-day inventories, for day-specific inventories, and for hour-specific inventories. For annual or average-day inventories, the ORL and FF10 formats can be used for criteria, particulate, and toxics inventories. Finally, the CEM data format can be used for day-specific or hour-specific data : SMOKE uses the ORIS codes and boiler codes in the annual inventory files to match sources from the CEM data files.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Wildfire sources:</emphasis> There are two approches available that you can provide wildfire data that are being treated as point sources to SMOKE using the ORL and FF10 point-source formats.</para>
    <itemizedlist spacing="compact">
	    <listitem><emphasis role="bold">Precomputed plume rise approach:</emphasis> Certain fields must be left blank (such as stack parameters) because they do not apply to wildfire sources. When using wildfire data provided as point sources, you must also provide day-specific or hour-specific wildfire emissions and hour-specific precomputed plume rise using the FF10 day-specific and hour-specific formats.</listitem>
        <listitem><emphasis role="bold">Internal plume rise calculation approach:</emphasis> Requires two separate inventory files that are provided in a modified ORL format: (1) a list of fires with fire-specific characteristics including country/state/county, fire identification, location coordinate, fire name, SCC and others, as described in <xref linkend="sect_input_ptinv_fire" />, and (2) a day-specific fire data including size of area burned, fuel loading, and star/end hour of fire (<xref linkend="sect_input_ptday_fireemis" />). Unlike the approach listed above, this approach internally estimates the plume rise using the size of the area burned and fuel loading, and it adjusts temporal profiles using the start and end hours of the fire. <comment>See detail at <xref linkend="sect_ptfire_emis_cmaq" /></comment></listitem>
    </itemizedlist>
</listitem>

</itemizedlist>

</section>

<section id="sect_concepts_costcy_codes">

<title>Country, state, and county codes</title>

<para>SMOKE uses a 6-digit integer code to identify a country, state (or province), and county (or other region) for a particular source. Most U.S. inventories input to SMOKE have the 5-digit U.S. Federal Implementation Planning Standards (FIPS) state and county codes. All inventory input formats have been adapted to include a special header record with which you can specify the country, effectively allowing the inventories to be provided with the 6-digit code that SMOKE uses. The 6-digit system was designed for use in the United States with states and counties, as well as Canada and Mexico, but it can be adapted for other uses. The format used by SMOKE for the codes is:</para>

<informalfigure>
<mediaobject>
<imageobject role="pdf">
<imagedata width="3in" fileref="images/concepts/costcy_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/costcy_html.jpg" />
</imageobject>
</mediaobject>
</informalfigure>

<para>The SMOKE installation is set up to use U.S.-centered codes as defined in the <envar>COSTCY</envar> or the <envar>GEOCODE_LEVEL4</envar> (if USE_EXP_GEOCODES Y) file, which contains the codes and their associated names and time zone settings. In this file, the U.S. country code is zero, which allows the U.S. country/state/county codes to be the same as the FIPS state/county codes that appear in U.S. inventories. See <xref linkend="sect_input_costcy" /> for more information on the <envar>COSTCY</envar> file format.</para>

<para>To change the meaning of the country, state, or county codes in SMOKE, the <envar>COSTCY</envar> or the <envar>GEOCODE_LEVEL4</envar> (if USE_EXP_GEOCODES Y) file must be modified to use different names associated with each country, state, county or tribal number. All SMOKE input files must also use this new numbering scheme, including inventory files and cross-reference files.</para>

<para>Acceptable values in SMOKE for the country code are 0 through 9. Acceptable values of the state code are 1 through 99. Acceptable values of the county code are 1 through 999. No alphabetic codes are accepted, since SMOKE stores these values as integers.</para>

</section>

<section id="sect_concepts_scc_codes">

<title>Source Classification Codes</title>

<para>EPA uses Source Classification Codes (SCCs) and area and mobile source (AMS) codes to classify different types of anthropogenic emission activities. SCCs have 8 digits for point sources, while AMS codes have 10 digits, and sometimes include a leading <quote>A</quote> as an eleventh character. In SMOKE, we refer to both kinds of codes as <quote>SCCs</quote>, and we ignore the leading <quote>A</quote> in the area and mobile codes. Additionally, SMOKE permits the nonpoint and point toxics inventories to use both 8-digit and 10-digit SCCs in the same inventory input file, because both 8- and 10-digit codes are contained in the nonpoint and point inventories in the 1999 NEI for HAPs. The maximum field width in SMOKE and its input files for SCCs is 20 characters as of SMOKE v4.0. The 8 or 10 digit SCC are still supported, but if the SCC is greater than 10 digits the SCC hierarchial approach will not be supported.</para>

<para>For SCC's of size 10 characters or less, the codes use a hierarchical system in which the definition of the code gets increasingly more specific as you move from left to right. (NOTE: if the SCC is greater than 10 characters in length the hierarchial system is not used). For SCC's of 10 characters or less, it is important to understand the hierarchy of the codes, because you can take advantage of the hierarchy in building cross-reference files for assigning emissions processing factors to inventory emission sources. In the diagrams below, level 1 is the least specific and level 4 is the most specific.</para>

<para>The code structure for the 8-digit point-source codes is:</para>

<informalfigure>
<mediaobject>
<imageobject role="pdf">
<imagedata width="3in" fileref="images/concepts/scc_point_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/scc_point_html.jpg" />
</imageobject>
</mediaobject>
</informalfigure>

<para>An example point-source activity and corresponding SCC can be taken directly from SMOKE&rsquo;s SCC description file (<envar>SCCDESC</envar>): <quote>External Combustion Boilers; Electric Generation; Lignite; Spreader Stoker</quote> is represented by 10100306. Below we have mapped the levels of this description with the levels of the SCC:</para>

<informalfigure>
<mediaobject>
<imageobject role="pdf">
<imagedata width="3in" fileref="images/concepts/scc_example_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/scc_example_html.jpg" />
</imageobject>
</mediaobject>
</informalfigure>

<para>Similarly, the code structure for the 10-digit area- and mobile-source codes is:</para>

<informalfigure>
<mediaobject>
<imageobject role="pdf">
<imagedata width="3in" fileref="images/concepts/scc_nonpoint_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/scc_nonpoint_html.jpg" />
</imageobject>
</mediaobject>
</informalfigure>

<para>SMOKE treats SCCs as character strings, though in practice the values in the inventories and cross-reference files are usually numeric. In <xref linkend="sect_concepts_cross_referencing" /> on cross-references and profiles, we explain how these hierarchies are used by SMOKE and how you should use them in preparing SMOKE input files.</para>

<para>For on-road mobile sources, SCCs are treated somewhat differently than for other source categories. We explain more about this in <xref linkend="sect_concepts_onroad_sccs" />.</para>

</section>

<section id="sect_concepts_sic_codes">

<title>Standard Industrial Classification codes</title>

<para>Although SIC codes are being replaced by NAICS codes in building emission inventories at EPA, SIC codes are still used in emissions processing. As of SMOKE v4.0, the SIC codes may be up to 20 characters in length, but for SIC codes greater than 4-digits, the hierarchial system is not used. For SICs of size 4-digits, a 2-level hierarchial system is recognized by SMOKE for application of growth, control, and chemical speciation factors. The two code levels are illustrated below.</para>

<informalfigure>
<mediaobject>
<imageobject role="pdf">
<imagedata width="3in" fileref="images/concepts/sic_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/sic_html.jpg" />
</imageobject>
</mediaobject>
</informalfigure>

</section>

<section id="sect_concepts_geocodes">

	<title>Geographical Code (GEOCODE_LEVEL[1-4]</title>
	<para> Geographical codes may be specified to the user's desired level of detail using the GEOCODE_LEVEL[1-4] files. GEOCODE_LEVEL1 contains three character codes for each country in the inventory (CCC). GEOCODE_LEVEL2 contains six character codes for each state that the user would like to track in the inventory (CCCSSS). GEOCODE_LEVEL3 contains nine character codes for each county that the user would like to track in the inventory (CCCSSSYYY).  GEOCODE_LEVEL4 contains twelve character codes for each tribal region that the user would like to track in the inventory (CCCSSSYYYTTT). </para>

	<informalfigure>
		<mediaobject>
			<imageobject role="pdf">
				<imagedata width="3in" fileref="images/concepts/geocode_pdf.jpg" />
			</imageobject>
			<imageobject role="html">
				<imagedata fileref="images/concepts/geocode_html.jpg" />
			</imageobject>
		</mediaobject>
	</informalfigure>

</section>


<section id="sect_concepts_mact_codes">

<title>Maximum Achievable Control Technology codes</title>

<para>The following quote explaining MACT codes was taken from EPA to explain what MACT codes are and why they are used in some inventories and not others:</para>

<blockquote>
<para>To evaluate EPA&rsquo;s progress in reducing air toxic emissions through the Maximum Achievable Control Technology (MACT) standards and to identify sources that may be modeled as part of residual risk assessments, operations within facilities that are subject to MACT standards are identified in the NTI by 4-digit MACT codes. <emphasis>[note that the term NTI (National Toxics Inventory) has since been replaced with NEI and that the codes are now 6 digits]</emphasis></para>

<para>A MACT category is one for which emissions limitations have been or are being developed under section 112(d) of the Clean Air Act (National Emissions Standards for Hazardous Air Pollutants).  EPA sets source category, technology based standards through its MACT program that sharply reduce emissions of HAPs. EPA&rsquo;s ATW web site includes information on the MACT source categories and the MACT program (www.epa.gov/ttn/atw/eparules.html). The tagging of data with MACT codes allows EPA to determine reductions attributable to the MACT program. The NTI associates MACT codes corresponding to MACT source categories with stationary major and [section-112] area source data.  MACT codes are assigned at the process level or at the site level in the point source data, e.g., the MACT code for municipal waste combustors (MWCs) is assigned at the site level whereas the MACT code for petroleum refinery catalytic cracking is assigned at the process level. MACT codes are also assigned to source categories in the nonpoint source file.</para>
</blockquote>

<para>In SMOKE, MACT codes are treated as 6-character strings, with no internal hierarchy associated with the number.</para>

</section>

<section id="sect_concepts_section_112">

<title>Source types: major and section-112 area sources</title>

<para>For point and nonpoint toxics inventories, each source can be labeled as <quote>major</quote> or <quote>section-112 area</quote> for input to SMOKE (the following paragraph explains how the term <quote>area</quote> can be applied to a point-source inventory). The Clean Air Act defines major sources as those stationary facilities that emit or have the potential to emit 10 tons per year or more of any one toxic air pollutant or 25 tons per year or more of any combination of toxic air pollutants. section-112 area sources include facilities that have air toxics emissions below the major source threshold as defined in section 112 of the Clean Air Act and thus emit less than 10 tons per year of a single toxics air pollutant or less than 25 tons per year combined of multiple toxics air pollutants. Another source type exists in principle for nonpoint sources: the <quote>other</quote> source type; an example of this source type is wildfires. However, these source types are not labeled differently from the section-112 area sources in the nonpoint toxics inventories, so the <quote>other</quote> source type has not been included in SMOKE to date.</para>

<para>A note about the confusing use of <quote>area</quote> terminology to describe point sources: The designation of sources in the point inventories as section-112 area sources has no relationship whatsoever to SMOKE&rsquo;s area processing category. The point sources that are section-112 area sources are still processed by SMOKE as point sources using a lat-lon location and stack parameters.</para>

<para>In practice, all <quote>major</quote> sources should appear only in the point toxics inventory, but in some cases, <quote>major</quote> sources have shown up in the nonpoint inventory (specifically in inventory year 1996, in the July 2001 version of that inventory). Thus, the source type designation is provided in both the point and nonpoint toxics input formats.</para>

<para>The major and section-112 area designations are used when applying MACT-based control factors. These control factors are assigned based on a source&rsquo;s MACT code and may be applied to major sources only, to section-112 area sources only, or to both types of sources regardless of designation.</para>

</section>

<section id="sect_concepts_source_type_codes">

<title>Source types: nonroad and onroad mobile sources</title>

<para>The nonroad and onroad mobile source type designations are used when applying MACT-based control factors. These control factors are assigned based on a source&rsquo;s MACT code.</para>

</section>

</section>

<section id="sect_concepts_cross_referencing">

<title>Cross-referencing and profiles</title>

<para>The emission inventories described in <xref linkend="sect_concepts_emis_inv" /> can contain hundreds of thousands or even millions of sources. Collecting specific information for each source about its temporal allocation, chemical speciation, and spatial allocation is not practical. Therefore, a part of emissions processing involves assuming that many sources share the same factors for these major processing steps. For example, we apply monthly, day-of-week, and hourly temporal factors (called profiles) to convert from an annual emissions value to an hour-specific emissions value. A limited set of monthly, day-of-week, and hourly diurnal profiles are available from various studies, and these profiles each have their own unique profile number (also called profile code or profile ID). This limited set of profiles is assigned to the much more numerous inventory sources using an approach called cross-referencing, which is implemented using cross-reference files.</para>

<para>The cross-reference files assign the profiles based on source characteristics such as country, state, and county codes and/or SCCs, using the profile numbers to associate source characteristics with the profiles. While the profile numbers are unique in the profile files, they will appear many times in the cross-reference; this is how SMOKE is able to group the sources to treat them in the same manner. This approach is used for temporal allocation profiles, chemical speciation profiles and the spatial <quote>profiles</quote>, which are called spatial (or gridding) surrogates.</para>

<para>The cross-reference tables are applied to the sources in a stepwise manner, such that the most specific entry available is always applied. For example, if a cross-reference entry were available that matched a source by state, county, and SCC, SMOKE would apply that entry instead of a different cross-reference entry that matched that source only by SCC. The hierarchy that describes how each cross-reference file is applied to the inventory is described for each program in <xref linkend="ch_programs" />.</para>

<para><xref linkend="fig_concepts_xref" /> provides a generic example of how cross-reference files and profile files work together. In the example, the profile to be used for most of North Carolina is profile ID 16. Durham and Orange counties, however, are assigned profile 15, which would be preferentially applied to all sources in Durham and Orange counties, instead of using the general North Carolina profile. South Carolina sources would be assigned profile 17.</para>

<figure id="fig_concepts_xref">
<title>Generic example of how cross-reference files and profiles work together</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="3.5in" fileref="images/concepts/xref_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/xref_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>This example does not correspond to a particular processing step (i.e., temporal allocation, chemical speciation, or spatial allocation), but rather assigns generic <quote>factors</quote> from profiles 15, 16, and 17 based on the state and county information in the cross-reference file. (Note that we have used the state and county names in this example, whereas real cross-reference files would use the country, state, and county codes according to the file format of the actual cross-reference files.)</para>

<para>SMOKE handles cross-references and profile application in a very efficient manner. In reading a cross-reference file, SMOKE first sorts the cross-reference entries using the same sort criteria as are used for the inventory sources (e.g. by country/state/county code, then by SCC, then by remaining source characteristics if any). Next, the cross-reference entries are grouped according to the <quote>level</quote> of matching of each of the entries. For example, all entries that could match to the inventory using only state and county codes would be in one group, while entries that could match to the inventory using only SCCs would be in another group. Once the cross-reference entries are grouped, SMOKE processes each sources in the inventory, and attempts to find a matching entry in one of the cross-reference groups. The most specific groups are searched first, and when a match is found for a particular source, the other groups are not searched. This helps increase efficiency. In addition, because the cross-reference entries are sorted within each group, an efficient searching algorithm can be used for each individual search. When a match to one of the cross-reference groups has been found, SMOKE continues to the next source in the inventory until all sources have been processed.</para>

<para>Cross-references and profiles are used in the following SMOKE processing steps. These steps and their associated programs (listed in parentheses) will be described in the sections to come.</para>

<itemizedlist spacing="compact">
<listitem>Inventory import (<command>Smkinven</command>)

    <itemizedlist spacing="compact">
    <listitem>cross-references: <envar>NHAPEXCLUDE</envar>, <envar>VMTMIX</envar>, <envar>PSTK</envar>, <envar>ARTOPNT</envar></listitem>
    <listitem>profiles: none (factors are included in the cross-reference files when needed)</listitem>
    </itemizedlist>

</listitem>
<listitem>Temporal allocation (<command>Temporal</command>)

    <itemizedlist spacing="compact">
    <listitem>cross-references: <envar>ATREF</envar>, <envar>MTREF</envar>, <envar>PTREF</envar></listitem>
    <listitem>profiles: <envar>ATPRO</envar>, <envar>MTPRO</envar>, <envar>PTPRO</envar></listitem>
    </itemizedlist>

</listitem>
<listitem>Chemical speciation (<command>Spcmat</command>)

    <itemizedlist spacing="compact">
    <listitem>cross-references: <envar>GSREF</envar>, <envar>GSCNV</envar></listitem>
    <listitem>profiles: <envar>GSPRO</envar></listitem>
    </itemizedlist>

</listitem>
<listitem>Spatial allocation (<command>Grdmat</command>)

    <itemizedlist spacing="compact">
    <listitem>cross-references: <envar>AGREF</envar>, <envar>MGREF</envar></listitem>
    <listitem>profiles: <envar>AGPRO</envar>, <envar>MGPRO</envar> (<command>* Note</command>)</listitem>
    </itemizedlist>

</listitem>
<listitem>Growth and controls (<command>Cntlmat</command>)

    <itemizedlist spacing="compact">
    <listitem>cross-references: <envar>GCNTL</envar></listitem>
    <listitem>profiles: none (factors are included in the cross-reference files)</listitem>
    </itemizedlist>

</listitem>
<listitem>Mobile-source speed assignment (<command>Movesmrg</command>)

    <itemizedlist spacing="compact">
    <listitem>cross-references: <envar>MCXREF</envar>, <envar>MFMREF</envar></listitem>
    <listitem>profiles: <envar>SPDPRO</envar></listitem>
    </itemizedlist>

</listitem>
</itemizedlist>

<para>The hierarchies that each SMOKE program uses to assign cross-reference entries to sources are provided in <xref linkend="ch_programs" />, where the programs are described at length. The file contents and formats are described in more detail in <xref linkend="ch_input_files" />.</para>

<para><command>Note</command>:  The use of the Environment variable <envar>AGPRO</envar> (Area spatial surrogate file)and <envar>MGPRO</envar> (Mobile spatial surrogate file) have been discontinued.  Two new Environment variables have been introduced to SMOKE; <envar>SRGPRO_PATH</envar> (spatial surrogate profile file location) and <envar>SRGDESC</envar> (description file with the specific list of available surrogates located in <envar>SRGPRO_PATH</envar>) See <xref linkend="fig_programs_grdmat" />.  The surrogate files located in <envar>SRGPRO_PATH</envar> are refinements of the old [A|M]GPRO files.  They are of the same format as the old files, however, there now may be one or more surrogate files.  <command>Grdmat</command> now process each surrogate separately.  On domains with large cell counts, this approach limits the memory usage at the expense of slightly longer run times. </para>

</section>

<section id="sect_concepts_basic_formats">

<title>Input and output file types</title>

<para>Before we describe more about the SMOKE processing, we first need to explain the types of files you will encounter in this documentation. SMOKE primarily uses two types of file formats: ASCII files and I/O API files. In addition, the output file format for the UAM-based air quality model is a Fortran binary file format. <xref linkend="ch_input_files" />, <xref linkend="ch_intermediate_files" />, and <xref linkend="ch_output_files" /> describe all input, intermediate, and output files, including the file format for each one. Input files are files that are read by at least one core SMOKE program (listed in <xref linkend="ch_programs" />), but are not written by a core program. Intermediate files are files that are written by a core program and read by at least one other core program. Output files are files output by a SMOKE core program but not read by any of them; these files include reports, log files, and the model-ready files to be input to an air quality model. (Exception: one intermediate file [used by a core program] is also an output file [used by an AQM]: the <envar>STACK_GROUPS</envar> file, described in <xref linkend="sect_intmed_stack_groups" />.) In this section, we further describe the ASCII and I/O API files, and then provide information about the two approaches for formatting the model-ready output files produced by SMOKE (the CMAQ/Models-3 approach and the UAM-based approach).</para>

<para>SMOKE&rsquo;s input files are primarily ASCII files, although a few I/O API files are used. The intermediate files in SMOKE are primarily I/O API files, although there are several important ACSII files as well. The output files from SMOKE are primarily I/O API files and Fortran binary files for model-ready emissions files, and ASCII files for reports and logs.</para>

<section>

<title>ASCII files</title>

<para>ASCII files are simply the text files with which most computer users are familiar. The ASCII files input by SMOKE come in two structures: <emphasis>column-specific</emphasis> and <emphasis>list-directed</emphasis>.</para>

<section id="sect_concepts_column_specific">

<title>Column-specific ASCII files</title>

<para>In column-specific files, the fields in the files must appear in certain columns in the file. Each character on a line represents a single column. The lines below represent a column-specific ASCII data file:</para>

<programlisting>TEST 1 2 3

Additional data</programlisting>

<para>The letters <literal>TEST</literal> are in columns 1 through 4 of the file and the numbers 1, 2, and 3 are in columns 6, 8, and 10 respectively:</para>

<programlisting>123456789012345
TEST 1 2 3

Additional data</programlisting>

</section>

<section id="sect_concepts_list_directed">

<title>List-directed ASCII files</title>

<para>In list-directed files, the exact positioning of the fields on a line is not important, but the order of the fields on that line is crucial. The fields must be delimited (separated) by special characters called delimiters; in SMOKE, valid delimiters are <emphasis role="bold">spaces</emphasis>, <emphasis role="bold">commas</emphasis>, or <emphasis role="bold">semicolons</emphasis>. If a particular field happens to contain any of these delimiters within it, then that field must be surrounded by single or double quotes in the input file.</para>

</section>

</section>

<section>

<title>I/O API files</title>

<para>I/O API files are read and written by the I/O API library used by SMOKE and other Models-3 programs. A library is a set of routines that have been created and compiled for use by multiple programs. The I/O API library, in turn, is built upon yet another library called the NetCDF library. For this reason, I/O API files are also referred to as I/O API NetCDF files. More information on both of these libraries is available at the <ulink url="http://www.baronams.com/products/ioapi/">I/O API web site</ulink>. <comment><xref linkend="sect_install_compile" /> contains instructions for obtaining the I/O API and NetCDF libraries.</comment></para>

<para>The I/O API files cannot be viewed with a text editor because they are binary files. These binary files use less disk space than ASCII files containing the same data. They also allow much more efficient input and output of the data, and the I/O API library provides many quality assurance (QA) features useful for all input and output (I/O), including I/O for emissions processing.</para>

<para>The basic I/O API file has a limitation of 120 variables per file. To overcome this, SMOKE uses a wrapper called the FileSetAPI that creates and manages multiple I/O API files when more than 120 variables are needed in a single I/O API dataset in SMOKE. For example, if the SMOKE speciation matrix requires 140 pollutant-to-species variables, SMOKE will open by default two standard I/O API files: one with 120 variables and one with 20 variables. This resulting <quote>file set</quote> will be treated by other SMOKE programs as a single file, which enables processing of any number of pollutants and species in a single run, despite the I/O API variable limitation.</para>

<para>Some I/O API files can be viewed by the <ulink url="http://www.verdi-tool.org">Visualization Environment for Rich Data Interpretation</ulink> (VERDI). In SMOKE, any gridded output file from the <command>Smkmerge</command>, <command>Mrggrid</command>, or <command>Smk2emis</command> programs can be viewed by VERDI.</para>

<para>In some cases, it can be helpful to directly view the contents of the I/O API files in text form. This provides a quick way to check grid settings, time period, or species names in the model-ready output files. By viewing the text version of the model-ready output files produced by SMOKE, you can easily confirm that the correct species have been created or that the emission units are correct. To convert the I/O API files to text, one can use a combination of the NetCDF-provided <command>ncdump</command> utility and UNIX commands. The <command>ncdump</command> utility is created when you compile the NetCDF library, or you can download it from the <ulink url="http://www.unidata.ucar.edu/packages/netcdf/">NetCDF web site</ulink>. The command to convert the files to text format is:</para>

<para><userinput><command>ncdump</command> <replaceable>&lt;infile&gt;</replaceable> | <command>cut -c1-80</command> &gt; <replaceable>&lt;outfile&gt;</replaceable></userinput></para>

<para>Replace <replaceable>&lt;infile&gt;</replaceable> in the command above with your input I/O API file name, and <replaceable>&lt;outfile&gt;</replaceable> with your desired ACSII output file name. The output file contains all the applicable data stored in the I/O API file including grid information, time period, variable names, etc.</para>

</section>

<section id="sect_concepts_model_ready_files">

<title>Model-ready files</title>

<para>SMOKE supports two major approaches for formatting its output files that are used as inputs to air quality models (i.e., model-ready files): the CMAQ/Models-3 approach and the UAM-based approach. The CMAQ/Models-3 approach is used for the CMAQ model, and the UAM-based approach is used for the UAM models, and CAM<subscript>X</subscript>.</para>

<para>The CMAQ/Models-3 approach uses one required 3-D I/O API file that contains the gridded, hourly, speciated, and vertically distributed emissions. In SMOKE, it is called the <envar>EGTS3D_L</envar> file. To create the 3-D model-ready emissions file, SMOKE computes plume rise for some or all point sources. For CMAQ, two additional optional files can be provided for plume-in-grid (PinG) processing. The first must contain locations and stack parameters for PinG sources and is called the <envar>STACK_GROUPS</envar> file. The second must contain the hourly, speciated emissions for the same PinG sources in a file called the <envar>PINGTS_L</envar> file.</para>

<para>The UAM-based approach has two required files: (1) a 2-D emissions Fortran binary file with all sources other than point sources and all low-level point sources, and (2) an elevated-point-source Fortran binary file. The SMOKE program <command>Smk2emis</command> can create the 2-D emissions Fortran binary file (called the <envar>UAM_EGTS</envar> file) by converting a 2-D <envar>EGTS_L</envar> file from an I/O API format. To obtain the elevated-point-source Fortran binary file, the SMOKE program <command>Smkmerge</command> can create an ASCII elevated-point-source file, which can then be converted to the required binary format using the UAM preprocessor <ulink url="http://www.remsad.com/ptsrce.htm">Ptsrce</ulink>.</para>

</section>

</section>

<section>

<title>Modeling parameters</title>

<para>Emissions modeling requires information about the subsequent air quality modeling that will be done. For example, to produce appropriate model-ready files using SMOKE, you must know which AQM will be used, the model grid and map projection, the episode dates, and the chemical mechanism to be used. In this manual, we refer to these settings collectively as <quote>modeling parameters</quote>. In this section, we provide information on what these modeling parameters are and SMOKE&rsquo;s capabilities to support them.</para>

<para>SMOKE reads in the modeling parameters from both script settings (environment variables) and input files. In the subsections below, we provide the relevant settings and files that control the modeling parameters<comment>. More information about how to configure your scripts and files to change these parameters can be found in <xref linkend="sect_scripts_how_use_smoke" /></comment>; how the settings affect the programs is described in <xref linkend="ch_utilities" /> and <xref linkend="ch_programs" />.</para>

<section>

<title>Map projections and model grids</title>

<para>A map projection is the mathematical representation of the spherical surface of the earth on a 2-D plane. SMOKE supports Lambert conformal, lat-lon, UTM, and polar stereographic map projections. There are many different settings that you may use to define your Lambert conformal, UTM, and polar stereographic projections, to make these projections match the one being used by your meteorology model and AQM. (Lat-lon is a fixed projection and cannot be changed.)</para>

<para>A model grid is a two-dimensional region overlaid on a map projection. It is defined by the starting <emphasis>x-y</emphasis> coordinates, the number of grid cells in each direction, and the physical size of the grid cells. <xref linkend="fig_concepts_grid" /> shows an example of a model grid that includes most of the eastern U.S. This example has <comment>starting coordinates of,</comment> 81 grid cells in the <emphasis>x</emphasis>-direction, 75 grid cells in the <emphasis>y</emphasis>-direction, and each grid cell is 36 by 36 kilometers. Each set of 10 cells by 10 cells (counting from the starting coordinates) is enclosed in black grid lines.</para>

<figure id="fig_concepts_grid">
<title>Example model grid</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="5in" fileref="images/concepts/grid_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/grid_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>The model grid is set in SMOKE using the <envar>IOAPI_GRIDNAME_1</envar> setting to select a grid and map projection from among those defined in the <envar>GRIDDESC</envar> input file. The name of the grid set with the <envar>IOAPI_GRIDNAME_1</envar> setting must match a grid name in the <envar>GRIDDESC</envar> file to allow SMOKE to obtain the grid and map projection parameters from the <envar>GRIDDESC</envar> file.</para>

</section>

<section>

<title>Base year and past/future years</title>

<para>For any modeling effort, the emissions base year and future year are key modeling parameters needed for performing emissions processing. The base year is usually the year for which the air quality model is being run in order to compare modeling results with observed air quality data. Such comparisons allow modelers to tune the emissions data and air quality model, to ensure that the AQM is performing adequately during the modeling episode.</para>

<para>The base year is most often a year for which an emission inventory is available. This is usually the same year for which the meteorology model has been run to prepare input to SMOKE and an AQM and for which air quality observations are available. Of course there are exceptions to this principle, but generally that is how one establishes a base year. <comment>At the time of this writing, versions of the 1996 criteria and particulate inventory and 1999 criteria, particulate, and toxics inventories are available from EPA.</comment></para>

<para>Several different files and settings are used to set the base year in SMOKE, each of which should be consistent with each other for ideal results.</para>

<itemizedlist>
<listitem>
<para>The <envar>YEAR</envar> setting in the SMOKE Assigns file is the reference point used by the scripts to determine the base year and set the names of various year-specific input files.</para>
</listitem>
<listitem>
<para>The episode and run settings (see <xref linkend="sect_concepts_modeling_episodes" />) determine the base year that will be used in the model-ready output files. This base year must match the <envar>YEAR</envar> setting so that the correct input files are used.</para>
</listitem>
<listitem>
<para>The input emissions files should ideally contain data for the same base year, and the #YEAR header setting in those files should be consistent with the <envar>YEAR</envar> environment variable in the Assigns file. If the years in the annual inventory files are not consistent with each other, SMOKE will determine the year used by the most sources and set that as the base year. If day-specific or hour-specific data are used, all years in those files must be consistent with the base year of the annual emissions.</para>
</listitem>
<listitem>
<para>The MOVES input data, if they are being used, should also be consistent with the base year. SMOKE is capable of running MOVES with inputs from a different year, but certain inputs may not be correct.</para>
</listitem>
<listitem>
<para>Finally, the dates in the I/O API meteorology data from the Meteorology-Chemistry Interface Processor (MCIP) must be consistent with both the base year and the episode and run settings.</para>
</listitem>
</itemizedlist>

<para>The future (or past) year is a chosen year in the future (or past) for which a modeler needs to run an air quality model; for example, to model the future effects of particular emission control strategies. To model a future year with SMOKE, you must have either an inventory that has been computed for a future year, or growth and control factors to project the base-year inventory to the future year. The settings and files that must be considered are as follows:</para>

<itemizedlist>
<listitem>
<para>The setting <envar>FYEAR</envar> is set in the run script and is used by the script to automatically assign the name of the <command>Cntlmat</command> input file <envar>GCNTL</envar>, which contains the growth factors. <envar>FYEAR</envar> must be set to the future year even if a future-year inventory is not being created because it has already been provided to you.</para>
</listitem>
<listitem>
<para>If you already have a future-year inventory and so do not need to use SMOKE to project one from the base year inventory, then the emissions data year must match the future year, and the #YEAR header in the inventory file must match that year as well. In this case, the <envar>SMK_BASEYR_OVERRIDE</envar> setting must also be used to indicate what the base year is (which will be the same as the year of the meteorology data).</para>
</listitem>
<listitem>
<para>The MOVES input data, if they are being used, must also include the correct settings for the future year of interest.</para>
</listitem>
<listitem>
<para>The episode and run settings, meteorology files, and day- or hour-specific inventories should <emphasis>not</emphasis> match the future year, but rather should use the base-year episode dates.</para>
</listitem>
</itemizedlist>

</section>

<section id="sect_concepts_modeling_episodes">

<title>Modeling episodes</title>

<para>The modeling episode is the total time period for which you will run SMOKE and your AQM. Unless the episode is just a few days long, users typically set up SMOKE to create emissions files of a shorter duration than their modeling episode, often creating one-day files for each day of their episode. Though SMOKE can create a single file for an entire episode, the file often becomes too large for some computers to handle (the limit for 32-bit operating systems is 2 GB files), so necessity rather than preference dictates that smaller files (usually one-day files) be created by SMOKE. We use the term <quote>run period</quote> to distinguish between these shorter durations and the full modeling episode; unless otherwise noted, we will assume that the run period is one day. For example, a typical SMOKE episode might cover July 1, 1996 through July 31, 1996. There will be 31 run periods (days) within this episode, the first starting on July 1, 1996 and the last starting on July 31, 1996.</para>

<comment><para>In the SMOKE Assigns file, there are several settings that you need to change to cause SMOKE to create emissions for the episode of interest. <xref linkend="sect_scripts_how_use_smoke" /> provides more guidance on the particular form and approaches needed for using these settings.</para></comment>

<itemizedlist>
<listitem>
<para>The episode start date (<envar>EPI_STDATE</envar>), episode start time (<envar>EPI_STTIME</envar>), episode duration in hours (<envar>EPI_RUNLEN</envar>), and the episode number of days (<envar>EPI_NDAY</envar>) all must be set to cover the modeling episode. Note that SMOKE can only be run for periods contained within a single calendar year. It cannot, for example, start in December of 1996 and run through January of 1997. Two separate episodes would need to be set up in this case, with the first ending on December 31, 1996, and the second starting on January 1, 1997.</para>
</listitem>
<listitem>
<para>The start date of the first run period needs to be set using the <envar>G_STDATE</envar> and <envar>ESDATE</envar> settings. The <envar>G_STDATE</envar> is the year and Julian day setting used by the SMOKE programs; in our example above, <envar>G_STDATE</envar> would be set to 1996183, since July 1 is the 183rd day of 1996. The <envar>ESDATE</envar> is the Gregorian date used in naming the SMOKE intermediate and output files; for our example, <envar>ESDATE</envar> would be 19960701. The SMOKE scripts will use the <envar>EPI_NDAY</envar> setting to automatically loop through the number of run periods in the episode, starting with the first <envar>G_STDATE</envar> value in the Assigns file. The <envar>G_STDATE</envar> and <envar>ESDATE</envar> settings are changed for each run period.</para>
</listitem>
<listitem>
<para>The run period start time (<envar>G_STTIME</envar>) and duration (<envar>G_RUNLEN</envar>) must also be set to indicate the start time and length of each run period. Both values are provided as a number of hours, using a HHMMSS (hours, minutes, seconds) format.</para>

<para>The run period duration (<envar>G_RUNLEN</envar>) is usually not the same as the episode duration (<envar>EPI_RUNLEN</envar>). For example, if the episode length is 30 days (720 hours), the run period duration setting could be just 1 day (25 hours), 2 days (49 hours), or three days (73 hours) (the reason for the extra hour in each case is explained below). In the first case, SMOKE would create thirty 25-hourfiles; in the second case, fifteen 49-hour files; and in the third case, SMOKE would create ten 73-hour files.</para>
</listitem>
<listitem>
<para>The <envar>NDAYS</envar>, <envar>MSDATE</envar>, and <envar>MDAYS</envar> settings are used for naming files. The <envar>NDAYS</envar> setting should be set to the number of days in each run period, and is used by default for naming time-based files. The <envar>NDAYS</envar> setting is also used along with the <envar>EPI_NDAY</envar> setting to loop through the run periods in the episode. The <envar>MSDATE</envar> and <envar>MDAYS</envar> settings can be used for naming the meteorology input files, but are not being used by the default Assigns file provided with SMOKE.</para>
</listitem>
</itemizedlist>

<para>There are a few key things to remember when you are verifying that you have the correct episode settings:</para>

<itemizedlist>
<listitem>
<para>SMOKE cannot process emissions over a calendar-year break. Thus, the longest run that can be done is for 365 days, with the episode start date being January 1. If a modeling episode spans multiple years, then a different Assigns file, script, and sets of input files must be created for each year.</para>
</listitem>
<listitem>
<para>The AQMs supported by SMOKE always need one extra hour in each emissions input file due to how they calculate boundary conditions. Therefore, if you are inputting emissions to run a 24-hour period, the <envar>G_RUNLEN</envar> setting should be 250000 for 25 hours.</para>
</listitem>
<listitem>
<para>The CMAQ and CAM<subscript>X</subscript> models can accept emissions files for multiple days, but the UAM must have 25-hour files only. As stated earlier, however, all of these models are often run using 25-hour files, with one file for each day of the episode.</para>
</listitem>
<listitem>
<para>All times are associated with a time zone, including the episode and run period start time settings. These settings must be consistent with the time zone of the meteorology files. If the meteorology data were created using MM5, the time zone is most likely Greenwich Mean Time (GMT); therefore, the <envar>EPI_STDATE</envar>, <envar>EPI_STTIME</envar>, <envar>G_STDATE</envar>, and <envar>G_STTIME</envar> settings would have to be provided in that same time zone. Whatever time zone is inherent in the meteorology files and these date settings will also be the time zone of the dates and times in the output emissions files from SMOKE. This ensures that the dates and times of the emissions and meteorology files are consistent for input to the AQM.</para>
</listitem>
</itemizedlist>

</section>

<section>

<title>Chemical mechanisms</title>

<para>SMOKE can accommodate a variety of chemical mechanisms for the models it supports. From the emissions processing perspective, the chemical mechanism is the mapping of the pollutants provided in the emissions inventory to the species needed by the AQM of interest. For example, the input files for five chemical mechanisms for the CMAQ model are available for download from the EPA; these mechanisms are Carbon Bond 6 (CB6), CB6 with particulates, Regional Acid Deposition Model, 2 (RADM2), RADM2 with particulates, and a research version of CB6 with toxics.</para>

<comment><para>In <xref linkend="sect_scripts_change_speciation" />, we provide the settings needed in the Assigns file to use a different chemical mechanisms with SMOKE. SMOKE is not constrained to the files available for download. If you need to process other data (e.g., a tracer species) with SMOKE, they can be added to several input files, including the chemical mechanism file, to be output to the AQM. Some additions to chemical mechanisms are easier than others, and we explain how to determine whether you can create the files you need for your situation. We also give instructions on how to add species to the chemical mechanism files and how to make sure that the inventory pollutants are mapped to the correct chemical species.</para></comment>

<para>SMOKE users must know what chemical mechanism will be used in the AQM for which the SMOKE output emissions are intended. Once that has been determined, the following files must be configured to be consistent with the inventory being used and the chemical mechanism: the inventory table (<envar>INVTABLE</envar>), speciation profiles (<envar>GSPRO</envar>), speciation cross-reference (<envar>GSREF</envar>), and the mobile processes file (<envar>MEPROC</envar>) when creating on-road mobile emissions with MOVES through SMOKE.</para>

</section>

<section>

<title>Layer structures</title>

<para>SMOKE needs information on layer structures for processing elevated point sources&rsquo; plume rise in the <command>Laypoint</command> program and creating the ASCII elevated-point-source file (<envar>ELEVTS_L</envar> or <envar>ELEVTS_S</envar>) with the <command>Smkmerge</command> program. The way SMOKE obtains the layer information differs depending on whether you are creating emissions using a CMAQ-based or UAM-based approach (see <xref linkend="sect_concepts_model_ready_files" />). For the CMAQ-based approach, SMOKE determines the layer structure from the structure included in the header of the <envar>GRID_CRO_3D</envar> meteorology file. For the UAM-based approach, SMOKE does not really need to know the layer structure, except to output it to the ASCII elevated-point-source file. In this case, there are many settings obtained by <command>Smkmerge</command> from environment variable names starting with <envar>UAM_</envar>.</para>

</section>

</section>

<section id="sect_concepts_sparse_matrix">

<title>Sparse matrix approach to emissions modeling</title>

<para>The paradigm for atmospheric emissions models prior to SMOKE was a network of pipes and filters. This means that at any given stage in the processing, an emissions file includes self-contained records describing each source and <emphasis>all</emphasis> of the attributes acquired from previous processing stages. Each processing stage acts as a filter that inputs a stream of these fully-defined records, combines it with data from one or more support files, and produces a new stream of these records. Redundant data are passed down the pipe at the cost of extra I/O, storage, data processing, and program complexity. Using this method, all processing is performed one record at a time, without necessarily a structure or order to the records.</para>

<para>This old paradigm came about as a way to avoid repeatedly searching through data files for needed information, which would be very inefficient. It is admirably suited to older computer architectures with very small available memories and tape-only storage, but is not suitable for current desktop machines or high-performance computers. SMOKE developers demonstrated this when the Emissions Preprocessor System (EPS) 2.0 was run on a Cray Y-MP. It ran four times slower on the Cray machine (a much faster computer) than on a desktop 150 MHz DEC Alphastation 3000/300. This paradigm also fostered a serial approach to the emissions processing steps, as shown in <xref linkend="fig_concepts_serial_approach" />.</para>

<figure id="fig_concepts_serial_approach">
<title>Serial approach to emissions processing</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/serial_approach_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/serial_approach_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>The new paradigm implemented in SMOKE came about from analyses indicating that emissions computations should be quite adaptable to high-performance computing if the paradigm were appropriately changed. For each SMOKE processing category (i.e., area, biogenic, mobile, and point sources), the following tasks are performed:</para>

<itemizedlist>
<listitem>
<para>read emissions inventory data files</para>
</listitem>
<listitem>
<para>optionally grow emissions from the base year to the (future or past) modeled year (except biogenic sources)</para>
</listitem>
<listitem>
<para>transform inventory species into chemical mechanism species defined by an AQM</para>
</listitem>
<listitem>
<para>optionally apply emissions controls (except for biogenic sources)</para>
</listitem>
<listitem>
<para>model the temporal distribution of the emissions, including any meteorology effects</para>
</listitem>
<listitem>
<para>model the spatial distribution of the emissions;</para>
</listitem>
<listitem>
<para>merge the various source categories of emissions to form input files for the AQM</para>
</listitem>
<listitem>
<para>at every step of the processing, perform quality assurance on the input data and the results</para>
</listitem>
</itemizedlist>

<para>Each processing category has its particular complexities and deviations from the above list; these are described in <xref linkend="sect_concepts_processing_summaries" />. For all categories, however, most of the needed processing steps are <emphasis>factor-based</emphasis>; they are linear operations that can be represented as multiplication by matrices. Further, some of the matrices are <emphasis>sparse</emphasis> matrices (i.e., most of their entries are zeros).</para>

<para>SMOKE is designed to take advantage of these facts by formulating emissions modeling in terms of sparse matrix operations, which can be performed by optimized sparse matrix libraries. Specifically, the inventory emissions are arranged as a vector of emissions sorted in a particular order, with associated vectors that include characteristics about the sources such as the state/county and SCCs. SMOKE then creates matrices that apply the control, gridding, and speciation factors to the vector of emissions. In many cases, these matrices are independent from one another, and can therefore be generated in parallel and applied to the inventory in a final <quote>merge</quote> step, which combines the inventory emissions vector (now an hourly inventory file) with the control, speciation, and gridding matrices to create model-ready emissions. <xref linkend="fig_concepts_parallel_approach" /> shows how the matrix approach allows for a more parallel approach to emissions processing, in which fewer steps depend on other needed steps.</para>

<para>Note that in <xref linkend="fig_concepts_parallel_approach" />, temporal allocation outputs hourly emissions instead of a temporal matrix. This is because of some peculiarities with temporal modeling for point sources, which can use hourly emissions as input data. To be able to overwrite the inventory emissions with these hourly emissions, the temporal allocation step must output the emissions data. The matrix approach is used internally in the temporal allocation step.</para>

<para>The growth and controls steps shown in <xref linkend="fig_concepts_parallel_approach" /> are optional. If the inventory is not grown to a future or past year, then the temporal allocation step uses the original inventory vectors to calculate the hourly emissions.</para>

<figure id="fig_concepts_parallel_approach">
<title>Parallel approach to emissions processing</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/parallel_approach_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/parallel_approach_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>Several benefits can be realized from this more parallel approach. For example, given a single emissions inventory, temporal modeling is performed only once per inventory and episode (though in practice, this step is often performed once per episode day). Also, gridding matrices typically need only be calculated once per inventory and model grid definition, without having to reprocess other steps. As shown in <xref linkend="fig_concepts_additional_grid" />, SMOKE usually needs to rerun only the gridding and merge steps to process a different grid for the same inventory. The merge step in the figure will read the previously created results from the temporal allocation, chemical speciation, and control processing steps.</para>

<figure id="fig_concepts_additional_grid">
<title>Processing steps for running an additional grid in SMOKE</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/additional_grid_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/additional_grid_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>In addition, speciation matrices need only be calculated once per inventory and chemical mechanism. Similar to the gridding example, <xref linkend="fig_concepts_additional_chemical" /> shows the SMOKE steps that generally need to be rerun for running an additional chemical mechanism.</para>

<figure id="fig_concepts_additional_chemical">
<title>Processing steps for running an additional chemical mechanism in SMOKE</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/additional_chemical_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/additional_chemical_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>A final example of how this approach is beneficial is processing with a control strategy. Because of SMOKE&rsquo;s parallel processing, changing a control strategy requires only the control and merge steps to be processed again (<xref linkend="fig_concepts_running_control" />). In serial processing, on the other hand, the growth and controls step occurs as the second processing step, which requires that all downstream steps be redone. In <xref linkend="fig_concepts_running_control" />, the speciation, temporal allocation, and gridding steps have already been run, and can be fed to the merge step without being altered or regenerated.</para>

<figure id="fig_concepts_running_control">
<title>Processing steps for running a control scenario in SMOKE</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/running_control_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/running_control_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>Although SMOKE processing generally follows the structure shown in <xref linkend="fig_concepts_parallel_approach" />, there are some exceptions. In the list below, we summarize these exceptions and provide references to the sections of this chapter where these exceptions are explained and shown through diagrams. These exceptions are also described in more detail in <xref linkend="sect_concepts_area_source_processing" />, <xref linkend="sect_concepts_biogenic_source_processing" />, <xref linkend="sect_concepts_mobile_source_processing_moves" />, and <xref linkend="sect_concepts_point_source_processing" />.</para>

<itemizedlist>

<listitem>
<para><emphasis role="bold">On-road mobile processing with MOVES:</emphasis> One way of processing on-road mobile-source emissions is to have SMOKE run the MOVES model based on hourly, gridded meteorology data. To run a different grid or control strategy using this approach, users usually need to run a number of additional processing steps that we have not yet discussed. These differences from the standard processing approach are described in <xref linkend="sect_concepts_mobile_source_processing_moves" />.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Biogenics processing:</emphasis> Biogenics processing uses different processors than those for anthropogenic sources. The emissions from biogenic sources are based on land use data and meteorology data instead of on actual emission inventories. For more information, please see <xref linkend="sect_concepts_biogenic_source_processing" />.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Toxics processing for different chemical speciation mechanisms:</emphasis> Toxics processing may require some special processing steps during import of the inventory data when integrating the criteria and toxics inventories. This step depends on which chemical speciation approach is going to be used. Therefore, when changing the toxics speciation mechanism, it is sometimes necessary to rerun the data import step. See <xref linkend="sect_concepts_combine_toxics" /> for more information.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Point-source processing for CMAQ versus UAM, or CAM<subscript>X</subscript>:</emphasis> Point-source processing for CMAQ uses some different programs than processing for UAM or CAM<subscript>X</subscript>. In some cases, it may be necessary to rerun several programs in order to run for one model rather than another. Further details on this additional processing can be found in <xref linkend="sect_concepts_point_source_processing" />.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Adding hour-specific or day-specific point-source data:</emphasis> If you want to add hour-specific or day-specific point-source data after a point source run has already been performed, several processing steps must be rerun. Further details on this additional processing can be found in <xref linkend="sect_concepts_point_source_processing" />.</para>
</listitem>

</itemizedlist>

</section>

<section id="sect_concepts_processing_summaries">

<title>Area, biogenic, mobile, and point processing summaries</title>

<section id="sect_concepts_summary_source_processing">

<title>Summary of SMOKE processing categories</title>

<para>Each SMOKE processing category is defined by its source <emphasis>characteristics</emphasis>, which correspond to the identifiers used in creating the emission inventory (e.g., state/county FIPS code and SCC). The processing categories also have source <emphasis>attributes</emphasis>, which are the other useful data in the emission inventories that SMOKE uses; examples are point-source flue gas exit height and temperature. Source characteristics <emphasis>define</emphasis> the sources as area, biogenic, mobile, or point sources and also distinguish one source in the inventory from another. Source attributes are additional data about the source that do not contribute to the source&rsquo;s uniqueness in SMOKE. We have previously described in <xref linkend="sect_concepts_inv_data_types" /> the data types and the data attributes that are contained in the inventories that SMOKE uses. In the subsections below, we summarize the source characteristics of area, biogenics, mobile, and point sources. Please refer to <xref linkend="tbl_concepts_inv_categories" /> for more information about how SMOKE processing categories map to the inventory source categories.</para>

<para>In SMOKE, each processing category is defined by source characteristics as follows:</para>

<itemizedlist>

<listitem>
<para><emphasis role="bold">Area sources</emphasis> are defined by (1) <link linkend="sect_concepts_costcy_codes">country, state, and county codes</link>, (2) <link linkend="sect_concepts_scc_codes">SCCs</link>, and (3) optionally, grid cell.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Biogenic sources</emphasis> are defined differently depending on the type of processing you are using. They can be defined either by (1) <link linkend="sect_concepts_costcy_codes">country, state, and county codes</link> and (2) land use code, or by (1) grid cell and (2) land use code.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Mobile sources</emphasis> are defined by (1) <link linkend="sect_concepts_costcy_codes">country, state, and county codes</link>, (2) <link linkend="sect_concepts_scc_codes">SCCs</link>, and (3) optionally link codes.</para>
</listitem>

<listitem>
<para><emphasis role="bold">Point sources</emphasis> are defined by (1) <link linkend="sect_concepts_costcy_codes">country, state, and county codes</link>, (2) plant/facility codes, and (3) characteristics 1 through 5, one of which must be the <link linkend="sect_concepts_scc_codes">SCC</link>.</para>
</listitem>

</itemizedlist>

</section>

<section id="sect_concepts_area_source_processing">

<title>Area-source processing</title>

<para>In SMOKE, there are two major processing routes that you can take for area sources: the typical route and the pregridded data route. (Recall that by <quote>area sources</quote> in SMOKE we mean stationary area/nonpoint sources and nonroad mobile sources.)</para>

<section>

<title>Typical route</title>

<para>The typical route involves processing data identified by country/state/county codes and SCCs. The processing steps vary depending on whether you are doing base-case processing or future- or past-year processing. The steps for base-year processing are shown in <xref linkend="fig_concepts_area_base" />. In <xref linkend="fig_concepts_parallel_approach" />, we also included the major intermediate vectors and matrices; please refer to that diagram for those details. The inventory import step reads the raw emissions data, screens them, processes them, and converts the raw data to the SMOKE intermediate inventory file (inventory vectors in <xref linkend="fig_concepts_parallel_approach" />). The emissions in the inventory file are subdivided to hourly emissions during temporal allocation; assigned chemical speciation factors during speciation, and assigned spatial allocation factors during gridding. The merge step combines the hourly emissions, speciation matrix, and gridding matrix to create model-ready emissions.</para>

<figure id="fig_concepts_area_base">
<title>Base case area-source processing steps</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/area_base_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/area_base_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>In <xref linkend="fig_concepts_area_growth" />, we show the area-source processing steps for future- or past-year processing. This processing is similar to the base-year processing flow, except the growth and controls step is added to create the growth matrix and optionally one or more control matrices. The grow inventory step applies the growth matrix to convert the base-year inventory to a future or past year. Also, the control matrix can optionally be used in the merge step to apply control factors to the future- or past-year emissions. The steps shown with dotted lines represent steps that can be reused from the base-year processing because they do not depend on any of the new steps.</para>

<figure id="fig_concepts_area_growth">
<title>Future- or past-year growth and optional control area-processing steps</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/area_growth_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/area_growth_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>Finally, inventory controls as well as growth can be applied at the front end of processing if such a scheme is needed (<xref linkend="fig_concepts_area_projection" />). This method permits up to 80 growth and/or control matrices to be applied to an inventory, whereas the method shown in <xref linkend="fig_concepts_area_growth" /> allows only one control matrix in the merge step, although any number of growth matrices on the front end. The processing scheme shown in <xref linkend="fig_concepts_area_projection" /> can therefore be useful when mixing and matching many control strategies for simulations.</para>

<figure id="fig_concepts_area_projection">
<title>Alternative future- or past-year growth and control area-processing steps</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/area_projection_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/area_projection_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>In sections later in this chapter, we describe the SMOKE programs that are needed for each of these processing steps and additional details about what activities are accomplished during each step. These sections are:</para>

<itemizedlist spacing="compact">
<listitem><xref linkend="sect_concepts_inventory_import" /></listitem>
<listitem><xref linkend="sect_concepts_temporal_processing" /></listitem>
<listitem><xref linkend="sect_concepts_chemical_processing" /></listitem>
<listitem><xref linkend="sect_concepts_spatial_processing" /></listitem>
<listitem><xref linkend="sect_concepts_growth_processing" /></listitem>
<listitem><xref linkend="sect_concepts_control_processing" /></listitem>
<listitem><xref linkend="sect_concepts_merge_processing" /></listitem>
<listitem><xref linkend="sect_concepts_qa_processing" /></listitem>
</itemizedlist>

</section>

<section id="sect_concepts_pregridded_data">

<title>Pre-gridded Emissions</title>
<section>
<title>Pregridded data route for same modeling domain</title>

<para>The second processing approach for area sources involves using pregridded data. As indicated in <xref linkend="sect_concepts_summary_source_processing" />, area sources can be specified by grid cell instead of by country/state/county code and SCC. This optional approach to modeling area sources requires the inventory emissions data to be gridded prior to inventory import. The gridded area sources do <emphasis>not</emphasis> have country/state/county codes or SCCs, and can be provided via an I/O API time-independent gridded data file. <comment>Users are responsible for getting their data into this format, which is described in more detail in <xref linkend="ch_input_files" />.</comment> The flow diagrams that describe this type of processing are identical to those in <xref linkend="fig_concepts_area_base" />, <xref linkend="fig_concepts_area_growth" />, and <xref linkend="fig_concepts_area_projection" />. Although the gridding step is quite trivial when the grid cell numbers are already specified, the gridding step must still be run to create a gridding matrix required for the merge step.</para>

<para>The disadvantage of using pregridded emissions for area-source processing is that there are no country/state/county codes and SCCs to use in the cross-referencing of any processing step. Therefore, temporal profiles, speciation profiles, growth factors, and control factors must be applied uniformly across the model grid by pollutant.</para>
</section>
<section>
<title>Pregridded data route for a different modeling domain</title>
<para>The sequence for processing global emissions data (e.g., EDGAR, RCP and HTAP) for hemispheric CMAQ involves projecting the data from latitude-longitude projection to polar stereographic projection, converting the inventory species to the terms required by the CMAQ chemical mechanism, and extrapolating the annual emissions to hourly estimates.</para>
<para><ulink url="https://docs.google.com/document/d/1veqEjaTPbDpkqHAIcyReokvv6MaENz0TrwOcFl1ewac/edit#heading=h.xo1iz1nkhapd">Detail information</ulink> on how to process pregridded global emissions data for CMAQ Hemispheric Modeling in SMOKE is available at <ulink url="https://docs.google.com/document/d/1veqEjaTPbDpkqHAIcyReokvv6MaENz0TrwOcFl1ewac/edit#heading=h.xo1iz1nkhapd">this link</ulink>.
</para>
</section>
</section>
<section>

<title>Day-specific and Hour-specific Emissions</title>

<para>Emissions from area sources are sometimes available as day- or hour-specific values. <command>Smkinven</command> can import the day- and hour-specific data, and it can also convert the hour-specific data to hour-specific temporal profiles. When these data are available, the <command>Temporal</command> program overrides the annual or daily emissions with the most specific data available. If day-specific data are available, <command>Temporal</command> uses them to overwrite the annual or average-day emissions during the time periods that these data are available. If hour-specific data are available, <command>Temporal</command> uses them to overwrite the annual, average-day emissions, or day-specific emissions data.</para>

</section>

</section>

<section id="sect_concepts_biogenic_source_processing">

<title>Biogenic-source processing</title>

<para>SMOKE biogenic emissions modeling can be accomplished with the Biogenic Emissions Inventory System, version 4 (BEIS4) approach using the processing scheme  (<xref linkend="fig_concepts_biogenic_base" />). The raw land use inventory data are imported and output as normalized emissions.<comment>add description of normalized emissions</comment> Meteorology adjustments are then applied to the normalized emissions to create hourly model-ready emissions estimates.</para>

<figure id="fig_concepts_biogenic_base">
<title>Biogenic-source processing steps and intermediate files</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/biogenic_base_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/biogenic_base_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>The land use import can start with gridded BELD6 inputs along with summer and winter emission factors. In <xref linkend="sect_concepts_biogenic_processing" />, we provide additional details about the SMOKE programs used for BEIS4 processing and its capabilities.</para>

<para>A variation can be run on the processing steps shown in <xref linkend="fig_concepts_biogenic_base" /> (see <xref linkend="fig_concepts_biogenic_seasons" />). In this variation, some grid cells use summer emission factors and some use winter emission factors. This is useful during the changes of seasons. Based on guidance from EPA, the summer emissions factors should be used for time periods after the last frost of the spring until the first frost of the fall, and winter emission factors should be used at other times of the year. To make such assignments by grid cell, the SMOKE utility <command>Metscan</command> analyzes the meteorology data for the entire year (or the period of interest) to establish which days each grid cell should use winter and summer emission factors. <command>Metscan</command> creates a winter/summer switch file that indicates the appropriate season for each grid cell for each day. More information on <command>Metscan</command> is available in <xref linkend="sect_utilities_metscan" />. The results of the meteorology analysis can then be used in the <xref linkend="fig_concepts_biogenic_seasons" /> processing approach, in which both the summer and winter normalized emissions are provided to the meteorology adjustments step, along with the winter/summer switch file. The resulting model-ready emissions data have used the winter emission factors for all grid cells of the domain that have experienced the first freeze date of the year but not the last (within a calendar year, this is the time periods January through March and November through December in many regions), and the summer emission factors for all grid cells between the last and first freeze dates.</para>

<figure id="fig_concepts_biogenic_seasons">
<title>Biogenic-source processing steps and intermediate files using both winter and summer emission factors</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/biogenic_seasons_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/biogenic_seasons_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>In <xref linkend="sect_concepts_biogenic_processing" /> we describe the SMOKE programs that are needed for each of these processing types for BEIS4 processing, and additional details about what activities are accomplished during each step.</para>

</section>

<section id="sect_concepts_mobile_source_processing_moves">

<title>Mobile-source processing using MOVES</title>

<para>SMOKE provides two ways of processing mobile sources using MOVES. (Recall that by <quote>mobile sources</quote> in SMOKE we mean on-road mobile sources.) The first approach is to compute mobile emissions values prior to running SMOKE and provide them to SMOKE as input; we call this the precomputed-emissions approach. The second approach is to provide SMOKE with VMT data, Vehicle population (VPOP) data, meteorology data, and MOVES outputs, and have SMOKE compute the mobile emissions based on these data; this is called the MOVES approach. These approaches are not mutually exclusive, so it is possible to provide both precomputed emissions and VMT and VPOP data to SMOKE and have the system compute only some of the emissions using MOVES outputs. Both processing approaches can produce criteria, particulate, and toxics emissions results.</para>

<para>The precomputed-emissions approach is quite similar to the processing method for area sources. In fact, <xref linkend="fig_concepts_area_base" />, <xref linkend="fig_concepts_area_growth" />, and <xref linkend="fig_concepts_area_projection" /> from <xref linkend="sect_concepts_area_source_processing" /> show exactly the processing steps needed for processing mobile sources using SMOKE and the precomputed-emissions approach. As in base-case processing for area sources, emissions in the inventory file are subdivided to hourly emissions during temporal allocation, assigned chemical speciation factors during speciation, and assigned spatial allocation factors during gridding. The merge step combines the hourly emissions, speciation matrix, and gridding matrix to create model-ready emissions. For future- or past-year processing, the growth and controls step is added to create the growth and control matrices, while the grow inventory step converts the inventory from the base year to a future or past year. The control matrix can be optionally used in the merge step to apply control factors to the future- or past-year emissions. Note that, unlike the VMT approach, in the precomputed-emissions approach SMOKE will not model the variations in emissions caused by temperature, humidity, or other meteorological settings.</para>

<para>The MOVES approach is much different from the precomputed-emissions approach. <xref linkend="fig_concepts_mobile_moves_onroad" /> and <xref linkend="fig_concepts_mobile_moves_offroad" /> summarize the MOVES approach. First, county total activity inventory VMT data by road class and vehicle type or county total activity inventory VPOP by vehicle type are input to SMOKE. The chemical speciation step computes the chemical speciation factors for each county, road class, vehicle type, emissions process (e.g., exhaust start, exhaust running, evaporative processes, extended idle, and crankcase), and pollutant and stores the necessary factors for this transformation. The gridding step allocates the sources to grid cells and uses spatial surrogates to allocate county-total emissions to grid cells, storing the emission rates needed for these allocations based on hourly gridded ambient temperature meteorology.</para>

<para>The approach for running MOVES for SMOKE relies on the concept of representative counties and fuel months. The concept of representative county refers to running MOVES for a single county, which is the representative county, to represent itself and other counties that share the same MOVES input parameters and thus have the same emission rates for any given speed, temperature and humidity.  A reference fuel month similarly refers to a reference fuel month's MOVES run that contains the temperatures that occur in neighboring months as well as the representative month. The mapping of calendar months to a representative month should be assigned on the basis of shared fuel parameters, because it is the interaction of fuel and temperature that is important.  For example, an average-hourly temperature of 70°F may occur in some hour of any day in each of four months: May, June, July and August.   If those four months share the same fuel properties (i.e. summer fuel) then an emission factor will be determined for just the representative month, reducing by a factor of four the number of calculations that MOVES needs to perform.</para>
<para>Unlike MOBILE6, MOVES differentiates between on-roadway emission processes and off-network emission processes. <xref linkend="fig_concepts_mobile_moves_onroad" /> summarizes the approach used by MOVES for on-roadway mobile sources.  The on-roadway emission process includes county-total VMT and average speed inventory as input.  The off-network emission processes use the county-total vehicle population by vehicle type as input. <xref linkend="fig_concepts_mobile_moves_offroad" /> summarizes the approach used by MOVES for off-network mobile sources. Both on-roadway and off-network emission processes do require real gridded meteorology data from MCIP files to estimate temperature-dependent emission rates.</para>

<figure id="fig_concepts_mobile_moves_onroad">
<title>MOVES mobile RatePerDistance processing steps</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/mobile_moves_onroad_base_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/mobile_moves_onroad_base_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<figure id="fig_concepts_mobile_moves_offroad">
<title>MOVES mobile RatePerVehicle and RatPereProfile (off-network) processing steps</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/mobile_moves_offroad_base_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/mobile_moves_offroad_base_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>In sections later in this chapter, we describe the SMOKE programs that are needed for each of the processing steps just described for MOVES processed mobile sources, and additional details about what activities are accomplished during each step. These sections are:</para>

<itemizedlist spacing="compact">
<listitem><xref linkend="sect_concepts_inventory_import" /></listitem>
<listitem><xref linkend="sect_concepts_temporal_processing" /></listitem>
<listitem><xref linkend="sect_concepts_chemical_processing" /></listitem>
<listitem><xref linkend="sect_concepts_spatial_processing" /></listitem>
<listitem><xref linkend="sect_concepts_onroad_processing_moves" /></listitem>
<listitem><xref linkend="sect_concepts_merge_processing_moves" /></listitem>

</itemizedlist>

<para>Processing mobile sources involves a number of concepts that are unique to mobile sources. These include a special classification of road types in MOVES, SMOKE and MOVES vehicle types, emissions processes, MOVES emission factors, representative counties, reference fuel months, and meteorological processing using <command>Met4moves</command>. The following subsections explain these topics in more detail.</para>

<section id="sect_concepts_onroad_sccs">

<title>Special approach for on-road mobile MOVES SCCs</title>

<para>SMOKE handles SCCs differently for on-road mobile sources compared with all other source categories. SMOKE programs assume that on-road mobile SCCs have the following form:</para>

<informalfigure>
<mediaobject>
<imageobject role="pdf">
<imagedata width="3in" fileref="images/concepts/scc_mobile_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/scc_mobile_html.jpg" />
</imageobject>
</mediaobject>
</informalfigure>

</section>

<section>

<title>Fuel types in MOVES</title>

<para>MOVES can model six different fuel types: Gasoline, Diesel, Compressed Natural Gas (CNG), Liquefied Petroleum Gas (LPG), Ethanol, and Electricity. <xref linkend="tbl_concepts_moves_fuel"/> indicates a list of original MOVES fuel types.<!-- <xref linkend="tbl_concepts_moves_agg_fuel"/> shows how the four aggregated SCC fuel types are mapped to the six MOVES fuel types.--></para>

<table id="tbl_concepts_moves_fuel">
<?dbfo table-width="4in"?>
<title>MOVES Fuel Type</title>

<tgroup cols="2">
<colspec colwidth="1*" />
<colspec colwidth="2*" />

<thead>
<row>
<entry align="center">MOVES Fuel Type</entry>
<entry align="center">Description</entry>
</row>
</thead>

<tbody>
<row>
<entry>01</entry>
<entry>Gasoline</entry>
</row>
<row>
<entry>02</entry>
<entry>Diesel</entry>
</row>
<row>
<entry>03</entry>
<entry>Compressed Natural Gas (CNG)</entry>
</row>
<row>
<entry>04</entry>
<entry>Liquefied Petroleum Gas (LPG)</entry>
</row>
<row>
<entry>05</entry>
<entry>Ethanol (E-85)</entry>
</row>
<row>
<entry>09</entry>
<entry>Electricity</entry>
</row>
</tbody>
</tgroup>
</table>

<!--
<table id="tbl_concepts_moves_agg_fuel">
<?dbfo table-width="4in"?>
<title>Example of aggregated fuel types and corresponding MOVES fuel types</title>

<tgroup cols="3">
<colspec colwidth="1*" />
<colspec colwidth="2*" />
<colspec colwidth="2*" />

<thead>
<row>
<entry align="center">SCC</entry>
<entry align="center">Example of aggregated Fuel Type Description</entry>
<entry align="center">MOVES Fuel Type</entry>
</row>
</thead>

<tbody>
<row>
<entry>x1</entry>
<entry>All non-diesel fuels</entry>
<entry>01; 03; 04; 05; 09</entry>
</row>
<row>
<entry>x2</entry>
<entry>All gasoline and ethanol blends</entry>
<entry>01; 05</entry>
</row>
<row>
<entry>x3</entry>
<entry>All fossil fuels</entry>
<entry>01; 02; 03; 04; 05</entry>
</row>
<row>
<entry>00</entry>
<entry>All fuels</entry>
<entry>01; 02; 03; 04; 05; 09</entry>
</row>
</tbody>
</tgroup>
</table>
-->

</section>

<section>

<title>MOVES Vehicle Types</title>

<para>The vehicle types used in SMOKE&rsquo;s on-road mobile source processing are described in <xref linkend="tbl_concepts_vehicle_types" />.<!--<xref linkend="tbl_concepts_agg_vehicle_types"/> shows how the eight aggregated MOVES vehicle types are mapped to the original MOVES vehicle types.--></para>

<table id="tbl_concepts_vehicle_types">
<?dbfo table-width="3.5in"?>
<title>MOVES Vehicle type codes and descriptions</title>

<tgroup cols="2">
<colspec colwidth="1*" />
<colspec colwidth="2*" />

<thead>
<row>
<entry align="center">MOVES Vehicle Type</entry>
<entry align="center">Description</entry>
</row>
</thead>

<tbody>
<row>
<entry>11</entry>
<entry>Motorcycle</entry>
</row>
<row>
<entry>21</entry>
<entry>Passenger Car</entry>
</row>
<row>
<entry>31</entry>
<entry>Passenger Truck</entry>
</row>
<row>
<entry>32</entry>
<entry>Light Commercial Truck</entry>
</row>
<row>
<entry>41</entry>
<entry>Intercity Bus</entry>
</row>
<row>
<entry>42</entry>
<entry>Transit Bus</entry>
</row>
<row>
<entry>43</entry>
<entry>School Bus</entry>
</row>
<row>
<entry>51</entry>
<entry>Refuse Truck</entry>
</row>
<row>
<entry>52</entry>
<entry>Single Unit Short-haul Truck</entry>
</row>
<row>
<entry>53</entry>
<entry>Single Unit Long-haul Truck</entry>
</row>
<row>
<entry>54</entry>
<entry>Motor Home</entry>
</row>
<row>
<entry>61</entry>
<entry>Combination Short-haul Truck</entry>
</row>
<row>
<entry>62</entry>
<entry>Combination Long-haul Truck</entry>
</row>
</tbody>
</tgroup>
</table>

<!--
<table id="tbl_concepts_agg_vehicle_types">
<?dbfo table-width="3.5in"?>
<title>Example of aggregated vehicle types and corresponding MOVES vehicle types</title>

<tgroup cols="3">
<colspec colwidth="1*" />
<colspec colwidth="2*" />
<colspec colwidth="3*" />

<thead>
<row>
<entry align="center"> SCC </entry>
<entry align="center">Example of aggregated Vehicle Type Description</entry>
<entry align="center">MOVES Vehicle Type</entry>
</row>
</thead>

<tbody>
<row>
<entry>30</entry>
<entry>Light Duty Trucks</entry>
<entry>31; 32</entry>
</row>
<row>
<entry>40</entry>
<entry>Buses</entry>
<entry>41; 42; 43</entry>
</row>
<row>
<entry>70</entry>
<entry>All Heavy Duty Trucks and Buses</entry>
<entry>41; 42; 43; 51; 52; 53; 54; 61; 62 </entry>
</row>
<row>
<entry>71</entry>
<entry>All Heavy Duty Trucks</entry>
<entry>51; 52; 53; 54; 61; 62 </entry>
</row>
<row>
<entry>72</entry>
<entry>All Combination Trucks</entry>
<entry>61; 62 </entry>
</row>
<row>
<entry>80</entry>
<entry>All Trucks and Buses</entry>
<entry>31; 32; 41; 42; 43; 51; 52; 53; 54; 61; 62 </entry>
</row>
<row>
<entry>81</entry>
<entry>All Trucks except Buses</entry>
<entry>31; 32; 51; 52; 53; 54; 61; 62 </entry>
</row>
<row>
<entry>00</entry>
<entry>All Vehicles</entry>
<entry>11; 21; 31; 32; 41; 42; 43; 51; 52; 53; 54; 61; 62 </entry>
</row>
</tbody>
</tgroup>
</table>

<para>MOVES can produce emission factors for possible combinations between 13 MOVES vehicle types and fuel types. For the efficiency of SMOKE processing, the emission factors can be aggregated to a list of vehicle types listed in <xref linkend="tbl_concepts_agg_vehicle_types" />.</para>
-->

</section>

<section>

<title>MOVES Road Types</title>

<para>MOVES can model nine different road types: rural restricted/unrestricted access with ramps, rural restricted/unrestricted access without ramps, urban restricted/unrestricted access with ramps, urban restricted/unrestricted access without ramps, and off-network (<xref linkend="tbl_concepts_moves_roads" />). <!--<xref linkend="tbl_concepts_moves_agg_roads" /> indicates how the 6 aggregated MOVES road classes are mapped to the nine MOVES road types.--></para>

<table id="tbl_concepts_moves_roads">
<?dbfo table-width="4in"?>
<title>Road class and corresponding MOVES road type</title>

<tgroup cols="2">
<colspec colwidth="1*" />
<colspec colwidth="2*" />

<thead>
<row>
<entry align="center">MOVES Road Type</entry>
<entry align="center">Description</entry>
</row>
</thead>

<tbody>
<row>
<entry>01</entry>
<entry>Off-Network</entry>
</row>
<row>
<entry>02</entry>
<entry>Rural Restricted Access</entry>
</row>
<row>
<entry>03</entry>
<entry>Rural Unrestricted Access</entry>
</row>
<row>
<entry>04</entry>
<entry>Urban Restricted Access</entry>
</row>
<row>
<entry>05</entry>
<entry>Urban Unrestricted Access</entry>
</row>
<row>
<entry>06</entry>
<entry>Rural Restricted without Ramps</entry>
</row>
<row>
<entry>07</entry>
<entry>Urban Restricted without Ramps</entry>
</row>
<row>
<entry>08</entry>
<entry>Rural Restricted only Ramps</entry>
</row>
<row>
<entry>09</entry>
<entry>Urban Restricted only Ramps</entry>
</row>

</tbody>
</tgroup>
</table>

<!--
<table id="tbl_concepts_moves_agg_roads">
<?dbfo table-width="4in"?>
<title>Example of aggregated road types and corresponding MOVES road types</title>

<tgroup cols="3">
<colspec colwidth="1*" />
<colspec colwidth="2*" />
<colspec colwidth="2*" />

<thead>
<row>
<entry align="center"> SCC </entry>
<entry align="center">Example of aggregated Road Type Description</entry>
<entry align="center">MOVES Road Type</entry>
</row>
</thead>

<tbody>
<row>
<entry>70</entry>
<entry>Freeway</entry>
<entry>2; 4 </entry>
</row>
<row>
<entry>71</entry>
<entry>freeway except ramps</entry>
<entry>6; 7 </entry>
</row>
<row>
<entry>72</entry>
<entry>Ramps</entry>
<entry>8; 9 </entry>
</row>
<row>
<entry>80</entry>
<entry>Non-Freeway</entry>
<entry>3; 5 </entry>
</row>
<row>
<entry>90</entry>
<entry>All On-network</entry>
<entry>2; 3; 4; 5 </entry>
</row>
<row>
<entry>00</entry>
<entry>All on and off-network</entry>
<entry>1; 2; 3; 4; 5 </entry>
</row>
</tbody>
</tgroup>
</table>
-->

</section>

<section>

<title>MOVES Process Types</title>

<para>MOVES can model 14 different process types: including on-roadway and off-network emissions processes, for the selected pollutants. Off-network emission processes (e.g., parked engine-off, engine starts, and idling, and fuel vapor venting in MOVES (<xref linkend="tbl_concepts_moves_proc" />). <!--<xref linkend="tbl_concepts_moves_agg_proc" /> indicates how the 14 aggregated MOVES process types are mapped to the 14 MOVES process types.--></para>

<table id="tbl_concepts_moves_proc">
<?dbfo table-width="4in"?>
<title>Process types and corresponding MOVES road type</title>

<tgroup cols="2">
<colspec colwidth="1*" />
<colspec colwidth="2*" />

<thead>
<row>
<entry align="center">MOVES Process</entry>
<entry align="center">Description</entry>
</row>
</thead>

<tbody>
<row>
<entry>01</entry>
<entry>Running Exhaust</entry>
</row>
<row>
<entry>02</entry>
<entry>Start Exhaust</entry>
</row>
<row>
<entry>09</entry>
<entry>Brakewear</entry>
</row>
<row>
<entry>10</entry>
<entry>Tirewear</entry>
</row>
<row>
<entry>11</entry>
<entry>Evaporative Permeation</entry>
</row>
<row>
<entry>12</entry>
<entry>Evaporative Fuel Vapor Venting</entry>
</row>
<row>
<entry>13</entry>
<entry>Evaporative Fuel Leaks</entry>
</row>
<row>
<entry>15</entry>
<entry>Crankcase Running Exhaust</entry>
</row>
<row>
<entry>16</entry>
<entry>Crankcase Start Exhaust</entry>
</row>
<row>
<entry>17</entry>
<entry>Crankcase Extended Idle Exhaust</entry>
</row>
<row>
<entry>18</entry>
<entry>Refueling Displacement Vapor Loss</entry>
</row>
<row>
<entry>19</entry>
<entry>Refueling Spillage Loss</entry>
</row>
<row>
<entry>90</entry>
<entry>Extended Idle Exhaust</entry>
</row>
<row>
<entry>91</entry>
<entry>Auxiliary Power Exhaust</entry>
</row>
<row>
<entry>99</entry>
<entry>Well-to-Pump</entry>
</row>
</tbody>
</tgroup>
</table>

<!--
<table id="tbl_concepts_moves_agg_proc">
<?dbfo table-width="4in"?>
<title>Example of aggregated process types and corresponding MOVES process types</title>

<tgroup cols="3">
<colspec colwidth="1*" />
<colspec colwidth="2*" />
<colspec colwidth="2*" />

<thead>
<row>
<entry align="center"> SCC </entry>
<entry align="center">Example of aggregated Process Type Description</entry>
<entry align="center">MOVES Process Type</entry>
</row>
</thead>

<tbody>
<row>
<entry>50</entry>
<entry>All Exhaust</entry>
<entry>1; 2; 15; 16; 17; 90; 91 </entry>
</row>
<row>
<entry>51</entry>
<entry>All Exhaust except Hotelling</entry>
<entry>1; 2; 15; 16 </entry>
</row>
<row>
<entry>52</entry>
<entry>All hotelling exhaust</entry>
<entry>17; 90; 91 </entry>
</row>
<row>
<entry>53</entry>
<entry>All Extended Idle Exhaust</entry>
<entry>17; 90 </entry>
</row>
<row>
<entry>60</entry>
<entry>All Evaporative and Refueling</entry>
<entry>11; 12; 13; 18; 19 </entry>
</row>
<row>
<entry>61</entry>
<entry>All Evaporative except Refueling</entry>
<entry>11; 12; 13 </entry>
</row>
<row>
<entry>61</entry>
<entry>All Evaporative except Refueling</entry>
<entry>11; 12; 13 </entry>
</row>
<row>
<entry>62</entry>
<entry>All Refueling</entry>
<entry>18; 19 </entry>
</row>
<row>
<entry>63</entry>
<entry>All Evaporative except Permeation and Refueling</entry>
<entry>12; 13 </entry>
</row>
<row>
<entry>70</entry>
<entry>All Exhaust and Evaporative and Refueling</entry>
<entry>1; 2; 11; 12; 13; 15; 16; 17; 18; 19; 90; 91 </entry>
</row>
<row>
<entry>71</entry>
<entry>All Exhaust and Evaporative except Refueling</entry>
<entry>1; 2; 11; 12; 13; 15; 16; 17; 90; 91 </entry>
</row>
<row>
<entry>72</entry>
<entry>All Exhaust and Evaporative except Refueling and Hotelling</entry>
<entry>1; 2; 11; 12; 13; 15; 16 </entry>
</row>
<row>
<entry>80</entry>
<entry>All Exhaust and Evaporative and Brake and Tire Wear except Refueling</entry>
<entry>1; 2; 9; 10; 11; 12; 13; 15; 16; 17; 90; 91 </entry>
</row>
<row>
<entry>81</entry>
<entry>All Exhaust and Evaporative and Brake and Tire Wear except Refueling and Hotelling</entry>
<entry>1; 2; 9; 10; 11; 12; 13; 15; 16; 17 </entry>
</row>
<row>
<entry>00</entry>
<entry>All Processes</entry>
<entry>1; 2; 9; 10; 11; 12; 13; 15; 16; 17; 18; 19; 90; 91 </entry>
</row>
</tbody>
</tgroup>
</table>
-->

</section>


<section id="sect_concepts_reference_counties_moves">
<title>Representative Counties</title>
<para>The approach for running MOVES for SMOKE relies on the concept of representative counties. These are counties that are used during the creation and use of emission rates to represent a set of similar counties (i.e., inventory counties) called a county group. The purpose of the representative county approach is to reduce the computational burden of running MOVES on every county in your modeling domain. By using a represenative county, the user generates key emission rates for the single county in MOVES and then utilizes these factors to estimate emissions for all counties in the county group through SMOKE. The representative county is modeled at a range of speeds and temperatures to produce emission rate lookup tables (grams/mile or grams/vehicle/hour, depending on mobile emission process). The variables that are assumed to be constant across the county group members (and the representative county) are fuel parameters, fleet age distribution and inspection/maintenance (I/M) programs. The variables that can vary within the county group are vehicle miles traveled (VMT), source type vehicle population, roadway speed, and grid cell temperatures. Determining the representative counties and their respective county groups is a key aspect of utilizing the SMOKE-MOVES tool. It is ideal for the user to create each county group based on the similarity between the county characteristics (e.g., urban and rural) and the meteorological conditions (e.g., temperature and relative humidity). The user should avoid grouping counties that have significantly different meteorological conditions.</para>
</section>
<section id="sect_concepts_moves_reference_fuel_month">
<title>Reference Fuel month</title>
<para>Along with the concept of representative county approach, the concept of a fuel month is very important. It is used to indicate when a particular set of fuel properties should be used in a MOVES simulation. Similar to the representative county, the fuel month reduces the computational time of MOVES by using a single month to represent a set of months. To determine the fuel month and which months it corresponds to, the user should review the State-provided fuel supply data in the MOVES database for each representative county. If the fuel supply data change throughout the year, then group the months by fuel parameters. For example, if the grams/mile exhaust emission rates in January are identical to February's rates for a given representative county, then use a single fuel month to represent January and February. In other words, only one of the months needs to be modeled through MOVES.</para>
</section>

<section id="sect_concepts_moves_met_processing">
<title>Meteorological Data Processing</title>
<para>The meteorological data processor program <command>Met4moves</command> prepares spatially and temporally averaged temperatures and relative humidity data to set up the meteorological input conditions for MOVES and SMOKE using the Meteorology-Chemistry Interface Processor (MCIP) output files.</para>
<para><link linkend="sect_programs_met4moves"><command>Met4moves</command></link> must be run after MCIP and before the MOVES Driver script <command><quote>Runspec_generator.pl</quote></command> and SMOKE modeling system.</para>
<para>The following are the major processing steps that <command>Met4moves</command> performs:</para>
<itemizedlist>
<listitem>
<para>Read the representative county cross-reference file <link linkend="sect_input_mcxref"><envar>MCXREF</envar></link> that contains a list of representative counties and the county groups that map to those representative counties.</para>
</listitem>
<listitem>
<para>Read the surrogate description file <link linkend="sect_input_srgdesc"><envar>SRGDESC</envar></link> and a list of associated spatial surrogate(s) chosen for use in selecting grid cells.</para>
</listitem>
<listitem>
<para>Determine a list of grid cells for each county. Only the selected grid cells are used to estimate the min/max temperatures, 24-hour temperature profiles, and RH over the user-specified modeling period.</para>
</listitem>
<listitem>
<para>Set the dates of the modeling episode in local time using the flags <envar>STDATE</envar> and <envar>ENDATE</envar></para>
</listitem>
<listitem>
<para>Determine the fuel month for the representative county using the <link linkend="sect_input_mfmref"><envar>MFMREF</envar></link> input file.</para>
</listitem>
<listitem><para>Read the country/state/county <link linkend="sect_input_costcy"><envar>COSTCY</envar></link> or <link linkend="sect_input_geocode"><envar>GEOCODE_LEVEL[1-4]</envar></link> (if USE_EXP_GEOCODES Y) file to define the time zones for county groups.</para></listitem><listitem><para>Read the meteorology data that have been processed by MCIP.</para></listitem><listitem>
<para>Calculate the min/max temperatures hourly and over the modeling period.</para></listitem>
<listitem>
<para>Calculate average RH for the specified hour range over the modeling period.</para>
</listitem>
<listitem>
<para>Once min/max temperatures and averaged RH are estimated for all representative counties and all inventory counties in the county groups, estimate diurnal 24-hour temperature profiles for use by the MOVES Driver script. The result is a normalized 24-hour shape profile over the user-specified period or fuel month.</para></listitem>
</itemizedlist>
</section>

<section id="sect_concepts_moves_emission_processes">
<title> MOVES Emission Processes by Emission Rate Tables</title>
<para>When the MOVES model runs for SMOKE, it runs for all emissions processes (or modes), including on-roadway and off-network emissions processes, for the selected pollutants. Off-network emission processes (e.g., parked engine-off, engine starts, and idling, and fuel vapor venting) in MOVES are hour-dependent due to vehicle activity assumptions built into the MOVES model; the emission rate depends on both hour of the day and temperature. On-roadway emission processes (e.g., running exhaust, crankcase running exhaust, brake wear, tire wear, and on-road evaporative), on the other hand, do not depend on hour. In MOVES, these emission processes are categorized into three major groups:</para>
<itemizedlist>
<listitem>
<emphasis role="bold">RatePerDistance (RPD) - </emphasis>The emission rate of on-roadway vehicles (e.g., driving) from MOVES. The rate is expressed in grams/mile traveled.
</listitem>
<listitem>
<emphasis role="bold">RatePerVehicle (RPV) - </emphasis>The emission rate of off-network vehicles (e.g., idling, refueling, parked) from MOVES. The rate is given in grams/vehicle/hour.
</listitem>
<listitem>
<emphasis role="bold">RatePerProfile (RPP) - </emphasis>The emission rate of off-network vehicles specifically, the evaporation from parked vehicles (vapor-venting emissions) from MOVES. The rate is expressed in grams/vehicle/hour.
</listitem>
<listitem>
<emphasis role="bold">RatePerHour (RPH) - </emphasis>The emission rate of extended idle exhaust from on-roadway vehicle. The rate is expressed in grams/hour traveled.
</listitem>
<listitem>
<emphasis role="bold">RatePerStart (RPS) - </emphasis>The emission rate of off-network vehicles from engine start exhaust and crankcase start exhaust processes. The rate is expressed in grams/no of engine starts.
</listitem>
<listitem>
<emphasis role="bold">Off-Network Idling (RPHO or ONI) - </emphasis>The emission rate of off-network vehicles from idling process from parking lots, dustribution centers et al., The rate is expressed in grams/hours of vehicle idling.
</listitem>


</itemizedlist>

<para>MOVES emission rates are organized into four lookup tables (RPD, RPV, RPP, RPH, RPS and RPHO), depending on emission process and whether the vehicle is parked or in motion.  The approach to running MOVES for SMOKE is unique for each emission rate table listed in <xref linkend="tbl_concepts_moves_emission_rate_tbl"/>.  A complete inventory must use the emission rates from all three tables. Note that refueling emission process is not a subject to MOVES emission rate table approach yet.</para>

<table id="tbl_concepts_moves_emission_rate_tbl">
<title>MOVES Emission Processes by Emission Rate Tables</title>
<tgroup cols="3">
<colspec colname="c1" colwidth="10*" />
<colspec colname="c2" colwidth="10*" />
<colspec colname="c3" colwidth="10*" />

<thead>
<row>
<entry morerows="1" valign="bottom" align="center">MOVES Lookup Table</entry>
<entry morerows="1" valign="bottom" align="center">Units</entry>
<entry morerows="1" valign="bottom" align="center">Emissions Process</entry>
</row>
</thead>

<tbody>
<row>
<entry>RatePerDistance (RPD)</entry>
<entry align="center">Grams/mile</entry>
<entry align="left">
<simplelist>
<member>Running Exhaust</member>
<member>Crankcase Running Exhaust</member>
<member>Tire Wear</member>
<member>Brake Wear</member>
<member>On-road Evaporative Permeation</member>
<member>On-road Evaporative Fuel Leaks</member>
<member>On-road Evaporative Fuel Vapor Venting</member>
</simplelist>
	</entry>
</row>

<row>
<entry>RatePerVehicle (RPV)</entry>
<entry align="center">Grams/vehicle/hour</entry>
<entry align="left">
<simplelist>
<member>Start Exhaust</member>
<member>Crankcase Start Exhaust</member>
<member>Off-network Evaporative Permeation</member>
<member>Off-network Evaporative Fuel Leaks</member>
<member>Crankcase Extended Idle Exhaust</member>
<member>Extended Idle Exhaust</member>
</simplelist>
	</entry>
</row>
<row>
<entry>RatePerProfile (RPP)</entry>
<entry align="center">Grams/vehicle/hour</entry>
<entry align="left">Off-network Evaporative Fuel Vapor Venting</entry>
</row>
<row>
<entry>RatePerHour (RPH)</entry>
<entry align="center">Grams/hour</entry>
<entry align="left">On-roadway Extended Idle Exhaust</entry>
</row>
<row>
<entry>RatePerStart (RPS)</entry>
<entry align="center">Grams/start</entry>
<entry align="left">Off-network Engine Start Exhaust</entry>
</row>
<row>
<entry>Off-Network Idling (RPHO or ONI)</entry>
<entry align="center">Grams/hour</entry>
<entry align="left">Off-Network Idling</entry>
</row>
</tbody>
</tgroup>
</table>

<para><emphasis>The RPD lookup table</emphasis> is used to provide estimates of on-roadway emissions processes from mobile sources, using a separate file for each representative county. The on-road running processes that appear in this table include running exhaust, crankcase running exhaust, brake wear, tire wear, on-road evaporative permeation, on-road evaporative fuel leaks, and on-road evaporative vapor venting. The units of the emission rates in this table are grams/mile. The lookup fields for the factors are temperature and average speed. There are 16 set speed bins defined in <xref linkend="tbl_moves_speed_bins" /> (i.e., avgSpeedBinID 1=2.5mph, 2=5mph, 3=10mph, …16=75mph). The avgBinSpeed is used for interpolation in the RPD table.</para>
<para><emphasis>The RPV, RPH, and RPS lookup table</emphasis> is used to provide estimates of off-network emission processes (parked engine-off, engine starts, and idling), except for the evaporative off-network vapor venting emissions process. A separate file is provided for each representative county. The off-network emission processes include start exhaust, crankcase start exhaust, off-network evaporative permeation, off-network evaporative fuel leaks, extended idle exhaust, and crankcase extended idle exhaust. Fuel month, temperature, and local hour are the lookup fields in this table, and hours are in the local time of the modeling county. The units of the emission rates are grams/vehicle/hour. Note: Although the units are grams/vehicle/hour, the number of vehicles (i.e., population) should not be temporally allocated to hours in SMOKE. Instead, a county total of vehicle population should be multiplied by emission rates at any given hour. The number of starts per vehicle by hour is already accounted for in the MOVES lookup table.</para>
<para><emphasis>The RPP table</emphasis> is used only to estimate emissions for off-network fuel vapor venting when the vehicle is parked. This process type includes diurnal (when the vehicle is parked during the day) and hot soak (immediately after a trip when the vehicle parks) emissions types. The process depends on the rate of rise in temperature and the maximum temperature achieved during the day for the diurnal emissions type, and on the hourly temperatures for the hot soak emission type. The lookup fields for this table are reference fuel month and hour of day. As with the RPV table, the units of the emission rates are grams/vehicle/hour. The estimated emissions rates need to be multiplied by the county vehicle population. The representative county lookup tables contain 24-hour emission rates per hour per vehicle using a representative county temperature profile with different minimum and maximum temperatures. The average day county emissions are determined by interpolating between the minimum and maximum temperatures for the modeling county generated by <command>Met4moves</command>. <xref linkend="sect_concepts_moves_met_processing" /> summarizes how <command>Met4moves</command> processes meteorological data for both MOVES and SMOKE.</para>
<para><emphasis>The RPH table</emphasis> is used only to estimate emissions for on-roadway extended idle exhaust process from mobile sources, using a separate file for each representative county.</para>:
<para><emphasis>The RPS table</emphasis> is used only to estimate emissions for off-netwrok engine start and crankcase start exhaust when the vehicle engines get ignited. Number of vehicle starts activity inventory will be used to compute the emissions multiplying to RPS grams/start emission rate.</para>
<para><emphasis>The RPHO (ONI) table</emphasis> is used only to estimate emissions for off-netwrok vehicle idling process during especially heavy duty vehicle engines are in idling mode in the partking lot and distribution. Number of vehicle idling hours activity inventory will be used to compute the emissions multiplying to RPHO (ONI) grams/hour emission rate.</para>
<table id="tbl_moves_speed_bins">
<?dbfo table-width="6in"?>
<title>MOVES Default Speed Bins</title>

<tgroup cols="3">
<colspec colwidth="5*" />
<colspec colwidth="5*" />
<colspec colwidth="10*" />

<thead>
<row>
<entry align="center">avgSpeedBinId</entry>
<entry align="center">avgBinSpeed</entry>
<entry align="center">AvgSpeedBinDesc</entry>
</row>
</thead>

<tbody>
<row>
<entry align="center">1</entry>
<entry align="center">2.5</entry>
<entry align="center">speed &lt; 2.5mph</entry>
</row>
<row>
<entry align="center">2</entry>
<entry align="center">5</entry>
<entry align="center">2.5mph &#8804; speed &lt; 7.5mph</entry>
</row>
<row>
<entry align="center">3</entry>
<entry align="center">10</entry>
<entry align="center">7.5mph &#8804; speed &lt; 12.5mph</entry>
</row>
<row>
<entry align="center">4</entry>
<entry align="center">15</entry>
<entry align="center">12.5mph &#8804; speed &lt; 17.5mph</entry>
</row>
<row>
<entry align="center">5</entry>
<entry align="center">20</entry>
<entry align="center">17.5mph &#8804; speed &lt; 22.5mph</entry>
</row>
<row>
<entry align="center">6</entry>
<entry align="center">25</entry>
<entry align="center">22.5mph &#8804; speed &lt; 27.5mph</entry>
</row>
<row>
<entry align="center">7</entry>
<entry align="center">30</entry>
<entry align="center">27.5mph &#8804; speed &lt; 32.5mph</entry>
</row>
<row>
<entry align="center">8</entry>
<entry align="center">35</entry>
<entry align="center">32.5mph &#8804; speed &lt; 37.5mph</entry>
</row>
<row>
<entry align="center">9</entry>
<entry align="center">40</entry>
<entry align="center">37.5mph &#8804; speed &lt; 42.5mph</entry>
</row>
<row>
<entry align="center">10</entry>
<entry align="center">45</entry>
<entry align="center">42.5mph &#8804; speed &lt; 47.5mph</entry>
</row>
<row>
<entry align="center">11</entry>
<entry align="center">50</entry>
<entry align="center">47.5mph &#8804; speed &lt; 52.5mph</entry>
</row>
<row>
<entry align="center">12</entry>
<entry align="center">55</entry>
<entry align="center">52.5mph &#8804; speed &lt; 57.5mph</entry>
</row>
<row>
<entry align="center">13</entry>
<entry align="center">60</entry>
<entry align="center">57.5mph &#8804; speed &lt; 62.5mph</entry>
</row>
<row>
<entry align="center">14</entry>
<entry align="center">65</entry>
<entry align="center">62.5mph &#8804; speed &lt; 67.5mph</entry>
</row>
<row>
<entry align="center">15</entry>
<entry align="center">70</entry>
<entry align="center">67.5mph &#8804; speed &lt; 72.5mph</entry>
</row>
<row>
<entry align="center">16</entry>
<entry align="center">75</entry>
<entry align="center">72.5mph &#8804; speed</entry>
</row>
</tbody>
</tgroup>
</table>

</section>

<section>
<title>MOVES Pollutant Groups</title>
<para>The following <xref linkend="tbl_concepts_moves_pollutant_groups"/> provides a list of available MOVES pollutant groups that the user can specify to model within MOVES. The choice of pollutant groups(s) determines what pollutants are included in the three emission rate lookup tables (RPD, RPV, and RPP) output by MOVES. The letter 'X' marks the key pollutants for inclusion, and a letter 'd' signifies that the pollutant is included as a default in the MOVES run because a key pollutant depends on it.  The user modifies the <link linkend="sect_input_runctlfile">control.in</link> input file to specify the pollutant group.</para>

<table id="tbl_concepts_moves_pollutant_groups">
<title>MOVES Pollutant Groups</title>
<tgroup cols="6">
<colspec colname="c1" colwidth="7*" />
<colspec colname="c2" colwidth="20*" />
<colspec colname="c3" colwidth="7*" />
<colspec colname="c4" colwidth="7*" />
<colspec colname="c5" colwidth="7*" />
<colspec colname="c6" colwidth="7*" />
<thead>
<row>
<entry morerows="1" valign="center" align="center">pollutantID</entry>
<entry morerows="1" valign="center" align="center">pollutantName</entry>
<entry namest="c3" nameend="c6" align="center">Pollutant Group</entry>
</row>
<row>
<entry align="center">Ozone </entry>
<entry align="center">Toxics </entry>
<entry align="center">PM </entry>
<entry align="center">GHG </entry>
</row>
</thead>
<tbody>
<row>
<entry align="center">1</entry>
<entry align="center">Total Gaseous Hydrocarbons</entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">79</entry>
<entry align="center">Non-Methane Hydrocarbons</entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">80</entry>
<entry align="center">Non-Methane Organic Gases</entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">86</entry>
<entry align="center">Total Organic Gases</entry>
<entry align="center">X</entry>
<entry align="center">X</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">87</entry>
<entry align="center">Volatile Organic Compounds</entry>
<entry align="center">X</entry>
<entry align="center">X</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">2</entry>
<entry align="center">Carbon Monoxide (CO)</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">3</entry>
<entry align="center">Oxides of Nitrogen</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">30</entry>
<entry align="center">Ammonia (NH3)</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">32</entry>
<entry align="center">Nitrogen Oxide</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">33</entry>
<entry align="center">Nitrogen Dioxide</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">31</entry>
<entry align="center">Sulfur Dioxide (SO2)</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">100</entry>
<entry align="center">Primary Exhaust PM10 - Total</entry>
<entry align="center"></entry>
<entry align="center">d</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">101</entry>
<entry align="center">Primary PM10 - Organic Carbon</entry>
<entry align="center"></entry>
<entry align="center">d</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">102</entry>
<entry align="center">Primary PM10 - Elemental Carbon</entry>
<entry align="center"></entry>
<entry align="center">d</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">105</entry>
<entry align="center">Primary PM10 - Sulfate Particulate</entry>
<entry align="center"></entry>
<entry align="center">d</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">106</entry>
<entry align="center">Primary PM10 - Brakewear Particulate</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">107</entry>
<entry align="center">Primary PM10 - Tirewear Particulate</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">110</entry>
<entry align="center">Primary Exhaust PM2.5 - Total</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">111</entry>
<entry align="center">Primary Exhaust PM2.5 - Organic Carbon</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">112</entry>
<entry align="center">Primary Exhaust PM2.5 - Elemental Carbon</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">115</entry>
<entry align="center">Primary Exhaust PM2.5 - Sulfate Particulate</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">116</entry>
<entry align="center">Primary Exhaust PM2.5 - Brakewear Particulate</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">117</entry>
<entry align="center">Primary Exhaust PM2.5 - Tirewear Particulate</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">91</entry>
<entry align="center">Total Energy Consumption</entry>
<entry align="center"></entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center">X</entry>
</row>
<row>
<entry align="center">92</entry>
<entry align="center">Petroleum Energy Consumption</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
</row>
<row>
<entry align="center">93</entry>
<entry align="center">Fossil Fuel Energy Consumption</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
</row>
<row>
<entry align="center">5</entry>
<entry align="center">Methane (CH4)</entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center">d</entry>
<entry align="center">X</entry>
</row>
<row>
<entry align="center">6</entry>
<entry align="center">Nitrous Oxide (N2O)</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
</row>
<row>
<entry align="center">90</entry>
<entry align="center">Atmospheric CO2</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
</row>
<row>
<entry align="center">98</entry>
<entry align="center">CO2 Equivalent</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center">X</entry>
</row>
<row>
<entry align="center">20</entry>
<entry align="center">Benzene</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center">X</entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">21</entry>
<entry align="center">Ethanol</entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center"></entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">22</entry>
<entry align="center">MTBE</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">23</entry>
<entry align="center">Naphthalene</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">24</entry>
<entry align="center">1,3-Butadiene</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">25</entry>
<entry align="center">Formaldehyde</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">26</entry>
<entry align="center">Acetaldehyde</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center"></entry>
</row>
<row>
<entry align="center">27</entry>
<entry align="center">Acrolein</entry>
<entry align="center"></entry>
<entry align="center">X</entry>
<entry align="center"></entry>
<entry align="center"></entry>
</row>
</tbody>
</tgroup>
</table>


</section>

</section>

<section id="sect_concepts_point_source_processing">

<title>Point-source processing</title>

<para>Point-source emissions processing in SMOKE focuses on converting annual, daily, or hourly emissions to hourly, gridded model-ready emissions of the chemical species used by an AQM. Recall that by <quote>point sources</quote> in SMOKE we mean point sources in the usual sense plus wildfires with/without precomputed plumes. SMOKE processing may be performed either with or without growth and control of emissions. SMOKE can process both criteria and toxics inventories for point sources and combine consistent criteria and toxics inventories in one run (as explained in more detail in <xref linkend="sect_concepts_combine_toxics" />).</para>

<para>Point-source processing can be performed using a CMAQ-based approach or a UAM-based approach, as previously described in <xref linkend="sect_concepts_model_ready_files" />. The processing steps for CMAQ base-year processing are shown in <xref linkend="fig_concepts_point_base_CMAQ" />. In <xref linkend="fig_concepts_parallel_approach" />, we also included the major intermediate vectors and matrices; please refer to that diagram for those details.</para>

<figure id="fig_concepts_point_base_CMAQ">
<title>Base case point-source processing steps for the CMAQ-based approach</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/point_base_CMAQ_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/point_base_CMAQ_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>The inventory import step reads the raw emissions data, screens them, processes them, and converts the data to the SMOKE intermediate inventory file (inventory vectors in <xref linkend="fig_concepts_parallel_approach" />). The import can optionally include day-specific and hour-specific data. The emissions in the inventory file are subdivided to hourly emissions during temporal allocation; assigned chemical speciation factors during speciation, and assigned spatial allocation factors during gridding. The plume-rise computation estimates vertical plume rise of emissions sources and computes the fraction of emissions from the sources to go into the model layers. The results of these steps are combined in a merge step, which creates model-ready files for CMAQ.</para>

<para>Users may optionally choose to select specific sources to be elevated sources and/or PinG sources. If this approach is taken, the selection process can depend on daily-total emissions summed from the <command>Temporal</command> output files. Hence, <xref linkend="fig_concepts_point_base_CMAQ" /> shows that the elevated-source selection may optionally depend on the output from the <command>Temporal</command> program. If elevated-source selection is being included, the plume-rise computation uses that information to skip the point sources that have not been selected as elevated. Thus, plume rise is only computed for the elevated sources. The elevated-source selection also provides its results to the merge step, which is where the special PinG data files for CMAQ are created in addition to the 3-D model-ready file.</para>

<para><xref linkend="fig_concepts_point_base_UAM" /> describes base-case processing for the UAM-based approach. For this type of modeling, the elevated-source selection step is required, and the plume rise computation is not performed. Otherwise, the major processing steps are the same as for the CMAQ-based approach.</para>

<figure id="fig_concepts_point_base_UAM">
<title>Base case point-source processing steps for the UAM-based approach</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/point_base_UAM_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/point_base_UAM_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>In <xref linkend="fig_concepts_point_growth_CMAQ" />, we show the point-source CMAQ-based processing steps for future- or past-year processing. This processing is similar to the base-year processing flow, except the growth and controls step in added to calculate the growth and control matrices. The grow inventory step is added to convert the inventory from the base year to a future or past year. The control matrix can optionally be merged to apply control factors to the future- or past-year emissions. The steps shown with dotted lines represent steps that can be reused from the base-year processing because they do not necessarily depend on any of the new steps. However, if the elevated-source selection is to be performed based on the grown emissions, then the elevated-source selection and plume rise computation steps would need to be redone. Note that usually the same elevated-source list is used in both the base- and future-year modeling.</para>

<figure id="fig_concepts_point_growth_CMAQ">
<title>Future- or past-year growth and control point-source processing steps for the CMAQ-based approach</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/point_growth_CMAQ_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/point_growth_CMAQ_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>Growth and control can also be used for UAM-based processing (<xref linkend="fig_concepts_point_growth_UAM" />). As with the other figures, the dotted lines indicate steps that may be reused from the base-case processing. If the elevated-source selection depends on the grown emissions then you will need to regenerate the elevated-source list, though it is the usual practice in modeling to use the same elevated-source list in both the base- and future-year modeling. This allows the air quality modeling results to be more comparable.</para>

<figure id="fig_concepts_point_growth_UAM">
<title>Future- or past-year growth and control point-source processing steps for the UAM-based approach</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/point_growth_UAM_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/point_growth_UAM_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>As with area sources, you may apply many growth and control matrices at the front end of processing. The area-source diagram for this approach was provided as <xref linkend="fig_concepts_area_projection" />, and the point-source approach is quite similar, with the addition of the elevated-source selection and plume rise computation steps.</para>

<para>In sections later in this chapter, we describe the SMOKE programs that are needed for each of the processing steps just discussed and additional details about what activities are accomplished during each step. These sections are:</para>

<itemizedlist spacing="compact">
<listitem><xref linkend="sect_concepts_inventory_import" /></listitem>
<listitem><xref linkend="sect_concepts_temporal_processing" /></listitem>
<listitem><xref linkend="sect_concepts_chemical_processing" /></listitem>
<listitem><xref linkend="sect_concepts_spatial_processing" /></listitem>
<listitem><xref linkend="sect_concepts_growth_processing" /></listitem>
<listitem><xref linkend="sect_concepts_control_processing" /></listitem>
<listitem><xref linkend="sect_concepts_elevated_processing" /></listitem>
<listitem><xref linkend="sect_concepts_merge_processing" /></listitem>
<listitem><xref linkend="sect_concepts_qa_processing" /></listitem>
</itemizedlist>

<para>Point-source processing includes a number of additional features that are not applicable for other SMOKE source categories:</para>

<orderedlist spacing="compact">
<listitem>Flexible source definitions</listitem>
<listitem>Stack parameters</listitem>
<listitem>Day- and hour-specific emissions</listitem>
<listitem>Different approaches for elevated sources for different AQMs, including the use of PinG sources</listitem>
<listitem>Elevated-source selection</listitem>
<listitem>Wild and prescribed fires point sources</listitem>
</orderedlist>

<section>

<title>Flexible source definitions</title>

<para>Depending on the input format of the point-source emissions (e.g., FF10, ORL), the set of characteristics that are used to uniquely identify a point source can be different. For example, the ORL-formatted inventories define a point source using a country, state, and county code, an SCC, a plant identifier, a stack number, a point identifier, and a segment number. The FF10 format, however, identifies a source using a FIPS state/county code, a plant code, a unit ID, a segment ID, and SCC. To better support the formats and be adaptable if new formats are created in the future, SMOKE uses a flexible definition of point sources. This definition consists of the following source characteristics to uniquely define the sources:</para>

<itemizedlist>
<listitem>
<para>Country, state, and county code</para>
</listitem>
<listitem>
<para>Plant ID (15 characters or less)</para>
</listitem>
<listitem>
<para>Characteristics 1 through 5 (each 15 characters or less)</para>
</listitem>
</itemizedlist>

<para>Depending on the input format, SMOKE assigns different variables from the input format to the parts of the SMOKE point-source definition.  The assignments for the remaining characteristics are as follows:</para>

<para><emphasis>ORL format:</emphasis></para>

<itemizedlist spacing="compact">
<listitem>Char 1: Point ID</listitem>
<listitem>Char 2: Stack ID</listitem>
<listitem>Char 3: Segment ID</listitem>
<listitem>Char 4: SCC</listitem>
<listitem>Char 5: unused</listitem>
</itemizedlist>

<para><emphasis>FF10 format:</emphasis></para>

<itemizedlist spacing="compact">
<listitem>Char 1: Stack ID</listitem>
<listitem>Char 2: Unit ID</listitem>
<listitem>Char 3: Segment ID</listitem>
<listitem>Char 4: SCC</listitem>
<listitem>Char 5: unused</listitem>
</itemizedlist>

<para>The meaning of these source characteristics for a given inventory type needs to be considered when cross-reference files are created, if cross-reference entries other than state/county and SCC-specific entries are provided.</para>

<section>

<title>Point definition header row in cross-reference files</title>

<para>As just described, SMOKE uses a flexible definition of point sources. This definition may or may not include the SCC (although SCC is always at least a source attribute). Cross-reference files for point sources can contain source-specific records, and they usually use the SCC to perform the needed assignments during emissions processing. For the files to be self-describing, they use a header that indicates the number of characteristics in addition to the plant ID that are being used in the cross-reference file. This number needs to be consistent with the number of point-source characteristics used in the inventory files. In addition, the header indicates which, if any, of the point-source characteristics is the SCC. This header starts with the characters /POINT DEFN/, and the files that use it describe it as part of the file format definition in <xref linkend="ch_input_files" />.</para>

</section>

</section>

<section>

<title>Stack parameters</title>

<para>Several of the source attributes for point sources are stack parameters - specifically, the stack height, stack diameter, and the stack flue gas exit temperature, velocity, and flow rate. SMOKE can use hourly data for the stack flue gas exit temperature, velocity, and flow rate when using the CMAQ-based approach to modeling with SMOKE computing hourly plume rise. The hourly stack parameters cannot be used when modeling using a UAM-based elevated-point-source approach.</para>

<para>During the <command>Smkinven</command> program&rsquo;s import of the stack parameters from the annual or average-day inventory file (i.e., not the hourly stack parameters), SMOKE needs to read or assign stack parameters for all point sources. <xref linkend="sect_concepts_check_stack_params" /> explains in greater detail what the <command>Smkinven</command> program does with stack parameters. The hourly stack parameters are read in without modification or adjustment.</para>

</section>

<section>

<title>Day-specific and hour-specific emissions</title>

<para>Emissions from point sources are sometimes available as day- or hour-specific values. <command>Smkinven</command> can import the day- and hour-specific data, and it can also convert the hour-specific data to hour-specific temporal profiles. When these data are available, the <command>Temporal</command> program overrides the annual or daily emissions with the most specific data available. If day-specific data are available, <command>Temporal</command> uses them to overwrite the annual or average-day emissions during the time periods that these data are available. If hour-specific data are available, <command>Temporal</command> uses them to overwrite the annual, average-day emissions, or day-specific emissions data.</para>

</section>

<section>

<title>Different approaches for elevated sources for different AQMs</title>

<para>As introduced in <xref linkend="sect_concepts_model_ready_files" />, there are two different major approaches for creating emissions inputs to AQMs: the CMAQ-based approach and the UAM-based approach. The two approaches differ only on how point sources are being treated. In the CMAQ-based approach, SMOKE calculates the plume rise using an algorithm based on a Briggs plume rise formulation<comment>add reference</comment>. SMOKE then includes the vertical distribution of the point-source emissions in the 3-D model-ready file for CMAQ. For the CMAQ model only, SMOKE can also create two special PinG files: one to identify the sources, their locations, and their stack parameters, and the other to provide the hour-specific emissions for just these sources. In the UAM-based approach, SMOKE creates a special elevated-point-source file that both identifies the elevated and PinG sources and includes the hourly emissions values for those sources.</para>

<para>PinG sources are those sources that will be treated in greater detail by the AQM. In simple terms, the AQMs preprocess the chemistry of the plume emissions before those emissions are provided to the AQM grid cells and layers. The intent of the PinG approach is to provide more accurate modeling at and around very large point sources.</para>

<para><xref linkend="sect_concepts_elevated_processing" /> describes in greater detail the steps taken by SMOKE in the layer fraction processing using a Briggs formulation and the elevated and PinG source selection.</para>

</section>

<section>

<title>Wild and prescribed fires point sources</title>

<para>You may also either provide either precomputed point-source plume rise to SMOKE or internally compute plume rise using acres burned and fuel loading of fires with both the CMAQ-based and UAM-based approaches to modeling point sources. Precomputed plume rise point sources are called explicit plume sources. This capability was implemented for modeling wildfire sources as point sources in SMOKE using plume rise computed with a different approach from the Briggs-based approach used for stack-based plumes<comment> (<xref linkend="sect_ptfire_emis_cmaq"/>)</comment>. The input data for this approach are the fraction of emissions in layer 1, the bottom of the plume, and the top of the plume. SMOKE distributes the emissions across the layers by weighting the emissions by the pressure difference in each layer over the total pressure difference between the top and bottom of the plume. <comment><xref linkend="sect_programs_laypoint" /> provides the technical description of how the emissions are distributed among the layers.</comment></para>

<para>For the UAM-based modeling approach, the file format does not readily allow you to provide precomputed plume rise; in fact, the entire premise of the format is that the AQM will compute the plume rise. To enable you to provide precomputed rise for the UAM-based modeling approach, the <command>Smkmerge</command> program creates an ASCII elevated-point-source file with an imaginary stack for each layer of each source (e.g., each wildfire). The stack parameters of the imaginary stack are set to values that will ensure a zero plume rise will be computed for the stack, and the x-y location of the stacks are the same for all imaginary stacks representing the same source. The emissions associated with the imaginary stacks are provided based on the emissions values that are to be entered in each layer for the source. The emissions for layer 1 are written in the point-source file to the imaginary stack associated with layer 1, and the same is done for all of the other layers. SMOKE uses a zero value for the imaginary stack when the emissions from a given source are not in a layer for an hour or when the source stops (e.g., once a wildfire ends). While not particularly elegant, this approach permits providing precomputed plume rise to the UAM-based models without having to change those models.</para>

</section>

</section>
</section>

<section id="sect_concepts_inventory_import">

<title>Inventory import</title>

<para>The importing of emission inventory and related data is the first processing step needed for any emissions processing effort. The <command>Smkinven</command> program imports data for anthropogenic sources, and the <command>Normbeis4</command> program imports BEIS4 land use data for biogenic sources. In this section, we focus on the import of the anthropogenic inventories using <command>Smkinven</command>. The biogenic import is further described in <xref linkend="sect_concepts_biogenic_processing" />.</para>

<para><command>Smkinven</command> performs many types of activities during import of the anthropogenic inventories. Though the primary purpose is reading the data from ASCII formats and outputting and I/O API SMOKE intermediate inventory, there are many other actions that need to be performed duringthe inventory import stage of processing. These actions are the following:</para>

<orderedlist>
<listitem>
<para>Check that the formats of the input files are correct and consistent, and ensure that all data can be read properly.</para>
</listitem>
<listitem>
<para>Assign pollutant names to data input by code numbers.</para>
</listitem>
<listitem>
<para>Select pollutants from the input files to keep for further SMOKE processing.</para>
</listitem>
<listitem>
<para>When multiple files are provided, combine all annual and/or average-day data into aconsistent inventory. This includes checking for duplicates and possibly aborting, depending on program options set by the user.</para>
</listitem>
<listitem>
<para>Combine toxics and criteria inventories, and eliminate duplicate mass using either an integrate or no-integrate approach.</para>
</listitem>
<listitem>
<para>Sort the inventory records into the order expected by other SMOKE programs.</para>
</listitem>
<listitem>
<para>Aggregate or disaggregate toxics emissions data as specified by user inputs.</para>
</listitem>
<listitem>
<para>Assign point-source locations to area sources, when available.</para>
</listitem>
<listitem>
<para>Fill in and check point-source stack parameters.</para>
</listitem>
<listitem>
<para>Convert stack locations from UTM to lat-lon.</para>
</listitem>
<listitem>
<para>Optionally ensure that lat-lon coordinates are in the Western Hemisphere.</para>
</listitem>
<listitem>
<para>Convert units of emissions and activities to the units used in the SMOKE intermediate inventory.</para>
</listitem>
<listitem>
<para>Set the weekday averaging approach.</para>
</listitem>
<listitem>
	<para>Assign country codes and/or geographic codes (GEOCODE_LEVEL[1-4]), years, and time zones.</para>
</listitem>
<listitem>
<para>Handle inventories that have data for multiple years.</para>
</listitem>
<listitem>
<para>Set the base year.</para>
</listitem>
<listitem>
<para>Report results of import including pollutant totals for toxics data and other information needed for quality assurance.</para>
</listitem>
<listitem>
<para>Import day-specific and hour-specific data, if available, and ensure that records in these files match inventory records provided in the annual or average-day inputs.</para>
</listitem>
</orderedlist>

<para>In the following subsections, we describe what SMOKE does for each of these activities.</para>

<section id="sect_concepts_check_inventory">

<title>Check the correctness and consistency of input file formats</title>

<para><command>Smkinven</command> can read the following ASCII formats for annual and average-day inventory data:</para>

<itemizedlist>
<listitem>
<para><emphasis role="bold">ORL format:</emphasis> This set of input formats is used for inputting point, nonpoint, on-road, and nonroad HAP emissions inventories, also called toxics emission inventories. There is a different ORL format for nonpoint, point, nonroad mobile, and on-road mobile sources.</para>
</listitem>

<listitem>
	<para><emphasis role="bold">List format:</emphasis> This is the input format used to provide multiple files to <command>Smkinven</command> in a single run. This format is simply an ASCII file that contains a list of other files.</para>
</listitem>
<listitem>
	        <para><emphasis role="bold">List GRID format:</emphasis> This is the input format used to provide multiple global gridded emission inventory files to <command>Smkinven</command> in a single run. This format is simply an ASCII file that contains a list of other files. To support input of multiple pre-gridded NetCDF files, the keyword #LIST GRID in the header of this file will switch SMOKE into gridded inventory processing mode. This approach is described more in <xref linkend="sect_concepts_pregridded_data" /></para>
	</listitem>


<listitem>
<para><emphasis role="bold">Gridded I/O API format:</emphasis> This format is a gridded I/O API file for allowing the import of pregridded data from the same modeling domain. This approach is described more in <xref linkend="sect_concepts_pregridded_data" />.</para>
</listitem>
</itemizedlist>

<para><command>Smkinven</command> ensures that all file formats provided to SMOKE are correct and include the required data fields. The formats and their required fields are provided in <xref linkend="ch_input_files" />.</para>

</section>

<section id="sect_concepts_assign_pollutant_names">

<title>Assign pollutant names to data input by code numbers</title>

<para>The ORL format use code numbers (usually Chemical Abstracts Service [CAS] numbers) to distinguish which chemical compound or inventory pollutant is provided on each line of the file. <command>Smkinven</command> matches these numbers with the CAS numbers from an inventory table (<envar>INVTABLE</envar>) file, described in <xref linkend="sect_input_invtable" />. The CAS number does not necessarily have to be a valid CAS number; it can be any number as long as there is a match between the numbers in the ORL file and the inventory table. The inventory table provides the inventory data names, such as the pollutant names that SMOKE uses in the remaining processing steps.  Note that the SMOKE inventory pollutants may not be identical to the pollutants in the inventory fed to SMOKE because of the aggregation/disaggregation that is performed by <command>Smkinven</command> (see <xref linkend="sect_concepts_aggregate_toxics" /> for more information).</para>

<para>For toxics processing, if multiple inventory data names apply for the same CAS number, the Factor column of <envar>INVTABLE</envar> will contain the split factor used by <command>Smkinven</command> to disaggregate the emissions from that CAS number to multiple inventory data values. If multiple CAS numbers apply for the same pollutant name, then <command>Smkinven</command> will sum these emissions, but will not report duplicate records unless there are indeed duplicates in the inventory file. This is described more fully in <xref linkend="sect_concepts_aggregate_toxics" />. Duplicate reporting is described more in <xref linkend="sect_concepts_check_duplicates" />.</para>

<para>The use of the inventory data name as the unique pollutant identifier in SMOKE differs from the approach of EMS-HAP, in which the SAROAD code is the unique identifier for the pollutants to be modeled. Because we anticipated that some toxics pollutants that do not have unique SAROAD codes (e.g., divalent particulate mercury) would need to be modeled explicitly, we did not want to take this approach. If we had, the user would have been required to create fake and unique SAROAD codes to be able to model these emissions explicitly.</para>

</section>

<section>

<title>Select pollutants from the input files for further SMOKE processing</title>

<para>In version 1.5 and higher of SMOKE, users can specify the valid data (pollutants and activities) using the inventory table discussed in <xref linkend="sect_input_invtable" />, as described in <xref linkend="sect_concepts_assign_pollutant_names" />. <command>Smkinven</command> will read only those entries that have a <quote>Y</quote> (for <quote>Yes</quote>) in the Keep column of the inventory table. If a pollutant has an <quote>N</quote> (for <quote>No</quote>) in the Keep column, it will not be output to the SMOKE intermediate files, but will be included in the inventory reports that <command>Smkinven</command> creates. If the pollutant is not listed at all in the inventory table, it will be dropped as well; <command>Smkinven</command> will also write a warning message and will not include the pollutant in any reporting. It is a good work practice for users to put all pollutants in the input inventory in the inventory table even if it the pollutant will not be used  for the AQM. Use of the <envar>INVTABLE</envar> file replaces the use of the <envar>SIPOLS</envar> and <envar>ACTVNAMS</envar> files from previous versions of SMOKE.</para>

</section>

<section id="sect_concepts_check_duplicates">

<title>Check for duplicate records</title>

<para>In <xref linkend="sect_concepts_check_inventory" />, we explained that <command>Smkinven</command> ensures that the input files are correct and that the same format is used. In addition, <command>Smkinven</command> checks for duplicate records across the entire set of inventory input files. A duplicate record is one that has the same source characteristics (defined in <xref linkend="sect_concepts_summary_source_processing" />) and pollutants as another record in the inventory. <command>Smkinven</command> provides an option that allows you to instruct <command>Smkinven</command> whether duplicates should cause a warning message or an error. In some cases, you may not expect duplicates in your inventory, in which case you can have <command>Smkinven</command> abort after reporting all duplicates. In other cases, you can instruct <command>Smkinven</command> to sum the emissions across all duplicate records. See <xref linkend="sect_programs_smkinven" /> for more information on the different approaches.</para>

</section>

<section id="sect_concepts_combine_toxics">

<title>Combine toxics and criteria inventories</title>

<para>For point, nonpoint, on-road mobile, and nonroad-mobile sources, the toxics inventory contains emissions for VOC pollutants that are provided as explicit chemical compounds (for example, benzene). These same VOC emissions are also included as an aggregated VOC value in the criteria emissions inventory. To use these inventories together, <command>Smkinven</command> provides the necessary options to ensure that double counting of VOC emissions will not occur. These two options are the <quote>integrate</quote> and <quote>no-integrate</quote> options.</para>

<para>The <quote>integrate</quote> option involves subtracting toxic VOC emissions from the criteria VOC emissions to avoid double counting of VOC when the emissions are speciated. With this option the user must ensure that the sources in the toxics and criteria inventories match up one-to-one, so that <command>Smkinven</command> can properly compute the emissions.</para>

<para>Note that in any discussion of the toxics inventory we have assumed that all emissions are annual total emissions, because the toxics inventory that is currently available does not include average-day emissions. We have also assumed that the inventory contains VOC emissions, but the same approach can be used to process TOG emissions.</para>

<para>During import of both toxics and criteria emission inventories, SMOKE matches the area/nonpoint, on-road mobile, and nonroad mobile emission inventories by country/state/county code and SCC. SMOKE also matches the toxics and criteria records for the point sources, provided that the point sources in the two inventories use identical fields for their source characteristics. You are required to ensure that the source characteristics for all source categories match between the two inventories for any sources that you wish to have matched. Once they are matched, SMOKE will have both a criteria VOC emissions value and toxics emission values for individual VOC chemical compounds.</para>

<para>SMOKE can optionally compute a NONHAPVOC value by subtracting the sum of toxics VOC from the criteria VOC value. This same approach can be used to create a NONHAPTOG value if the inventory or MOVES (when processing on-road mobile emissions using VMT data) uses a TOG value instead of a VOC value. (We will not mention NONHAPTOG again, but it could be used to replace NONHAPVOC throughout this section).</para>

<para>The case of computing NONHAPVOC is called the <quote>integrate</quote> case because it involves integration of the VOC mass between the criteria and toxics inventories. Likewise, the case of not computing NONHAPVOC is called the <quote>no-integrate</quote> case. With the <quote>integrate</quote> approach, the NONHAPVOC mass and the toxics VOC mass are independent from one another and will not double count emissions. The calculation must be performed for each source, and <command>Smkinven</command> will set the criteria VOC value to zero when it computes the NONHAPVOC value. <command>Smkinven</command> determines which pollutants should be subtracted from VOC using the <quote>VOC or TOG</quote> column in the inventory table (<envar>INVTABLE</envar>) file.</para>

</section>

<section>
<title>Sort the inventory</title>
<para>When <command>Smkinven</command> reads an inventory, it also puts its sources into a special sorted order prior to outputting the SMOKE intermediate inventory files. All programs that read <command>Smkinven</command> outputs, which includes most of the SMOKE programs, expect this order. The order is determined by sorting the source characteristics listed in <xref linkend="sect_concepts_summary_source_processing" /> in ascending order. For example, area sources will be sorted in order of increasing country/state/county code, and within a single country/state/county code will be sorted in order of increasing SCC. In <xref linkend="fig_concepts_sort_inventory" />, we show how the sorted order may be completely different from the order of the files and records provided to <command>Smkinven</command>. The figure shows the unsorted ASCII input files at left (provided to <command>Smkinven</command> by logical files <envar>ARINV</envar>, <envar>MBINV</envar>, or <envar>PTINV</envar>) and how the records can be rearranged by <command>Smkinven</command> to create the sorted I/O API output files (output from <command>Smkinven</command> as <envar>AREA</envar>, <envar>MOBL</envar>, or <envar>PNTS</envar>). Each record in this diagram represents a complete inventory record with all source characteristics, source attributes, and emissions.</para>

<figure id="fig_concepts_sort_inventory">
<title>Combining and sorting ASCII inputs to created sorted I/O API outputs</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/sort_inventory_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/sort_inventory_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>Having this sorted order is important because the programs that depend on <command>Smkinven</command> outputs also store their outputs in the same order. However, these other programs (such as <command>Temporal</command> for temporal allocation, <command>Spcmat</command> for chemical speciation, and <command>Grdmat</command> for spatial allocation) store their records using the record numbers to match their outputs with the <command>Smkinven</command> outputs. This means that the source characteristics that are stored in the <command>Smkinven</command> outputs are not also included in the outputs from <command>Temporal</command>, <command>Spcmat</command>, <command>Grdmat</command> and other programs; these programs rely on the sorted order in the <command>Smkinven</command> outputs not changing. This approach allows minimal redundant data storage, this reducing disk space needs.</para>

<para>As explained previously, the records output from <command>Smkinven</command> are vectors of emissions and source characteristics that make up the SMOKE intermediate inventory files. Each record number in the file identifies an element of the vector. The outputs from <command>Temporal</command> are also vectors of hourly emissions. The record number in each hourly vector will match the record number in the intermediate inventory files. The outputs from <command>Spcmat</command> are a matrix of speciation factors, in which the record numbers (rows of the matrix) will match the record numbers of the intermediate inventory files. The columns of the matrix are each valid pollutant-to-species transformations. The outputs from <command>Grdmat</command> are a sparse matrix, but again the rows of the matrix match the rows of the intermediate inventory file.  Therefore, assignment of factors is a simple matter of selecting the same record number from the <command>Smkinven</command> output files; this is in fact one part of the vector-matrix multiplication used by SMOKE.</para>

<para>It is important to remember this sorted-order approach when you have run an inventory through all of the programs once, and then want to change your inventory and re-import the data with <command>Smkinven</command>. For the re-importing and subsequent rerun, if any source characteristics in the inventory change, or if any sources are added or removed, then the number and/or order of the output sources in the new <command>Smkinven</command> outputs will be different. This means that the outputs from all processing steps that depend on the <command>Smkinven</command> outputs will need to be rerun.</para>

</section>

<section id="sect_concepts_aggregate_toxics">

<title>Aggregate or disaggregate toxics emissions</title>

<para><command>Smkinven</command> also supports aggregation and disaggregation of toxics inventory pollutants to match the input needs for AQMs. There are two components of this aggregation and disaggregation. The first involves what to do during inventory import to resolve discrepancies between the inventory content and what data are most useful for further processing through SMOKE; this topic is addressed in this subsection. The second issue is conversion of the toxics inventory to the species needed by the AQM, which is handled by the <command>Spcmat</command> program in SMOKE, as described in <xref linkend="sect_concepts_chemical_processing" />.</para>

<para><command>Smkinven</command> uses the inventory table (<envar>INVTABLE</envar>) file to determine what aggregation and disaggregation are needed. The <envar>INVTABLE</envar> file can have both multiple pollutant names per CAS number and multiple CAS numbers per pollutant name. In the first case, <command>Smkinven</command> disaggregates the emissions to two or more pollutants using the number provided in the <quote>Factor</quote> column of the <envar>INVTABLE</envar> file. In the second case, <command>Smkinven</command> combines the separate records in the inventory into the same pollutant. Depending on the relationship between the <quote>Keep</quote> column, the <quote>Pollutant Name,</quote> and the <quote>CAS Number</quote> in the inventory table file, the reader routines ensure that they will keep emissions only for pollutants (not CAS numbers) with a <quote>Y</quote> in the <quote>Keep</quote> column. <command>Smkinven</command> can also tell the difference between two CAS numbers that are being aggregated for the same source and a duplicate record (two records for the same source with emissions for the same pollutant).</para>

</section>

<section id="sect_concepts_assign_point_locations">

<title>Assign point-source locations to area sources</title>

<para>Some area/nonpoint and nonroad mobile sources can be assigned point-source locations instead of being assigned spatial surrogates. At this time, the only allowable cases of this are for toxics processing for nonroad mobile and stationary area/nonpoint sources related to airport emissions processes. <command>Smkinven</command> has the capability to make such assignments for any area source from both the toxics and criteria inventories<comment>; the example file that comes with SMOKE for this purpose (the <envar>ARTOPNT</envar> file) only contains entries for the airport emissions processes. <xref linkend="sect_dirs_example_data_files" /> provides more information about the example <envar>ARTOPNT</envar> file provided with SMOKE</comment>.</para>

<para>Because airport location data are readily available, this feature gives users the ability to model airport-related area-source emissions at airport locations, as opposed to spatially allocating them to grid cells using spatial surrogates. Although spatial surrogates could be (and have been) developed to reproduce the same or similar results as the point-source assignments for models such as CMAQ, other models that SMOKE may eventually support (namely ISCST3) require that point-source inputs include the location coordinates. The area-to-point assignment feature in SMOKE will support such a need. Note that for large airports and a small grid, one may want to have the airport emissions gridded using surrogate data since the airport can be large enough to encompass multiple grid cells. Thus, the user should decide whether and when to use this area-to-point feature as opposed to gridding using an airport surrogate.</para>

<para>In SMOKE version 1.5 and higher, SMOKE can assign an area source to one or more locations in a county. For example, when multiple airports are in a county, SMOKE adds sources to the inventory and splits the county-total emissions among the locations based on a factor provided in the <envar>ARTOPNT</envar> file. All sources remain with the area-source inventory in which they started; they are not moved to a point-source inventory. Any area sources that are assigned point locations are spatially allocated using those locations, as described in <xref linkend="sect_concepts_spatial_processing" />. Sources that are not assigned point-source locations are allocated in the standard way using spatial surrogates.</para>

</section>

<section id="sect_concepts_check_stack_params">

<title>Fill in and check point-source stack parameters</title>

<para>An additional action taken by the <command>Smkinven</command> program for point sources only is to check stack parameters and fill them in with valid values if they are missing. <command>Smkinven</command> performs the following steps on stack parameters from the annual or average-day inventory files, but not the hour-specific stack parameters file.</para>

<itemizedlist>

<listitem>
<para>If the stack parameters are zero, <command>Smkinven</command> treats them as missing values and writes the warning messages.</para>
</listitem>

<listitem>
<para>If the exit velocity is missing or zero and the exit flow is not, or if the <envar>VELOC_RECALC</envar> option is Y in the run script, <command>Smkinven</command> automatically calculates the exit velocity using the formula:</para>

<para>velocity = flow / (&pi; * diameter<superscript>2</superscript>/4)</para>
</listitem>

<listitem>
<para>If the stack parameters are not missing or zero, <command>Smkinven</command> ensures that their values are within the allowed ranges. The range for each stack parameter is as follows:</para>

<itemizedlist spacing="compact">
<listitem>Height: 0.5 to 2100 meters</listitem>
<listitem>Diameter: 0.01 to 100 meters</listitem>
<listitem>Exit temperature: 260 to 2000 K</listitem>
<listitem>Exit velocity: 0.0001 to 500 m/s</listitem>
</itemizedlist>

<para>When a stack parameter falls outside of its associated range, <command>Smkinven</command> sets it to the top or bottom of the range, depending on whether it is higher than the upper end of the range or lower than the lower end. Note that a zero value is not treated as an out-of-range parameter, but is treated as a missing value.</para>
</listitem>

<listitem>
<para>If the stack parameters  are missing or zero, <command>Smkinven</command> uses the <envar>PSTK</envar> file to assign new stack parameters using country/state/county and SCC assignments. The format of <envar>PSTK</envar> file is described in <xref linkend="sect_input_pstk" />.</para>
</listitem>
</itemizedlist>

</section>

<section>

<title>Convert coordinates from UTM to lat-lon</title>

<para>The ORL and FF10 point-source formats permit stack coordinates to be provided in UTM or lat-lon coordinates, and the SMOKE intermediate inventory stores the coordinates as lat-lon values. <command>Smkinven</command> converts the UTM coordinates to lat-lon coordinates using the I/O API routine <ulink url="http://www.baronams.com/products/ioapi/LL2UTM.html">UTM2LL</ulink>. Lat-lon coordinates must be provided in decimal degrees, while UTM coordinates must be provided in meters.</para>

</section>

<section>

<title>Optionally ensure that lat-lon coordinates are in the Western Hemisphere</title>

<para>For any lat-lon coordinates input to <command>Smkinven</command> (point-source locations, link coordinates, and area-to-point coordinates), SMOKE can optionally ensure that the longitude values are in the Western Hemisphere. In some cases, the data prepared for input to SMOKE do not contain the negative sign on the longitude value that indicates the Western Hemisphere. If the modeling domain is in the Western Hemisphere, then <command>Smkinven</command> will convert all positive longitudes to negative ones when the <envar>WEST_HSPHERE</envar> option is set to Y.</para>

</section>

<section>

<title>Convert units of emissions and activities</title>

<para>Different input formats in SMOKE have different emissions units. The SMOKE intermediate inventory file stores all annual and average-day emissions values from the ORL format in tons/year. These two formats also support average-day emissions values, which are stored separately in the SMOKE intermediate inventory in tons/day. The average-weekday emissions values input are also stored in tons/year, but SMOKE sets an internal variable called TPFLAG so that later SMOKE processing steps can properly treat the computed <quote>annual</quote> value as an average-day value. The annual VMT data are stored as miles/year. In all cases, <command>Smkinven</command> converts the units of the input emissions to the units used in the SMOKE intermediate inventory. When this conversion involves a day-to-year conversion, <command>Smkinven</command> considers leap years by using 366 instead of 365 days in the year. See <xref linkend="sect_input_inventory" /> for information about the units required for each inventory format.</para>

<para>When emissions are provided as average-day values from the ORL and FF10 formats and these emissions are then used in later processing steps, SMOKE does not further adjust the emissions using the monthly profiles. SMOKE assumes that the average-day emissions from these two formats have been adjusted to a specific month already. In addition, when <command>Smkinven</command> is configured using the <envar>FILL_ANNUAL</envar> option to fill in missing annual values using average-day values, <command>Smkinven</command> sets the TPFLAG internal variable to indicate that monthly adjustments should not be applied and that the <quote>annual</quote> emissions should just be divided by the number of days of the year before being used. <command>Smkinven</command> makes this setting on a source-by-source basis.</para>

</section>

<section id="sect_concepts_set_weekday_approach">

<title>Set the weekday averaging approach</title>

<para>There are two approaches for processing weekly temporal profiles: weekly normalization and weekday normalization. Although the actual normalization happens during temporal allocation, <command>Smkinven</command> sets the approach to use for each source. In the weekly approach, the <command>Temporal</command> program normalizes the weekly temporal profiles over every day of the week. This approach is appropriate for annual-total inventories or average-day inventories. Weekday normalization normalizes over just the weekdays (i.e., Monday through Friday) in the profile, or if the profile indicates no emissions on weekdays, then over just the weekend days in the profile. This approach is appropriate only for average-weekday inventories. <xref linkend="sect_concepts_temporal_processing" /> describes the two forms of weekly profile normalization in more detail.</para>

<para>The default normalization setting for ORL and FF10 formats in <command>Smkinven</command> is weekly normalization.  The default settings can be changed using the <envar>WKDAY_NORMALIZE</envar> option in the SMOKE run scripts. When this option is set to Y, the sources will be set to instruct <command>Temporal</command> to use weekday normalization, and when this option is set to N, the sources will be set to cause weekly normalization. <command>Smkinven</command> gives a warning when an approach is being used that is inconsistent with the expected approach for the inventory format being used.</para>
</section>

<section id="sect_concepts_assign_countries">

<title>Assign country codes, years, and time zones</title>

<para>For all source categories, <command>Smkinven</command> assigns country codes, years, and time zones to inventory files that do not contain this information in the inventory records.</para>

<para>Since most inventory formats do not include a column for country or year, the country code and year must be provided in header fields in the inventory. because most inventory formats do not include a column for country or year. Users can provide data for up to 10 countries using as many separate files, or separate headers within a single file, as needed. The inventory file formats listed in <xref linkend="sect_input_inventory" /> provide a further description of these country-setting headers and the valid country codes.</para>

<para>Users can also provide multiple inventory years in a single <command>Smkinven</command> run. This is necessary in some cases when an inventory for one region (e.g., Canada or Mexico) is unavailable for the same year as the majority of the inventory region (e.g., the U.S.). <command>Smkinven</command> stores the inventory year as part of the source attributes.</para>

<para>In addition to country codes and years, <command>Smkinven</command> assigns a time zone to each source based on the county associated with the source. <command>Smkinven</command> uses the <envar>COSTCY</envar> or the <envar>GEOCODE_LEVEL4</envar> (if USE_EXP_GEOCODES Y) file to get this information. It matches the county code from the inventory to the county code in this file. If a county code included in the inventory is missing from this file, or if the time zone is not provided in the <envar>COSTCY</envar> or <envar>GEOCODE_LEVEL4</envar> file, <command>Smkinven</command> uses the <envar>SMK_DEFAULT_TZONE</envar> setting to obtain a default time zone for sources in such a county. In some cases, the time zone set in the <envar>COSTCY</envar> or  <envar>GEOCODE_LEVEL4</envar> file is an approximation, since counties that are bisected by two time zones have only the predominant zone represented in this file.</para>

</section>

<section>

<title>Handle inventories that have data for multiple years</title>

<para>In some cases, inventories may use data from multiple years; this can occur when inventory data are not available for the modeling year of interest, but data from another year are available. SMOKE can handle this case. Unless growth factors are available to grow the available data to the desired year, the SMOKE intermediate inventory may include data from two years. Usually, using data from a nearby year is preferable to not performing the modeling at all.</para>

<comment>
<para><remark><command>Smkinven</command> can import data from two years and the <command>Cntlmat</command> and <command>Grwinven</command> programs can later be used to selectively grow only one of the years. This can be used to combine inventories from two years and update one part of the inventory to have a new year, or to grow both inventories to a new consistent year.</remark></para>
</comment>

</section>

<section>

<title>Set the base year</title>

<para><command>Smkinven</command> imports all data and then sets the inventory base year in the file as the year that has the most data values. The year of the inventories is set using the #YEAR header for ORL and FF10 inputs, (see <xref linkend="sect_input_inventory" /> for more information on this header and packet). Usually, the year of the data is the same for all sources, so you will know what <command>Smkinven</command> will assume is the base year. However, if multiple inventory years are provided, <command>Smkinven</command> will set it using the year that is associated with the largest number of sources.</para>

<para>If you provide a future-year inventory to SMOKE, you must override the base-year setting obtained from the values given by the #YEAR header or the /INVYEAR/ packet. This is accomplished with the <envar>SMK_BASEYR_OVERRIDE</envar> setting. This is necessary so that later SMOKE steps will be able to verify that the episode year is consistent with the base year.</para>

</section>

<section>

<title>Report results of import step</title>

<para>The <command>Smkinven</command> program writes several reports to help you determine what it has done.</para>

<orderedlist>
<listitem>
<para>A summary by CAS number of the emissions in tons/year. This includes the number of inventory records; whether all, some, or none of the pollutants associated with that CAS code are kept; and the CAS description. This report is written to a special report file (<envar>REPINVEN</envar>).</para>
</listitem>
<listitem>
<para>A summary of emissions by CAS number and pollutant before and after the application of disaggregation factors based on the inventory table. This report shows, for instance, how chromium emissions get split out into different pollutants. The report is written to the <envar>REPINVEN</envar> file.</para>
</listitem>
<listitem>
<para>A listing of the first 10 nonheader records in each inventory file. This information is written to the program log file.</para>
</listitem>
<listitem>
<para>For area-to-point conversions (from <xref linkend="sect_concepts_assign_point_locations" />), a list of the SCCs that were converted, and the section number of the <envar>ARTOPNT</envar> input file used to assign the point locations. This information is written to the program log file.</para>
</listitem>
<listitem>
<para>For area-to-point conversions, a list of any SCCs included in the <envar>ARTOPNT</envar> input file that do not also appear in the inventory. This report is written to the <envar>REPINVEN</envar> file.</para>
</listitem>
<listitem>
<para>For area-to-point conversions, a summary of emissions totals by SCC and pollutant before and after the factors are applied, and the total number of country/state/county codes affected. A separate summary is provided that reports emissions by state, SCC, and pollutant, but otherwise has the same information. These reports are written to the <envar>REPINVEN</envar> file.</para>
</listitem>
<listitem>
<para>For area-to-point conversions, a summary by SCC of the number of country/state/county codes being assigned area-to-point factors, and the number not being assigned those factors. This report is written to the <envar>REPINVEN</envar> file.</para>
</listitem>
<listitem>
<para>For import of CEM data, a summary of ORIS IDs and boiler IDs from the CEM data that have been matched to the inventory. SMOKE allows for multiple plant descriptions and country/state/county codes for a single ORIS ID that may be identical in their characteristics but different boiler IDS. They are treated as separate sources by matching boiler IDs. This report is written to the <envar>REPINVEN</envar> file.</para>
</listitem>
</orderedlist>

<para>More information about these reports can be found in <xref linkend="sect_output_repinven" />.</para>

</section>

<section>

<title>Import day-specific and hour-specific data</title>

<para>Another function of <command>Smkinven</command> that only applies to point sources is importing day- and hour-specific data. It can read three input ASCII formats:</para>

<itemizedlist>
<listitem>
<para><emphasis role="bold">CEM data format:</emphasis> The CEM data format provides standard local time hourly emissions by ORIS identification code and boiler number. It was developed in conjunction with the Market Trading Division of EPA, and that division now provides this ASCII file format when the data are released.</para>
</listitem>
<listitem>
<para><emphasis role="bold">FF10 format:</emphasis>The FF10 day- and hour-specific format can be input to SMOKE using either the FF10 annual or average-day inventories.</para>
</listitem>

</itemizedlist>

<para>In addition to reading the files, <command>Smkinven</command> ensures that the records listed in day- and hour-specific files match records provided in the annual inventory inputs. This is necessary because SMOKE builds its list of all point sources based on the annual and/or average-day inventory files, and these are the only sources that will be listed in the SMOKE intermediate inventory file. <command>Smkinven</command> requires that the year associated with the date of the day-specific and hour-specific emissions is the same as the base year. <command>Smkinven</command> generates an error message when it cannot match a day- or hour-specific source with an inventory source. Please refer to <xref linkend="sect_programs_smkinven" /> for more information about how <command>Smkinven</command> can be used to import these data.</para>

<para><command>Smkinven</command> also can process the hourly CEM data in a more sophisticated way. Hourly heat input from the CEM data are used to allocate annual emissons to hourly emission data. First of all, the utility program <command>CEMScan</command> must be run before <command>Smkinven</command>, see more information at (<xref linkend="sect_utilities_cemscan" />). </para>

</section>

<section id="sect_concepts_hour_specific_cem">

<title>Processing hour-specific CEM data</title>

<para><command>Smkinven</command> can allocate annual emissions to hourly data using the hourly heat input from the standard local time CEM data. <command>Smkinven</command> must match the ORIS/boiler combinations that appear in the standard local time CEM data to sources in the annual emissions inventory. <command>Smkinven</command> will first skip any ORIS/boiler combinations in the input CEM data that do not appear in the summary list (<envar>CEMSUM</envar>) created by CEMScan; it is important that users ensure that their input CEM data and CEM summary file remain consistent. <command>Smkinven</command> will also skip any CEM ORIS IDs that are not in the inventory and also any ORIS and boiler combinations that are not in the inventory. Note that the inventory may contain sources with valid ORIS IDs but blank boiler codes; none of these sources will be matched to the CEM data. The format of the CEM hour-specific data is shown in <xref linkend="tbl_input_pthour_cem" />.</para>

<para><emphasis role="bold">Data Check :</emphasis> Before calculating the hourly emissions, <command>Smkinven</command> first checks if the hourly NOx emissions from the CEM data are zero or null. If so, <command>Smkinven</command> checks if the hourly heat input, gross load, and steam load values are zero or null before all output values will be set to zero. In this case a warning will be written to the <command>Smkinven</command> log file.</para>

<para>Special handling is needed when processing SO<subscript>2</subscript> and NO<subscript>x</subscript> emissions. If the summed annual CEM SO<subscript>2</subscript> and NO<subscript>x</subscript> emissions for a particular ORIS/boiler combination are zero or null, <command>Smkinven</command> will calculate hourly emissions based on the SO<subscript>2</subscript> and NO<subscript>x</subscript> emissions in the annual inventory rather than using the hourly CEM data. If the summed annual SO<subscript>2</subscript> and NO<subscript>x</subscript>emissions are valid but a particular hour is missing the hourly SO<subscript>2</subscript> and NO<subscript>x</subscript> emissions, <command>Smkinven</command> will set the hourly SO<subscript>2</subscript> and NO<subscript>x</subscript> emissions to zero and write a warning to the log file.</para>

<para><emphasis role="bold">Calculation:</emphasis> When calculating hourly emissions, <command>Smkinven</command> must use data from the annual inventory. If any of the matching inventory sources are missing the annual emissions value, the program will exit with an error. <command>Smkinven</command> will start with the hourly emissns from the CEM data and then disaggregate the emissions to the matching inventory sources like so:</para>

<para><emphasis>Hourly NOx emissions for source <emphasis>i</emphasis> = (annual NOx emissions for source <emphasis>i</emphasis> / summed annual NOx emissions for all matching sources ) * hourly CEM NOx emissions * pounds to tons conversion</emphasis></para>

<para>If the summed annual emissions for all matching sources is zero, the hourly CEM emissions will be distributed evenly to the matching inventory sources. For all other pollutants, the hourly emissions are calculated as; </para>

<para><emphasis>Hourly emissions for source <emphasis>i</emphasis> = annual factor * annual emissions for source <emphasis>i</emphasis></emphasis></para>

<para>The annual factor in the above calculation will preferentially be:</para>

<para><emphasis>Annual factor = hourly heat input for ORIS/boiler / annual summed heat input for ORIS/boiler</emphasis></para>

<para>If heat input data are not available, <command>Smkinven</command> will fallback to steam load followed by gross load. The heat input values do not need to be disaggregated to the matching inventory sources because the same disaggregation factor would be used for both the hourly heat input and summed heat input.</para>

<para>Used to calculate hourly flow rates from hourly heat input when reading CEM data. If <envar>FLOW_RATE_FACTOR</envar> is set to zero or unset, then <command>Smkinven</command> will not calculate hourly flow rates.</para>

<para><emphasis>Hourly flow rate (m<superscript>3</superscript>/s) = [<envar>FLOW_RATE_FACTOR</envar> (ft<superscript>3</superscript>/MMBTU) * hourly heat input (MMBTU/hr) * 0.02831 m<superscript>3</superscript>/ft<superscript>3</superscript>] / 3600 s/hr</emphasis></para>

<para><command>Smkinven</command> then needs to assign the ORIS/boiler-level hourly flow rate to the matching inventory sources. To do this, it sums the flow rate for sources with the same stack. <command>Smkinven</command> uses the plant ID and stack ID to determine which sources feed into the same stack. Script setting information is available at <xref linkend="sect_programs_smkinven_envar" />.</para>

<para><command>NOTE</command>: It is not recommended to run <command>Smkinven</command> for an entire year's worth of CEM data since the memory requirements are large. Instead, users should process only the CEM data for their episode of interest. This may require multiple runs of <command>Smkinven</command> to break up large episodes.</para>

</section>

</section>

<section id="sect_concepts_temporal_processing">

<title>Temporal processing</title>

<para>The temporal allocation of emission inventory data always occurs after the inventory import processing previously described in <xref linkend="sect_concepts_inventory_import" />. The <command>Temporal</command> program processes data for anthropogenic sources, while the <command>Tmpbeis3 or Tmpbeis4</command> program allocates biogenic emissions. In this section, we focus on the temporal allocation of the anthropogenic inventories using <command>Temporal</command>. The biogenic processing is further described in <xref linkend="sect_concepts_biogenic_processing" />.</para>

<para>The primary purpose of the <command>Temporal</command> program is to create an intermediate hourly emissions file (<envar>ATMP</envar>, <envar>MTMP</envar>, or <envar>PTMP</envar>). It also creates a supplementary intermediate file that indicates which monthly, weekly (day-of-week), and diurnal (hourly) profiles were assigned to each source (<envar>ATSUP</envar>, <envar>MTSUP</envar>, or <envar>PTSUP</envar>). Since the <command>Temporal</command> dynamically create names for the output files, two new environment variables <envar>[A|M|P]TMPNAME</envar> and <envar>[A|M|P]TSUPNAME</envar> are used to set the directory and file prefix for naming the output files <envar>[A|M|P]TMP</envar> and <envar>[A|M|P]TSUP</envar>. The files are named using the starting date of each time period.For example, if <envar>ATMPNAME</envar> is set to /data/ntmp.nctox., then the <envar>ATMP</envar> file for a given time period will be put in the data directory and named ntmp.nctox.[start date].ncf.</para>

<para>The temporal processing operation applies factors based on the source characteristics to the emissions data from the SMOKE inventory files. These factors can include monthly, weekly, and diurnal temporal profiles. The resulting emissions data vectors (not a matrix) contain hourly emissions for the inventory species. SMOKE assumes an hourly time step (Even though the time step is an input setting to SMOKE, it currently cannot be changed.). Most of the calculations are implemented as sparse-matrix algebra based upon temporal cross-references and profiles, augmented by the substitution of values from day- and hour-specific emissions data sets. For mobile sources, hourly emissions values also depend on meteorology (e.g., the temperature dependence of evaporative emissions).</para>

<para><xref linkend="fig_concepts_temporal_processing" /> shows how data from the intermediate inventory are stored in the hourly file. The arrow represents the temporal processing steps which convert the annual, average-day, or day- and hour-specific data to hourly data. After the temporal processing, the hourly emissions are stored in the intermediate hourly file, by hour and source number. The emissions are stored in the same order as the sources in the sorted intermediate inventory file.</para>

<figure id="fig_concepts_temporal_processing">
<title>Transformation of inventory data to hourly data</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/temporal_processing_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/temporal_processing_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>Temporal processing also addresses the following issues that need to be considered during emissions processing:</para>


<orderedlist>
<listitem>
<para>Using annual or average-day data when both are available in the inventory</para>
</listitem>
<listitem>
<para>Applying monthly, weekly, and diurnal profiles</para>
</listitem>
<listitem>
<para>Using day- and hour-specific emissions</para>
</listitem>
<listitem>
<para>Time zone adjustments</para>
</listitem>
<listitem>
<para>Holiday processing</para>
</listitem>
<listitem>
<para>Monday-weekday-Saturday-Sunday (MWSS) processing</para>
</listitem>
<listitem>
<para>Processing non-sequential dates</para>
</listitem>
<listitem>
<para>Creating the intermediate files</para>
</listitem>
</orderedlist>

<para>In the subsections below, we address each of these issues in the same order as the list above.</para>

<section>

<title>Using annual or average-day data</title>

<para>When using the ORL or FF10 inventory format, you may choose to use either the annual or the average-day emissions values when running the <command>Temporal</command> program. The default is to use the annual data. To apply average-day data instead, the <envar>SMK_AVEDAY</envar> setting is used (this setting is relevant only for ORL or FF10 inventories). These emissions are then used in the merge-processing step, resulting in model-ready emissions that depend on the data type selected. If part of your inventory is available as average-day data and part is available as annual data, you have used <command>Smkinven</command> to fill in annual values based on the average-day values. The <command>Temporal</command> program is then run using annual values. <command>Temporal</command> ensures that for those sources for which <quote>annual</quote> values were created, only day-of-week and hourly adjustments (not monthly profiles) are applied. <command>Temporal</command> assumes that any average-day data provided has already been adjusted for a specific month, and therefore does not apply the monthly profiles to them.</para>

</section>

<section>

<title>Applying monthly, weekly, and diurnal profiles</title>

<para>The <command>Temporal</command> program uses a cross-reference file and a temporal profiles file to assign the temporal profiles to the inventory sources. The cross-reference file can assign the profiles using a hierarchy that is based on the source characteristics. A detailed list of valid assignments is given in <xref linkend="sect_programs_temporal" /> with the detailed description of the <command>Temporal</command> program. The cross-reference assigns a monthly, weekly, and diurnal profile to each source, and can also assign default profiles to sources that don&rsquo;t have more specific matches in the cross-reference file. When default assignments are made by the <command>Temporal</command> program, you can optionally choose to have warnings given for all of these assignments, using the <envar>REPORT_DEFAULTS</envar> option.</para>

<para>The temporal profiles file has different sections that contain the monthly, weekly, and diurnal profiles. Additionally, you may provide different weekend (Saturday, Sunday) and weekday (Monday through Friday) diurnal profiles, or you may provide a different set of diurnal profiles for each day of the week.</para>

<para>When using multiple diurnal profiles, a given source must use a diurnal profile with the same profile code for each day. For example, if a source uses diurnal profile 1 for Monday, it will also use profile 1 for Tuesday, but Monday&rsquo;s profile 1 can be different than Tuesday&rsquo;s profile 1.</para>

<para>For weekly profiles, <command>Temporal</command> evaluates the TPFLAG setting (set by <command>Smkinven</command> as explained in <xref linkend="sect_concepts_set_weekday_approach" />) and uses the weekly-normalized weekly profiles or weekday-normalized weekly profiles, depending on the setting. <comment>The formulas used are described in <xref linkend="sect_programs_temporal" />.</comment></para>

<para>One limitation of using monthly, weekly, and diurnal profiles is that there is no week-to-week variation within a month. For example, all Tuesdays in a month use the same emissions profile (unless one happens to be a holiday, see <xref linkend="sect_concepts_holiday_processing" />). This limitation can be overcome by using day-specific emissions data (only for point sources).</para>

</section>

<section>

<title>Using day- and hour-specific emissions</title>

<para>SMOKE uses the most specific data available for any given source. For point sources, <command>Temporal</command> will read in the day-specific and hour-specific emissions and use these instead of the annual or average-day emissions that have been adjusted to hourly values. If both day-specific and hour-specific data are available for the same source, <command>Temporal</command> will use the hour-specific data. The settings that control the use of these emissions are <envar>HOUR_SPECIFIC_YN</envar> and <envar>DAY_SPECIFIC_YN</envar>.</para>

</section>

<section id="sect_concepts_time_zones">

<title>Time zone adjustments</title>

<para>You can use the <envar>OUTZONE</envar> setting to control the output time zone that defines the hours in the output files. The one constraint on this setting is that it must be consistent with the time zone of the meteorology files. These files are used only when processing on-road mobile sources with MOVES/MOBILE6 (optional), processing elevated point sources for the CMAQ-based approach (required), or processing biogenic emissions with SMOKE BEIS3 (required). The time zones must be consistent among all source categories processed, so if one source category depends on the meteorology file, then all source categories must be processed with the same <envar>OUTZONE</envar> setting.</para>

<para><command>Temporal</command> compares the <envar>OUTZONE</envar> value with the time zone of the source, which was set in <command>Smkinven</command> (see <xref linkend="sect_concepts_assign_countries" />) based on the county. Additionally, it assesses and accounts for whether the date being processed falls within the range of Daylight Savings Time, and whether the county of the source uses Daylight Savings Time. <command>Temporal</command> uses the <envar>COSTCY</envar> or the <envar>GEOCODE_LEVEL4</envar> (if USE_EXP_GEOCODES Y) file to determine which counties use Daylight Savings Time and which do not; for example, the state of Arizona does not use it. Using these pieces of information, <command>Temporal</command> interprets the diurnal profiles assuming that they are local profiles in order to map the correct adjustment to the correct output hour in the output time zone. <command>Temporal</command> also uses the time zone of the source and the output time zone to determine the correct hour for switching from one month to the next and from one day of the week to the next.</para>

<para><xref linkend="tbl_concepts_outzone" /> lists a sampling of <envar>OUTZONE</envar> settings and the time zones that they represent. Note that SMOKE expects <envar>OUTZONE</envar> to be set as a positive number for time zones in the Western Hemisphere, although standard notation would list these as negative values. For example, Eastern Standard Time is listed in this table as -5:00 hours from GMT, but <envar>OUTZONE</envar> for EST in SMOKE is 5. One result of this implementation is that SMOKE does not work perfectly for time zones east of GMT.</para>

<table id="tbl_concepts_outzone">
<?dbfo table-width="6in"?>
<title>Example <envar>OUTZONE</envar> settings and their associated time zones</title>

<tgroup cols="4">
<colspec colwidth="1*" />
<colspec colwidth="1*" />
<colspec colwidth="1*" />
<colspec colwidth="3*" />

<thead>
<row>
<entry align="center"><envar>OUTZONE</envar></entry>
<entry align="center">GMT</entry>
<entry align="center">Time zone</entry>
<entry align="center">Description</entry>
</row>
</thead>

<tbody>
<row>
<entry>12</entry>
<entry>-12:00</entry>
<entry>BIT</entry>
<entry>Baker Island Time</entry>
</row>
<row>
<entry>11</entry>
<entry>-11:00</entry>
<entry>SST</entry>
<entry>Samoa Standard Time</entry>
</row>
<row>
<entry>10</entry>
<entry>-10:00</entry>
<entry>HST</entry>
<entry>Hawaii Standard Time</entry>
</row>
<row>
<entry>9</entry>
<entry>-9:00</entry>
<entry>AKT</entry>
<entry>Alaska Standard Time</entry>
</row>
<row>
<entry>8</entry>
<entry>-8:00</entry>
<entry>PST</entry>
<entry>Pacific Standard Time</entry>
</row>
<row>
<entry>7</entry>
<entry>-7:00</entry>
<entry>MST</entry>
<entry>Mountain Standard Time</entry>
</row>
<row>
<entry>6</entry>
<entry>-6:00</entry>
<entry>CST</entry>
<entry>Central Standard Time</entry>
</row>
<row>
<entry>5</entry>
<entry>-5:00</entry>
<entry>EST</entry>
<entry>Eastern Standard Time</entry>
</row>
<row>
<entry>4</entry>
<entry>-4:00</entry>
<entry>AST</entry>
<entry>Atlantic Standard Time</entry>
</row>
<row>
<entry>3</entry>
<entry>-3:00</entry>
<entry>ART</entry>
<entry>Argentina Time</entry>
</row>
<row>
<entry>2</entry>
<entry>-2:00</entry>
<entry>FNT</entry>
<entry>Fernando de Noronha Time</entry>
</row>
<row>
<entry>1</entry>
<entry>-1:00</entry>
<entry>EGT</entry>
<entry>Eastern Greenland Time</entry>
</row>
<row>
<entry>0</entry>
<entry>0:00</entry>
<entry>GMT</entry>
<entry>Greenwich Mean Time</entry>
</row>
<row>
<entry>-1</entry>
<entry>1:00</entry>
<entry>CET</entry>
<entry>Central European Time</entry>
</row>
<row>
<entry>-2</entry>
<entry>2:00</entry>
<entry>EET</entry>
<entry>Eastern European Time</entry>
</row>
<row>
<entry>-3</entry>
<entry>3:00</entry>
<entry>MSK</entry>
<entry>Moscow Time</entry>
</row>
<row>
<entry>-4</entry>
<entry>4:00</entry>
<entry>GST</entry>
<entry>Gulf Standard Time</entry>
</row>
<row>
<entry>-5</entry>
<entry>5:00</entry>
<entry>PKT</entry>
<entry>Pakistan Standard Time</entry>
</row>
<row>
<entry>-6</entry>
<entry>6:00</entry>
<entry>BST</entry>
<entry>Bangladesh Standard Time</entry>
</row>
<row>
<entry>-7</entry>
<entry>7:00</entry>
<entry>THA</entry>
<entry>Thailand Time</entry>
</row>
<row>
<entry>-8</entry>
<entry>8:00</entry>
<entry>HKT</entry>
<entry>China Standard Time</entry>
</row>
<row>
<entry>-9</entry>
<entry>9:00</entry>
<entry>KST</entry>
<entry>Korean Standard Time</entry>
</row>
<row>
<entry>-10</entry>
<entry>10:00</entry>
<entry>AET</entry>
<entry>Australian Eastern Time</entry>
</row>
<row>
<entry>-11</entry>
<entry>11:00</entry>
<entry>ADT</entry>
<entry>Australian Daylight Time</entry>
</row>
<row>
<entry>-12</entry>
<entry>12:00</entry>
<entry>FJT</entry>
<entry>Fiji Time</entry>
</row>
<row>
<entry>-13</entry>
<entry>13:00</entry>
<entry>NZT</entry>
<entry>New Zealand Daylight Time</entry>
</row>
<row>
<entry>-14</entry>
<entry>14:00</entry>
<entry>LNT</entry>
<entry>Line Islands Time</entry>
</row>
</tbody>
</tgroup>
</table>

</section>

<section id="sect_concepts_holiday_processing">

<title>Holiday processing</title>

<para>Holidays that fall on weekdays can have different activity patterns than regular weekdays because commercial and commuting activities are altered. <command>Temporal</command> uses the <envar>HOLIDAYS</envar> file to identify which user-defined dates should be processed as holidays and to determine which day of the week (Saturday or Sunday) to use to model the holiday. Usually, holidays are modeled as if they were a Sunday; this attempts to account for things like plants being closed and traffic patterns being different. Of course, you can also set up specific temporal profile inputs for a specific holiday and model that day separately, but this is not automated in SMOKE. Users therefore simply pick Saturday or Sunday as an alternative date treatment in the hope that it will somewhat better represent the emissions than using a weekday.</para>

<para>Holidays processing happens automatically if you are running the <command>Temporal</command> program for every day of your episode. If you are using the Monday-weekday-Saturday-Sunday approach described in <xref linkend="sect_concepts_mwss_processing" />, then additional scripting steps must be taken to ensure that the holidays are modeled properly. In particular, for inventories that span multiple time zones and/or when the output time zone is not the same as at least one time zone in the inventory, the scripts must be configured to model the holiday and the next day, so that the final hours in the holiday that are west of the output time zone will be included in the next day&rsquo;s file.</para>

</section>

<section id="sect_concepts_mwss_processing">

<title>Monday, weekday, Saturday, Sunday processing</title>

<para>The temporal variation of the days in a week is the same from week to week within a month because SMOKE uses weekly profiles. Also, it is common that the weekly profiles do not vary on Monday through Friday. Consequently, it is often desirable for long (e.g., annual) simulations to use a Monday-weekday-Saturday-Sunday (MWSS) approach. With this approach, SMOKE computes emissions for a representative Monday, weekday, Saturday, and Sunday within each month. The representative days cannot be the first day of the month (to prevent effects from the previous month from being included in the emissions data), a holiday (as set in the <envar>HOLIDAYS</envar> file), or the day after the holiday. Monday is distinguished from other weekdays because in multi-time zone cases in the Western Hemisphere with <envar>OUTZONE</envar> set to 0 (i.e. GMT), a few late-night Sunday hours are included in the hours at the start of the Monday file. In addition, one must specifically process the holidays and the day after holidays as separate runs for all holidays set by the <envar>HOLIDAYS</envar> file. During merging of emissions, the Monday, weekday, Saturday, and Sunday files are reused to create model-ready emissions for every day and hour needed. The holiday and day-after files are used for the holiday and day-after dates only.</para>

<para>This approach relies heavily on scripting to select dates for each month&rsquo;s representative days and to ensure that the days that are run are consistent with the dates in the <envar>HOLIDAYS</envar> file. Scripting is also responsible for ensuring that the correct files are merged together by <command>Smkmerge</command> to create the model-ready files for all days of the episode. <comment>These scripting issues are described in more detail in <xref linkend="ch_scripts" />.</comment></para>

<para>There are some situations in which you cannot use the MWSS approach. In particular, on-road processing with MOVES must be run for all days of the episode, although some steps can be sped up by using the meteorology averaging approach. Additionally, biogenic emissions processing obtains all of its temporal variation from the meteorology data, and therefore must be run for all days. Fortunately, biogenic emissions processing is very fast compared to processing for other source categories.</para>

</section>


<section id="sect_concepts_non_seq_processing">

<title>Processing Non-sequential Dates</title>

<para>Previously, <command>Temporal</command> processed a single continuous time period during each execution of the program producing one output file.  Typically, only representative Monday, weekday, Saturday and Sunday, plus any holidays are processed for a single month.  This type of processing can require complex scripting and <command>Temporal</command> would need to be run several times.  <comment>See <xref linkend="sect_scripts_change_episode" />, <xref linkend="sect_scripts_duration_output" /> and <xref linkend="sect_scripts_non_seq_dates" />.</comment></para>

<para>SMOKE now provides the capability for optionally setting non-sequential date processing during a single execution of <command>Temporal</command> using a new input file <envar>PROCDATES</envar> to indicate a list of dates which <command>Temporal</command> should process.  The format for <envar>PROCDATES</envar> is described in <xref linkend="sect_input_procdates" />.  The format allows for blank lines, comment lines (any lines that start with a pound sign, and trailing comments (any characters after an exclamation point).</para>

<para><envar>G_TSTEP</envar> will be used to set the time step for all time periods.  The output data for each time period will be written to an endividual file.</para>

<para>An example of <envar>PROCDATES</envar> file is shown below.  In this case, we are processing the first day of each month in 2005.  Twelve output files will be produced.  Each file will contain 25 time steps.</para>

<itemizedlist>
<listitem># First day of each month</listitem>
<listitem>20050101 0 250000 ! January</listitem>
<listitem>20050201 0 250000 ! February</listitem>
<listitem>20050301 0 250000 ! March</listitem>
<listitem>20050401 0 250000 ! April</listitem>
<listitem>20050501 0 250000 ! May</listitem>
<listitem>20050601 0 250000 ! June</listitem>
<listitem>20050701 0 250000 ! July</listitem>
<listitem>20050801 0 250000 ! August</listitem>
<listitem>20050901 0 250000 ! September</listitem>
<listitem>20051001 0 250000 ! October</listitem>
<listitem>20051101 0 250000 ! November</listitem>
<listitem>20051201 0 250000 ! December</listitem>
</itemizedlist>

<para><command>Temporal</command> procduces two output files <envar>[A|M|P]TMP</envar> and <envar>[A|M|P]TSUP</envar>.  Since <command>Temporal</command> will need to dynamically create names for the output files, we will use two new environment variables <envar>[A|M|P]TMPNAME</envar> and <envar>[A|M|P]TSUPNAME</envar> to set the directory and file prefix used to name the output files.  The files will be named using the starting date of each time period.  For example, if <envar>ATMPNAME</envar> is set to /data/ntmp.nctox., (note the period at the end of the file string) the <envar>ATMP</envar> file for a given time period will be put in the <envar>STATIC</envar> directory and named "ntmp.nctox.&lt;start date&gt;.ncf".</para>

</section>

<section>

<title>Creating the intermediate files</title>

<para>Finally, the <command>Temporal</command> program must output intermediate files. The hourly emissions are written to the <envar>ATMP</envar>, <envar>MTMP</envar>, and <envar>PTMP</envar> I/O API files. Unlike the other major SMOKE intermediate files (e.g., the matrices), the actual emissions (not just factors) are written to this file. This is because day-specific and hour-specific emissions can be impossible to convert into factors since the annual inventory emissions for the day- or hour-specific sources could be zero and factors would not be able to change that.</para>

<para>If <command>Temporal</command> has more than 120 variables to output (the limit for the number of variables in an I/O API file), <command>Temporal</command> opens as many files (using the FileSetAPI wrapper) as are needed to store the data. SMOKE also estimates how large the output files will be using 120 variables per file and automatically lowers the number of variables that will be put in each I/O API output file to ensure that the files use less than 2 GB of disk space. In addition, <command>Temporal</command> writes the supplementary files <envar>ATSUP</envar>, <envar>MTSUP</envar>, or <envar>PTSUP</envar>, which contain the temporal profiles assigned to each source. The structures of the SMOKE intermediate files output by <command>Temporal</command> are provided in <xref linkend="sect_intmed_temporal" />.</para>

</section>

</section>

<section id="sect_concepts_chemical_processing">

<title>Chemical speciation processing</title>

<para>An emission inventory is built and reported for a variety of pollutants, such as CO, NO<subscript>x</subscript>, VOC, PM<subscript>10</subscript>, and SO<subscript>2</subscript>. However, AQM chemical mechanisms (e.g., CB6) contain a simplified set of equations that use <quote>model species</quote> to represent atmospheric chemistry. Therefore, emissions processing requires speciation profiles to convert the emissions in terms of pollutant values to the species used in the photochemical mechanism. The purpose of the chemical speciation processing program <command>Spcmat</command> is to produce matrices that contain the factors for converting the input emissions pollutants to the model species used in the AQM. These species include organics, PM species, and toxics species.</para>

<para>The speciation matrices that <command>Spcmat</command> creates are used transform column vectors of inventory-pollutant emissions into column vectors of model-species emissions. As shown in <xref linkend="fig_concepts_chemical_processing" />, the speciation matrix consists of columns for each required pollutant-to-species transformation and includes an entry for each source. The entries are the factors needed to convert the inventory pollutants into the model species. Note that speciation matrices depend only upon the chemical mechanism and the inventory, and they are therefore independent of the other factor-based operations for emissions processing.</para>

<figure id="fig_concepts_chemical_processing">
<title>Relationship of inventory sources to speciation matrix</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/chemical_processing_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/chemical_processing_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>Chemical speciation processing addresses the following issues during emissions processing:</para>

<orderedlist>
<listitem>
<para>Splitting inventory pollutants into chemical species</para>
</listitem>
<listitem>
<para>Pollutant-to-pollutant conversions</para>
</listitem>
<listitem>
<para>Checking the consistency of the speciation profiles with the inventory</para>
</listitem>table
<listitem>
<para>Setting the order of the output species</para>
</listitem>
<listitem>
<para>Creating speciation intermediate files</para>
</listitem>
</orderedlist>

<para>In the subsections below, we provide additional detail about each of these steps, in the order in which they are listed above.</para>

<section>

<title>Splitting inventory pollutants into chemical species</title>

<para>SMOKE supports run-time, user-selected inventory pollutants and chemical mechanisms. Before running the chemical speciation step, the only relevant information that SMOKE has is information about the inventory pollutants. After <command>Spcmat</command> runs, SMOKE then has the instructions for supporting a specific chemical mechanism, and through the speciation matrices, SMOKE will be able to generate model-ready emissions for the specific chemical mechanism set by the user. The inventory pollutants relate to the chemical mechanism because certain pollutants are needed to create certain species, but the pollutants do not dictate the chemical mechanism. As a SMOKE user, you must be aware of what pollutants are required to generate the model species needed by a chemical mechanism, so that all needed model species are created.</para>

<para>As we just mentioned, SMOKE learns of the species being created for a given run through the <command>Spcmat</command> program. The chemical speciation profiles input file (<envar>GSPRO</envar>) is the data file that controls the chemical species SMOKE will create. It contains the chemical speciation profile code, the pollutant-to-species relationships, and both mole-based and mass-based conversion factors. The format and contents of this file are described in detail in <xref linkend="sect_input_gspro" />. Several <envar>GSPRO</envar> files to support many chemical mechanisms are available for download<comment>; see <xref linkend="sect_dirs_example_data_files" /> for more information. How to select an existing chemical mechanism or specify a new one, and how to ensure that the <envar>GSPRO</envar> file is consistent with other input files, are described in <xref linkend="sect_scripts_change_speciation" /></comment>.</para>

<para><command>Spcmat</command> uses a cross-reference file to assign the chemical speciation profiles to the inventory sources and pollutants. The cross-reference file can assign the profiles using a hierarchy that is based on the source characteristics; a detailed list of valid assignments is given in <xref linkend="sect_programs_spcmat" />. All assignments are pollutant-specific, such that each pollutant for a source can (and often should) use a different speciation profile. The cross-reference file can also assign a default profile, and some pollutants that have only one way of being speciated (e.g., mapping the CO pollutant to the CO species) receive a default profile for every source. When <command>Spcmat</command> makes default assignments, you can optionally choose to have warnings given for all of these assignments using the <envar>REPORT_DEFAULTS</envar> option. <command>Spcmat</command> will also produce a warning about any inventory pollutants that are not assigned a speciation profile, because this will result in the emissions for that pollutant being dropped by SMOKE.</para>

<para><command>Spcmat</command> creates two speciation matrices during each run: a mole-based matrix and a mass-based matrix. The speciation profile file (<envar>GSPRO</envar>) has different factors for mass- and mole-based conversions. It is not trivial to convert between mass- and mole-based factors for some chemical mechanisms like CB6, which use aggregates of chemical compounds or parts of compounds to define the model species. One cannot simply use a molecular weight to convert accurately, because the molecular weight of the chemical species is different for every speciation profile. This is because different proportions of chemical compounds are present in each speciation profile, so even though the species are the same, their molecular weights are different from profile to profile. This is why SMOKE has the two speciation matrices. The mole-based matrices are used to create the model-ready files, and the mass-based matrices are used only to create the reports that require tons, kilograms, grams, or other mass units. One peculiarity of the mole-based matrix is that particulate species emissions cannot be expressed in moles, so the units are still grams in the mole-based matrices for particulate species.</para>

<para>Chemical speciation has both similarities and differences from the aggregation and disaggregation that is performed during inventory import (see <xref linkend="sect_concepts_aggregate_toxics" />). It is similar in that it involves separating one data value into more than one data value. For example, inventory disaggregation can split Mercury &amp; Compounds (CAS=199) into elemental mercury, divalent gaseous mercury, and divalent particulate mercury during inventory import of nonroad mobile sources. Similarly, NO<subscript>x</subscript> can be split into nitrogen oxide (NO) and nitrogen dioxide (NO<subscript>2</subscript>) during chemical speciation. Aggregation is also similar to speciation because it can map multiple pollutants or parts of pollutants into the same chemical compound, just as speciation can map parts of the different pollutants (e.g., HGSUM and HG) to the same model species (e.g., HG). On the other hand, the two concepts are different for three reasons. First, the pollutants aggregated and disaggregated by <command>Smkinven</command> are still considered by SMOKE to be pollutants, not species, to pass to later processing steps. Second, chemical speciation allows multiple split factors to be used for the same pollutant-to-species conversion, whereas inventory aggregation/disaggregation uses just one factor across the whole inventory for each pollutant-to-pollutant conversion. Third, chemical speciation also converts the units of the pollutants (tons) to the units of the species (moles for gaseous species and grams for particulate species).</para>

<para>Speciation profiles do not necessarily conserve mass. For example, it is possible to input 100 tons of VOC into the <command>Spcmat</command> program and have it output factors that will produce 70 tons of VOC-based species or 110 tons of VOC-based species. The reduced mass occurs when some of the pollutant&rsquo;s mass does not map to chemically reactive species in the inventory. In some cases, the nonreactive (NR) species is included in the speciation profiles so that the speciation profiles do sum to 1. Increased mass happens because some compounds that are part of VOC may have chemical reactivity associated with two model species. Since this one part of the VOC is mapped to two model species, its mass appears to be double-counted when summing the model-species mass. This is merely an artifact of how the model species are defined and implemented in the AQM, and the AQM is responsible for accounting for such issues in its chemical mechanism.</para>

</section>

<section>

<title>Pollutant-to-pollutant conversions</title>

<para>In some cases, the pollutant available in the inventory is not the same as the pollutant for which speciation profiles have been developed. At this time, the only known case of this situation is when the inventory is collected for VOC (or for reactive organic gases [ROG]) but the speciation profiles are available for TOG. For this situation, VOC-to-TOG factors have been developed that <command>Spcmat</command> assigns to the inventory sources based on the sources&rsquo; SCCs. The factors are input using the <envar>GSCNV</envar> file, which can support multiple pollutant-to-pollutant conversions in the same file. <comment>The example SMOKE <envar>GSCNV</envar> file has sections for ROG-to-TOG and VOC-to-TOG; it is described further in <xref linkend="sect_dirs_example_data_files" /> and <xref linkend="sect_input_gscnv" />.</comment></para>

</section>

<section>

<title>Checking the consistency of the speciation profiles with the inventory table</title>

<para>The <command>Spcmat</command> program checks to be sure that the two definitions of the NONHAPVOC pollutant - one from the speciation profiles (<envar>GSPRO</envar>) file and the other from the inventory table (<envar>INVTABLE</envar>) file - are consistent. When you are modeling with both criteria and toxics inventories and using toxics VOC (i.e., <quote>integrating</quote> toxics VOC), you must ensure that the definition of toxics VOC used in the <envar>INVTABLE</envar> file is consistent with the definition in the <envar>GSPRO</envar> file. <command>Spcmat</command> cannot assure that these files are consistent with each other, but does ensure that the information provided by the user is consistent; it is up to you to make sure that the information provided is correct and consistent with what is actually in the files.</para>

<para>These files need to be consistent because the same pollutants that SMOKE subtracts out of the VOC to create NONHAPVOC must also be removed when the speciation profiles are being computed. This assures that there will be no double-counting of mass when calculating model species from both toxics VOC and from the criteria aggregated VOC value. The <envar>GSPRO</envar> file format includes a header that allows you to define the toxics VOC that were removed when creating the NONHAPVOC speciation profiles. This is intended to help you ensure that the NONHAPVOC definition in the <envar>GSPRO</envar> file is consistent with the inventory table file. One important detail of how this is implemented is that <command>Spcmat</command> will not include inventory pollutants that are not actually in the inventory in the consistency check, even if the pollutant is listed in the NONHAPVOC definition in the <envar>GSPRO</envar> file.</para>

</section>

<section>

<title>Setting the order of the output species</title>

<para>The <command>Spcmat</command> program controls the order of the species in the output files. The order does not matter to any of the AQMs that SMOKE supports, so there is not very much user control of the order. <command>Spcmat</command> arranges the output species first by the order in which their associated pollutants appear in the inventory table (<envar>INVTABLE</envar>) file, and next in alphabetical order. For example, if an inventory included CO, NO<subscript>x</subscript>, and VOC, the CO species from the CO pollutant would be first, followed by the NO and NO<subscript>2</subscript> species for NO<subscript>x</subscript>. After that, all of the VOC species would be output in alphabetical order (e.g., ALD2, ETH, FORM, ISOP, NR, OLE, PAR, TOL, XYL). If two pollutants create the same species, the first pollutant in the <envar>INVTABLE</envar> file with which the species is associated will determine its output order.</para>

</section>

<section>

<title>Creating speciation intermediate files</title>

<para>The last task for the chemical speciation processing is creating the speciation intermediate files. The <command>Spcmat</command> program creates a mole-based speciation matrix (<envar>ASMAT_L</envar>, <envar>MSMAT_L</envar>, or <envar>PSMAT_L</envar>) and a mass-based speciation matrix (<envar>ASMAT_S</envar>, <envar>MSMAT_S</envar>, or <envar>PSMAT_S</envar>), as previously mentioned. These matrices can have any number of pollutant-to-species conversions; the <command>Spcmat</command> program will open multiple speciation matrix output files if the number of pollutant-to-species conversions is greater than 120 (the limit for the number of variables in an I/O API file). <command>Spcmat</command> opens and writes as many files with 120 variables (using the FileSetAPI wrapper) as are needed to store the data. In addition, the <command>Spcmat</command> program writes a supplementary speciation file (<envar>ASSUP</envar>, <envar>MSSUP</envar>, or <envar>PSSUP</envar>) that contains the speciation profile assignments for each source. The structures of the SMOKE intermediate files output by <command>Spcmat</command> are provided in <xref linkend="sect_intmed_spcmat" />.</para>

</section>

</section>

<section id="sect_concepts_spatial_processing">

<title>Spatial processing</title>

<para>The spatial processing operation, or <emphasis>gridding</emphasis>, combines the grid specification for the air-quality modeling domain with source locations from the SMOKE inventory file. The resulting gridding matrix is a sparse matrix that describes in which grid cells the emissions for each source occur within the modeling domain. The gridding matrix is applied to the inventory emissions to transform source-based inventory emissions to gridded emissions.</para>

<para>The SMOKE <command>Grdmat</command> program creates the gridding matrix for area, mobile, and point sources. The gridding step is different depending on the type of source being processed.</para>

<itemizedlist>
<listitem>
<para>For area sources, county-total emissions are spread among the cells intersecting the county through the use of gridding surrogates.</para>
</listitem>
<listitem>
<para>For mobile sources, the data can be provided by county (as area sources are), or the data can be provided as line sources (<quote>links</quote>). County-based mobile emissions are apportioned with gridding surrogates, preferably with surrogates based on the different road types for which the mobile emissions are provided. The line-source emissions are apportioned depending on the length of the link in each cell.</para>
</listitem>
<listitem>
<para>For point sources, emissions are apportioned to the grid cell intersecting the point.</para>
</listitem>
</itemizedlist>

<para>As shown in <xref linkend="fig_concepts_spatial_processing" />, the gridding matrix contains the SMOKE source IDs that intersect each grid cell, and the source-to-cell factors for each. The gridding matrix is a sparse matrix because each source intersects only a small number of cells relative to the total number of cells in the domain, and the storage format shown in the figure reflects a sparse storage format. In the example in <xref linkend="fig_concepts_spatial_processing" />, source 1 intersects cells 1 and 2, with 10% of the emissions in cell 1 and 30% of the emissions in cell 2. The remainder of the source&rsquo;s emissions (60%) are either outside the grid or in other cells not shown in this example. Source 2 is completely inside the domain; 40% of its emissions are in cell 1, 30% are in cell 2, and 30% are in cell 4. If this were a point-source speciation matrix, there would be only one cell associated with each source because each point source exists in only one cell.</para>

<figure id="fig_concepts_spatial_processing">
<title>Relationship between inventory and gridding matrix</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/spatial_processing_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/spatial_processing_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para>Note that the gridding matrix depends only upon the source locations, the grid definition, and in some cases gridding surrogates and cross-references. It is therefore independent of the other steps of emissions processing.</para>

<para>Spatial processing addresses the following issues during emissions processing:</para>

<orderedlist>
<listitem>
<para>Defining the gridded region to output from SMOKE</para>
</listitem>
<listitem>
<para>Assigning sources to grid cells</para>
</listitem>
<listitem>
<para>Creating the spatial allocation intermediate files</para>
</listitem>
</orderedlist>

<section>

<title>Defining the gridded region to output from SMOKE</title>

<para>The grid for which the spatial allocation step outputs the gridding matrix depends upon the <envar>IOAPI_GRIDNAME_1</envar> environment variable setting and the <envar>GRIDDESC</envar> input file. The file contains all of the settings needed by the I/O API to define each grid (together, these settings are called a grid definition), and it contains many such grid definitions. The <envar>IOAPI_GRIDNAME_1</envar> setting selects the specific grid to use during a specific run, and it must match the grid name provided in the <envar>GRIDDESC</envar> file. The settings used by the I/O API to define a grid are documented as part of the <ulink url="http://www.baronams.com/products/ioapi/GRIDDESC.html">I/O API web site</ulink>.</para>

<para>There is another file format, called the <envar>G_GRIDPATH</envar> file, that can be provided to <command>Grdmat</command> instead of the <envar>GRIDDESC</envar> file. However, the <envar>G_GRIDPATH</envar> format allows only one grid per file. It is included as an acceptable input to SMOKE to allow SMOKE to be backwardly compatible with previous versions. All new users are advised to use a single <envar>GRIDDESC</envar> file to store all of their modeling grids, instead of using multiple <envar>G_GRIDDPATH</envar> files, one for each grid. The <envar>IOAPI_GRIDNAME_1</envar> setting is not used by <command>Grdmat</command> when the <envar>G_GRIDPATH</envar> format is used.</para>

<para>The gridded region selected at run time does not need to cover all of the counties in the inventory. If the gridded region is smaller than the inventory, this processing step will still include the counties or parts of counties that do not overlap the grid. This allows users who are performing emissions processing using nested grids to import the inventory once and apply gridding matrices for each grid to the same inventory, creating gridded emissions for all nested grids without having to adjust the inventory files. The downside of this approach is that SMOKE does not give a warning if a county that is in the inventory is not in the spatial surrogates, or if a lat-lon coordinate is not inside the grid. It assumes that these sources are intended to be dropped and proceeds without comment. Thus, you must ensure that your surrogate files contain the counties in your inventory that are inside the grid and that your lat-lon coordinates for point and link sources are correct.</para>

</section>

<section>

<title>Assigning sources to grid cells</title>

<para>SMOKE takes a different approach to assigning sources to grid cells for each SMOKE source category and some special cases. In the following subsections, we describe the concepts of spatial allocation of (1)  area/nonpoint, nonroad, mobile, and on-road mobile nonlink sources; (2) area-to-point sources; (3) pregridded area sources, (4) on-road mobile link sources, and (5) point sources.</para>

<section>

<title>Spatial allocation of area/nonpoint, nonroad mobile, and on-road mobile nonlink sources</title>

<para>The area/nonpoint, nonroad mobile, and on-road mobile nonlink sources all have emissions in the inventory with county-total values. To spatially allocate these emissions, factors must be assigned to each source to distribute the county-total emissions across the grid cells that intersect the county. This is accomplished using a cross-referencing approach that assigns a spatial surrogate to each source in the inventory.</para>

<para>The spatial surrogates files, located in <envar>SRGPRO_PATH</envar> contain factors for allocating emissions from a county-total to a gridded value, and there are usually many sets of factors available for each county. The data in this file are used to estimate the spatial distribution of county-total emissions inside the county. These sets of factors are calculated from other data that are available at a finer resolution than the county data, such as census tracks. Examples of commonly available surrogates are population, housing, urban area, rural area, agriculture, water, railroads, major highways, airports, ports, and forest. To ensure correct emissions processing, it is essential that all counties within the inventory and domain be included in the spatial surrogates file. If any counties inside the domain are left out of the file, then SMOKE will not be able to detect this; instead, it will act as though the county is outside of the domain and drop the emissions.</para>

<para>The general case of cross-referencing and profiles was described in <xref linkend="sect_concepts_cross_referencing" />. During spatial allocation with the <command>Grdmat</command> program, the spatial cross-reference file (<envar>AGREF</envar> or <envar>MGREF</envar>) assigns spatial surrogates for area/nonpoint, nonroad mobile, and on-road mobile nonlink sources. The spatial cross-reference file associates SCCs with a spatial surrogate code, which is an arbitrary positive integer code that also appears in the spatial surrogates file along with the spatial surrogate data. It is important to ensure (if possible) that the spatial surrogate codes assigned to each SCC are actually available in the surrogate file for all counties in the inventory with that SCC. For example, if the water surrogate is assigned to motorboat sources in a given county, the spatial surrogate file should have values for the water surrogate in that county. If this is not the case, SMOKE will be forced to use a <quote>fallback</quote> surrogate, defined with the <envar>SMK_DEFAULT_SRGID</envar> environment variable setting, which assigns a surrogate that is defined for every county in the domain.   <envar>SMK_DEFAULT_SRGID</envar> is mandatory and must be set to the population surrogate code listed in the <envar>SRGDESC</envar> file.  The population surrogate in this case serves two purposes; 1) It defines all the FIPS codes contained within the gridded domain which reduces I/O in the gridding process and 2) this fallback approach prevents the emissions for that county and source category from being dropped from the emissions processing. <command>Grdmat</command> produces warnings in the log file whenever the fallback surrogate is used. If the fallback surrogate also causes the emissions to go to zero, an additional warning that indicates the emissions are being dropped is also written to the <command>Grdmat</command> log file.  It is to the the users discretion to use the fallback described in purpose 2.  This is set by the <envar>SMK_USE_FALLBACK</envar> environment variable either set as [Y/N]. </para>

<para>The surrogate codes are defined, described and a reference to the spatial surrogate files listed in the surrogate description file <envar>SRGDESC</envar> which is likewise set as the environment variable <envar>SRGDESC</envar>. Because the surrogate codes are arbitrarily assigned, you need to make sure that the spatial cross-reference is developed in conjunction with the spatial surrogates so that the surrogate codes used in each are consistent. It is not generally wise to develop surrogates without a cross-reference, unless the surrogate codes used in the surrogates file are set to the same values that are in a spatial cross-reference file that is already available.  In other words, the surrogate codes in <envar>SRGDESC</envar>, <envar>AMGREF</envar> and <envar>AMGPRO</envar> should be in aggrement. </para>

<para><xref linkend="fig_concepts_spatial_allocation_county" /> illustrates spatial allocation of all county-based emissions sources. The box at the left represents a grid with a single county, shaded in gray to represent an emissions value in that county. The arrow represents the spatial allocation steps. The box at right shows that all grid cells that intersect the county have emissions contributed by the county. Cells that intersect more of the county are shown in darker colors, simulating the effect of using area surrogates. The darker cells could also represent cells in which a large city causes the population surrogate to concentrate the emissions in those cells, while the surrounding regions within the county have lower emissions.</para>

<figure id="fig_concepts_spatial_allocation_county">
<title>Spatial allocation of county-total emissions</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="4in" fileref="images/concepts/spatial_allocation_county_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/spatial_allocation_county_html.jpg" />
</imageobject>
</mediaobject>
</figure>

</section>

<section>

<title>Spatial allocation of area-to-point sources</title>

<para>In some cases, the inventory import processing (by the <command>Smkinven</command> program) will have assigned point locations to some of the county-based inventory sources such as airports. Spatial allocation for these sources ignores the assigned surrogate, sets the surrogate assignment to 0, and uses the point-source locations and assigned fractions to determine the cells and associated magnitudes of the emissions from the county-total source.</para>

</section>

<section>

<title>Spatial allocation of pregridded sources</title>

<para>When pregridded data are provided to SMOKE, the spatial allocation step must still be run. In this case, <command>Grdmat</command> just maps the pregridded emissions to the correct cells and outputs a simple gridding matrix that is used in later processing steps to maintain the pregridded nature of the inventory.</para>

</section>

<section>

<title>Spatial allocation of on-road mobile link sources</title>

<para>On-road mobile link sources are straight line segments within a county that have emissions and/or VMT data associated with them as well as the latitudes and longitudes of the starting and ending positions of the link. SMOKE determines the fraction of each link within each grid cell, and then assigns the emissions or VMT from the link to those grid cells that the link intersects, by weighting the emissions according to the length of the link within a grid cell divided by the total length of the link.</para>

<para>As shown in <xref linkend="fig_concepts_spatial_allocation_link" />, three link sources in the left-hand diagram are joined at their ends to represent a road cutting through the county. As shown in the figure, link sources do not bisect county boundaries (new link sources would be used to continue the road into an adjoining county). The three portions of the link have different VMT values, as represented by the three different shades of gray that highlight the link. The arrow represents the spatial allocation step<comment>; the details of the calculation are included in <xref linkend="sect_programs_grdmat" /> where the <command>Grdmat</command> program is described in more detail</comment>. In the right-hand diagram, the cells that intersect the link sources are shaded, with darker shading used in cells in which the link intersects the cell more and the link has higher VMT.</para>

<figure id="fig_concepts_spatial_allocation_link">
<title>Spatial allocation of on-road mobile link sources</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="4in" fileref="images/concepts/spatial_allocation_link_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/spatial_allocation_link_html.jpg" />
</imageobject>
</mediaobject>
</figure>

</section>

<section>

<title>Spatial allocation of point sources</title>

<para>Spatial allocation of point sources in <command>Grdmat</command> is very straightforward. The <command>Grdmat</command> program simply determines which cell the lat-lon coordinates are in and assigns all emissions from that source to a single grid cell. Except for (1) plume-in-grid sources for all models and (2) elevated sources for UAM-based models (both addressed in <xref linkend="sect_concepts_elevated_processing" />, this means that the AQMs do not have information about where in the grid cell the point source is located.</para>

</section>

</section>

<!--
<section>

<title>Computing an ungridding matrix for on-road mobile sources</title>

<para>For on-road mobile sources when using MOBILE6, the gridded meteorology data must be converted to county-specific data to facilitate the very large number of MOBILE6 runs that will ultimately be performed. This is done in SMOKE by computing a weighted-average value for each county, with the weighting calculation being done using the VMT data. Each weighting factor is the VMT in the grid cell divided by the total county VMT, resulting in a factor for each grid cell that intersects the county. These factors are stored in the <quote>ungridding</quote> matrix created during spatial allocation by the <command>Grdmat</command> program when running on-road mobile sources in SMOKE. <xref linkend="fig_concepts_ungridding" /> shows how the spatially varied meteorology data at left would be averaged using the averaging weights to calculate a value for each county in the domain.</para>

<figure id="fig_concepts_ungridding">
<title>Representation of ungridding</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="4in" fileref="images/concepts/ungridding_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/ungridding_html.jpg" />
</imageobject>
</mediaobject>
</figure>

</section>
-->

<section>

<title>Creating the spatial allocation intermediate files</title>

<para>The last task for the spatial allocation processing is creating the gridding intermediate files. The <command>Grdmat</command> program creates a gridding matrix (<envar>AGMAT</envar>, <envar>MGMAT</envar>, or <envar>PGMAT</envar>) and for on-road mobile sources an ungridding matrix (<envar>MUMAT</envar>). These matrices contain all of the factors needed to calculate gridded emissions from a source-based inventory. In addition, the <command>Grdmat</command> program writes a supplementary output file (<envar>AGSUP</envar> or <envar>MGSUP</envar>) that contains the spatial surrogates assigned for each source, indicates whether the assignment was a primary or a fallback, and identifies area-to-point sources. The structures of the SMOKE intermediate files output by <command>Grdmat</command> are provided in <xref linkend="sect_intmed_grdmat" />.</para>

</section>

</section>

<section id="sect_concepts_growth_processing">

<title>Growth processing</title>

<para>Growth processing creates emission data sets for years other than a year for which an emissions inventory is available. For example, if an inventory is available for 1996, but the modeling effort involves predicting ozone levels in 2007, then the emissions inventory must be grown to the year 2007. Previous versions of this document used the term <quote>projection</quote> for this function; in this version, however, <quote>projection</quote> refers to both growth of emissions (which is covered in this subsection) and control of emissions (addressed in <xref linkend="sect_concepts_control_processing" />). The <command>Cntlmat</command> program performs both growth and control functions. For growth processing, <command>Cntlmat</command> creates a growth matrix that contains the growth factors for each source and pollutant in the inventory. The <command>Grwinven</command> program then combines the growth matrix with the emission inventory to create a grown emission inventory.</para>

<para>If no new sources are being added when moving from the inventory year to the future year, then <command>Grwinven</command> can be used with the base case inventory and the growth matrix based on it. If new sources must be added, then the data structuring step (performed by <command>Smkinven</command>) must be rerun for the new number of sources, followed by running <command>Cntlmat</command> to create the growth matrix; then <command>Grwinven</command> applies the matrix to the new inventory file. Alternatively, users may elect to prepare a future-year inventory outside of SMOKE and import it directly with <command>Smkinven</command>, which skips the <command>Cntlmat</command> and <command>Grwinven</command> steps.</para>

<para><xref linkend="fig_concepts_growth_processing" /> shows the relationship between the inventory and the growth matrix created by <command>Cntlmat</command>, which consists of columns for each pollutant being grown from one year to another. The entries in the matrix are the growth factors needed to grow the inventory to a future or past year; note that these entries can be greater than, equal to, or less than 1 depending on if the emissions should increase, stay the same, or decrease after the inventory is grown. If the growth factors are the same for every pollutant in the inventory, then only one column, called <quote>pfac</quote>, is included in the growth matrix, rather than using duplicate columns for every pollutant. The growth matrix depends only upon the growth factors and the inventory, so it is therefore independent of other factor-based operations for emissions processing; however, growth of the inventory (using the <command>Grwinven</command> program) must occur before the temporal allocation step when creating model-ready emissions using inventories grown with SMOKE.</para>

<figure id="fig_concepts_growth_processing">
<title>Relationship between inventory sources and growth matrix</title>

<mediaobject>
<imageobject role="pdf">
<imagedata width="6.5in" fileref="images/concepts/growth_processing_pdf.jpg" />
</imageobject>
<imageobject role="html">
<imagedata fileref="images/concepts/growth_processing_html.jpg" />
</imageobject>
</mediaobject>
</figure>

<para><command>Grwinven</command> combines the intermediate inventory files with one or more growth matrices to create a new intermediate inventory file with the same structure as the original file but with a future (or past) year stored in the header of this file.</para>

<para>In growth processing, the <command>Cntlmat</command> program addresses the following emissions processing needs when creating the growth matrix:</para>

<orderedlist>
<listitem>
<para>Assigning growth factors</para>
</listitem>
<listitem>
<para>Reporting on the factors assigned to each source in the inventory</para>
</listitem>
<listitem>
<para>Creating the growth matrix</para>
</listitem>
</orderedlist>

<para>The <command>Grwinven</command> program addresses the remaining needs to create a grown inventory:</para>

<orderedlist continuation="continues">
<listitem>
<para>Applying the growth matrix</para>
</listitem>
<listitem>
<para>Creating a grown inventory file</para>
</listitem>
</orderedlist>

<para>Each of the issues in the list above is addressed in the following subsections, in the order in which they appear in the list.</para>

<section>

<title>Assigning growth factors</title>

<para>The <command>Cntlmat</command> program assigns growth factors using a cross-reference approach similar to the approaches used for chemical speciation and gridding. <command>Cntlmat</command> reads the information about the growth factors from the /PROJECTION/ packet in the control input file (<envar>GCNTL</envar>). As described in detail in <xref linkend="sect_programs_cntlmat" />, <envar>GCNTL</envar> can assign growth factors by state/county FIPS code, SCC, SIC, MACT, pollutant, and various combinations of these. The most specific entry is selected by <command>Cntlmat</command> based on the hierarchy described in <xref linkend="sect_programs_cntlmat" />. Since the /PROJECTION/ packet includes both the cross-referencing information and the growth factors, there is no need for a profile file (like those used for chemical speciation). The growth factors may include both positive growth (factors greater than 1) or negative growth (factors less than 1) or no growth (factors equal to 1).</para>

<comment>
<para>Part of the /PROJECTION/ packet indicates the base year (the year from which emissions will be grown) and the future or past year (the year to which the emissions will be grown). Having the dates in this packet ensures that <command>Cntlmat</command> will apply the growth factors to only those sources with a base year that matches the growth factors. <remark>If there are multiple base years in the inventory file, only some of the sources (those with a year that matches the base year in the packet) will be assigned a growth factor.</remark></para>

<para><remark>Only one /PROJECTION/ packet may be input to <command>Cntlmat</command> for a given run. This means that if several inventory years need to be grown to a single year, several steps will need to be taken. The best approach in this case is to first import the inventory data for all years into a single inventory. Then, you would run the <command>Cntlmat</command> program with different /PROJECTION/ packets, once for each year-to-year transformation. Finally, the <command>Grwinven</command> program would be used to apply all resulting growth matrices to the inventory in a single run. This approach will work only if the destination years from all of the /PROJECTION/ packets are the same.</remark></para>
</comment>

<para>The /REACTIVITY/ packet can also be used to grow emissions to a future or past year, but a different approach is used that includes a <quote>phase-in</quote> period and other differences. We describe this packet in <xref linkend="sect_concepts_control_processing" />.</para>

</section>

<section>

<title>Reporting on the factors assigned to each source in the inventory</title>

<para>In addition to actually applying the growth factors, the growth processing reports the factors that were applied to each source. <command>Cntlmat</command> writes a report that includes the source characteristics and the factor from the /PROJECTION/ packet assigned to the source. For example, the area-source report would include the state, county, and SCCs along with the factor assigned to the source. Only those sources that have growth factors applied are included in the report. This report permits you to verify that the assignments you intended were in fact made by the program. The report is called the <envar>APROJREP</envar>, <envar>MPROJREP</envar>, or <envar>PPROJREP</envar> file for SMOKE area, mobile, and point sources, respectively.</para>

</section>

<section>

<title>Creating the growth matrix</title>

<para>The growth matrix created by <command>Cntlmat</command> contains the growth factors for each source and pollutant. If pollutant-specific growth factors are used, the growth matrix contains one column for each pollutant that is grown. The names of the columns are the same as the names of the pollutants from the SMOKE inventory file, so that the <command>Grwinven</command> program knows which column to apply to which inventory pollutant. If the growth factors are applied uniformly to all pollutants in the inventory, then <command>Cntlmat</command> does not output a separate but identical factor for each pollutant. Instead, it writes only one column in the matrix, with the name <quote>pfac</quote>.</para>

</section>

<section id="sect_concepts_apply_growth_with_grwinven">

<title>Applying the growth matrix</title>

<para>The <command>Grwinven</command> program applies the growth factors to the inventory pollutants by simply multiplying the growth matrices for the pollutants with the annual and average-day emission values. It reads in the SMOKE intermediate inventory file and up to 80 growth matrices, then applies the growth factors from all growth matrices to the appropriate pollutants. If a growth matrix has the <quote>pfac</quote> variable, <command>Grwinven</command> applies the factors to all pollutants in the inventory; if instead the matrix has pollutant-specific entries, the factors are applied by matching the pollutants between the matrix and the inventory.</para>

<para>As is further described in <xref linkend="sect_concepts_control_processing" />, <command>Grwinven</command> can also apply the control matrix at the same time it applies the growth matrix, and the total number of growth and control matrices that can be applied in one run is 80. The <command>Grwinven</command> program&rsquo;s growth processing features cannot be applied to a SMOKE intermediate file that <command>Grwinven</command> has created, because of the /FYEAR/ header element described in the next section.</para>

</section>

<section>

<title>Creating a grown inventory file</title>

<para>After applying the growth factors to the SMOKE intermediate inventory file, the <command>Grwinven</command> program writes a new SMOKE intermediate inventory file that contains the new, grown emissions. <command>Grwinven</command> only writes the I/O API part of the inventory and not the ASCII part, because the ASCII part that contains the state/county codes, SCCs, and other character strings does not change between the base and future or past year. All of the sources in the base and grown inventory files are the same; the difference is in their emissions values.</para>

<para><command>Grwinven</command> adds a header element to the I/O API part of the inventory file that indicates the file was created using <command>Grwinven</command> and a growth matrix. The header element is /FYEAR/, which is followed by the date of the future year (for example, /FYEAR/ 2018). Other programs (such as <command>Temporal</command>) will recognize the /FYEAR/ header element and alter its messages to indicate that a future year is being processed.</para>


</section>

</section>

<section id="sect_concepts_control_processing">

<title>Control processing</title>

<para>The control processing operation applies control factors from a control input file (<envar>GCNTL</envar>) based on source characteristics in the inventory. A control scenario involves changing the values of emissions based on regulations affecting industrial activities or personal behaviors. The resulting control matrix, created by the <command>Cntlmat</command> program, takes the form of the matrix shown in <xref linkend="fig_concepts_growth_processing" />. The control matrix depends only upon the source characteristics in the SMOKE inventory and the set of controls chosen, so control processing can therefore be decoupled from the rest of the processing steps. The <command>Cntlmat</command> program performs control processing for SMOKE area, mobile, and point sources; however, much more complex controls for on-road mobile sources can also be implemented when using MOVES through SMOKE to calculate emission factors and apply them to VMT.</para>

<para>The emissions control factors can be applied in addition to the emissions growth factors (described in <xref linkend="sect_concepts_growth_processing" />), and the net effect of this growth and control is called <quote>projection</quote>. SMOKE control processing can create two types of control matrices during a given run: a multiplicative control matrix and a reactivity control matrix.</para>

<para>The <command>Cntlmat</command> program performs the following emissions processing steps in creating the control matrices:</para>

<orderedlist>
<listitem>
<para>Assigning control factors from six control packets to the sources</para>
</listitem>
<listitem>
<para>Creating the multiplicative control matrix</para>
</listitem>
<listitem>
<para>Creating the reactivity control matrix</para>
</listitem>
<listitem>
<para>Reporting on factors assigned to each source in the inventory</para>
</listitem>
</orderedlist>

<para>The <command>Grwinven</command> program addresses the following control processing steps:</para>

<orderedlist continuation="continues">
<listitem>
<para>Applying the multiplicative control matrices to the inventory</para>
</listitem>
<listitem>
<para>Creating a controlled intermediate inventory file</para>
</listitem>
</orderedlist>

<para>Finally, the <command>Smkmerge</command> program can be used to perform the following control processing step:</para>

<orderedlist continuation="continues">
<listitem>
<para>Applying the multiplicative and/or reactivity control matrices to the inventory to create model-ready inputs</para>
</listitem>
</orderedlist>

<para>The next seven subsections explain the concepts involved with these processing steps in more detail.</para>

<section>

<title>Assigning control factors from six control packets to the sources</title>

<para>SMOKE provides six control packets with which users can control emissions:</para>

<itemizedlist>
<listitem>
<para>/MACT/ contains MACT-based assignments for toxics inventories and can be used to apply general MACT controls to sources affected by MACT regulations. This packet contributes to the multiplicative control matrix.</para>
</listitem>
<listitem>
<para>/CONTROL/ contains settings for control efficiency, rule effectiveness, and rule penetration that can be applied by nearly any combination of source characteristics, even targeting a specific source. This packet contributes to the multiplicative control matrix. This packet cannot appear in the same input file with an /EMS_CONTROL/ packet.</para>
</listitem>
<listitem>
<para>/EMS_CONTROL/ contains settings for control efficiency, rule effectiveness, and rule penetration for both the base year <emphasis>and</emphasis> a future year. It also contains a point-source conversion factor and an aggregated control factor that can override everything else in the packet. This packet contributes to the multiplicative control matrix. This packet cannot appear in the same input file with a /CONTROL/ packet, and it can be used for point sources only.</para>
</listitem>
<listitem>
<para>/CTG/ contains settings for control technology guideline (CTG) controls, MACT controls, and reasonably available control technology (RACT) controls. It contributes to the multiplicative control matrix.</para>
</listitem>
<listitem>
<para>/ALLOWABLE/ contains county-specific, SIC-specific, SCC-specific controls, caps, and replacement emissions. It contributes to the multiplicative control matrix.</para>
</listitem>
<listitem>
<para>/REACTIVITY/ contains settings needed for reactivity-based controls and its use results in the reactivity control matrix.</para>
</listitem>
</itemizedlist>

<para><xref linkend="sect_programs_cntlmat" /> describes the <command>Cntlmat</command> program in more detail, including the cross-reference hierarchy of these packets and how they relate to one another. All packets can be included in a single <command>Cntlmat</command> run, with the exception of the /CONTROL/ and /EMS_CONTROL/ packets (either one or the other of these can be included, but not both). In general, these packets can assign control factors by state/county FIPS code, SCC, SIC, MACT, pollutant, other plant-specific source characteristics, and various combinations of these.</para>

</section>

<section>

<title>Creating the multiplicative control matrix</title>

<para><command>Cntlmat</command> creates the multiplicative control matrix based on the information contained in the /MACT/, /CONTROL/, /EMS_CONTROL/, /CTG/, and /ALLOWABLE/ packets. The specific details of how these packets interact and how the control factors are calculated is described in the <command>Cntlmat</command> section of <xref linkend="sect_programs_cntlmat" />. Once the control factors have been calculated, <command>Cntlmat</command> writes a multiplicative control matrix file that contains one column of control factors for each pollutant; each row in the matrix represents a source. The names of the columns are the same as the names of the pollutants from the SMOKE inventory file, so that either the <command>Grwinven</command> or the <command>Smkmerge</command> program knows which control column to apply to which inventory pollutant.</para>

</section>

<section id="sect_concepts_create_reactivity">

<title>Creating the reactivity control matrix</title>

<para>Reactivity controls have been included in SMOKE to allow users to examine what happens to the air quality modeling results when the chemical mixture of the emissions is changed to reduce its ozone-forming potential. Examining the effect of reactivity controls is known as <quote>reactivity assessment</quote>. The implementation of this type of control includes permitting users to reset the base-year emissions, reset the SCC, and reset the chemical speciation profile. The controls can be applied to all sources that match an SCC or a specific facility or process within a facility.</para>

<para>Several issues are important when addressing emissions processing requirements for reactivity assessments. Reactivity assessment involves replacing one compound in the inventory by another compound. This replacement can impact emission projections, the total magnitude of the inventory pollutants, and the associated SCCs. The market penetration of the replacement compound may vary in time and space, which affects the future-year emissions. Also, the replacement compound may be needed in much greater or much smaller amounts, thereby affecting the total inventory emissions. Finally, if a different process is required in order for a source to use the different compound, the SCC for that source may change.</para>

<para>The scale of the reactivity assessment is important; it could be local, statewide, or national. A local case could involve investigating reactivity for one source. A statewide case could be implementing a change in compound based on reactivity considerations for a State Implementation Plan (SIP), and this would affect sources across the state. A national case could involve an EPA investigation of the formulation of nationally distributed consumer products.</para>

<para>In addition, exemptions from controls for certain sources must be permitted as part of an emissions control strategy. These exemptions can occur when a reactivity assessment determines that certain compounds and/or processes do not significantly affect pollution formation.</para>

<para>To address these issues, SMOKE is able to target changes in a VOC for specific classes of VOC emissions, and address the spatial and temporal considerations implied by market penetration issues. Furthermore, when replacement options are being investigated, the correct replacement operations are facilitated by SMOKE. These operations include selecting sources, changing underlying pollutant emissions, changing SCCs, correctly projecting future-year emissions based in part on market penetration issues, and appropriately speciating emissions for the new compound.</para>

<para>For a single run of <command>Cntlmat</command>, reactivity controls can be applied to only one pollutant, typically a VOC pollutant. Therefore, if you have more than one VOC pollutant in the inventory (e.g., a toxics VOC and a particulate pollutant), then separate reactivity matrices will need to be created for each pollutant that receives reactivity controls, and these controls will need to be applied in separate SMOKE runs (see <xref linkend="sect_concepts_apply_controls_with_smkmerge" /> for more information).</para>

</section>

<section>

<title>Reporting on factors assigned to each source in the inventory</title>

<para>In addition to assigning the control factors, <command>Cntlmat</command> also creates reports detailing the factors applied to each source. These reports include the source characteristics and the factors assigned to the sources. The reports are:</para>

<itemizedlist>
<listitem>
<para><emphasis role="bold"><envar>ACREP</envar>, <envar>MCREP</envar>, or <envar>PCREP</envar>:</emphasis> For area and mobile sources, this report includes emissions before and after all multiplicative controls have been applied by state, county, and SCC. For point sources, the report includes emissions before and after all multiplicative controls by facility.</para>
</listitem>
<listitem>
<para><emphasis role="bold"><envar>AREACREP</envar>, <envar>MREACREP</envar>, or <envar>PREACREP</envar>:</emphasis> For pollutants controlled with reactivity controls (usually VOC), this report includes source number, base-year emissions, replacement base-year emissions, projection factor, future-year SCC, future-year speciation profile number, and market penetration rate of reactivity control. This report is the only reactivity report that can be generated by SMOKE since the <command>Smkreport</command> program cannot yet import and apply the reactivity matrix to generate reports.</para>
</listitem>
<listitem>
<para><emphasis role="bold"><envar>ACSUMREP</envar>, <envar>MCSUMREP</envar>, or <envar>PCSUMREP</envar>:</emphasis> For each source, this report includes the specific multiplicative controls that were applied to each source. It includes the emissions before and after each control, as well as the control factor that was applied. The control packet from which each control came is included in the report. The reactivity controls are not included in this report.</para>
</listitem>
</itemizedlist>

</section>

<section>

<title>Using <command>Grwinven</command> to apply the multiplicative control matrices</title>

<para>The <command>Grwinven</command> program applies the multiplicative control matrices to the inventory pollutants by multiplying the control factors for the pollutants with the annual and average-day emission values. (Reactivity matrices cannot be applied by <command>Grwinven</command>; only the <command>Smkmerge</command> program can apply those, as discussed in <xref linkend="sect_concepts_apply_controls_with_smkmerge" />.) <command>Grwinven</command> reads in the SMOKE intermediate inventory file and up to 80 control matrices and then applies the control factors from all matrices to the appropriate pollutants.</para>

<para>As discussed in <xref linkend="sect_concepts_apply_growth_with_grwinven" />, the <command>Grwinven</command> program can also apply growth matrices at the same time it applies control matrices, and the total number of growth and control matrices that can be applied in one run is 80. If desired, the control processing features of <command>Grwinven</command> can be used on a SMOKE intermediate file that has been created by <command>Grwinven</command> with grown and/or controlled emissions, as long as no <emphasis>growth</emphasis> processing features are included in that second pass. (See section <xref linkend="sect_concepts_apply_growth_with_grwinven" /> for discussion of this restriction on applying growth matrices, caused by <command>Grwinven</command>&rsquo;s addition of the /FYEAR/ header element.)</para>

</section>

<section>

<title>Creating a controlled intermediate inventory file</title>

<para>After applying the control factors to the SMOKE intermediate inventory file, the <command>Grwinven</command> program writes a new SMOKE intermediate inventory file that contains the new, controlled emissions. <command>Grwinven</command> writes only the I/O API part of the inventory and not the ASCII part, because the ASCII part that contains the state/county codes, SCCs, and other character strings does not change between the base and future or past year. All of the sources in the base and controlled inventory files are the same; the difference is in their emissions values.</para>

<para><command>Grwinven</command> can also output the ORL inventory format (in addition to I/O API format), which is also a SMOKE input format.
</para>

</section>

<section id="sect_concepts_apply_controls_with_smkmerge">

<title>Using <command>Smkmerge</command> to apply the multiplicative and/or reactivity control matrices</title>

<para>The <command>Smkmerge</command> program is a second option for applying control matrices. <command>Smkmerge</command> can apply both multiplicative control matrices and reactivity control matrices; it is the only program that can do the latter. Because the overall purpose of <command>Smkmerge</command> is to create the model-ready input files for an AQM (as discussed in <xref linkend="sect_concepts_merge_processing" />, the result of using <command>Smkmerge</command> to apply control matrices is a controlled, model-ready file, as opposed to the controlled intermediate inventory file output by <command>Grwinven</command>.</para>

<para><command>Smkmerge</command> is limited to applying one control matrix and one reactivity matrix for each SMOKE source category (area, mobile, or point). Therefore, if multiple reactivity control matrices need to be applied to create a single set of model-ready emission inputs, then the processing must be done as multiple SMOKE runs. This includes separating the sources that need different controls into separate inventory files and performing all SMOKE processing steps separately on the runs, using different reactivity controls for each. The resulting two or more sets of model-ready emissions should not have duplicate sources if the inventories were separated correctly at the start, and the model-ready files are combined using the <command>Mrggrid</command> program, as discussed in <xref linkend="sect_concepts_merge_processing" />.</para>

</section>

</section>

<section id="sect_concepts_elevated_processing">

<title>Elevated-source processing</title>

<para>As introduced in <xref linkend="sect_concepts_model_ready_files" /> and further explained in <xref linkend="sect_concepts_point_source_processing" />, there are two major approaches to processing elevated point sources for air quality modeling. The first approach is to have SMOKE compute the layer assignments for the point sources; this method is used for the CMAQ model. The second approach, which is used for the UAM models and CAM<subscript>X</subscript>, is to select specific sources as elevated and then create a special elevated-point-source file that contains the information needed so that the AQM can compute the plume rise. In both cases, users can select elevated sources specifically (in the second approach, that selection is mandatory). Also, PinG sources can be selected in both cases.</para>

<para>The two approaches have some steps that are the same and some that are different. The rest of this section is split into the following two subsections, one for each elevated-point-source processing approach:</para>

<orderedlist>
<listitem>
<para>Computing layer fractions for CMAQ</para>
</listitem>
<listitem>
<para>Creating an elevated-source file for UAM and CAM<subscript>X</subscript></para>
</listitem>
</orderedlist>

<para>Each subsection first overviews the steps in the approach, then gives more details on some of them.</para>

<section id="sect_concepts_compute_layer_cmaq">

<title>Computing layer fractions for CMAQ</title>

<para>The CMAQ model requires the layer fractions for elevated point sources to be computed by SMOKE. To do this, SMOKE performs the following steps:</para>

<itemizedlist>
<listitem>
<para>Uses the <command>Smkinven</command> program to import the annual, average-day, day-specific, and/or hour-specific emissions.</para>
</listitem>
<listitem>
<para>Optionally uses the <command>Temporal</command> program to calculate hourly emissions if emissions will be used as a criterion for selecting elevated sources or PinG sources. For example, you may wish to select facilities with NO<subscript>x</subscript> emissions greater than 100 tons/day.</para>
</listitem>
<listitem>
<para>Uses the <command>Elevpoint</command> program to select elevated and/or PinG sources. This step is required if modeling with PinG sources, but optional otherwise.</para>
</listitem>
<listitem>
<para>Uses the <command>Elevpoint</command> program to create the <envar>STACK_GROUPS</envar> file, which is needed for processing PinG sources with CMAQ.</para>
</listitem>
<listitem>
<para>Uses the <command>Laypoint</command> program to compute elevated plume rise for all elevated sources, and store the layer fractions for each source. This step can optionally read the output file from <command>Elevpoint</command> to identify the elevated sources, but otherwise will compute plume rise for <emphasis>all</emphasis> sources.</para>
</listitem>
<listitem>
<para>Uses the <command>Smkmerge</command> program to combine the layer fractions with the hourly emissions to generate the model-ready output files and optionally generate the PinG hourly emissions file for CMAQ.</para>
</listitem>
<listitem>
<para>Optionally uses the <command>Smkreport</command> program to report on elevated or PinG sources</para>
</listitem>
</itemizedlist>

<section>

<title>Using <command>Elevpoint</command></title>

<para><command>Elevpoint</command> can select elevated and PinG sources using multiple criteria based on emissions values, emissions rank, stack parameters, plant numbers, and an analytical plume rise calculation. The elevated criteria and PinG criteria are provided to <command>Elevpoint</command> using a file called <envar>PELVCONFIG</envar>. You must configure this file to contain the criteria needed to select elevated and PinG source (if these selection are needed at all). If emissions values is one of the criteria, the <command>Elevpoint</command> program reads all of the hourly point-source files using the <envar>PTMPLIST</envar> file. This file is a list of all <envar>PTMP</envar> files that will be evaluated to determine which sources have maximum daily emissions that exceed the specified selection criteria or to determine the emissions rank. Only the maximum daily facility-total emissions can be used by <command>Elevpoint</command> to select sources based on emissions.</para>

<para>The elevated sources and PinG sources can each be selected using different criteria. In general, there are many more elevated sources than PinG sources for typical applications of SMOKE and AQMs. <command>Elevpoint</command> also permits you to group PinG sources and creates the <envar>STACK_GROUPS</envar> file so that the sources are treated as a single source in the PinG rise calculation by CMAQ. Grouping is useful to reduce the total number of PinG stacks processed by CMAQ (PinG processing is a computationally expensive calculation). Using grouping makes sense when several stacks at the same plant have the same, or nearly the same, stack parameters. When that is true, the emissions from the multiple stacks can be grouped and treated as a single PinG stack. Finally, there are two SMOKE settings (<envar>SMK_ELEV_METHOD</envar> and <envar>SMK_PING_METHOD</envar>) that instruct <command>Elevpoint</command> and other SMOKE programs to actually use these criteria to select the elevated and PinG sources. If these settings are not set to <quote>1</quote> the elevated and PinG selections will not be made, and so will not affect any further processing steps.</para>

</section>

<section>

<title>Using <command>Laypoint</command></title>

<para>In this CMAQ approach, <command>Laypoint</command> uses gridded, hourly meteorological data and stack parameters to calculate the plume rise for all point-source emissions. The program&rsquo;s approach is based on the Briggs algorithm, as explained in detail in <xref linkend="sect_programs_elevpoint_briggs"/>, and provides the top and bottom heights of the plume. <command>Laypoint</command> uses these heights to compute the plumes&rsquo; distributions into the vertical layers that the plumes intersect, using the pressure difference across each layer over the pressure difference across the entire plume as a weighting factor to make this calculation. This approach gives plume fractions by layer and source. Only these fractions are stored in the output file (<envar>PLAY</envar>) from the <command>Laypoint</command> program (not the emissions in each layer).</para>

<para>If explicit plume rise sources (e.g., wildfires with precomputed hourly plume rise) are included in the inventory, <command>Laypoint</command> will skip the plume rise calculation for these sources. Instead, it will use the hourly data from the SMOKE <envar>PHOUR</envar> intermediate file, which describe the fraction of emissions in layer 1 and the top and bottom of the plume. <command>Laypoint</command> will combine these data with the pressure weights used for all elevated point sources to compute the fraction of emissions to go into each layer.</para>

</section>

<section>

<title>Using <command>Smkmerge</command></title>

<para><command>Smkmerge</command> applies the layer fractions from <command>Laypoint</command> to the elevated sources to compute the emissions in each layer. This approach has the advantage of allowing you to avoid repeating the plume rise calculations for each control strategy or grid. If the <envar>SMK_PING_METHOD</envar> setting (discussed above) indicates that the special CMAQ PinG file should be created, the <command>Smkmerge</command> program will also output this special file, called the <envar>PINGTS_L</envar> file. It contains the hourly, speciated emissions for each PinG source (which could be a stack group, as explained above).</para>

</section>

<section>

<title>Optional use of <command>Smkreport</command></title>

<para>If desired, <command>Smkreport</command> can apply the layer fractions and elevated or PinG statuses to the inventory to generate reports that include layer information and/or the elevated or PinG status. This reporting could be used, for example, to create a list of all PinG sources in the inventory, or to determine the elevated versus layer-1 emissions by state or SCC.</para>

</section>

</section>

<section id="sect_concepts_create_elevated_uam">

<title>Creating an elevated-source file for UAM and CAM<subscript>X</subscript></title>

<para>The other approach to modeling elevated sources is to create an elevated-point-source input file for one of the UAM models or CAM<subscript>X</subscript>. To do this, SMOKE performs the following steps:</para>

<itemizedlist>
<listitem>
<para>Uses the <command>Smkinven</command> program to import the annual, average-day, day-specific, and/or hour-specific emissions.</para>
</listitem>
<listitem>
<para>Optionally uses the <command>Temporal</command> program to calculate hourly emissions if emissions will be used as a criterion for selecting elevated sources or PinG sources. For example, you may wish to select facilities with NO<subscript>x</subscript> emissions greater than 100 tons/day.</para>
</listitem>
<listitem>
<para>Uses the <command>Elevpoint</command> program to select elevated and optionally PinG sources. Unlike processing for CMAQ, this step is always required.</para>
</listitem>
<listitem>
<para>Uses the <command>Elevpoint</command> program to create the <envar>STACK_GROUPS</envar> file, which is needed for creating the elevated-point-source file with <command>Smkmerge</command>.</para>
</listitem>
<listitem>
<para>Uses the <command>Laypoint</command> program to compute elevated plume rise for explicit plume rise sources, and store the layer fractions for the explicit sources only.</para>
</listitem>
<listitem>
<para>Uses the <command>Smkmerge</command> program to combine the <envar>STACK_GROUPS</envar> information, optional explicit plume rise information, and the hourly emissions to generate an ASCII elevated-point-source file with optional PinG flags.</para>
</listitem>
</itemizedlist>

<section>

<title>Using <command>Elevpoint</command></title>

<para>The primary difference between this approach is that the <command>Elevpoint</command> processing step is required. Traditionally, elevated point sources have been selected for the UAM or CAM<subscript>X</subscript> models using the analytical plume rise calculation. While <command>Elevpoint</command> can perform this calculation and use it exclusively to determine the elevated sources, you can also use the other selection criteria if desired: emissions values, emissions rank, stack parameters, and plant numbers. As described in <xref linkend="sect_concepts_compute_layer_cmaq" />, if the selection criteria include emissions, then the <envar>PTMPLIST</envar> file will be used to input all hourly emissions files for the entire modeling episode.</para>

</section>

<section>

<title>Using <command>Laypoint</command></title>

<para>If explicit plume rise sources (e.g., wildfires with precomputed hourly plume rise) are included in the inventory, you must run the <command>Laypoint</command> program to compute the layer fractions for these sources only. This is the <emphasis>only</emphasis> reason <command>Laypoint</command> would be run in the UAM/CAM<subscript>X</subscript> processing approach, and it requires that the <envar>EXPLICIT_PLUMES_YN</envar> and the <envar>HOUR_PLUMEDATA_YN</envar> settings be set to Y. These settings cause <command>Laypoint</command> to write the layer fractions to the <envar>PLAY_EX</envar> file instead of to the usual <envar>PLAY</envar> file. For these explicit sources, <command>Laypoint</command> will skip the plume rise calculation. Instead, it will use the hourly data from the SMOKE <envar>PHOUR</envar> intermediate file containing the fraction of emissions in layer 1 and the top and bottom of the plume. <command>Laypoint</command> will combine these data with the pressure weights used for all elevated point sources to compute the fraction of emissions to go into each layer.</para>

</section>

<section>

<title>Using <command>Smkmerge</command></title>

<para>The <command>Smkmerge</command> program looks for the <envar>SMK_ASCIIELEV_YN</envar> setting to determine whether the ASCII output file should be created. When this is set to Y, the output files from <command>Elevpoint</command> (<envar>PELV</envar> and <envar>STACK_GROUPS</envar>) are read to determine which sources should not be included in the 2-D emissions output file for point sources. The emissions from these sources are instead output to an ASCII elevated file along with stack parameters and locations so that the AQM can compute the plume rise.</para>

<para>If <envar>EXPLICIT_PLUMES_YN</envar> is also set to Y, <command>Smkmerge</command> will read the <envar>PLAY_EX</envar> file for the explicit plume sources. Since the UAM-based approach assumes that the AQM will compute the plume rise, SMOKE must manipulate the input file to trick the model into using precomputed plume rise. This is done by inserting fake stacks into the ASCII elevated file that extend to the center of each of the model layers and setting the stack parameters so that the plume rise calculation will keep the emissions associated with the fake stacks in the layer of the stack. As the emissions move from layer to layer, <command>Smkmerge</command> moves the reported emissions in the ASCII elevated file from fake stack to fake stack to represent the same behavior.</para>

</section>

</section>

</section>

<section id="sect_concepts_onroad_processing_moves">

<title>Mobile-source processing with MOVES</title>
<para>MOVES is the U.S. Environmental Protection Agency's (EPA) Motor Vehicle Emission Simulator.  In the modeling process, the user specifies vehicle types, time periods, geographical areas, pollutants, vehicle operating characteristics, and road types to be modeled. The model then performs a series of calculations, which have been carefully developed to accurately reflect vehicle operating processes (such as cold start or extended idle) and provide estimates of bulk emissions or emission rates. </para>
<para>An important feature of MOVES is that it allows users to choose between (1) the Inventory calculation type, which provides emission rates in terms of total quantity of emissions for a given time period; and (2) Emission Rate calculation type, which gives emission rates in terms of grams/mile or grams/vehicle/hour. For large-scale emissions modeling such as that needed for regional- and national-scale air quality modeling projects, it is desirable to use the Emission Rate calculation type, which populates emission rate lookup tables that can then be applied to many times and places, thus reducing the total number of MOVES runs required.</para>
<para>To reduce the time and effort and to help the user obtain more accurate modeling results, users need to prepare and post-process MOVES runs for a representative county (See <xref linkend="sect_concepts_reference_counties_moves" />) and reference fuel month (See <xref linkend="sect_concepts_moves_reference_fuel_month" />) This approach consists of a set of scripts that automate the proper use of the Emission Rate calculations for the purpose of estimating mobile-source emissions for air quality (AQ) modeling.</para>
<para>Integrating MOVES into SMOKE modeling system consists of three major parts:</para>
<para></para>
1) Meteorological data processing
<itemizedlist>
<listitem>
<para>The meteorological data preprocessor program <link linkend="sect_programs_met4moves"><command>Met4moves</command></link> prepares spatially and temporally averaged temperatures and relative humidity data to set up the meteorological input conditions for MOVES and SMOKE using the Meteorology-Chemistry Interface Processor (MCIP) output files.</para>
</listitem>
</itemizedlist>
2) MOVES model processing
<itemizedlist>
 <listitem><link linkend="section_moves_driver_script">The MOVES Driver script </link> : <command><quote>Runspec_generator.pl</quote></command> creates data importer files and the MOVES input file (runspec), which specifies the characteristics of the particular scenario to be modeled.</listitem>
 <listitem><link linkend="section_moves_postprocessing_scripts">The MOVES postprocessing scripts </link> : <command><quote>Moves2smkEF.pl</quote></command> formats the MOVES emission rate lookup tables for SMOKE. <command><quote>gen_8digit_scc.pl</quote></command> generates an SCC mapping file used when importing activity data.</listitem>
</itemizedlist>
3) SMOKE model processing
<itemizedlist>
 <listitem>Spatially and temporally allocate mobile onroad activity data (i.e., Vehicle Mileage Traveled and Vehicle population).</listitem>
 <listitem>The MOVES postprocessing program, <link linkend="sect_programs_movesmrg"><command>Movesmrg</command></link>, estimates emissions from on-road mobile sources based on MOVES-based emission rate lookup tables and meteorology data from <command>Met4moves</command>.</listitem>
 <listitem>Creates hourly gridded speciated air quality model-ready input files.</listitem>
 <listitem>Produces various types of reports for users.</listitem>
</itemizedlist>
<para></para>
<section>
	<title>Meteorology Data Processing</title>
		<para>With the specified representative county and reference fuel month approach for temperature and RH calculation for MOVES and SMOKE modeling systems, <command>Met4moves</command> uses hourly min/max temperatures and averaged RH over the spatial region that includes all of the inventory counties in a county group over the user-defined modeling period. <command>Met4moves</command> supports the averaging method (monthly or daily) to create min/max temperatures and averaged RH for all inventory counties in the county group(s). <command>Met4moves</command> determines the min/max grid cell temperatures and associated RH for both SMOKE and MOVES, and computes average 24-hour temperature profiles using Meteorology-Chemistry Interface Processor (MCIP) output files for use in MOVES. These <command>Met4moves</command> program is discussed in detail in <xref linkend="sect_programs_met4moves" />. </para>
		<para>The 24-hour temperature profiles are averaged over a user-specified time period and grid cells for all representative counties. For <link linkend="section_moves_utilities">the MOVES Driver scripts</link>, <command>Met4moves</command> outputs monthly average RH, min/max temperatures, and 24-hour temperature profiles in local time for all representative counties into one output file. For the SMOKE model, <command>Met4moves</command> outputs county-specific min/max temperatures and averaged RH values in local time for every inventory county and averaging period in the modeling inventory.</para>
	</section>
<section>
	<title>MOVES Model Processing</title>
	<section>
	<title>MOVES Driver Script</title>
	<para>The inputs to <link linkend="section_moves_driver_script">the MOVES Driver Script</link> (<command>Runspec_generator.pl</command>) include the temperature and humidity conditions output from <command>Met4moves</command> and two additional inputs, the RunControl and RepCounty files.  The RunControl file contains pollutant selections and the file path location to the <command>Met4moves</command> output.  The RepCounty file contains file path locations to user-created MOVES-formatted inputs for age distribution, fuel supply and formulation, inspection and maintenance programs, county level population and annual VMT for each representative county.  Precise formats of the RunControl and RepCounty files can be found in the <xref linkend="sect_input_refcountyfile" /> and <xref linkend="sect_input_runctlfile" />.</para>
</section>

<section>
	<title>MOVES Post-processing Script</title>
		<para>A MOVES Driver Script to automates the MOVES run setups and prepare two kinds of batch files.  When launched, the batch files instruct MOVES to:</para>
	<itemizedlist>
		<listitem>
		<para>import data into MySQL County Scale databases</para>
		</listitem>
		<listitem>
		<para>run MOVES for each runspec file</para>
		</listitem>
	</itemizedlist>
	<para>The resulting RatePerDistance (RPD), RatePerVehicle (RPV), RatePerProfile (RPP), and RatePerHour (RPH) tables contain all the conditions needed for regional modeling using SMOKE modeling system.</para>
	<para>Once a MOVES batch run completes, MOVES populates the four output lookup tables with formats listed in <xref linkend="tbl_concepts_moves_lookup_tbl_mv_format" />.   A <link linkend="section_moves_postprocessing_scripts"> MOVES Post-processing Script</link> (<command>Moves2smkEF.pl</command>) in Perl interacts with MySQL to modify the default formats shown in <xref linkend="tbl_concepts_moves_lookup_tbl_mv_format" /> into a text ASCII-format readable by SMOKE, shown in <xref linkend="tbl_concepts_moves_lookup_tbl_sm_format" />.</para>
<table id="tbl_concepts_moves_lookup_tbl_mv_format">
<title>MOVES Emission Rate Lookup Table (MOVES Format)</title>

<tgroup cols="4">
<colspec colname="c1" colwidth="10*" />
<colspec colname="c2" colwidth="10*" />
<colspec colname="c3" colwidth="10*" />
<colspec colname="c4" colwidth="10*" />

<thead>
<row>
<entry morerows="1" valign="bottom" align="center">RatePerDistance (grams/mile)</entry>
<entry morerows="1" valign="bottom" align="center">RatePerVehicle (grams/vehicle/hour)</entry>
<entry morerows="1" valign="bottom" align="center">RatePerProfile (grams/vehicle/hour)</entry>
<entry morerows="1" valign="bottom" align="center">RatePerHour (grams/activity-hour)</entry>
</row>
</thead>

<tbody>
<row>
<entry>MOVESScenarioID</entry>
<entry>MOVESScenarioID</entry>
<entry>MOVESScenarioID</entry>
<entry>MOVESScenarioID</entry>
</row>
<row>
<entry>MOVESRunID</entry>
<entry>MOVESRunID</entry>
<entry>MOVESRunID</entry>
<entry>MOVESRunID</entry>
</row>
<row>
	<entry>yearID</entry>
	<entry>yearID</entry>
	<entry>temperatureProfileID</entry>
	<entry>yearID</entry>
</row>
<row>
	<entry>monthID</entry>
	<entry>monthID</entry>
	<entry>yearID</entry>
	<entry>monthID</entry>
</row>
<row>
		<entry>dayID</entry>
		<entry>dayID</entry>
		<entry>dayID</entry>
		<entry>dayID</entry>
</row>
<row>
    <entry>hourID</entry>
    <entry>hourID</entry>
    <entry>hourID</entry>
    <entry>hourID</entry>
</row>
<row>
	<entry>linkID</entry>
	<entry>zoneID</entry>
	<entry>pollutantID</entry>
	<entry>linkID</entry>
</row>
<row>
	<entry>pollutantID</entry>
	<entry>pollutantID</entry>
	<entry>processID</entry>
	<entry>pollutantID</entry>
</row>
<row>
	<entry>processID</entry>
	<entry>processID</entry>
	<entry>sourceTypeID</entry>
	<entry>processID</entry>
</row>
<row>
	<entry>sourceTypeID</entry>
	<entry>sourceTypeID</entry>
	<entry>SCC</entry>
	<entry>sourceTypeID</entry>
</row>
<row>
	<entry>SCC</entry>
	<entry>SCC</entry>
	<entry>fuelTypeID</entry>
	<entry>SCC</entry>
</row>
<row>
	<entry>fuelTypeID</entry>
	<entry>fuelTypeID</entry>
	<entry>modelYearID</entry>
	<entry>fuelTypeID</entry>
</row>
<row>
	<entry>modelYearID</entry>
	<entry>modelYearID</entry>
	<entry>Temperature</entry>
	<entry>modelYearID</entry>
</row>
<row>
	<entry>roadTypeID</entry>
	<entry>Temperature</entry>
	<entry>RatePerProfile</entry>
	<entry>roadTypeID</entry>
</row>
<row>
	<entry>avgSpeedBinID</entry>
	<entry>RatePerVehicle</entry>
	<entry></entry>
	<entry>Temperature</entry>
</row>
<row>
	<entry>Temperature</entry>
	<entry></entry>
	<entry></entry>
	<entry>RatePerHour</entry>
</row>
<row>
	<entry>relHumidity</entry>
</row>
<row>
	<entry>RatePerDistance</entry>
</row>
</tbody>
</tgroup>
</table>

<para>The MOVES Post-processing Script converts the MOVES format Emission Rate Lookup Table to the SMOKE format Emission Rate Table by performing the following:</para>
<itemizedlist>
	<listitem>
		<para>Parses the state-county FIPS code from linkID, zoneID or TemperatureProfileID and stores it as a unique field, FIPS.</para>
	</listitem>
<listitem>
	<para>Removes fields that are uninformative to SMOKE, including the MOVES source type,  fuel type and road type,  hourID in RatePerDistance, TemperatureProfileID in RatePerProfile.</para>
</listitem>
<listitem>
	<para>Reduces the output database table size by performing a cross-tab query on the pollutant emissions, listing each pollutant in a separate field rather than in a single column with a higher number of data records.  The script also sorts the lookup tables by countyID, monthID and SCC for more efficient processing in SMOKE.</para>
</listitem>
<listitem>
  <para>Optionally calculates additional output pollutants or species by applying user-specified formulas to MOVES-created emission factors.</para>
</listitem>
<listitem>
	<para>Write out to the ASCII-formatted four processed SMOKE-ready MOVES lookup tables [RatePerDistance (RPD), RatePerVehicle (RPV),  RatePerProfile (RPP), and RatePerHour (RPH)].</para>
</listitem>
</itemizedlist>

<table id="tbl_concepts_moves_lookup_tbl_sm_format">
<title>MOVES Emission Rate Lookup Table (SMOKE Format)</title>
<tgroup cols="4">
<colspec colname="c1" colwidth="10*" />
<colspec colname="c2" colwidth="10*" />
<colspec colname="c3" colwidth="10*" />
<colspec colname="c3" colwidth="10*" />

<thead>
<row>
<entry morerows="1" valign="bottom" align="center">RatePerDistance (grams/mile)</entry>
<entry morerows="1" valign="bottom" align="center">RatePerVehicle (grams/vehicle/hour)</entry>
<entry morerows="1" valign="bottom" align="center">RatePerStart (grams/no of starts)</entry>
<entry morerows="1" valign="bottom" align="center">RatePerProfile (grams/vehicle/hour)</entry>
<entry morerows="1" valign="bottom" align="center">RatePerHour (grams/activity-hour)</entry>
<entry morerows="1" valign="bottom" align="center">RatePerHour_ONI (grams/activity-hour)</entry>
</row>
</thead>

<tbody>
<row>
<entry>MOVESScenarioID</entry>
<entry>MOVESScenarioID</entry>
<entry>MOVESScenarioID</entry>
<entry>MOVESScenarioID</entry>
<entry>MOVESScenarioID</entry>
<entry>MOVESScenarioID</entry>
</row>
<row>
	<entry>yearID</entry>
	<entry>yearID</entry>
	<entry>yearID</entry>
	<entry>yearID</entry>
	<entry>yearID</entry>
	<entry>yearID</entry>
</row>
<row>
	<entry>monthID</entry>
	<entry>monthID</entry>
	<entry>monthID</entry>
	<entry>monthID</entry>
	<entry>monthID</entry>
	<entry>monthID</entry>
</row>
<row>
	<entry>FIPS</entry>
	<entry>dayID</entry>
	<entry>dayID</entry>
	<entry>dayID</entry>
	<entry>FIPS</entry>
	<entry>FIPS</entry>
</row>
<row>
    <entry>SCC</entry>
    <entry>hourID</entry>
    <entry>hourID</entry>
    <entry>hourID</entry>
    <entry>SCC</entry>
    <entry>SCC</entry>
</row>
<row>
	<entry>avgSpeedBin</entry>
	<entry>FIPS</entry>
	<entry>FIPS</entry>
	<entry>FIPS</entry>
	<entry>temperature</entry>
	<entry>temperature</entry>
</row>
<row>
	<entry>temperature</entry>
	<entry>SCC</entry>
	<entry>SCC</entry>
	<entry>SCC</entry>
	<entry>CO</entry>
	<entry>CO</entry>
</row>
<row>
	<entry>relHumidity</entry>
	<entry>temperature (for each grid cell)</entry>
	<entry>temperature (for each grid cell)</entry>
	<entry>temperature (24hr temporal profile)</entry>
	<entry>TOG</entry>
	<entry>TOG</entry>
</row>
<row>
	<entry>CO</entry>
	<entry>CO</entry>
	<entry>CO</entry>
	<entry>THC</entry>
	<entry>BENZENE</entry>
	<entry>BENZENE</entry>
</row>
<row>
	<entry>TOG</entry>
	<entry>NOX</entry>
	<entry>NOX</entry>
	<entry>TOG</entry>
	<entry>NOX</entry>
	<entry>NOX</entry>
</row>
<row>
	<entry>BENZENE</entry>
	<entry>PM10OC</entry>
	<entry>PM10OC</entry>
	<entry>VOC</entry>
	<entry>VOC</entry>
	<entry>VOC</entry>
</row>
<row>
    <entry>...</entry>
	<entry>...</entry>
	<entry>...</entry>
	<entry>...</entry>
	<entry>...</entry>
	<entry>...</entry>
</row>
</tbody>
</tgroup>
</table>
</section>
</section>
<section>
	<title>SMOKE Model Processing</title>
	<para>Once the <command>Met4moves</command> meteorology preprocessor and the MOVES model processing that results in the SMOKE-formatted emissions factor lookup tables are completed, we address the remaining major component of the SMOKE-MOVES tool: the SMOKE model processing step. The goals of this step are (1) to estimate emissions from on-road mobile sources based on MOVES-based emissions lookup tables and meteorology data, (2) to create hourly gridded speciated air quality model-ready input files, and (3) to produce various types of reports for the user.</para>
	<para>As some readers are aware, MOBILE6 and MOVES are both vehicle emissions modeling systems used with SMOKE. However, they differ in their approaches to calculating off-network evaporative emissions. In MOBILE6, off-network emissions processes are calculated as emission factors in grams/mile, which is related to Vehicle Mileage Travelled (VMT). MOVES, on the other hand, uses the source (vehicle) type population (VPOP) to calculate start and off-network evaporative emissions, which are assigned to off-network emissions processes; these processes are hour-dependent due to VPOP (activity) assumptions built into the MOVES model. Thus, compared to the SMOKE-MOBILE6 approach, the SMOKE-MOVES approach requires additional vehicle population inventory data as input for estimating mobile-source emissions from off-network emissions processes. This requirement is reflected in the discussion below.</para>
  <para>When processing mobile-source emissions from MOVES, SMOKE performs the following basic steps.</para>
<itemizedlist>
	<listitem><link linkend="sect_programs_smkinven"><command>Smkinven</command></link> imports county-total VMT and average speed, and county-total vehicle hotelling hours (HOTELLING) by SCC for On-roadway Emission Processes and county-total vehicle population (VPOP) by vehicle type for Off-network Emission Processes.</listitem>
  <listitem><link linkend="sect_programs_spcmat"><command>Spcmat</command></link> computes the chemical speciation factors for each county, fuel type, source (=vehicle) type, road type, emission process, and pollutant, and stores the necessary factors for the VMT-to-species in RatePerDistance (RPD), HOTELLING-to-species in RatePerHour (RPH), and VPOP-to-species transformations in RatePerVehicle (RPV) and RatePerProfile (RPP).</listitem>
  <listitem><link linkend="sect_programs_grdmat"><command>Grdmat</command></link> allocates the county sources to grid cells and uses spatial surrogates to allocate county-total VMT, HOTELLING, and VPOP to grid cells, storing the factors needed for these allocations.</listitem>
  <listitem><link linkend="sect_programs_temporal"><command>Temporal</command></link> computes hourly VMT and HOTELLING activity data for On-roadway Emission Processes (RPD and RPH tables). Off-network Emission Processes (RPV and RPP tables) do not require the <command>Temporal</command> program because vehicle population does not need to be temporally allocated.</listitem>
</itemizedlist>
<para>The way the MOVES-generated emissions factor lookup tables are used varies according to whether SMOKE is modeling on-roadway emission processes or off-network emissions processes.</para>
<para>On-roadway emission processes: When estimates of all on-roadway emission processes except for extended idle exhaust are needed, SMOKE requires county-total VMT and average hourly speed (SPEED) inventory data as inputs to a SMOKE postprocessor called <command>Movesmrg</command> which is part of the SMOKE-MOVES tool. When extended idle exhaust process is estimated, counthy-total HOTELLING activity data is needed for <command>Movesmrg</command>. <command>Movesmrg</command> uses <link linkend="tbl_concepts_moves_lookup_tbl_sm_format">the SMOKE-ready MOVES RPD and RPH lookup tables</link> as an input to estimate on-road sources emission. The key lookup fields for RPD are gridded hourly temperature and average hourly speed from the avgSpeedBinID field for RPD. SMOKE interpolates in the emission factors lookup table (in unit of grams/mile) based on gridded hourly temperature and average speed. <xref linkend="fig_concepts_mobile_moves_onroad"/> shows processing steps for on-roadway emissions processes in the SMOKE system using VMT and SPEED activity inventory data. For RPH, the gridded houlry temperature is used to interpolate in the emission factors lookup table (in unit of grams/hour).</para>
<para>Off-network emission processes: When estimates are needed for off-network emissions processes, including the off-network vapor venting emissions process, SMOKE uses county-total VPOP by vehicle type as input to <command>Movesmrg</command> together with the SMOKE-ready RatePerVehicle (RPV) and RatePerProfile (RPP) lookup tables. <xref linkend="fig_concepts_mobile_moves_offroad"/> shows processing steps for off-network emissions processes in the SMOKE modeling system using vehicle population activity inventory data. A significant difference in the processing steps between the on-roadway emissions processes (RPD table) and the off-network emissions processes (RPV and RPP tables) is that off-network emissions processing does not require the <command>Temporal</command> program step because vehicle population (VPOP) does not need to be temporally allocated. In the RPV table, gridded hourly temperature and hour of the day are the key lookup fields SMOKE uses to estimate hourly off-network emissions in unit of grams/vehicle/hour. For the evaporative fuel off-network vapor venting emissions process only, <command>Movesmrg</command> uses the RPP lookup table to estimate the emission rates based on the minimum and maximum temperatures computed by <command>Met4moves</command>.</para>
<para>For all SMOKE-ready three emission rate lookup tables (RPD, RPV, RPP), SMOKE performs linear interpolation when using them.</para>
</section>
</section>

<section id="sect_concepts_biogenic_processing">

<title>Biogenic processing</title>

<para>SMOKE currently supports the Biogenic Emissions Inventory System (BEIS) 4 and previous versions for processing emissions from biogenic origin. The overall processing performed by these models is quite different from the processing done for anthropogenic source categories. BEIS4 starts with spatial allocation of land use data as the first processing step (which is the same as importing the raw inventory data for anthropogenic sources). In the next step, the model computes compute normalized emissions for each grid cell and land use category. The final step is adjusting the normalized emissions based on gridded, hourly meteorology data and assigning the chemical species to output a model-ready biogenic emissions file. The following two subsections provide more detail about the most recent biogenic model implementation in SMOKE and give references to additional detail elsewhere.</para>

<section>

<title>BEIS4 processing</title>

<para>The concepts in BEIS4 are similar to those in BEIS3 and previous versions, except the program <command>Normbeis4</command> reads gridded land use data from a single file in I/O API input format  (<envar>BELD6</envar>), and gridded Leaf Biomass inputs from a separate file <envar>BIOMASS</envar>. <command>Normbeis4</command> creates normalized biogenic emissions for both winter and summer. The <command>Tmpbeis4</command> program reads the MCIP meteorology data and makes these adjustments; these include the effects of temperature and solar radiation. <command>Tmpbeis4</command> gives the user two options for using the winter and summer normalized emissions.  The recommended option is to use the gridded soil temperature from the MCIP data to determine the fraction of summer and winter normalized emissions to use for each grid cell.  When gridded soil temperature is greater or equal to 290K, the summer normalized emissions will be used in the hourly emissions calculations.   If gridded soil temperature is below 282K, then the winter normalized emissions will be used in calculations.   Any grid cell with a soil temperature between 282K and 290K will use a certain fraction of winter and a fraction of the summer normalized emissions in hourly emissions calculations.  This recommended option should provide more realistic emissions during season transition periods (e.g. winter to spring). The option for using winter and summer normalized emissions is the same approach with <command>Tmpbeis3</command> where <command>Metscan</command> is used. The winter emission factors are to be used during the "winter" period, which is defined as being after the first date of freezing ground temperature and before the last date of freezing ground temperature. The SMOKE <command>Metscan</command> program can determine this time period and create a gridded file that indicates whether each grid cell is in a winter or summer period for each day of the year. Using this file, which is optional, will cause <command>Tmpbeis4</command> to read and use both the winter and summer normalized emissions files from <command>Normbeis4</command>. Whether the winter or summer normalized emissions are used for a given grid cell and hour is set by the output from the <command>Metscan</command> program. Users can define the chemical species that are output from <command>Tmpbeis4</command> using the speciation profile file, <envar>GSPRO</envar>. In this file, you can set an environment variable (<envar>BIOG_SPRO</envar>) that indicates which speciation profile code should be used for biogenic emissions. More information on <command>Normbeis4</command> and <command>Tmpbeis4</command> is provided in <xref linkend="sect_programs_normbeis4" /> and <xref linkend="sect_programs_tmpbeis4" />.</para>

</section>

</section>

<section id="sect_concepts_merge_processing">

<title>Creating model-ready emissions</title>

<para>Creating emissions with SMOKE that are ready for input to an AQM must always include merging the hourly emissions created during temporal processing with the gridding matrices and the speciation matrices. In addition, for point sources for CMAQ, creating the model-ready emissions must also include merging with the layer fractions (see <xref linkend="sect_concepts_compute_layer_cmaq" />), and for UAM-based models it must include creating the ASCII elevated file (see <xref linkend="sect_concepts_create_elevated_uam" />). The <command>Smkmerge</command> program performs these processing steps using vector-matrix multiplication to combine the matrices and layer fractions with the hourly emissions vectors from the <command>Temporal</command> program.</para>

<para>If the overall SMOKE processing setup includes running the nonroad mobile category (or other source categories) as separate runs<comment> (see <xref linkend="ch_scripts" />)</comment>, then <command>Smkmerge</command> cannot be used to combine all source categories into a single output file. Instead, the <link linkend="sect_programs_movesmrg"><command>Mrggrid</command></link> program would combine the model-ready files from the individual source categories; for example, from separate SMOKE (including <command>Smkmerge</command>) runs for stationary area/nonpoint, nonroad mobile, windblown dust, wildfire, on-road mobile, and point sources. There is no limit to the number of model-ready files that <command>Mrggrid</command> can combine into a single model-ready file, and the input files can be 2-D or 3-D.</para>

<para><command>Smkmerge</command> can be run for any or all SMOKE source categories, but it can use only one of each SMOKE inventory type (area, biogenic, mobile, and point source) per run. You can run it to create model-ready files for only one SMOKE source category (area, biogenic, mobile, or point), or you can run it to create both the individual and combined model-ready files. The following list indicates the modes in which <command>Smkmerge</command> can be run:</para>

<itemizedlist>
<listitem>
<para>Run for SMOKE area sources to create gridded, hourly, speciated emissions in moles/hour or moles/second. Can be used for all area sources and/or nonroad mobile sources.</para>
</listitem>
<listitem>
<para>Run for SMOKE mobile sources to create gridded, hourly, speciated emissions in moles/hour or moles/ second.</para>
</listitem>
<listitem>
<para>Run for SMOKE point sources to create 3-D gridded, hourly, speciated emissions in moles/ second for CMAQ, <emphasis>or</emphasis> run to create 2-D gridded, hourly, speciated emissions in moles/hour for UAM, or CAM<subscript>X</subscript> and an ASCII elevated-point-source file.</para>
</listitem>
<listitem>
<para>Run to convert the units and calculate state and county totals of biogenic emissions output by the <command>Tmpbeis4</command> program.</para>
</listitem>
<listitem>
<para>Run to perform any combination of the previously listed steps simultaneously and create a combined model-ready file that includes multiple source categories. In this mode, only one each of SMOKE area, mobile, point, and biogenic sources can be included. The same output units must be used for all source categories in a single run.</para>
</listitem>
</itemizedlist>

<para>When creating model-ready emissions for any of the anthropogenic source categories, you may choose to apply one or more control matrices to the emissions to create controlled model-ready emissions. For each source category (area, mobile, or point), you can apply one multiplicative control matrix and one reactivity control matrix per run per source category. <command>Smkmerge</command> is the only way you can apply the reactivity control matrix to the inventory, while the multiplicative control matrix can be applied by either <command>Smkmerge</command> or the <command>Grwinven</command> program.</para>

<para>Many processing steps in SMOKE are independent of one another; for example, chemical speciation and temporal allocation can change without affecting one another. This independence means that when one step changes, another step does not need to be rerun in many cases. However, because <command>Smkmerge</command> combines the data from all of these processing steps to create the model-ready emissions, if one of the earlier steps changes, then the merging step must also be rerun. This includes rerunning <command>Smkmerge</command> to generate model-ready files, and if the <command>Mrggrid</command> program was used, also rerunning that to merge data from multiple source categories together.</para>

<para><command>Smkmerge</command> also has the ability to input hourly emissions by day of the week and reuse days that are the same. For example, it can input separate hourly emissions files for Monday, a weekday, Saturday, and Sunday, and use these four files to generate model-ready emissions for every day in an entire month. This is accomplished using the <envar>MRG_BYDAY</envar> SMOKE option, described further in <xref linkend="sect_programs_smkmerge" />. Special treatment can also be given to holidays in this case, since users generally wish to model holidays differently than other days. <comment><xref linkend="sect_scripts_mwss_approach" /> explains more about how to configure scripts for processing with a Monday, weekday, Saturday, and Sunday approach.</comment></para>

</section>

<section id="sect_concepts_merge_processing_moves">

<title>Creating model-ready emissions using MOVES lookup tables</title>

<para>Creating emissions with SMOKE using MOVES lookup tables (i.e., RatePerDistance [RPD], RatePerHour [RPH], RatePerVehicle [RPV] and RatePerProfile [RPP] ) must always include merging the gridding matrices from <command>Grdmat</command> and the chemical speciation matrices from <command>Spcmat</command>. In addition, for RPD emissions based on VMT data by source for CMAQ, creating the model-ready emissions must also include merging with the hourly VMT from <command>Temporal</command>. The <command>Movesmrg</command> program performs these processing steps using vector-matrix multiplication to combine the matrices and the hourly emissions vectors to create CMAQ-ready gridded and speciated hourly emissions input data.</para>

<para>If the overall SMOKE processing setup includes running the nonroad mobile category (or other source categories) as separate runs<comment> (see <xref linkend="ch_scripts" />)</comment>, then <command>Movesmrg</command> cannot be used to combine all source categories into a single output file. Instead, the <link linkend="sect_programs_mrggrid"><command>Mrggrid</command></link> program would combine the model-ready files from the individual source categories; for example, from separate SMOKE (including <command>Movesmrg</command>) runs for RPD, RPV and RPP mobile sources. There is no limit to the number of model-ready files that <command>Mrggrid</command> can combine into a single model-ready file, and the input files can be 2-D or 3-D.</para>

<para><command>Movesmrg</command> can be used only for MOVES-based mobile SMOKE source categories. It can use only one of each MOVES lookup table (RPD, RPH, RPV and RPV) per run. You can run it to create model-ready files for only one SMOKE source category (RPD, RPH, RPV, and RPP). The following list indicates the modes in which <command>Movesmrg</command> can be run:</para>

<itemizedlist>
<listitem>
<para>Run for MOVES RPD mobile sources based on VMT data by vehicle and road types to create gridded, hourly, speciated emissions in moles/hour or moles/second.</para>
</listitem>
<listitem>
<para>Run for MOVES RPH mobile sources based on vehicle hotelling (HOTELLING) hours activity data by vehicle and road types to create gridded, hourly, speciated emissions in moles/hour or moles/second.</para>
</listitem>
<listitem>
<para>Run for MOVES RPV mobile sources based on Vehicle population (VPOP) data by vehicle to create gridded, hourly, speciated emissions in moles/hour or moles/second.</para>
</listitem>
<listitem>
<para>Run for MOVES RPP mobile sources based on Vehicle population (VPOP) data by vehicle to create gridded, hourly, speciated emissions in moles/hour or moles/second.</para>
</listitem>
<listitem>
<para>Run to create daily total emissions report by county, by state, and by SCC in the unit of tons/day or tons/hour.</para>
</listitem>
</itemizedlist>

<para>Many processing steps in SMOKE are independent of one another; for example, chemical speciation and temporal allocation can change without affecting one another. This independence means that when one step changes, another step does not need to be rerun in many cases. However, because <command>Movesmrg</command> combines the data from all of these processing steps to create the model-ready emissions, if one of the earlier steps changes, then the merging step must also be rerun. This includes rerunning <command>Movesmrg</command> to generate model-ready files, and if the <command>Mrggrid</command> program was used, also rerunning that to merge data from multiple source categories together.</para>

<para><command>Movesmrg</command> also has the ability to input hourly emissions by day of the week and reuse days that are the same. For example, it can input separate hourly emissions files for Monday, a weekday, Saturday, and Sunday, and use these four files to generate model-ready emissions for every day in an entire month. This is accomplished using the <envar>MRG_BYDAY</envar> SMOKE option, described further in <xref linkend="sect_programs_movesmrg" />. Special treatment can also be given to holidays in this case, since users generally wish to model holidays differently than other days. <comment><xref linkend="sect_scripts_mwss_approach" /> explains more about how to configure scripts for processing with a Monday, weekday, Saturday, and Sunday approach.</comment></para>

</section>

<section id="sect_concepts_qa_processing">

<title>Quality assurance</title>

<para>Quality assuring SMOKE emissions includes a combination of (1) steps performed by SMOKE programs and (2) postprocessing steps performed by the user.</para>

<para>The SMOKE components that play a role in quality assurance consist of the following:</para>

<orderedlist>
<listitem>
<para>The various SMOKE programs perform file format checks of all input files to ensure that the files can be read. The programs write errors and warnings if files cannot be read properly.</para>
</listitem>
<listitem>
<para>SMOKE gives error and warning messages about inventory data that are not complete or are invalid, and about problems or possible problems combining the inventory data with the support files.</para>
</listitem>
<listitem>
<para>Core SMOKE programs create reports, such as the area-to-point report provided by <command>Smkinven</command> and the control reports provided by <command>Cntlmat</command>.</para>
</listitem>
<listitem>
<para>The <command>Smkreport</command> program reports emissions totals at various levels of data aggregation. This reporting capability currently allows you to generate reports of emissions by source, SCC, region (e.g., state, county, or user-defined region), road class, layer, hour, grid cell, speciation profile, gridding surrogate code, temporal profile, and elevated status. The most powerful reporting feature is that you can combine these reporting resolutions in <emphasis>any</emphasis> combination. In addition, reports can be created at each stage of processing (import, gridding, speciation, temporal allocation, layer assignment) or any combination of stages.</para>
</listitem>
</orderedlist>

<para><command>Smkreport</command> can combine information from any SMOKE intermediate files (e.g., intermediate inventory file, speciation matrix, gridding matrix) to create emissions reports. One input file to <command>Smkreport</command>, called the <envar>REPCONFIG</envar> file, instructs <command>Smkreport</command> on how many and which reports to create. The <envar>REPCONFIG</envar> file contains a series of instructions that can be set by the user to control the contents of the reports. More details about <command>Smkreport</command> and the <envar>REPCONFIG</envar> are provided in <xref linkend="sect_qa_smkreport" /> and <xref linkend="sect_qa_repconfig" />.</para>

<para>The second major component of quality assuring the SMOKE emissions processing involves users taking steps to evaluate the information/reports provided by SMOKE. These steps include:</para>

<orderedlist>
<listitem>
<para>Check that the correct settings have been selected in the run scripts, including the settings that control which SMOKE programs are run.</para>
</listitem>
<listitem>
<para>Check the log files from all SMOKE programs for errors and warnings. Errors will keep the programs from running successfully, so the source of the error must be identified and repaired. Warnings may indicate that a problem exists that needs to be addressed, or warnings can be ignored if they are not something that will impact the results for the particular inventory of interest.</para>
</listitem>
<listitem>
<para>Compare the emissions totals provided by <command>Smkreport</command> (e.g., by state and county) to totals of the emissions inventories computed outside of SMOKE. Also, compare the emissions totals from SMOKE between each of the processing stages. For example, for area sources, compare the emissions after inventory import, gridding, chemical speciation, temporal allocation, and final merge to ensure that the emissions are consistent from step to step. This involves some subjectivity because the emissions do in fact change from step to step, and the magnitude of those changes depends on the support input files SMOKE uses with the inventory.</para>
</listitem>
<listitem>
<para>Check that the correct chemical speciation profiles, temporal profiles, and gridding surrogates were applied, using reports that provide this information from <command>Smkreport</command>.</para>
</listitem>
<listitem>
<para>Perform other specific checks of <command>Smkreport</command> outputs, such as ensuring that the correct major point sources are in the inventory, comparing population-normalized emissions among the counties, and checking stack parameters from point sources.</para>
</listitem>
<listitem>
<para>Ensure that the emissions data look reasonable by viewing them in the Visualization Environment for Rich Data Interpretation (VERDI).</para>
</listitem>
</orderedlist>

<para><xref linkend="ch_quality_assurance" /> provides much additional detail about how to proceed with quality assuring your inventories and emissions processing.</para>

</section>
</chapter>