- Projections of federal outlays, as required by law, reflect the assumption that current laws will generally remain unchanged. Those projections encompass the current year—the year in which the projections are made and a projection period of 5 or 10 years in the future.
- BudgetPy incorporates machine learning and artificial intelligence algorithms to extract insights from large datasets.
- Vector embeddings and predictive modeling to forecast contaminant spread and resource optimization to allocate resources effectively during emergencies.
- BudgetPy interacts with pre-trained Large Language Models (LLMs) like GPT-4o, o3, and o1-mini to enhance its analytical capabilities.
- Users leverage LLMs for rapid information retrieval from vast datasets, automated report generation, and potentially even expert consultation
- The Outlay Projector is a forecasting model that uses historical expenditure data, generative AI, and machine-learning to project future outlays by agency and fiscal year.
- Uses historical budget data from the Office of Management & Budget from FY1962 to FY2024 to predict FY2025 and beyond.
- Current data sets available via Kaggle
- Time Series Forecasting – Leverages ARIMA and Holt-Winters models to analyze seasonal and trend-based variations in budgetary spending.
- Batch Processing for Large Datasets – Optimized for handling extensive federal financial data without memory overload.
- Feature Engineering & Correlation Analysis – Utilizes PCA, Min-Max Scaling, Z-score Standardization, and K-Means clustering to enhance model performance.
- Automated Outlay Projections – Provides yearly budget forecasts per agency with a simple data frame output.
- Outlay Project Model
- Mutliple data providers including SQLite, MS Access, and SQL Servers Express Edition through pyodbc
- Charting, plotting and reporting with matplotlib, dash, and pandas.
- Pre-defined schema for 100 environmental data tables.
- Access to editors for SQLite, MS Access, and SQL CE.
- Vectorization is the process of converting textual data into numerical vectors and is a process that is usually applied once the text is cleaned.
- It can help improve the execution speed and reduce the training time of your code.
- BudgetPy provides the following vector stores on the OpenAI platform to support environmental data analysis with machine-learning
-
Federal Appropriations - vectorized data set of federal appropriations available for fine-tuning learning models
-
Federal Regulations - vectorized dat aset of federal, financial regulations available for fine-tuning learning models
- Loads federal budget data from "Budget Outlays.xlsx".
- Filters fiscal year data (2012–2024) and groups outlays by agency.
- Handles missing values and data inconsistencies.
- Uses Random Forest Regression as the primary predictive model.
- Splits data into training (80%) and testing (20%) sets.
- Trains on FY2012-FY2023 to predict FY2024 and validates performance.
- Predicts FY2025 outlays for each federal agency.
- Outputs results in a structured data frame for easy interpretation.
git clone https://github.com/your-repo/federal-budget-forecast.git
cd federal-budget-forecast
- Minion - other tools used and available in BudgetPy.
- Booger - controls for the user interface and related functionality.
- Data - data access layer with environmental budget data models.
- FileSys - classes for interacting with the file system and input/output.
- Static - enumerations used in budgetary data analysis.
- Schema - schema definitions of the BudgetPy data tables.
- Ninja- budget data model classes for environmental programs.
- SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine. Learn more here
- SQL Server Express Edition is a scaled down, free edition of SQL Server, which includes the core database engine. Learn more here
- MS Access is a database management system (DBMS) from Microsoft that combines the relational Access Database Engine (ACE) with a graphical user interface and software-development tools. Learn more here
- See the User Guide for steps to get started.
- You will need these Requirements
BudgetPy uses free code signing provided by SignPath.io and a free code signing certificate from SignPath Foundation.
This program will not transfer any information to other networked systems unless specifically requested by the user or the person installing or operating it.
BudgetPy has integrated the following services for additional functions, which can be enabled or disabled at the first start (in the welcome dialog) or at any time in the settings:
- api.github.com (Check for program updates)
- ipify.org (Retrieve the public IP address used by the client)
- ip-api.com (Retrieve network information such as geo location, ISP, DNS resolver used, etc. used by the client)
BudgetPy uses the following projects and libraries. Please consider supporting them as well (e.g., by starring their repositories):
Project | Description |
---|---|
SciPy | Fundamental algorithms for scientific computing in Python |
Tensorflow | An end-to-end platform for machine learning |
Pandas | An open source, easy-to-use data structures and data analysis tools for the Python programming language. |
Numpy | The fundamental package for scientific computing with Python |
Keras | Deep learning API written in Python and capable of running on top of either JAX, TensorFlow, or PyTorch. |
PyTorch | Tensors and Dynamic neural networks in Python with strong GPU acceleration |
pyodbc | pyodbc is an open source Python module that makes accessing ODBC databases simple. |
PySimplGUI | Python GUIs for Humans! PySimpleGUI is the top-rated Python application development environment. |
Scikit-Learn | Machine learning in Python |
OpenAI | The official Python library for the OpenAI API |
BudgetPy is published under the MIT General Public License v3.
The licenses of the libraries used can be found here.