This Python script performs Extract, Transform, Load (ETL) operations on medication stock data from an Excel file to a PostgreSQL database.
- Python 3.x
- pandas
- psycopg2
- SQLAlchemy
- Install the required Python packages:
pip install pandas psycopg2 sqlalchemy
- Set up a PostgreSQL database named
db_sediaan_obat
.
Before running the script, update the following variables in the code:
file_path
: Path to your Excel fileyear
: The year of the data (format: 'YYYY-')sheet_name
: The name of the Excel sheet (usually the month name in Indonesian)- Database connection details in the
create_engine
function
Run the script using Python:
python etl_process.py
The script performs the following operations:
- Extract: Reads data from a specified Excel file and sheet.
- Transform:
- Trims the DataFrame to relevant columns
- Cleans and renames columns
- Adds a 'Year-Month' column
- Calculates total usage
- Handles missing values
- Load: Inserts the transformed data into a PostgreSQL database table named 'stok_obat'.
The script will print a message indicating successful data processing and insertion, followed by the last few rows of the inserted data.
Ensure that the Excel file structure matches the expected format, with 'SISA AKHIR' as the keyword for the last relevant column.
If you encounter any issues related to data format or database connection, check the corresponding sections in the code and adjust as needed.