-
National Gallery of Art:
CSV files are automatically downloaded, extracted from open data ZIP files, and converted into a SQLite database file. -
Metropolitan Museum of Art:
This module, currently used as a script, processes open data from the Metropolitan Museum of Art (The Met). It downloads and filters the museum's public domain painting data, then enriches it by scraping image URLs from The Met's website. Key features include:- Reads and filters The Met's open access CSV file
- Uses Selenium WebDriver to fetch high-resolution image URLs for each painting
- Creates a new CSV file with filtered and enriched painting data
- Implements incremental updates, avoiding duplicate entries
- Handles errors and continues processing if issues arise with specific entries
The resulting dataset includes detailed information about public domain paintings from The Met, complete with direct links to high-quality images, ready for integration into the Open Art Viewer.
OpenArt is a Python-based project designed to download, process, and manage art data from the National Gallery of Art (NGA). As mentioned above, Metropolitan Museum of Art is currently standalone. OpenArt automates the retrieval of open data, processes it, and prepares it for further use or analysis.
-
Data Retrieval
- The
NGA
class handles downloading data from the National Gallery of Art's GitHub repository. - It checks for updates by comparing local file dates with the latest commit date on GitHub.
- If an update is needed, it downloads a ZIP file containing the latest data.
- The
-
Data Extraction and Processing
- The downloaded ZIP file is extracted to a specified directory.
- CSV files, particularly 'objects.csv' and 'published_images.csv', are processed.
- The
fix_nga_csv_in_folder
method prepares these files for merging.
-
Data Merging and Cleaning
- The
merge
method combines data from 'objects.csv' and 'published_images.csv'. - Unwanted columns are removed, and image properties are adjusted.
- The resulting data is saved back to a CSV file, with redundant files removed.
- The
-
User Interface
- A menu-driven interface (console) allows users to:
- Download new data
- Extract and process existing data
- View file information
- Perform database operations (SQLite)
- A menu-driven interface (console) allows users to:
-
Constants and Configuration
- The
Constants
class centralizes important variables and paths used throughout the project.
- The
-
File Handling
- The project uses both
os
andpathlib
for robust file and directory management across different operating systems.
- The project uses both
-
Error Handling and Logging
- The code includes error handling for download issues, file processing errors, and API requests.
-
Third-party Libraries
- Utilizes libraries like
requests
for API calls,pandas
for data manipulation, andBeautifulSoup
for web scraping.
- Utilizes libraries like
To the best of my knowledge, all or nearly all code in this project was written by me (Wartem), rather than generated by artificial intelligence or automated code generation tools. I utilized AI assistance to help identify and fix bugs in August 2024.
Moving forward, I plan to refactor this codebase with the aid of AI tools. However, this refactoring process will be conducted in a balanced manner, ensuring that the project's core structure and logic remain primarily my own work. The AI will be used as a tool to suggest improvements, optimize code, and help with best practices, but all final decisions and implementations will be made by me. This approach aims to enhance the project while maintaining my role as the primary author and architect of the codebase. This declaration reflects my commitment to transparency about the use of AI in the development process, while also affirming my central role in the project's creation and evolution.
- Objective: Enhance the project by integrating open data from additional museums beyond the National Gallery of Art (NGA), like Metropolitan Museum of Art.
The SQLite file created by OpenArt can be directly used with the Open Art Viewer.
Art Viewer
You can download it here: Open Art Viewer 1.0 (SQLite file is already included).
This art viewer is based on Flask and HTML: Open Art Web Viewer Project