A tool for scraping and generating Q&A pairs for RNN training using GPT-4.
- Python 3.9 or higher
- Poetry (package manager)
- OpenAI API Key
-
Install Poetry (if not already installed):
curl -sSL https://install.python-poetry.org | python3 -
-
Install dependencies:
poetry install
-
Set up environment variables:
- Copy
.env.example
to.env
- Add your OpenAI API Key to the
.env
file
- Copy
Start the scraper in the Poetry environment:
poetry run python scrape.\*.py
Generated Q&A pairs will be saved to `data/*.jsonl