Skip to content

Latest commit

 

History

History
29 lines (19 loc) · 798 Bytes

README.md

File metadata and controls

29 lines (19 loc) · 798 Bytes

Article Scraper

Scrapes articles for their Title and Body for the given url.

Tech Stack

Python, BeautifulSoup

Running on local

  1. Fork the repo clicking on the fork button in the top right corner

  2. Clone the repo to your local machine using the following command

git clone https://github.com/<your-github-username>/article-scraper.git
  1. Packages to be installed: bs4==4.10.0, requests==2.22.0. Run the following:
pip install bs4==4.10.0 requests==2.22.0 validators==0.22.0
  1. Just run the scrape.py with url as the command line argument. There can be any number of urls.
python scrape.py https://www.link-to-the-article.comes/here https://www.maybe-another-article.com/
  1. Look for the saved articles in the same directory named by their index numbers.