Skip to content

⛏️ A fully working (ATM) listings-to-json-file scraper for the website njuskalo.hr

License

Notifications You must be signed in to change notification settings

xxzoltanxx/NjuskaloScraper

Repository files navigation

Njuskalo scraper

An open-source Python program to scrape Njuskalo using Playwright and BeautifulSoup.

Use the software provided at your own risk. I cannot be held responsible for any potential consequences, including any potential damages.

Overview

This open-source program uses Python to scrape data from Njuskalo.hr. The program uses Playwright to navigate Njuskalo and BeautifulSoup to parse the HTML and extract relevant data. It then saves the data in json format inside the directory of your choosing.

You can scrape any category you choose, or whole tabs inside njuskalo (Nekretnine, Auto-Moto, etc...)

Installing and usage

1)Clone the repository

2)Navigate to the repository in your terminal

3)Run:

pip install -r requirements.txt

4)Run the program with

python main.py

Data format

  {
    "name": "ADVERT NAME",
    "location": "LOCATION DATA, KILOMETERS, YEAR OF CAR" ,
    "time": "DATE POSTED",
    "price": "PRICE"
  },

Language:

Requirements:

  • Python 3.x
  • Playwright
  • Streamlit
  • BeautifulSoup

Modules:

  • Playwright for web crawling
  • BeautifulSoup for HTML parsing
  • JSON for data formatting

About

⛏️ A fully working (ATM) listings-to-json-file scraper for the website njuskalo.hr

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages