- Assume you want to apply for jobs to specific companies that you care about and you only want the notifications to the specific category of jobs, example, data engineer, data analyst, or economist.
- The idea of this repo is to allow you to quickly do it and give you the flexibility to extend it (example, add ability to pull jobs from certain websites).
- Given the size of the database you need, you can easily store everything in your local box. In fact, one of the major motivation for this repo is doing everything locally.
- Set up once. Add the companies, job titles, your email and cron job. Then, you are done. (Well, thats the plan.)
- Add the companies you want. If the companies are not listed, extend it yourself (and maybe contribute to this project).
- For developers see this developer's readme.
- Get Docker
- Ensure you can run docker as a user. Check post-install for linux.
-
Clone repo
git clone https://github.com/thap2331/jobs_tracker.git
-
Go to jobs tracker directory, i.e. use command line and cd into it.
-
Copy
.env.example
and create.env
file.- Run command:
cp .env.example .env
- Fill out rest of
.env
file as needed- Fill your
email
andonePasswordEmail
if you want to send email from yourself.- See email set up below for more
- Fill your
- Run command:
-
Go to jobs tracker directory, i.e. use command line and cd into it.
-
To find current work directory and save in
.env
file, do below:- Find full path for this cloned repository. Use
pwd
. Copy and paste it in.env
file asabsolute_path=paste_path_here
.
- Find full path for this cloned repository. Use
-
Now, run
bash setup/one_time_setup.sh
to start containers and create tables in your database. This will also add a few sample rows.- Now go to localhost:5000. You should see a page with more data.
-
To see more data, ensure that you have psql (link for linux) and use these commands as you like.
- Run crawl
- For entry container and frontend, run
docker compose up prod_entrypoint frontend -d
- Wait for it to be done. Then run
docker exec -it prod_box bash -c "python scraping/crawl.py -f all"
. See argparse for more options.
- For entry container and frontend, run
- Use frontend to generate command line code for cron jobs
- Go to
localhost:5000
. If it fails, rundocker compose up prod_database frontend -d
- Go to tab
Cron Jobs Generator
. SelectAdd New Cronjob Entry
. - Rows to fill
- Absolute Path
- Check your
.env
file. Here you can addabsolute_path=
if it is not there yet. - Alternatively, you can manually find the repo path and add it there.
- Check your
- Job type
- Default crawl. You can also add email option.
- Cron Job
- Go to https://crontab.guru/ and copy paste as you desire.
- Box Type
- Default is linux. Others are yet to be tested. Godspeed.
- Absolute Path
- Hit submit. If you have absolute path in your env file, you can just hit submit if are ok with default options.
- Go to
- Copy
Fullcronjob
command and paste it in your bash command line. Usecrontab -l
to see all cron jobs. - Remove a cron jon: Copy
Remove cronjob
command and paste it in your bash command line. Usecrontab -l
to see all cron jobs.
- Linux: Easy support in Linux boxes for now.
- MacOs (M2 chip):(Tested with M2chip define an export platform )
- Other MacOS: To be tested
- Windows: Get (git) bash on windows. To be tested.
The resources below will help you set up one time password for your google account and you can send an email to yourself (from yourself).
- Stackoverflow to set up 1Password for Gmail (see first answer)
- Stackoverflow with images to set up 1Password for Gmail
For developer's see this developer's readme.