Usecase? #1

samyogdhital · 2025-02-14T04:48:09Z

Hello there.
I am running a self hosted version of firecrawl. I saw your comment here in /dzhng/deep-research/issues/77.

I wanted to ask you, have you configured crawlrouter by yourself for deep-research repo locally?

If that is then how did you configure it?
I mean is it hot swappable or I have to do some configuration in that deep-research repo?

Thanks. I think this is awesome tool.

loorisr · 2025-02-14T10:00:56Z

Hello,

https://github.com/dzhng/deep-research needs a Firecrawl with the /search endpoint. It only use the /search endpoint to make and SERP and scraped the pages.

This repo allows you to use for example SearxNG as a self hosted search engine and then use your self-hosted version of Firecrawl or Crawl4AI or Jina (that can be self hosted) to scrape the pages.

In the deep-research repo you have to set up FIRECRAWL_BASE_URL="http://localhost:8000" (or to the ip where you run crawlrouter

samyogdhital · 2025-02-14T18:10:48Z

This repo allows you to use for example SearxNG as a self hosted search engine and then use your self-hosted version of Firecrawl or Crawl4AI or Jina (that can be self hosted) to scrape the pages.

Brother I desperately need this. Was actively looking for solution. Was even ready to implement it myself. Thanks for this.
Its the exact usecase I am also working for.

I am closely following this repo.

Is there any roadmap for this project?
What are you planning to do?

samyogdhital · 2025-02-14T18:58:17Z

@loorisr For this /deep-research just swapping the url will work perfectly? Or I have to do some changes?
I have not looked deep into this repo. I am out today. Will look into this tomorrow. If you have already done the integration, it would be really awesome to know brother.

Again following this project. If there is some roadmap please me know. May as well help with the code.

loorisr · 2025-02-14T20:11:36Z

I'm glad it can help :)

I'm using it with deep-research and it works fine, you just need to set FIRECRAWL_BASE_URL in the env file of deep-research to where you host crawlrouter.

Then on crawlrouter you need to set SEARCH_BACKEND to the one you want (for example searxng) and SCRAPE_BACKEND to, for example crawl4ai or firecrawl.
Of course you need to have a working instance of Searxng (with json mode activated) and same for crawl4ai/firecrawl. This docker compose should work, it is very close to the one I'm using

For the roadmap, I'm currently implementing the /crawl endpoint. It will also speed up the /search endpoint when activating the scape mode (by default /search only return the url, title and a description of the link, but deep-research also need to have the complete page scraped).
I will also complete a but the implementation (options).

And then it will depends what people could need. Other backend (https://scrapingant.com, ...) or other functions!

samyogdhital · 2025-02-16T19:56:58Z

This is a promising project.

samyogdhital closed this as completed Feb 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usecase? #1

Usecase? #1

samyogdhital commented Feb 14, 2025

loorisr commented Feb 14, 2025

samyogdhital commented Feb 14, 2025

samyogdhital commented Feb 14, 2025

loorisr commented Feb 14, 2025

samyogdhital commented Feb 16, 2025

Usecase? #1

Usecase? #1

Comments

samyogdhital commented Feb 14, 2025

loorisr commented Feb 14, 2025

samyogdhital commented Feb 14, 2025

samyogdhital commented Feb 14, 2025

loorisr commented Feb 14, 2025

samyogdhital commented Feb 16, 2025