From 601986d53625af7ed33bacd3d1206e958770ed28 Mon Sep 17 00:00:00 2001 From: moons-14 Date: Fri, 25 Oct 2024 13:22:15 +0900 Subject: [PATCH 1/4] Add description of CF Puppeteer Loader in documentation. --- site/docs/pages/docs/loaders.mdx | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/site/docs/pages/docs/loaders.mdx b/site/docs/pages/docs/loaders.mdx index 5a25aaa..02ce591 100644 --- a/site/docs/pages/docs/loaders.mdx +++ b/site/docs/pages/docs/loaders.mdx @@ -13,11 +13,12 @@ However, they are not recommended for production use. ## Overview of Loaders -Webforai provides three different loaders: +Webforai provides four different loaders: - **Fetch Loader**: The simplest option, using JavaScript's built-in Fetch API. - **Playwright Loader**: Ideal for sites requiring JavaScript execution, like SPAs. - **Puppeteer Loader**: Another option for handling websites with JavaScript execution. +- **CF Puppeteer Loader**: Option to handle websites running JavaScript on cloudflare workers. ## Fetch Loader @@ -101,3 +102,27 @@ import { loadHtml } from "webforai/loaders/puppeteer"; const html = await loadHtml("https://example.com"); ``` +## CF Puppeteer Loader +The **CF Puppeteer Loader** is the best option for loading HTML from sites that rely on JavaScript execution on [cloudflare workers](https://workers.cloudflare.com/). This loader relies on [puppeteer on cloudflare workers](https://developers.cloudflare.com/browser-rendering/platform/puppeteer/). + +### Usage +Before using the CF Puppeteer Loader, you need to prepare a wrangler environment and install @cloudflare/puppeteer. Refer to the [cookbook](/cookbook/cf-workers) for instructions on how to create a project. + +:::code-group + +```bash [npm] +npm install @cloudflare/puppeteer --save-dev +``` + +```bash [pnpm] +pnpm install -D @cloudflare/puppeteer +``` +::: + +And then you can use the Playwright Loader as follows: + +```ts +import { loadHtml } from "webforai/loaders/cf-puppeteer"; + +const html = await loadHtml("https://example.com"); +``` From ef3225b3c2c4259c63fb251d907f7f2e226ee436 Mon Sep 17 00:00:00 2001 From: moons-14 Date: Fri, 25 Oct 2024 13:22:32 +0900 Subject: [PATCH 2/4] fix: authors name --- site/docs/pages/cookbook/cf-workers.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/site/docs/pages/cookbook/cf-workers.mdx b/site/docs/pages/cookbook/cf-workers.mdx index f10e17f..1bae47a 100644 --- a/site/docs/pages/cookbook/cf-workers.mdx +++ b/site/docs/pages/cookbook/cf-workers.mdx @@ -1,7 +1,7 @@ --- title: HTML to Markdown Conversion with Browser Rendering authors: - - "[Your Name]" + - "[inaridiy](https://github.com/inaridiy)" date: 2024-03-15 --- From b3caaba1c6ada3b292229ad2f076e96da102fb00 Mon Sep 17 00:00:00 2001 From: moons-14 Date: Fri, 25 Oct 2024 13:28:13 +0900 Subject: [PATCH 3/4] fix cf-puppeteer code --- site/docs/pages/docs/loaders.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/site/docs/pages/docs/loaders.mdx b/site/docs/pages/docs/loaders.mdx index 02ce591..cdb3b43 100644 --- a/site/docs/pages/docs/loaders.mdx +++ b/site/docs/pages/docs/loaders.mdx @@ -124,5 +124,5 @@ And then you can use the Playwright Loader as follows: ```ts import { loadHtml } from "webforai/loaders/cf-puppeteer"; -const html = await loadHtml("https://example.com"); +const html = await loadHtml("https://example.com", browser); // browser is the puppeteer browser instance ``` From c3f012ca740ef33538ca5d4874277008daf5c5a1 Mon Sep 17 00:00:00 2001 From: moons-14 Date: Fri, 25 Oct 2024 13:29:50 +0900 Subject: [PATCH 4/4] changeset --- .changeset/perfect-pets-happen.md | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 .changeset/perfect-pets-happen.md diff --git a/.changeset/perfect-pets-happen.md b/.changeset/perfect-pets-happen.md new file mode 100644 index 0000000..5f3c3c6 --- /dev/null +++ b/.changeset/perfect-pets-happen.md @@ -0,0 +1,5 @@ +--- +"site": patch +--- + +Add description for cf puppeteer loader.