Skip to content

Latest commit

 

History

History
171 lines (137 loc) · 4.63 KB

README.md

File metadata and controls

171 lines (137 loc) · 4.63 KB

Edge Node Knowledge Mining

Table of Contents

Getting Started

Follow the instructions below to set up and run the project on your local machine.

Prerequisites

Ensure you have the following installed on your system:

  • Node.js (v20 or higher)
  • Redis (for BullMQ job queues)

Installation

  1. Clone the repository:

    git clone https://github.com/OriginTrail/edge-node-knowledge-mining-js
    cd edge-node-knowledge-mining
  2. Install the dependencies:

    npm install
  3. Make sure Redis is running on its default port (6379).

Configuration

The project requires environment variables to be set. Use the provided .env.example file as a template:

  1. Copy .env.example to .env:

    cp .env.example .env
  2. Populate the .env file with the required values. Example:

    PORT=5005
    UI_ENDPOINT=http://localhost:5173
    AUTH_SERVICE_ENDPOINT=http://localhost:3001
    
    KNOWLEDGE_MINING_QUEUE=knowledge-mining-queue
    KNOWLEDGE_MINING_CONCURRENCY=20
    
    OPENAI_API_KEY=your_openai_api_key
    UNSTRUCTURED_API_URL=your_unstructured_api_url
    UNSTRUCTURED_API_KEY=your_unstructured_api_key

Usage

  1. Start the service:

    npm start
  2. The service will start on the configured port (default: 5005).

API Routes

1. Trigger Pipeline

POST /trigger-pipeline

This endpoint triggers a knowledge mining pipeline with a file upload.

Request:

  • Headers:
    • Authorization: Bearer token for authentication.
  • Body (form-data):
    • pipelineId (string, required): The ID of the pipeline to trigger. ID is the filename of the file where the pipeline is defined (simple_json_to_jsonld, pdf_to_jsonld...).
    • fileFormat (string, optional): Format of the uploaded file (json, csv...).
    • file (file, required): File to be processed.

Example cURL:

curl -X POST http://localhost:5005/trigger-pipeline \
  -H "Authorization: Bearer <your_token>" \
  -F "pipelineId=12345" \
  -F "fileFormat=pdf" \
  -F "file=@example.pdf"

Response:

  • Success (200):
    {
      "pipelineId": "12345",
      "runId": "jobId123",
      "message": "Pipeline triggered successfully",
      "success": true
    }
  • Error (400):
    { "error": "Missing pipelineId" }
    { "error": "No selected file" }
  • Error (500):
    { "error": "Failed to trigger pipeline" }

2. Check Pipeline Status

GET /check-pipeline-status

This endpoint retrieves the status of a specific pipeline run.

Request:

  • Headers:
    • Authorization: Bearer token for authentication.
  • Query Parameters:
    • pipelineId (string, required): The ID of the pipeline.
    • runId (string, required): The ID of the specific run to check.

Example cURL:

curl -X GET "http://localhost:5005/check-pipeline-status?pipelineId=12345&runId=jobId123" \
  -H "Authorization: Bearer <your_token>"

Response:

  • Success (200):
    {
      "id": "jobId123",
      "status": "completed",
      "ka": <knowledge_asset_object>
    }
  • Error (400):
    { "error": "Missing pipelineId or runId" }
  • Error (404):
    { "error": "Pipeline not found" }
  • Error (500):
    { "error": "Failed to fetch pipeline status" }

Dependencies

The project uses the following dependencies: