Upload multiple documents for asynchronous processing with vector embeddings.
This endpoint uploads documents and queues them for background processing using Celery. Each document is uploaded to S3 storage and processed to extract embeddings for semantic search and RAG applications. Only text-based file formats are supported.
Request Body (multipart/form-data):
files: List of files to upload (required)categories (optional): List of category names to organize documentsReturns:
list[int] containing:
Raises:
400: If no files provided400: If any file exceeds 100MB size limit400: If file type is unsupported (only text-based files allowed)400: If user doesn’t belong to an organization (required for vector processing)Supported File Types:
Text-based files only: .txt, .md, .json, .jsonl, .xml, .csv, .html
File Size Limit:
Maximum 100 MB per file
Example Request:
POST /api/v3/documents
Headers: {"Authorization": "Bearer <api_key>"}
Content-Type: multipart/form-data
Form Data:
files: [report.txt, analysis.md]
categories: ["financial", "2024-q1"]
Example Response:
[
12345,
12346
]
Notes:
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Returns a list of task IDs for the uploaded documents. This endpoint handles the upload of multiple documents, validates their types and sizes, creates processing tasks for each document, and returns their unique task identifiers for tracking the upload and processing status via Celery background tasks.