AI Extract Document Node

AI/Processing

AI Extract Document

Extracts text and content from documents using AI for enhanced image descriptions and OCR.

ai_processing_extract_document_aiprocessingLong running

Inputs8

Outputs2

Security exposure5/10

Packageprocessing

Ratings

Scores range from 0 to 10. Higher values mean more impact, exposure, or operational weight.

SecurityAttack surface and exposure impact.

5/10Medium

PrivacyPotential sensitivity of processed data.

5/10Medium

PerformanceRuntime or resource pressure.

4/10Medium

GovernancePolicy, audit, or compliance impact.

5/10Medium

ReliabilityOperational stability considerations.

3/10High

CostExternal or compute cost impact.

6/10Medium

Input Pins

Input

Execution

exec_in

Execution trigger to start AI-powered document extraction.

File

Struct

file

Document file to extract (PDF, DOCX, XLSX, images, etc.).

Schema enforced

Model

Struct

model

Vision-capable AI model for image analysis and OCR.

Schema enforced

Extract Images

Boolean

extract_images

Whether to extract and embed images from the document.

Default true

Images Per Message

Integer

images_per_message

Number of images to batch per LLM request (higher = faster but may hit token limits).

Default 1

Pages Per Batch

Integer

pages_per_batch

Number of PDF pages to process in parallel (higher = faster but uses more memory).

Default 4

Temperature

Float

temperature

LLM temperature (0.0 = deterministic, 1.0 = creative). Lower is better for extraction.

Default 0.1

Max Tokens

Integer

max_tokens

Maximum output tokens per LLM call. Leave at 0 for model default. Set lower for unreliable models.

Default 0

Output Pins

Output

Execution

exec_out

Execution output after extraction completes.

Pages

Struct Array

pages

Extracted document pages with AI-generated descriptions and images.

Schema enforced

Node Info

Internal name: ai_processing_extract_document_ai
Category: AI/Processing
Version: 2