Skip to content

Document Conversion

LLMling provides document conversion capabilities to transform various document formats (PDFs, Office documents, HTML, etc.) into markdown text that can be processed by language models.

Overview

The conversion system:

  • Supports multiple document formats through different converters
  • Can handle both files and raw content
  • Uses async I/O for file operations
  • Includes a fallback plain text converter

Usage

Through Conversation Manager

Documents can be added to an agent's context:

# Add document from path
await agent.conversation.add_context_from_path(
    "document.pdf",
    convert_to_md=True  # Enable markdown conversion
)

# Add raw content
await agent.conversation.add_context_message(
    html_content,
    mime_type="text/html"  # Helps converter selection
)

Automatic Conversion

When passing paths to Agent.run(), documents are automatically converted if needed:

# Will convert PDF if model doesn't support it directly
await agent.run(Path("document.pdf"))

# Multiple inputs
await agent.run(
    "Analyze this document:",
    Path("document.pdf"),
    "And compare it with:",
    Path("other.docx")
)

Configuration

Document conversion is configured globally in your agent manifest:

conversion:
  providers:
    - type: markitdown  # Uses MarkItDown for document conversion
      enabled: true
      llm_model: gpt-4  # Optional, for image descriptions
    # More providers will be added soon

Current Status

The document conversion system is in active development. Currently supported:

  • MarkItDown converter for various document formats
  • Plain text fallback converter
  • Async file operations for efficient I/O

Coming soon:

  • More converter implementations

Technical Details

Converters are implemented as sync processors to keep them simple, while the conversion manager handles:

  • Async file I/O
  • Thread pool management for CPU-bound operations
  • Converter selection and fallback
  • Error handling

This design ensures efficient processing while maintaining a clean interface.