Document Conversion¶

LLMling provides document conversion capabilities to transform various document formats (PDFs, Office documents, HTML, etc.) into markdown text that can be processed by language models.

Overview¶

The conversion system:

Supports multiple document formats through different converters
Can handle both files and raw content
Uses async I/O for file operations
Includes a fallback plain text converter

Usage¶

Through Conversation Manager¶

Documents can be added to an agent's context:

# Add document from path
await agent.conversation.add_context_from_path(
    "document.pdf",
    convert_to_md=True  # Enable markdown conversion
)

# Add raw content
await agent.conversation.add_context_message(
    html_content,
    mime_type="text/html"  # Helps converter selection
)

Automatic Conversion¶

When passing paths to Agent.run(), documents are automatically converted if needed:

# Will convert PDF if model doesn't support it directly
await agent.run(Path("document.pdf"))

# Multiple inputs
await agent.run(
    "Analyze this document:",
    Path("document.pdf"),
    "And compare it with:",
    Path("other.docx")
)

Configuration¶

Document conversion is configured globally in your agent manifest:

conversion:
  providers:
    - type: markitdown  # Uses MarkItDown for document conversion
      enabled: true
      llm_model: gpt-4  # Optional, for image descriptions
    # More providers will be added soon

Current Status¶

The document conversion system is in active development. Currently supported:

MarkItDown converter for various document formats
Plain text fallback converter
Async file operations for efficient I/O

Coming soon:

More converter implementations

Technical Details¶

Converters are implemented as sync processors to keep them simple, while the conversion manager handles:

Async file I/O
Thread pool management for CPU-bound operations
Converter selection and fallback
Error handling

This design ensures efficient processing while maintaining a clean interface.