converters
Class info¶
Classes¶
Name | Children | Inherits |
---|---|---|
BaseConverterConfig llmling_agent.models.converters Base configuration for document converters. |
||
ConversionConfig llmling_agent.models.converters Global conversion configuration. |
||
DoclingConverterConfig llmling_agent.models.converters Configuration for docling-based converter. |
||
DocumentConverter llmling_agent_converters.base Base class for document converters. |
||
GoogleSpeechConfig llmling_agent.models.converters Configuration for Google Cloud Speech-to-Text. |
||
LocalWhisperConfig llmling_agent.models.converters Configuration for local Whisper model. |
||
MarkItDownConfig llmling_agent.models.converters Configuration for MarkItDown-based converter. |
||
PlainConverterConfig llmling_agent.models.converters Configuration for plain text fallback converter. |
||
WhisperAPIConfig llmling_agent.models.converters Configuration for OpenAI's Whisper API. |
||
YouTubeConverterConfig llmling_agent.models.converters Configuration for YouTube transcript converter. |
🛈 DocStrings¶
BaseConverterConfig
¶
Bases: BaseModel
Base configuration for document converters.
Source code in src/llmling_agent/models/converters.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
enabled
class-attribute
instance-attribute
¶
enabled: bool = True
Whether this converter is currently active.
type
class-attribute
instance-attribute
¶
type: str = Field(init=False)
Type discriminator for converter configs.
get_converter
¶
get_converter() -> DocumentConverter
Get the converter instance.
Source code in src/llmling_agent/models/converters.py
22 23 24 |
|
ConversionConfig
¶
Bases: BaseModel
Global conversion configuration.
Source code in src/llmling_agent/models/converters.py
190 191 192 193 194 195 196 197 198 199 200 201 202 |
|
DoclingConverterConfig
¶
Bases: BaseConverterConfig
Configuration for docling-based converter.
Source code in src/llmling_agent/models/converters.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
|
max_size
class-attribute
instance-attribute
¶
max_size: int | None = None
Optional size limit in bytes.
type
class-attribute
instance-attribute
¶
type: Literal['docling'] = Field('docling', init=False)
Type discriminator for docling converter.
get_converter
¶
get_converter() -> DocumentConverter
Get the converter instance.
Source code in src/llmling_agent/models/converters.py
36 37 38 39 40 |
|
GoogleSpeechConfig
¶
Bases: BaseConverterConfig
Configuration for Google Cloud Speech-to-Text.
Source code in src/llmling_agent/models/converters.py
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
|
encoding
class-attribute
instance-attribute
¶
encoding: Literal['LINEAR16', 'FLAC', 'MP3'] = 'LINEAR16'
Audio encoding format.
language
class-attribute
instance-attribute
¶
language: str = 'en-US'
Language code for transcription.
type
class-attribute
instance-attribute
¶
type: Literal['google_speech'] = Field('google_speech', init=False)
Type discriminator for converter config.
get_converter
¶
get_converter() -> DocumentConverter
Get the converter instance.
Source code in src/llmling_agent/models/converters.py
155 156 157 158 159 |
|
LocalWhisperConfig
¶
Bases: BaseConverterConfig
Configuration for local Whisper model.
Source code in src/llmling_agent/models/converters.py
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
|
compute_type
class-attribute
instance-attribute
¶
compute_type: Literal['float32', 'float16'] = 'float16'
Compute precision to use.
device
class-attribute
instance-attribute
¶
device: Literal['cpu', 'cuda'] | None = None
Device to run model on (None for auto-select).
model_size
class-attribute
instance-attribute
¶
model_size: Literal['tiny', 'base', 'small', 'medium', 'large'] = 'base'
Size of the Whisper model to use.
type
class-attribute
instance-attribute
¶
type: Literal['local_whisper'] = Field('local_whisper', init=False)
Type discriminator for converter config.
get_converter
¶
get_converter() -> DocumentConverter
Get the converter instance.
Source code in src/llmling_agent/models/converters.py
111 112 113 114 115 |
|
MarkItDownConfig
¶
Bases: BaseConverterConfig
Configuration for MarkItDown-based converter.
Source code in src/llmling_agent/models/converters.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
|
max_size
class-attribute
instance-attribute
¶
max_size: int | None = None
Optional size limit in bytes.
type
class-attribute
instance-attribute
¶
type: Literal['markitdown'] = Field('markitdown', init=False)
Type discriminator for MarkItDown converter.
get_converter
¶
get_converter() -> DocumentConverter
Get the converter instance.
Source code in src/llmling_agent/models/converters.py
52 53 54 55 56 |
|
PlainConverterConfig
¶
Bases: BaseConverterConfig
Configuration for plain text fallback converter.
Source code in src/llmling_agent/models/converters.py
162 163 164 165 166 167 168 169 170 171 172 173 174 175 |
|
force
class-attribute
instance-attribute
¶
force: bool = False
Whether to attempt converting any file type.
type
class-attribute
instance-attribute
¶
type: Literal['plain'] = Field('plain', init=False)
Type discriminator for plain text converter.
get_converter
¶
get_converter() -> DocumentConverter
Get the converter instance.
Source code in src/llmling_agent/models/converters.py
171 172 173 174 175 |
|
WhisperAPIConfig
¶
Bases: BaseConverterConfig
Configuration for OpenAI's Whisper API.
Source code in src/llmling_agent/models/converters.py
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
|
type
class-attribute
instance-attribute
¶
type: Literal['whisper_api'] = Field('whisper_api', init=False)
Type discriminator for converter config.
get_converter
¶
get_converter() -> DocumentConverter
Get the converter instance.
Source code in src/llmling_agent/models/converters.py
133 134 135 136 137 |
|
YouTubeConverterConfig
¶
Bases: BaseConverterConfig
Configuration for YouTube transcript converter.
Source code in src/llmling_agent/models/converters.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
|
cookies_path
class-attribute
instance-attribute
¶
cookies_path: str | None = None
Optional path to cookies file for age-restricted videos.
format
class-attribute
instance-attribute
¶
format: FormatterType = 'text'
Output format. One of: text, json, vtt, srt.
https_proxy
class-attribute
instance-attribute
¶
https_proxy: str | None = None
Optional HTTPS proxy URL (format: https://user:pass@domain:port).
languages
class-attribute
instance-attribute
¶
Preferred language codes in priority order. Defaults to ['en'].
max_retries
class-attribute
instance-attribute
¶
max_retries: int = 3
Maximum number of retries for failed requests.
preserve_formatting
class-attribute
instance-attribute
¶
preserve_formatting: bool = False
Whether to keep HTML formatting elements like and .
type
class-attribute
instance-attribute
¶
type: Literal['youtube'] = Field('youtube', init=False)
Type discriminator for converter config.
get_converter
¶
get_converter() -> DocumentConverter
Get the converter instance.
Source code in src/llmling_agent/models/converters.py
86 87 88 89 90 |
|