Granite Docling is a multimodal model for efficient document conversion.
10K+
Granite Docling is a multimodal Image-Text-to-Text model engineered for efficient document conversion. It preserves the core features of Docling while maintaining seamless integration with Docling Documents to ensure full compatibility.
| Attribute | Details |
|---|---|
| Provider | IBM Research |
| Architecture | Based on Idefics2-8B; vision encoder = siglip-base-patch16-512; LLM = Granite 165M |
| Cutoff date | - |
| Languages | English (with experimental support for Japanese, Arabic, Chinese) |
| Tool calling | ❌ |
| Input modalities | Text, Image |
| Output modalities | Text |
| License | Apache 2.0 |
| Model variant | Parameters | Quantization | Context window | VRAM¹ | Size |
|---|---|---|---|---|---|
ai/granite-docling:258Mai/granite-docling:258M-F16ai/granite-docling:latest | 258M | MOSTLY_F16 | 8K tokens | 0.86 GiB | 312.88 MB |
ai/granite-docling:258M-Q8_0 | 258M | MOSTLY_Q8_0 | 8K tokens | 0.72 GiB | 166.28 MB |
¹: VRAM estimated based on model characteristics.
latest→258M
docker model run ai/granite-docling
Granite-Docling-258M emphasizes layout fidelity and content integrity over creative or open-ended generation. It is released under Apache 2.0 and integrates seamlessly with the Docling ecosystem for structured document AI workflows.
Content type
Model
Digest
sha256:229f83681…
Size
497.5 MB
Last updated
7 months ago
docker model pull ai/granite-doclingPulls:
176
Last week