📄 DeepSeek-OCR PDF Parser by Jatevo LLM Inference

Upload a PDF to extract text and convert to Markdown using DeepSeek-OCR.
Each page is processed sequentially and combined into a single markdown document.

✨ Features

🖼️ Image Embedding - Charts, graphs, and figures embedded directly in markdown
📝 Text Extraction - All text content from images and charts extracted
📊 Table Support - Tables converted to markdown format
🔍 Object Detection - Locate specific elements in documents
🎯 Multiple Models - Choose speed vs. accuracy trade-off

📏 Model Sizes

Tiny — Fastest, lower accuracy (512×512) - Best for large PDFs (30+ pages)
Small — Fast, good accuracy (640×640) - Good for 15-30 pages
Base — Balanced performance (1024×1024) - Good for 10-20 pages
Large — Best accuracy, slower (1280×1280) - Best for <10 pages
Gundam (Recommended) — Optimized for documents (1024 base, 640 image, crop mode)

💡 Tips

Enable "Embed Images" to include charts/figures (recommended)
Use Tiny or Small model for large PDFs (20+ pages)
Processing time: ~2-5 seconds per page depending on model
Maximum recommended: 50 pages at once
Image embedding increases file size (~1-2MB per page with images)

📎 Upload PDF

🎯 Model Size

Use Tiny/Small for large PDFs (20+ pages)

📋 Task Type

⚡ Evaluation Mode

Plain text only (faster)

🖼️ Embed Images

Include charts/figures in output

📊 Processing Status

Watch the progress bar for real-time updates.

Note: Image embedding provides both:

👁️ Visual image (embedded as base64)
📝 Extracted text content (OCR'd from image)

You get the best of both worlds!

📝 Markdown Output Preview

Upload a PDF and click 'Process PDF' to see results here.

The output will include both images and extracted text.

📄 Raw Markdown Source (Copy/Download)

Raw Markdown