๐Ÿ“„ DeepSeek-OCR PDF Parser by Jatevo LLM Inference

Upload a PDF to extract text and convert to Markdown using DeepSeek-OCR.
Each page is processed sequentially and combined into a single markdown document.

โœจ Features

  • ๐Ÿ–ผ๏ธ Image Embedding - Charts, graphs, and figures embedded directly in markdown
  • ๐Ÿ“ Text Extraction - All text content from images and charts extracted
  • ๐Ÿ“Š Table Support - Tables converted to markdown format
  • ๐Ÿ” Object Detection - Locate specific elements in documents
  • ๐ŸŽฏ Multiple Models - Choose speed vs. accuracy trade-off

๐Ÿ“ Model Sizes

  • Tiny โ€” Fastest, lower accuracy (512ร—512) - Best for large PDFs (30+ pages)
  • Small โ€” Fast, good accuracy (640ร—640) - Good for 15-30 pages
  • Base โ€” Balanced performance (1024ร—1024) - Good for 10-20 pages
  • Large โ€” Best accuracy, slower (1280ร—1280) - Best for <10 pages
  • Gundam (Recommended) โ€” Optimized for documents (1024 base, 640 image, crop mode)

๐Ÿ’ก Tips

  • Enable "Embed Images" to include charts/figures (recommended)
  • Use Tiny or Small model for large PDFs (20+ pages)
  • Processing time: ~2-5 seconds per page depending on model
  • Maximum recommended: 50 pages at once
  • Image embedding increases file size (~1-2MB per page with images)
๐ŸŽฏ Model Size

Use Tiny/Small for large PDFs (20+ pages)

๐Ÿ“‹ Task Type

Plain text only (faster)

Include charts/figures in output


๐Ÿ“Š Processing Status

Watch the progress bar for real-time updates.

Note: Image embedding provides both:

  • ๐Ÿ‘๏ธ Visual image (embedded as base64)
  • ๐Ÿ“ Extracted text content (OCR'd from image)

You get the best of both worlds!

๐Ÿ“ Markdown Output Preview

Upload a PDF and click 'Process PDF' to see results here.

The output will include both images and extracted text.

๐Ÿ“„ Raw Markdown Source (Copy/Download)