Universal Plywood

OCR / PDF Extraction Status

Server OCR and PDF extraction check

This page shows whether the server can convert scanned/image-only PDFs automatically.

Text PDFs
Limited
Scanned PDFs
Needs OCR tools
Images
Needs Tesseract
ToolPurposeStatusPath
pdftotextExtracts selectable text from normal PDF filesNot detected-
pdftoppmConverts scanned PDF pages to images before OCRNot detected-
tesseractReads text from scanned images/PDF pagesNot detected-
shell_execAllows PHP to call the above server toolsEnabled-

What to ask your hosting/server team to install

For normal text-layer PDFs, the best tool is Poppler pdftotext. For scanned/image-only PDFs, the app needs Poppler pdftoppm and Tesseract OCR.

On a Linux server, the required packages are normally:

poppler-utils
tesseract-ocr

On shared cPanel hosting, the host may not allow these packages or may disable shell_exec. In that case, the fallback is still available: upload/store the scanned PDF, then paste OCR text into the Parse Documents tab.