Universal Plywood

OCR / PDF Extraction Status

Server OCR and PDF extraction check

This page shows whether the server can convert scanned/image-only PDFs automatically.

Text PDFs

Limited

Scanned PDFs

Needs OCR tools

Images

Needs Tesseract

Tool	Purpose	Status	Path
pdftotext	Extracts selectable text from normal PDF files	Not detected	-
pdftoppm	Converts scanned PDF pages to images before OCR	Not detected	-
tesseract	Reads text from scanned images/PDF pages	Not detected	-
shell_exec	Allows PHP to call the above server tools	Enabled	-

What to ask your hosting/server team to install

For normal text-layer PDFs, the best tool is Poppler pdftotext. For scanned/image-only PDFs, the app needs Poppler pdftoppm and Tesseract OCR.

On a Linux server, the required packages are normally:

poppler-utils
tesseract-ocr

On shared cPanel hosting, the host may not allow these packages or may disable shell_exec. In that case, the fallback is still available: upload/store the scanned PDF, then paste OCR text into the Parse Documents tab.