To extract all text from PDFs (including text in images/Scan copy), we can use a combination of Ghostscript and a command line OCR tool called tesseract-ocr. First we need to convert our PDF to individual image files (TIFF) so we can then OCR-scan them again. We need Ghostscript for that. It’s probably already installed on…