My Arabic/Hebrew PDF translator outputs pdfs and translated them into english, the isssue is after the OCR is done, Speechify is missing some words for whatever reason and I dont know why, I need someone to fix the code so that the final output of the PDF is read by speechify without no skipping, None of the things added or taken away in the code should effect the fucntionality, features, and speed of the previous program, it should have the same output of words in format and language only it should be read by speechify I am open to improvements inside the existing OCR code (Tesseract, PDFs tagged as PDF/A, Unicode mapping, accessibility tagging, etc.) or to a post-processing solution—whichever proves more reliable and keeps the workflow simple.