Mixed-Format PDF Data Extraction

Бюджет: 30 $

I have a set of PDFs that contain purely textual information presented in a mix of paragraphs, scattered labels, and occasional row-style groupings. I need every single data point moved into a clean Excel workbook so nothing is lost in translation. Because the layout switches between narrative blocks and ad-hoc table-like sections, the job will likely involve both automated capture (Python, Power Query, or your preferred OCR tool) and some careful manual cleanup to preserve order and context. Deliverable: one well-structured .xlsx file per source PDF, with all columns and data fields represented exactly as they appear, ready for filtering and analysis. Consistent header naming and no stray line breaks or merged cells, please. If you’ve handled mixed-format PDFs before and can turn them into tidy spreadsheets quickly, I’m ready to get started as soon as you are.

Python

Реєстрація