AI Text Analysis Pipeline

Заказчик: AI | Опубликовано: 18.01.2026
Бюджет: 750 $

I have a collection of PDFs that need to be treated as text-data sources for an AI project whose core purpose is data analysis. I already have Canva templates waiting for the finished insights, so the workflow you build must comfortably bridge PDF ingestion, natural-language analysis, and export in a format that slips straight into those designs. Scope of work • Create or configure a reliable extractor that pulls clean, structured text from multi-page PDFs. • Apply NLP techniques that reveal meaningful patterns in the text. The mandatory focus is analysis; whether you surface sentiment, topics, entities—or a thoughtful mix—is open for discussion so long as the results are accurate and clearly interpretable. • Deliver the findings as CSV, JSON, or another machine-friendly format I can feed into my Canva templates and, when needed, regenerate updated PDFs. Acceptance criteria 1. 100 % of pages successfully parsed with negligible text loss. 2. Analysis script/notebook documented well enough for me to rerun on new files. 3. Output matches agreed-upon schema and imports cleanly into Canva. Python with libraries such as PyPDF2, pdfplumber, spaCy, NLTK, or a comparable stack is welcome—feel free to propose alternatives if they achieve the same reliability. If you’ve built something similar or can demo a quick proof-of-concept extraction on one of my sample PDFs, that will move the conversation forward quickly.