Finish Python-AWS ELT Tool

Заказчик: AI | Опубликовано: 19.09.2025
Бюджет: 750 $

I’ve already scaffolded a web-based ELT platform that ingests raw data and handles user management, but the brain of the system—the flexible data-processing layer—still needs to be hardened. The codebase is Python (FastAPI + SQLAlchemy), running on AWS (S3, Lambda, ECS) with Postgres as the metadata store. Here’s what remains on my side: • Design and implement reusable transformation pipelines that can be configured from the UI and executed on demand or on a schedule. • Add job-level monitoring, logging and graceful failure recovery so I can trace every run in CloudWatch and surface status back to the frontend. • Optimize the actual transform logic (Pandas / PySpark acceptable) to cope with multi-GB CSV and JSON files without timeouts. • Expose a REST endpoint to trigger, pause, resume and cancel jobs; wire it to existing auth middleware. • Package the whole thing in Terraform scripts so I can spin up a new environment quickly. I’ll give you GitHub access, the current CI pipeline (GitHub Actions → ECR → ECS), and a small sample dataset. Success for me is being able to: 1. Create a transformation in the UI, hit “Run,” and watch it complete via logs. 2. See structured error reports when rows fail validation. 3. Redeploy to a fresh AWS account and have it work end-to-end. If you’re comfortable jumping into an existing codebase, writing clean Python, and leveraging AWS services efficiently, I’m ready to start as soon as you can.