Role Purpose We are building a high-integrity, AI-assisted intelligence platform that converts unstructured and semi-structured data into decision-support insights. The platform operates under explicit scope, governance, and safety constraints, prioritising correctness, explainability, and auditability. This role owns the entire SDLC and technical lifecycle: system architecture, data engineering, machine learning integration, infrastructure deployment, and quality assurance. Core Responsibilities – Platform & Architecture • Design and own a multi-store architecture (relational DB, object storage, search, vector systems) • Define and enforce clear source-of-truth boundaries Database Design & Data Engineering • Lead schema and data model design for entities, relationships, events, and provenance • Model data as timelines and relationship graphs • Implement data validation, reconciliation, and integrity checks • Design and Enforce Data Hygiene and Completeness framework Unstructured Data Processing • Build batch/offline pipelines for documents, text, and transcripts • Implement OCR, parsing, and layout-aware extraction • Preserve provenance, confidence scores, and audit logs Data Science & Machine Learning • Integrate ML pipelines for extraction, classification, resolution, and similarity • Ensure models are descriptive, explainable, versioned, and monitored • Translate experimental data science into production systems Backend Services & APIs • Design APIs exposing synthesised, traceable outputs • Integrate 3rd party APIs to interface with official (verified) data sources • Enforce RBAC, logging, and misuse prevention Backend Services & APIs • WebApp (Next.js or equivalent) • UI Kit (Material UI / Chakra or equivalent) • PDF.js • Sandboxed Output DevOps & Infrastructure • Provision infrastructure using Infrastructure-as-Code • Containerise and deploy services using Docker and Kubernetes (or equivalent) • Build CI/CD pipelines and manage environments • Ensure systems are monitorable, fault tolerant and recoverable Observability & Operations • Implement metrics, logging, and alerting • Automate monitoring for pipeline health, latency, and failures •Break/Fix (CI/CD, Infrastructure etc) Semi Basic/Intermediate Quality Assurance • Implement unit, integration, and basic end-to-end tests • Validate data integrity and ML output sanity • Support UAT and acceptance sign-off Required Experience • 10+ years in platform/backend/data engineering • Advanced Python • Strong PostgreSQL and relational modelling • Experience with search engines and ML pipelines • Docker, Kubernetes, CI/CD experience Non-Goals • No real-time streaming systems • No predictive or scoring systems • No black-box AI outputs Success Criteria (Timeline to be confirmed) • Platform deployed via reproducible infrastructure • End-to-end pipelines operational with observability • QA coverage in place • No architectural path for prohibited behaviours