HTML-to-EPUB Automation

I maintain a growing archive of academic articles already marked up in clean, semantic HTML. I need a repeatable workflow (script, CLI tool, or lightweight web interface—whatever you feel is most robust) that ingests those HTML files and outputs polished EPUB 3 files ready for distribution on major e-reader platforms. The finished EPUBs must include: • a navigable, hierarchically correct Table of Contents generated from the existing heading structure • support for embedded multimedia (e.g., audio clips, short video, and high-resolution images) that remains standards-compliant and passes EPUBCheck • basic interactive elements such as internal quizzes or expandable footnotes rendered with unobtrusive JavaScript/CSS while degrading gracefully on devices that block scripts I am open to the underlying toolchain—Pandoc, Calibre’s command-line tools, custom Python scripts, or a Node-based build—but the process must run unattended once configured. Batch conversion, clear logging, and a way to tweak metadata (title, author, keywords, academic citation info) before packaging are essential. Deliverables 1. Source code or scripts with inline comments 2. One-click or single-command build instructions documented in README 3. Sample converted EPUB demonstrating all three requested features 4. Brief guide outlining how to extend the pipeline for new HTML inputs Your proposal should explain the overall approach, key libraries or frameworks you plan to use, and any assumptions about the input markup. If you have prior work converting HTML to rich EPUB 3, mention the most relevant example and any challenges you solved—no long portfolio dump needed, just the specifics that prove this will work.

Python

Реєстрація