Generic Rule-Based Automation Framework test

Заказчик: AI | Опубликовано: 03.01.2026

Automation Framework — Live Testing, Debugging & Stabilization (AnyDesk) CONTEXT I already have a completed automation framework codebase (Dockerized, Python-based). All files, configs, workflows, and documentation are already available. Your job is NOT to build anything from scratch. Your job is to: Live-test, debug, and stabilize an existing framework against a strict production-grade checklist. This is a verification + fixing task, not development. ACCESS MODE Testing will be done via AnyDesk / remote session You will test on my machine / server No code exfiltration No refactor-for-style No architecture redesign unless strictly required to fix failures SYSTEM SUMMARY (WHAT YOU WILL SEE) Docker Compose based system Python framework Modules include: Supervisor / Scheduler Rules Engine (YAML/JSON) Browser Automation (Playwright, single browser) Telegram Bot (approvals, commands) Safety Layer (anti-loop, throttling, circuit breaker) Persistent state (SQLite) Hot-reload watcher Voice interface abstraction (stub only) Hardware constraints: i3 CPU 8GB RAM No GPU Single browser instance YOUR PRIMARY RESPONSIBILITY ****You must run and verify the system against the following MANDATORY TESTS (You must NOT assume anything works.) REQUIRED LIVE TESTS (YOU MUST EXECUTE THESE) 1) Clean Start Test Copy code Bash docker-compose down -v docker-compose up --build Expected: No crash loop Supervisor stays alive No silent exits 2) Hot Reload Test (CRITICAL) Modify a running YAML rule/workflow System must: Reload automatically Not restart containers Not crash Log hot_reload_applied ** If restart required → FAIL 3) Invalid Config Safety Test Insert invalid YAML intentionally Expected: System does NOT crash Telegram alert sent Old workflows continue running 4) Telegram Command Test Test these live: Copy code /status /workflows /run <test_workflow> /pending /quarantined Expected: Human-readable responses No spam Correct system state reporting 5) Approval Queue Test (VERY IMPORTANT) Run workflow that requires approval Do NOT approve Expected: Workflow freezes No auto-approve System continues other tasks Test: Copy code /approve /reject 6) Fatigue Guard Test Trigger many approval requests quickly Expected: Throttling Low-priority defer Critical approvals still allowed 7) Infinite Loop / Retry Storm Test Force failing browser action (bad selector) Expected: Retry ladder Cooldown Quarantine No infinite restart Supervisor remains stable 8) Screenshot-on-Error Test Trigger browser error Expected: Screenshot auto captured Path logged Telegram alert includes reference 9) Cold Restart Memory Test Copy code Bash docker-compose down docker-compose up Expected: Failure counters preserved Quarantine preserved No reset storm 10)Voice Abstraction Test Trigger workflow using voice hook Expected: Stub invoked No crash Clear extension point WHAT YOU MUST DELIVER A) PASS / FAIL REPORT For each test (1–10), clearly state: PASS FAIL PARTIAL B) ERROR FIXES If anything fails: Fix it immediately Minimal changes only Explain what was wrong and why C) FINAL CONFIRMATION At the end, give a clear written statement: “This system is stable for long-running, rule-driven automation under low-resource constraints.” OR explicitly state what still blocks that claim. IMPORTANT RESTRICTIONS Do NOT add business logic Do NOT add paid services Do NOT simplify tests Do NOT bypass approvals Do NOT convert this into a demo-only system SKILL REQUIREMENTS (MANDATORY) Python orchestration systems Docker & Docker Compose Long-running process stability Rule-driven engines Debugging race conditions & restart storms Production mindset (not demo mindset) PAYMENT EXPECTATION This is a short, high-skill engagement: Testing Fixing Stabilization Verification Not feature development. If you are comfortable doing live AnyDesk testing, debugging under pressure, and giving an honest pass/fail verdict — reply YES and we will proceed. FINAL NOTE (IMPORTANT) This system is not a toy. It must behave correctly even when: configs are wrong browser fails approvals are ignored Telegram is slow restarts happen Only contact me if you can test at this level.