Open-Source Semantic RAG Web App

I’m putting together a lean Retrieval-Augmented-Generation web application that lets users type a natural-language query, semantically searches a private document collection, then returns ranked matches and an auto-generated answer. Everything must stay fully open-source. Core stack • Model: BERT (Hugging Face), fine-tuned or out-of-the-box for text embeddings. • Vector store: FAISS, Milvus, Weaviate or a similarly licence-friendly alternative. • Back-end: Python with FastAPI (or Flask) and PyTorch/Sentence-Transformers. • Front-end: a minimal HTML/React page that submits a query and shows results + generated summary. Scope of work 1. Set up the pipeline that ingests text files/markdown/PDF, chunks them, builds the vector index, and stores metadata. 2. Expose REST endpoints for “/ingest” and “/search”. 3. For every search, retrieve top-k passages with cosine similarity and feed them back into the generation step, then stream the answer. 4. Package the whole thing in Docker so I can run it locally or deploy to a small VPS. 5. Provide a concise README with environment setup, run commands, and sample curl calls. Acceptance criteria • Search returns relevant passages for at least 90 % of the supplied test queries. • Latency per query (on CPU) under two seconds for a 10 k-document corpus. • All code is clean, commented, and reproducible from a single docker-compose up. Optional but nice to have – Basic user authentication. – Hot-reload ingestion so new documents appear without a full re-index. If you’ve already wired up BERT-based semantic search or built RAG demos, I’d love to see a quick link or repo. Let’s keep this straightforward, open-source, and ready to extend.

Реєстрація