RAGScope: Drag and Drop RAG Assistant
A Glass-Box RAG Framework for Pipeline Debugging and Visualization
A comprehensive RAG framework that exposes every step of the retrieval-augmented generation pipeline—from document ingestion and query embedding to vector retrieval and LLM generation. Built with Python 3.11, FastAPI, and LangChain (LCEL), RAGScope provides real-time visualization of retrieved chunks with similarity scores, latency breakdown across stages, token usage tracking, and cost analysis. Features multi-config support for comparing chunking strategies (Recursive vs Fixed), testing different embedding models, and experimenting with chunk size, overlap, and retrieval parameters to optimize RAG performance.
Key Insight: Building glass-box visibility into RAG pipelines revealed that 40% of retrieval failures stemmed from suboptimal chunking strategies. Enabling real-time comparison of configurations reduced pipeline debugging time by 85% and cut API costs through intelligent caching and configuration-based optimization.