Back to Projects

Institutional Chatbot - Retrieval‑Augmented Generation

RAG-based chatbot for student guidance using FAISS, MiniLM vectors and Mixtral — production-ready prototype.

May 28, 2025
E. Hindi • R. Bongibault • A. Meetoo • F. Rousseau • H. Hamdan • J-P. Loubejac
Compilers & Architecture

Overview

Built IAN, an institutional assistant for INSA Toulouse that answers student questions by combining semantic search (FAISS + MiniLM) with a large generator (Mixtral‑8x7B) using a RAG pipeline. The system prioritizes up‑to‑date factual answers and clear failure messages.

RAG pipeline diagram

Data Collection

We implemented a Java scraper targeting INSA public pages and Moodle:

  • 6,215 pages harvested in ~20 minutes
  • ≈4.45M words extracted, 55% of documents with identifiable update date
  • Limitations: scanned PDFs, images and complex tables required manual handling
RAG pipeline diagram

Experimental Roadmap

We compared three approaches:

  • From‑scratch small language models (PyTorch) — computationally expensive, poor results without large infrastructure
  • Fine‑tuning GPT‑2 on local data — modest gains; human evaluation: ~3.8% fully satisfactory responses
  • RAG (final solution) — retrieval + generation gave best practical results and supports live updates

RAG Pipeline

Pipeline summary:

  • Chunk documents into 1000‑char segments
  • Embed with MiniLM, index vectors in FAISS
  • At query time, retrieve top segments and pass them to Mixtral‑8x7B alongside the prompt

Benefits: up‑to‑date answers without heavy re‑training.

RAG pipeline diagram

Evaluation & Tools

Prototype interface (Streamlit) named IAN. Evaluation on a small question set: 7/8 relevant answers. Additional tooling:

  • Cosine similarity diagnostics for chunk usefulness
  • Cross‑encoder re‑rankers for precision (MiniLM)
  • Binary classifier to detect non‑answerable queries

Results & Impact

RAG significantly outperformed local training and fine‑tuning for the use case. The system provides reliable, updateable guidance for students and administrative queries.

Next Steps

  • Index sub‑specializations in FAISS for faster, more accurate retrieval
  • Integrate structured sources (schedules, databases) and multimodal inputs
  • Deploy monitoring for drift and automated re-indexing