Tutorial

How to Build a Full Retrieval-Augmented Generation (RAG) System with FastAPI and Eden AI

Unlock the power of Retrieval-Augmented Generation (RAG)! This article shows you how combining language models with retrieval systems boosts response accuracy and relevance. Dive into a hands-on tutorial where you'll build a RAG backend using FastAPI, Eden AI, and top AI tools like OpenAI and Qdrant. Get ready to go from beginner to pro!

How to Build a Full Retrieval-Augmented Generation (RAG) System with FastAPI and Eden AI
TABLE OF CONTENTS

Why RAG Is the Future of AI-Powered Search

Language models are smart, but they don’t always know everything—especially when real-time or domain-specific knowledge is needed.

Enter Retrieval-Augmented Generation (RAG): a powerful paradigm that combines large language models (LLMs) with information retrieval systems to provide smarter, more relevant, and grounded responses.

Building a Full Retrieval-Augmented Generation

In this tutorial, based on our in-depth Youtube Tutorial, we’ll build a full-featured RAG backend using FastAPI, Eden AI, and some of the most powerful AI tools on the market (like OpenAI, Qdrant, and more).

You can watch our detailed tutorial to see the full breakdown and follow the step-by-step instructions, where we’ll guide you through every stage of the process—from setting up the environment to deploying a robust backend solution, ensuring you understand each concept and its application. 

Whether you're a developer looking to add smart Q&A to your app, or an ML enthusiast curious about how RAG works under the hood, you're in the right place.

What We’ll Build

We're going to build a complete backend RAG system that allows you to:

  • Create RAG projects using Eden AI’s API.
  • Upload data (files, URLs, or plain text).
  • Generate embeddings and store them in a vector database.
  • Ask contextual questions using LLMs.
  • Create and manage conversations for chat-based interfaces.

Tech Stack

  • FastAPI – For building our backend API.
  • Eden AI – Abstracts multiple AI providers and gives access to OCR, STT, embeddings, LLMs, and vector DBs.
  • Qdrant – Vector store used to hold your document embeddings.
  • OpenAI – Used as the provider for embeddings and LLMs.
  • CORS Middleware – To allow frontend apps to connect to our API.

Step-by-Step Guide:

Part 1: Setting Up the Project

Let’s start with the basic setup:

from fastapi import FastAPI, HTTPException, UploadFile, File, Form, Query
from fastapi.middleware.cors import CORSMiddleware
import requests
import os

We’re importing the essential FastAPI classes and tools for building APIs, along with requests for making HTTP requests to Eden AI, and os for accessing environment variables.

We also bring in Pydantic models and some typing utilities:

from typing import List, Optional, Dict, Any
from pydantic import BaseModel, HttpUrl

These help with defining clear and structured request/response models.

Load Eden AI Key

We load the API key from environment variables:

EDEN_AI_API_KEY = os.getenv("EDEN_AI_API_KEY", "your_default_key_here")
if not EDEN_AI_API_KEY:
    raise ValueError("Environment variable EDEN_AI_API_KEY is not set")

This ensures the key is securely managed and not hardcoded. Replace your_default_key_here with a safe fallback or manage securely via .env in production.

Define Eden API Base URL and Helper Function

Now we define the base URL for the Eden AI RAG endpoint and a helper to attach authorization headers to all outgoing requests:

BASE_URL = "https://api.edenai.run/v2/agent/rag"

def get_headers():
    return {
        "Authorization": f"Bearer {EDEN_AI_API_KEY}",
        "Content-Type": "application/json"
    }

Part 2: Initializing FastAPI

app = FastAPI( title="Eden AI RAG Interface", description="API for interacting with Eden AI's RAG capabilities", version="1.0.0", )

This initializes our FastAPI app with a title and description. Useful for auto-generated docs at /docs.

Add CORS middleware to allow frontend access:

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

This is crucial for development and integration with frontend apps like React or Vue.

Part 3: Models for RAG Project Creation

Here we define Pydantic models that structure our incoming requests.

class CreateProjectRequest(BaseModel):
    ocr_provider: str = "amazon"
    speech_to_text_provider: str = "openai"
    llm_provider: Optional[str] = None
    llm_model: Optional[str] = None
    project_name: str
    collection_name: str
    db_provider: str = "qdrant"
    embeddings_provider: str = "openai"
    chunk_size: Optional[int] = None
    chunk_separators: Optional[List[str]] = None

These parameters allow for deep customization of how data is chunked, embedded, and stored.

Part 4: Creating and Managing RAG Projects

Create a Project

@app.post("/projects")
async def create_project(project_request: CreateProjectRequest):
    payload = project_request.dict(exclude_none=True)
    response = requests.post(
        f"{BASE_URL}/",
        headers=get_headers(),
        json=payload,
    )
    response.raise_for_status()
    return response.json()

This endpoint creates a new RAG project in Eden AI. It serializes your form input to JSON and sends it off.

List and Manage Projects

Endpoints to list, retrieve, and delete projects:

@app.get("/projects")
async def list_projects(): ...

@app.get("/projects/{project_id}")
async def get_project(project_id: str): ...

@app.delete("/projects/{project_id}")
async def delete_project(project_id: str): ...

You can use these to monitor or clean up your project space.

Part 5: Adding Data to the Project

Uploading Files

@app.post("/projects/{project_id}/files")
async def upload_file(...):
    ...
    # You can either upload an actual file or just provide a file_url

This will trigger OCR (if needed), and chunking + embeddings generation via Eden AI.

Adding Text or URLs

@app.post("/projects/{project_id}/texts")
async def add_texts(...): ...
@app.post("/projects/{project_id}/urls")
async def add_urls(...): ...

This is especially useful for adding data programmatically, like injecting FAQ content or scraping URLs.

Part 6: Creating Bot Profiles

This is where you can define the "personality" of the chatbot or assistant:

class CreateBotProfileRequest(BaseModel):
    model: str
    name: str
    text: str
    params: Optional[Dict[str, Any]] = None

@app.post("/projects/{project_id}/bot-profile")
async def create_bot_profile(...): ...

This is your system prompt — telling the LLM how to behave. You can also pass temperature or max_tokens via params.

Part 7: Asking Questions (LLM + RAG)

class AskLLMRequest(BaseModel):
    query: str
    ...
    k: int = 3  # Number of relevant chunks to retrieve

@app.post("/projects/{project_id}/ask-llm")
async def ask_llm(...): ...

This endpoint is the core of the RAG system — it retrieves relevant chunks from the vector DB and feeds them into the LLM as context before answering your query.

Part 8: Managing Conversations

To support chat-based UIs, we manage conversations too:

@app.post("/projects/{project_id}/conversations")
async def create_conversation(...): ...

@app.get("/projects/{project_id}/conversations")
async def list_conversations(...): ...

You can also retrieve, delete, or continue an ongoing thread of dialogue using history.

Part 9: Querying and Deleting Data

Need to clean up?

@app.delete("/projects/{project_id}/chunks")
async def delete_chunks(...): ...

@app.delete("/projects/{project_id}/all-chunks")
async def delete_all_chunks(...): ...

You can also query data directly:

@app.post("/projects/{project_id}/query")
async def query_data(...): ...

Perfect for admin dashboards or sanity checks.

Real-World Example Flow

Let’s say you want to build a legal document assistant:

  1. Create a RAG project with OpenAI and Qdrant.
  2. Upload your legal docs (PDFs, text, URLs).
  3. Create a bot profile instructing the AI to behave like a legal expert.
  4. Ask questions like “What clauses apply to intellectual property in this contract?”
  5. Retrieve and chat with full context, citations, and follow-up support.

Conclusion

Retrieval-Augmented Generation is the next evolution in how we interact with AI. By combining structured knowledge retrieval with powerful LLMs, we’re giving our apps superpowers—from legal document search, to customer support chatbots, to custom research assistants.

This blog and our [YouTube tutorial] give you everything you need to get started. Whether you’re building tools for work, school, or fun, a solid RAG backend like this is your foundation.

If you're hungry for more, consider expanding this into:

  • A full-stack app with a React/Vue frontend.
  • Support for audio transcription and OCR pipelines.
  • Custom metadata tagging and filtering.

Want the full walkthrough with voice, visuals, and step-by-step debugging? Watch the full tutorial on our YouTube channel.

Start Your AI Journey Today

  • Access 100+ AI APIs in a single platform.
  • Compare and deploy AI models effortlessly.
  • Pay-as-you-go with no upfront fees.
Start building FREE

Related Posts

Try Eden AI for free.

You can directly start building now. If you have any questions, feel free to chat with us!

Get startedContact sales
X

Start Your AI Journey Today

Sign up now with free credits to explore 100+ AI APIs.
Get my FREE credits now