Blank white background with no objects or features visible.

تعلن TrueFoundry عن استحواذها على Seldon AI، موسعة بذلك لوحة التحكم الخاصة بها للذكاء الاصطناعي للمؤسسات. البيان الصحفي الكامل →

بناء RAG باستخدام TrueFoundry و MongoDB Atlas

Published: July 4, 2026

Introduction

Retrieval-Augmented Generation (RAG) combines the strengths of retrieval systems and generative models to produce highly relevant, context-aware outputs. It queries external knowledge sources—like databases or search indexes—to retrieve relevant information, which is then refined by a generative model.

Why RAG?
  1. Highly relevant and context-aware outputs.
  2. By incorporating dynamic and up-to-date knowledge, RAG systems overcome the limitations of static pretraining in generative models, making them highly effective for applications like question answering, knowledge-intensive tasks, and personalized content generation.
  3. The modular nature of RAG allows for optimization at both retrieval and generation stages, enabling greater flexibility and scalability in system design.

Cognita by TrueFoundry: Simplifying RAG for Scalable Applications

Despite its potential, implementing RAG can be complex, involving model selection, data organization, and best practices. Existing tools simplify prototyping but lack an open-source template for scalable deployment—enter Cognita.

Cognita is an open-source RAG framework that simplifies building and deploying scalable applications. By breaking RAG into modular steps, it ensures easy maintenance, interoperability with other AI tools, customization, and compliance. Cognita balances adaptability and user-friendliness while staying scalable for future advancements.

Advantages of Cognita

  1. A central reusable repository of parsers, loaders, embedders and retrievers.
  2. Ability for non-technical users to play with UI - Upload documents and perform QnA using modules built by the development team.
  3. Fully API driven - which allows integration with other systems.
  4. Large Language Models (LLMs) for easy interaction with generative models like OpenAI's GPT, Hugging Face models, or other LLM APIs.
  5. Prebuilt Integrations to easily connect to Pinecone, Weaviate, ChromaDB, or MongoDB Atlas Vector Search.
Know more about Cognita
Click Here

Why MongoDB for RAG?

Using MongoDB as a vector database for your Retrieval-Augmented Generation (RAG) application can be beneficial depending on your requirements. Here's why MongoDB could be a good choice:

1. Native Vector Search Support

MongoDB supports vector indexing through its Atlas Vector Search. This enables efficient similarity searches over high-dimensional data, which is central to RAG workflows. Key benefits:

  • Integration with MongoDB's Query Language: Combines vector search with traditional queries, allowing more flexible and powerful query composition.
  • High-performance search: Uses approximate nearest neighbor (ANN) algorithms like HNSW (Hierarchical Navigable Small World) for scalable and fast vector retrieval.

2. Unified Data Management

RAG applications often require managing both, unstructured data (e.g., text and embeddings) and structured data (e.g., metadata, user preferences).

MongoDB, being a document database, lets you store embeddings alongside their associated metadata in a single record. For example:

{
  "embedding": [0.1, 0.2, 0.3, ...],
  "text": "This is a sample document.",
  "metadata": {
    "source": "document_1",
    "timestamp": "2024-12-06T10:00:00Z"
  }
}

This avoids the complexity of managing embeddings in a separate system.

3. Flexibility and Scalability 

Schema-less Design: MongoDB's schema flexibility makes it easy to iterate on your data model as your RAG application evolves.

Horizontal Scaling: MongoDB's sharding capability allows handling large datasets and scaling as your application grows.

Cloud-native Features: MongoDB Atlas provides fully managed services, including scaling, backups, and monitoring.

Know more about TrueFoundry
Book Demo

Implementing RAG with cognita + MongoDB

Step 1: Setting up MongoDB

For a video tutorial to see how to get your free MongoDB Atlas cluster click here.

  1. Set up a MongoDB Atlas account by visiting the Register page if you don’t have an account yet.
  1. To setup a cluster in the Overview tab, click “Create”, select the cluster as per your requirements and hit “Create Deployment”.
  1. For adding required authentications, in the “Connect to Cluster” window, create a database user.
  2. Connect with MongoDB driver 
    • Choose the python version
    • The connection string will have your username and password. Copy the connection string. This would be used in the next step.

Step 2: Setting up Cognita to use MongoDB

  1. Clone the cognita github repository: https://github.com/truefoundry/cognita/tree/main
  2. Before starting the services, we need to configure model providers that we would need for embedding and generating answers. To start, copy models_config.sample.yaml to models_config.yaml.

cp models_config.sample.yaml models_config.yaml

  1. Create a mongo db collection in the newly created database, say “cognita”. This is the collection where all the chunks will be stored and used in the retrieval process.
  2. The compose file uses the compose.env file for environment variables. You can modify it as per your needs.
  3. Edit the “VECTOR_DB_CONFIG” key in the environment file. This config will be used in the bootstrap process to ensure that MongoDB will be used as a vector store throughout run time. The connection string for the mongo DB will be used here. Following is an example of how this would look like:

VECTOR_DB_CONFIG='{"provider":"mongo","url":"mongodb+srv://username:password@clustername.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0", "config": {"database_name": "cognita"}}'

  1. By default, the config has local providers enabled that need infinity and an ollama server to run embedding and LLMs locally. However, if you have a OpenAI API Key, you can uncomment the openai provider in models_config.yaml and update OPENAI_API_KEY in compose.env. Now, you can run the following command to start the services:

docker-compose --env-file compose.env --profile ollama --profile infinity up

  1. The compose file will start the following services
    • cognita-db - Postgres instance used to store metadata for collections and data sources.
    • cognita-backend - Used to start the FastAPI backend server for Cognita.
    • cognita-frontend - Used to start the frontend for Cognita.
  2. Once the services are up, you can access the frontend at http://localhost:5001.

Step 3: Set up a data collection in Cognita

Once you have set cognita up, the following steps will showcase how to use the UI to query documents:

Datasource

1. Create Data Source

  • Click on Data Sources tab 
  • Click + New Datasource
  • Data source type can be either files from local directory, web url, github url or providing Truefoundry artifact FQN.E.g: If Localdir is selected, upload files from your machine and click Submit.
  • Created Data sources list will be available in the Data Sources tab. 

2. Create Collection

  • Click on Collections tab
DataSourceList
  • Click + New Collection 
collection
  • Enter Collection Name
  • Select Embedding Model
  • Add earlier created data source and the necessary configuration
  • Click Process to create the collection and index the data
ingestionstarted

3. Upon creating a new collection, Here is what what happens behind the scenes

  • Create a new collection in the configured MongoDB database. For instance, if the name of the database is `cognita`, this step creates a collection with the given input name in the cognita database in mondo db. 
  • Upon creating a collection, a vector search index is created using the following code snippet:
from pymongo.operations import SearchIndexModel

search_index_model = SearchIndexModel(
            definition={
                "fields": [
                    {
                        "type": "vector",
                        "path": "embedding",
                        "numDimensions": self.get_embedding_dimensions(embeddings),
                        "similarity": "cosine",
                    }
                ]
            },
            name="vector_search_index",
            type="vectorSearch",
        )

        # Create the search index
        result = self.db[collection_name].create_search_index(model=search_index_model)

This ensures that the newly created collection is ready for vector search queries. Note that creating an index on MongoDB may take up to a minute.

4. As soon as you create the collection, data ingestion begins, you can view its status by selecting your collection in the collections tab. This step is responsible for parsing your files, chunking them and adding them to your MongoDB. You can also add additional data sources later on and index them in the collection. Move to the next step once the Status is “Completed”.

ingestioncomplete

Step 4: Find the right config for your application

responsegen
  1. في علامة تبويب DocsQA، استخدم بيئة Cognita التجريبية لتجربة الإعدادات المختلفة لمعرفة ما هو الأنسب لتطبيقك. يمكنك تجربة ما يلي:
    • تقنيات الاسترجاع
    • نماذج LLM، درجات الحرارة، إلخ.
    • مطالبات LLM
    • نماذج التضمين
  2. أيًا كان الإعداد الذي يناسب تطبيقك بشكل أفضل، اضغط على "إنشاء تطبيق" لذلك، وسيتم نشر نقطة نهاية API لتطبيقك. يمكنك الذهاب إلى علامة تبويب "التطبيقات" لرؤية جميع تطبيقاتك المنشورة.

الخلاصة

أوضح هذا البرنامج التعليمي كيفية بناء تطبيق RAG جاهز للإنتاج باستخدام Cognita و MongoDB في 10 دقائق فقط. يوفر التآزر بين قابلية Cognita للتكيف وسهولة استخدامها، ونموذج مستندات MongoDB المرن مع البحث المتجه، أساسًا قويًا لإنشاء تطبيقات الذكاء الاصطناعي المتقدمة.

Know more about TrueFoundry
Book Demo

The fastest way to build, govern and scale your AI

Sign Up
Table of Contents

One Gateway for Every LLM, Agent and MCP Server

Book a 30-min with our AI expert

Book a Demo

The fastest way to build, govern and scale your AI

Book Demo
Summarize with
ChatGPT logo by OpenAI
Perplexity AI logo
Blurry red snowflake on white background, symmetrical frosty design with soft edges and abstract shape.

Discover More

November 5, 2025
|
5 min read

توطين البيانات في عصر الذكاء الاصطناعي الوكيل: كيف تمكّن بوابات الذكاء الاصطناعي التوسع السيادي والامتثال

October 5, 2023
|
5 min read

<Webinar> عرض الذكاء الاصطناعي التوليدي للمؤسسات

Best Fine Tuning Tools for Model Training
May 3, 2024
|
5 min read

أفضل 6 أدوات ضبط دقيق لتدريب النماذج في عام 2026

May 25, 2023
|
5 min read

النماذج اللغوية الكبيرة مفتوحة المصدر: تبنّها أو تندثر

July 4, 2026
|
5 min read

تكاملات منصة التعلم الآلي #1: Weights & Biases

Use Cases
Engineering and Product
July 4, 2026
|
5 min read

تكامل Pillar Security مع TrueFoundry

No items found.
July 4, 2026
|
5 min read

التخزين المؤقت الدلالي لنماذج اللغة الكبيرة (LLMs): تقليل التكلفة وزمن الاستجابة بما يتجاوز التخزين المؤقت للبادئات

No items found.
July 4, 2026
|
5 min read

تكاملات أدوات التعلم الآلي #2 DVC لإدارة إصدارات بياناتك

Engineering and Product
Use Cases
No items found.

Recent Blogs

Black left pointing arrow symbol on white background, directional indicator.
Black left pointing arrow symbol on white background, directional indicator.
Take a quick product tour
Start Product Tour
Product Tour