Guides
RAG Integration

RAG Integration

DevTeam Orchestrator integrates with Weaviate vector search to provide Retrieval-Augmented Generation (RAG) capabilities. This allows agent tasks to query a knowledge base before generating responses, improving factual accuracy and domain specificity.

How RAG Works in DevTeam

  Task received
       |
       v
  [RAG Step: Search Weaviate]
       |
  Retrieve relevant documents
       |
       v
  [Agent Step: Generate with context]
       |
  Model receives: prompt + retrieved docs
       |
       v
  Output with grounded references
  1. A task or plan step includes a RAG configuration.
  2. Before the model is called, the orchestrator queries Weaviate for semantically similar documents.
  3. Retrieved documents are injected into the prompt as context.
  4. The model generates a response grounded in the retrieved information.

Configuration

Weaviate Connection

const client = new DevTeamClient({
  apiUrl: 'https://devteam.marsala.dev',
  apiKey: process.env.DEVTEAM_API_KEY,
  rag: {
    provider: 'weaviate',
    endpoint: 'http://localhost:8080',
    defaultCollection: 'DocChunk',
    defaultLimit: 10,
    defaultCertainty: 0.75,
  },
});

Environment Variables

.env
WEAVIATE_URL=http://localhost:8080
WEAVIATE_DEFAULT_COLLECTION=DocChunk
WEAVIATE_CERTAINTY_THRESHOLD=0.75

Using RAG in Tasks

Simple RAG Task

const result = await client.createTask({
  prompt: 'What are the tax implications of forming an LLC in Delaware?',
  model: 'sonnet',
  rag: {
    collection: 'USIRSTaxRule',
    query: 'LLC formation Delaware tax implications',
    limit: 5,
    certainty: 0.80,
  },
});

The orchestrator will:

  1. Search USIRSTaxRule for documents matching "LLC formation Delaware tax implications"
  2. Retrieve the top 5 results with certainty >= 0.80
  3. Prepend them to the prompt as Context: [retrieved documents]
  4. Send the augmented prompt to the model

RAG in Plan Steps

steps:
  - id: research
    prompt: |
      Based on the following context documents, answer:
      {{input.question}}
 
      Context will be provided automatically via RAG.
    model: sonnet
    rag:
      collection: DocChunk
      query: "{{input.question}}"
      limit: 10
      certainty: 0.75
      filters:
        - path: ["project"]
          operator: Equal
          valueText: "legal-templates"
 
  - id: synthesize
    prompt: |
      Using the research findings, provide a comprehensive answer:
      Research: {{research.output}}
      Question: {{input.question}}
    model: opus
    dependsOn: [research]

Advanced RAG Patterns

Multi-Collection Search

Query multiple collections and merge results:

const result = await client.createTask({
  prompt: 'Analyze the legal and tax implications of this corporate structure.',
  model: 'opus',
  rag: {
    searches: [
      {
        collection: 'USIRSTaxRule',
        query: 'corporate structure tax implications',
        limit: 5,
      },
      {
        collection: 'USCourtOpinion',
        query: 'corporate veil piercing liability',
        limit: 3,
      },
      {
        collection: 'SGRegulation',
        query: 'Singapore company compliance requirements',
        limit: 3,
      },
    ],
    mergeStrategy: 'interleave', // 'interleave' | 'sequential' | 'deduplicate'
  },
});

Hybrid Search

Combine vector similarity with keyword matching:

rag: {
  collection: 'DocChunk',
  query: 'contract indemnification clause',
  limit: 10,
  searchType: 'hybrid',
  hybridConfig: {
    alpha: 0.75,  // 0 = pure keyword, 1 = pure vector
  },
}

Filtered Search

Apply metadata filters to narrow results:

rag: {
  collection: 'DocChunk',
  query: 'risk assessment methodology',
  limit: 10,
  filters: [
    { path: ['project'], operator: 'Equal', valueText: 'legal-templates' },
    { path: ['category'], operator: 'Equal', valueText: 'risk-management' },
    { path: ['createdAt'], operator: 'GreaterThan', valueDate: '2025-01-01' },
  ],
}

RAG Pipeline Template

A reusable template for RAG-powered Q&A:

rag-qa-v1.yaml
name: RAG Question Answering
description: Answer questions using vector-searched knowledge base context.
version: "1.0.0"
industry: custom
tags: [rag, qa, knowledge-base]
 
inputSchema:
  type: object
  required: [question, collection]
  properties:
    question:
      type: string
    collection:
      type: string
      description: Weaviate collection to search
    filters:
      type: object
      description: Optional metadata filters
 
steps:
  - id: retrieve
    prompt: |
      Search the knowledge base and return the most relevant context
      for answering: {{input.question}}
    model: haiku
    rag:
      collection: "{{input.collection}}"
      query: "{{input.question}}"
      limit: 10
      certainty: 0.75
 
  - id: answer
    prompt: |
      Answer the following question using ONLY the provided context.
      If the context does not contain enough information, say so explicitly.
 
      Question: {{input.question}}
 
      Context:
      {{retrieve.output}}
 
      Rules:
      1. Cite specific sources from the context
      2. Do not fabricate information
      3. If uncertain, state your confidence level
    model: sonnet
    dependsOn: [retrieve]
    temperature: 0.3
 
  - id: verify
    prompt: |
      Verify the answer against the context. Check for:
      1. Claims not supported by the context (hallucination)
      2. Missing information that the context contains
      3. Accuracy of citations
 
      Answer: {{answer.output}}
      Context: {{retrieve.output}}
 
      Return JSON: { "verified": true/false, "issues": [...], "confidence": 0.0-1.0 }
    model: haiku
    dependsOn: [answer, retrieve]

Collection Management

List Collections

const collections = await client.rag.listCollections();
collections.forEach((c) => {
  console.log(`${c.name}: ${c.objectCount} objects, vectorizer: ${c.vectorizer}`);
});

Ingest Documents

await client.rag.ingest({
  collection: 'DocChunk',
  documents: [
    {
      content: 'Full text of the document...',
      metadata: {
        title: 'Contract Template - NDA',
        project: 'legal-templates',
        category: 'contracts',
        source: 'internal',
      },
    },
  ],
  chunkSize: 500,
  chunkOverlap: 50,
});

CLI Ingestion

# Ingest a file
devteam rag ingest --collection DocChunk --file ./document.pdf --project legal
 
# Ingest a directory
devteam rag ingest --collection DocChunk --dir ./documents/ --project legal --recursive
 
# Search
devteam rag search --collection DocChunk --query "indemnification clause" --limit 5

For large-scale ingestion (1000+ documents), use the batch ingestion API which processes documents in parallel and reports progress. See the API Reference for the /rag/ingest/batch endpoint.

Performance Tuning

ParameterRecommendation
limitStart with 5-10. More context improves accuracy but increases token cost.
certainty0.75 for broad search, 0.85+ for precise matching.
chunkSize300-500 tokens for most use cases. Smaller for Q&A, larger for analysis.
chunkOverlap10-20% of chunk size to maintain context across boundaries.
alpha (hybrid)0.75 for semantic-heavy, 0.25 for keyword-heavy queries.

Next Steps