Skip to main content
The MCP Server exposes a set of tools that allow agents to read, write, and query data within the DataDot ecosystem.

Discovery

search_tools

Dynamically discovers available tools based on a query.
  • Args: query (str), limit (int, default: 5)
  • Use Case: “What tools can I use to manage files?”

RAG & Retrieval

These tools allow the agent to tap into the knowledge base.

query_content

The most powerful tool. Performs a semantic search across all documents in the workspace and returns context-aware answers.
  • Features: Supports streaming reasoning, source citation, and relevance scoring.
  • Args: query (str), mode (“query” or “chat”), top_n (int)
  • Use Case: “What does the architecture document say about authentication?”

search_files

Performs a semantic or keyword search to find relevant files without retrieving their full content.
  • Args: query (str)
  • Use Case: “Find all files related to ‘onboarding’.”

get_citation

Retrieves specific citation metadata for a document ID, ensuring that the agent can attribute information correctly.
  • Args: doc_id (str)
  • Use Case: “Get the publication date for document ‘doc-123’.”

get_embeddings

Retrieves vector embeddings and metadata for a specific document or all documents in the workspace.
  • Args: document_id (optional str)
  • Use Case: “Check how many vectors are stored for this file.”

Document Management

Tools for modifying the workspace and file system.

process_file

Triggers the embedding and indexing process for a specific document. This performs text extraction, chunking, and vector storage.
  • Args: document_id (str), use_cache (bool)
  • Use Case: “Process the uploaded PDF.”

delete_file

Removes a file from the workspace index and database.
  • Args: document_id (str)
  • Use Case: “Remove the outdated policy document.”

get_file_content

Reads the raw text content of a file.
  • Args: document_id (str), limit_bytes (int)
  • Use Case: “Read the content of ‘config.json’.”

get_source_file

Lists all files currently available in the workspace with their metadata. despite the name, it returns the full list.
  • Use Case: “Show me all files in the current workspace.”

System & Maintenance

sync_files

Synchronizes a local directory with the workspace, automatically uploading and embedding supported files.
  • Args: source_path (optional str), auto_embed (bool)
  • Use Case: “Sync the latest code changes to the knowledge base.”

check_embeddings

Validates the integrity of embeddings in the workspace, identifying missing or stale vectors.
  • Use Case: “Verify if all documents are correctly indexed.”

refresh_embeddings

Forces a re-index for a specific document, clearing old vectors and re-generating them.
  • Args: document_id (str), clear_cache (bool)
  • Use Case: “The file changed, update its embeddings.”