Creating & Using Data Sources
This guide covers uploading and managing document-based data sources, then querying them for relevant AI context.
Last updated
This guide covers uploading and managing document-based data sources, then querying them for relevant AI context.
Last updated
MindStudio allows you to create internal data sources directly within your projects. These data sources are ideal for uploading documents—like support guides or product manuals—that your AI agents can reference to generate accurate, contextual responses.
MindStudio supports several types of data sources:
Integration data sources: External services like Google Docs or Sheets, brought in via integration blocks.
Internal databases: Custom backends or structured tables, supported via advanced connections.
Document-based project data sources: Files uploaded directly into your project’s "Data Sources" folder—this is the focus of this guide.
To demonstrate how document-based data sources work, we'll create a support bot that answers questions about MindStudio using uploaded documentation.
Begin your AI agent with a user input block. This block captures the user's question and stores it in a variable, typically called query
.
Navigate to the Data Sources section on the left-hand panel. Click the plus button to create a new data source:
Name it (e.g., Mind Studio Docs
)
Add a description
Upload documents (up to 150 files, each ≤50MB)
Tip: Use a free PDF compression service if your documents are too large.
As the document uploads, it will be processed into a vector database:
You’ll see a word count and chunk count.
Review the extracted text to ensure formatting looks clean.
Check the chunk preview to understand how the document is split.
Use the index snippet to reference the full document, or query it with natural language.
Insert the Query Data Source block into your workflow:
Select your uploaded data source.
Set the output variable (e.g., query_result
)
Use the query
variable (from user input) to trigger the search.
Optionally adjust the number of chunks retrieved (default is 3, max is 5).
Use a Generate Text block to create your AI’s response:
This setup ensures the AI receives relevant context before answering.
Use the Draft Agent preview to test your support bot. As users ask questions, the system:
Queries the vectorized document.
Retrieves relevant text chunks.
Uses those chunks as context to generate an answer.
If your model has a large enough context window (e.g., Claude 3.5 Haiku supports 200k tokens), you can pass the entire document to the AI using the index snippet.
Caution: Passing full documents may reduce performance or make the AI less precise. Use only when full context is necessary.
Data sources in MindStudio allow AI agents to query long-form documents with natural language.
Use them to build agents like knowledge bases, support bots, or product Q&A tools.
Choose between querying small chunks for relevance or referencing full documents for completeness.
Always validate uploaded files by checking the extracted text for formatting issues.
Data sources are a powerful way to give your AI agents domain-specific expertise—using the same documentation your team already relies on.