Collections
As a fundamental component of DataGyro’s Search vertical, Collections are logical groupings of data that allow you to organize your information for natural language search and LLM retrieval.What are Collections?
Collections serve as an abstraction layer over your data sources, enabling you to query your data using natural language. Currently, collections have a 1:1 mapping with data sources - each collection is created from a single data source. Collections are the queryable interface to your data sources. While data sources provide the connection to your data, collections transform that data into a format optimized for LLM retrieval, handling all the complex work of chunking, embedding, and indexing automatically.Creating Collections
Creating a collection is simple and straightforward:- Navigate to the “Collections” section in your project
- Click “New Collection”
- Enter a name for your collection
- Select the data source you want to use
- Click “Create Collection”
Collection Processing Time
After creating a collection, our systems need time to process your data:- We sync the data from your database (for SQL connection strings) or import your SQL dump file
- Automatically analyze and chunk your data optimally for retrieval
- Generate high-quality embeddings using the best models for your data type
- Create hybrid search indexes that combine semantic and keyword search capabilities
- Optimize everything for fast, accurate retrieval
Automatic Updates for SQL Databases
For collections connected to SQL databases via connection strings, any changes you make to your database will be automatically pulled and applied to your collection! This ensures your collection always stays in sync with your source data.To enable automatic updates for SQL databases, you’ll need to configure your database properly by running the following SQL commands as a database administrator. These commands set up logical replication and create a dedicated user with the necessary permissions for DataGyro to monitor changes:Replace the placeholders:After running these commands, provide the database connection details including the
- Replace
xxxxxxxxwith a secure password for the DataGyro user - Replace
<DB_NAME>with your actual database name - Replace
<SCHEMA_NAME>with your schema name (repeat the schema commands for each schema you want to sync)
datagyro user credentials when setting up your data source in DataGyro.Collection Schema
Each collection inherits its schema from the underlying data source:- Fields: The columns or attributes from your data source
- Data Types: The type of data each field contains (string, number, date, etc.)
- Primary Keys: Fields that uniquely identify records in your collection
Managing Collections
Viewing Collection Details
To view details about a collection:- Go to the “Collections” section
- Click on the collection name
- Review the schema and data preview
Deleting Collections
To delete a collection:- Go to the “Collections” section
- Find the collection you want to delete
- Click the “Delete” button
- Confirm the deletion