Skip to main content

Querying Data in DataGyro

As the core capability of DataGyro’s Search vertical, our platform provides powerful natural language querying capabilities designed specifically for LLM-powered applications. Skip the complex query languages and directly ask questions in plain English.

Natural Language Querying

With DataGyro, you can query your collections using natural language questions or statements, making it incredibly easy to retrieve exactly what you need:
  • “Find all customer complaints from March 2025”
  • “What were our top-selling products last quarter?”
  • “Show me documents about machine learning”
  • “Get information about API rate limiting from our docs”
This approach makes retrieval intuitive and eliminates the need to write complex queries, especially when working with LLMs.

API Integration

DataGyro’s querying is designed to be easily integrated with your LLM-powered applications through our streaming API. The query endpoint returns real-time updates via Server-Sent Events (SSE):
// Example SSE integration for querying a collection
const eventSource = new EventSource('https://platform.datagyro.com/v1/query', {
  method: 'POST',
  headers: {
    'apikey': 'YOUR_API_KEY',
    'Content-Type': 'application/json',
    'Accept': 'text/event-stream'
  },
  body: JSON.stringify({
    query_string: "What were our sales figures for Q1 2025?",
    dataset_id: "118",
    limit: 10,
    use_smaller_model: false
  })
});

let queryResults = null;

eventSource.onmessage = function(event) {
  const data = JSON.parse(event.data);
  
  switch(data.type) {
    case 'thoughts':
      console.log('Query processing insights:', data.data);
      break;
    case 'sql':
      console.log('Generated SQL:', data.data.sql);
      break;
    case 'results':
      queryResults = data.data.items;
      console.log('Query results received');
      break;
    case 'close':
      console.log('Query completed');
      eventSource.close();
      // Now use queryResults in your LLM prompt
      break;
  }
};

Understanding Query Results

When you query a collection, DataGyro streams responses in real-time through Server-Sent Events. The final results are delivered in the results event with information optimized for LLMs:

Stream Events

The query process returns four types of events:
  1. Thoughts Event: Shows query processing insights including filters, traits, and key phrases
  2. SQL Event: Contains the generated SQL query used to retrieve data
  3. Results Event: Delivers the actual query results
  4. Close Event: Indicates the stream has completed

Results Structure

The results event contains an array of relevant items:
{
  "type": "results",
  "data": {
    "items": [
      {
        "id": "c0cc058f-0ef0-4ef5-bfea-fe0468d3814f",
        "item": {
          "001_headline": "Q1 2025 Financial Results",
          "002_current_title": "Sales Report",
          "003_bio": "In Q1 2025, total sales reached $4.2M, representing a 15% increase year-over-year compared to Q1 2024 ($3.65M).",
          "004_name": "Quarterly Report",
          "005_linkedin_url": "https://company.com/reports/q1-2025",
          "006_follower_count": 2588,
          "007_connections_count": 2482,
          "008_city": "New York",
          "009_country_code": "us"
        }
      }
      // Additional results...
    ]
  },
  "created": 1748020553784
}
}
Each result item includes:
  • A unique identifier for the result
  • The structured data from your dataset
  • All available fields for the matched record

Optimizing Your Queries

While DataGyro handles most of the complexity automatically, here are some tips for getting the best results:

Be Specific

More specific queries typically yield better results:
  • Less effective: “Tell me about our customers”
  • More effective: “What are the most common customer complaints in the last month?”

Include Context

Providing context helps DataGyro understand exactly what you’re looking for:
  • Less effective: “What are our policies?”
  • More effective: “What are our return policies for damaged electronics?”

Query Parameters

You can refine your queries with these parameters:
  • query_string: The natural language query or question
  • dataset_id: The ID of the collection you want to query
  • limit: Maximum number of results to return (default: 10)
  • use_smaller_model: Option to use a smaller model for faster processing (default: false)

Using Results with LLMs

The results from DataGyro queries are designed to be easily fed into LLM prompts. Since the API uses SSE streaming, you’ll collect results and then use them:
// Example of using DataGyro SSE results with an LLM
let queryResults = null;

const eventSource = new EventSource('https://platform.datagyro.com/v1/query', {
  method: 'POST',
  headers: {
    'apikey': 'YOUR_API_KEY',
    'Content-Type': 'application/json',
    'Accept': 'text/event-stream'
  },
  body: JSON.stringify({
    query_string: "What is our refund policy for subscription services?",
    dataset_id: "118",
    limit: 5
  })
});

eventSource.onmessage = async function(event) {
  const data = JSON.parse(event.data);
  
  if (data.type === 'results') {
    queryResults = data.data.items;
  } else if (data.type === 'close') {
    eventSource.close();
    
    // Prepare context for the LLM using the retrieved data
    let context = "Here's information about our refund policy:\n\n";
    queryResults.forEach(result => {
      // Access the item data from the result
      const item = result.item;
      context += `${item['003_bio'] || item.content}\n`;
      if (item['004_name']) context += `Source: ${item['004_name']}\n`;
      context += '\n';
    });

    // Send to your LLM with the retrieved context
    const llmResponse = await callYourLLM({
      messages: [
        { role: "system", content: "You are a helpful customer support assistant." },
        { role: "user", content: "Can you explain your refund policy for subscriptions?" },
        { role: "assistant", content: "Let me check our policies for you." },
        { role: "system", content: context }
      ]
    });
  }
};

Advanced Features

DataGyro automatically uses semantic search to understand the meaning behind your queries, not just matching keywords. This enables more intuitive retrieval:
  • Users can ask questions in different ways but get consistent results
  • Queries can match conceptually related content, not just exact phrase matches
  • Results are ranked by semantic relevance, not just keyword frequency
For optimal results, DataGyro combines semantic search with traditional keyword search:
  • Get the precision of keyword matching when needed
  • Benefit from the flexibility of semantic understanding
  • Automatically balanced for best results based on your query

Query Understanding

DataGyro parses and analyzes natural language queries to extract:
  • Core information needs
  • Important entities and relationships
  • Time frames and other constraints
  • Required context for proper retrieval