Evaluate: Datasets

The “Datasets” component in DataGyro’s Evaluate vertical is your starting point for any Information Retrieval (IR) system evaluation. Its primary purpose is to take your raw data and automatically prepare it for a TREC-style evaluation by generating queries and relevance judgments (qrels).

Uploading Your Data

Currently, DataGyro supports uploading datasets in the JSONL (JSON Lines) format. Support for other formats like plain JSON and Parquet is planned for future releases. To upload a dataset:

Navigate to the “Evaluate” section in DataGyro and select the “Datasets” tab.
Click on “Upload Dataset” (or a similar button).
Choose your JSONL file for upload.

Automatic Query and Qrel Generation

Once you upload your JSONL file, DataGyro works in the background to:

Parse your data: Understand the structure and content.
Generate Queries: Create a set of test queries based on the information within your dataset. These queries are designed to be representative of how a user might search for information within that data.
Generate Qrels (Relevance Judgments): For each generated query, DataGyro identifies and marks the relevant documents or passages within your dataset. These form the ground truth for the evaluation.

This automated process saves significant time and effort compared to manually creating query sets and relevance judgments.

Dataset Status: “Ready”

After the background processing is complete, your dataset’s status will change to “Ready”. This indicates that the queries and qrels have been successfully generated, and the dataset can now be used in the “Benchmarks” component to evaluate an IR algorithm.

Future Enhancements

Support for additional file formats (JSON, Parquet).
More configuration options for the query and qrel generation process.

Next, learn how to use your “Ready” dataset in Evaluate: Benchmarks.

Getting Started

Search

Evaluate

Datasets

Evaluate: Datasets

Uploading Your Data

Automatic Query and Qrel Generation

Dataset Status: “Ready”

Future Enhancements

Getting Started

Search

Evaluate

​Evaluate: Datasets

​Uploading Your Data

​Automatic Query and Qrel Generation

​Dataset Status: “Ready”

​Future Enhancements

Evaluate: Datasets

Uploading Your Data

Automatic Query and Qrel Generation

Dataset Status: “Ready”

Future Enhancements