Create, manage, and populate evaluation datasets programmatically with the Brokle SDK

Datasets

Name: Brokle
Author: Brokle

Datasets are collections of input/expected pairs used for systematic evaluation of AI applications. Use client.datasets to create, populate, and manage datasets programmatically.

Quick Start

Create a dataset, insert items, and iterate over them in under a minute:

from brokle import Brokle

client = Brokle(api_key="bk_...")

# Create a dataset
dataset = client.datasets.create(
    name="qa-pairs",
    description="Question-answer test cases"
)

# Insert items
dataset.insert([
    {"input": {"question": "What is 2+2?"}, "expected": {"answer": "4"}},
    {"input": {"question": "Capital of France?"}, "expected": {"answer": "Paris"}},
])

# Iterate over items
for item in dataset:
    print(item.input, item.expected)

import { Brokle } from 'brokle';

const client = new Brokle({ apiKey: 'bk_...' });

// Create a dataset
const dataset = await client.datasets.create({
  name: "qa-pairs",
  description: "Question-answer test cases",
});

// Insert items
await dataset.insert([
  { input: { question: "What is 2+2?" }, expected: { answer: "4" } },
  { input: { question: "Capital of France?" }, expected: { answer: "Paris" } },
]);

// Iterate over items (auto-paginated)
for await (const item of dataset) {
  console.log(item.input, item.expected);
}

Creating Datasets

dataset = client.datasets.create(
    name="my-dataset",
    description="Optional description",
    metadata={"team": "ml", "version": "1.0"},
)
print(dataset.id, dataset.name)

const dataset = await client.datasets.create({
  name: "my-dataset",
  description: "Optional description",
  metadata: { team: "ml", version: "1.0" },
});
console.log(dataset.id, dataset.name);

Retrieving Datasets

# Get by ID
dataset = client.datasets.get("01HXYZ...")

# List all datasets (paginated)
datasets = client.datasets.list(limit=10, page=1)
for ds in datasets:
    print(ds.name)

// Get by ID
const dataset = await client.datasets.get("01HXYZ...");

// List all datasets (paginated)
const datasets = await client.datasets.list({ limit: 10, page: 1 });
for (const ds of datasets) {
  console.log(ds.name);
}

Updating & Deleting

# Update metadata
updated = client.datasets.update(
    "01HXYZ...",
    name="renamed-dataset",
    description="Updated description",
)

# Delete
client.datasets.delete("01HXYZ...")

// Update metadata
const updated = await client.datasets.update("01HXYZ...", {
  name: "renamed-dataset",
  description: "Updated description",
});

// Delete
await client.datasets.delete("01HXYZ...");

Working with Items

Insert Items

Each item must have an input field. expected and metadata are optional.

count = dataset.insert([
    {"input": {"prompt": "Summarize this article"}, "expected": {"summary": "..."}},
    {"input": {"prompt": "Translate to French"}, "metadata": {"language": "fr"}},
])
print(f"Inserted {count} items")

const count = await dataset.insert([
  { input: { prompt: "Summarize this article" }, expected: { summary: "..." } },
  { input: { prompt: "Translate to French" }, metadata: { language: "fr" } },
]);
console.log(`Inserted ${count} items`);

Iterate Items

Both SDKs support auto-paginated iteration that transparently fetches pages as needed.

# Auto-paginated iteration
for item in dataset:
    print(item.input, item.expected)

# Manual pagination
items = dataset.get_items(limit=10, page=1)

# Get total count
total = dataset.count()

// Auto-paginated async iteration
for await (const item of dataset) {
  console.log(item.input, item.expected);
}

// Manual pagination
const items = await dataset.getItems({ limit: 10, page: 1 });

// Get total count
const total = await dataset.count();

Import from Files

Import items from JSON, JSONL, or CSV files.

# Import from JSON or JSONL file
result = dataset.insert_from_json("./data.json")
print(f"Created: {result['created']}, Skipped: {result['skipped']}")

# Import from CSV
result = dataset.insert_from_csv(
    "./qa_pairs.csv",
    column_mapping={
        "input_column": "question",
        "expected_column": "answer",
        "metadata_columns": ["category", "difficulty"],
    },
)

// Import from JSON or JSONL file
const result = await dataset.insertFromJson("./data.json");
console.log(`Created: ${result.created}, Skipped: ${result.skipped}`);

// Import from CSV
const csvResult = await dataset.insertFromCsv("./qa_pairs.csv", {
  inputColumn: "question",
  expectedColumn: "answer",
  metadataColumns: ["category", "difficulty"],
});

Import from Production Data

Create dataset items directly from production traces or spans — no manual data collection needed.

# Create items from production traces
result = dataset.from_traces(["trace_id_1", "trace_id_2"])
print(f"Created {result['created']} items from traces")

# Create items from spans
result = dataset.from_spans(["span_id_1", "span_id_2"])

// Create items from production traces
const result = await dataset.fromTraces(["trace_id_1", "trace_id_2"]);
console.log(`Created ${result.created} items from traces`);

// Create items from spans
const spanResult = await dataset.fromSpans(["span_id_1", "span_id_2"]);

Dataset Versioning

Versions create point-in-time snapshots of your dataset for reproducible evaluations.

# Create a version snapshot
version = dataset.create_version(
    description="Pre-training dataset v1",
    metadata={"experiment_id": "exp_123"},
)
print(f"Version {version['version']} with {version['item_count']} items")

# List all versions
versions = dataset.list_versions()

# Get items from a specific version
result = dataset.get_version_items("version_id", limit=50, offset=0)

# Pin dataset to a version (ensures reproducibility)
dataset.pin_version(version_id="version_id")

# Unpin to return to live items
dataset.pin_version(version_id=None)

// Create a version snapshot
const version = await dataset.createVersion({
  description: "Pre-training dataset v1",
  metadata: { experiment_id: "exp_123" },
});
console.log(`Version ${version.version} with ${version.item_count} items`);

// List all versions
const versions = await dataset.listVersions();

// Get items from a specific version
const { items, total } = await dataset.getVersionItems("version_id", {
  limit: 50,
  offset: 0,
});

// Pin dataset to a version (ensures reproducibility)
await dataset.pinVersion({ versionId: "version_id" });

// Unpin to return to live items
await dataset.pinVersion({ versionId: null });

DatasetsManager Reference

Method	Parameters	Returns	Description
`create`	`name`, `description?`, `metadata?`	`Dataset`	Create a new dataset
`get`	`datasetId`	`Dataset`	Fetch dataset by ID
`list`	`limit?` (default: 50), `page?` (default: 1)	`Dataset[]`	Paginated list of datasets
`update`	`datasetId`, `name?`, `description?`, `metadata?`	`Dataset`	Update dataset metadata
`delete`	`datasetId`	`void`	Delete a dataset

Dataset Object Reference

Method	Parameters	Returns	Description
`insert`	`items[]` — `{input, expected?, metadata?}`	`number`	Insert items, returns count created
`getItems`	`limit?`, `page?`	`DatasetItem[]`	Fetch items with pagination
`count`	—	`number`	Get total item count
`insertFromJson`	`filePath`, `options?`	`BulkImportResult`	Import from JSON/JSONL file
`insertFromCsv`	`filePath`, `columnMapping`, `options?`	`BulkImportResult`	Import from CSV file
`fromTraces`	`traceIds[]`, `options?`	`BulkImportResult`	Create items from production traces
`fromSpans`	`spanIds[]`, `options?`	`BulkImportResult`	Create items from production spans
`createVersion`	`description?`, `metadata?`	`DatasetVersion`	Create a version snapshot
`listVersions`	—	`DatasetVersion[]`	List all versions
`getVersion`	`versionId`	`DatasetVersion`	Get version by ID
`getVersionItems`	`versionId`, `limit?`, `offset?`	`{items, total}`	Get items for a version
`pinVersion`	`versionId` (or `null` to unpin)	`DatasetWithVersionInfo`	Pin dataset to a version
`getInfo`	—	`DatasetWithVersionInfo`	Get dataset with version details
`toJson`	`filePath`	`void`	Export items to JSON file
`export`	—	`DatasetItem[]`	Export all items as array

Dataset Properties

Property	Type	Description
`id`	`string`	Dataset ID (ULID)
`name`	`string`	Dataset name
`description`	`string?`	Optional description
`metadata`	`object?`	Optional metadata
`createdAt`	`string`	ISO timestamp
`updatedAt`	`string`	ISO timestamp

Evaluation & Experiments — Run experiments against datasets
Span Query — Query production spans to populate datasets
Evaluation Concepts — Conceptual overview of evaluation workflows

Datasets

Datasets are collections of input/expected pairs used for systematic evaluation of AI applications. Use client.datasets to create, populate, and manage datasets programmatically.

Quick Start

Create a dataset, insert items, and iterate over them in under a minute:

from brokle import Brokle

client = Brokle(api_key="bk_...")

# Create a dataset
dataset = client.datasets.create(
    name="qa-pairs",
    description="Question-answer test cases"
)

# Insert items
dataset.insert([
    {"input": {"question": "What is 2+2?"}, "expected": {"answer": "4"}},
    {"input": {"question": "Capital of France?"}, "expected": {"answer": "Paris"}},
])

# Iterate over items
for item in dataset:
    print(item.input, item.expected)

import { Brokle } from 'brokle';

const client = new Brokle({ apiKey: 'bk_...' });

// Create a dataset
const dataset = await client.datasets.create({
  name: "qa-pairs",
  description: "Question-answer test cases",
});

// Insert items
await dataset.insert([
  { input: { question: "What is 2+2?" }, expected: { answer: "4" } },
  { input: { question: "Capital of France?" }, expected: { answer: "Paris" } },
]);

// Iterate over items (auto-paginated)
for await (const item of dataset) {
  console.log(item.input, item.expected);
}

Creating Datasets

dataset = client.datasets.create(
    name="my-dataset",
    description="Optional description",
    metadata={"team": "ml", "version": "1.0"},
)
print(dataset.id, dataset.name)

const dataset = await client.datasets.create({
  name: "my-dataset",
  description: "Optional description",
  metadata: { team: "ml", version: "1.0" },
});
console.log(dataset.id, dataset.name);

Retrieving Datasets

# Get by ID
dataset = client.datasets.get("01HXYZ...")

# List all datasets (paginated)
datasets = client.datasets.list(limit=10, page=1)
for ds in datasets:
    print(ds.name)

// Get by ID
const dataset = await client.datasets.get("01HXYZ...");

// List all datasets (paginated)
const datasets = await client.datasets.list({ limit: 10, page: 1 });
for (const ds of datasets) {
  console.log(ds.name);
}

Updating & Deleting

# Update metadata
updated = client.datasets.update(
    "01HXYZ...",
    name="renamed-dataset",
    description="Updated description",
)

# Delete
client.datasets.delete("01HXYZ...")

// Update metadata
const updated = await client.datasets.update("01HXYZ...", {
  name: "renamed-dataset",
  description: "Updated description",
});

// Delete
await client.datasets.delete("01HXYZ...");

Working with Items

Insert Items

Each item must have an input field. expected and metadata are optional.

count = dataset.insert([
    {"input": {"prompt": "Summarize this article"}, "expected": {"summary": "..."}},
    {"input": {"prompt": "Translate to French"}, "metadata": {"language": "fr"}},
])
print(f"Inserted {count} items")

const count = await dataset.insert([
  { input: { prompt: "Summarize this article" }, expected: { summary: "..." } },
  { input: { prompt: "Translate to French" }, metadata: { language: "fr" } },
]);
console.log(`Inserted ${count} items`);

Iterate Items

Both SDKs support auto-paginated iteration that transparently fetches pages as needed.

# Auto-paginated iteration
for item in dataset:
    print(item.input, item.expected)

# Manual pagination
items = dataset.get_items(limit=10, page=1)

# Get total count
total = dataset.count()

// Auto-paginated async iteration
for await (const item of dataset) {
  console.log(item.input, item.expected);
}

// Manual pagination
const items = await dataset.getItems({ limit: 10, page: 1 });

// Get total count
const total = await dataset.count();

Import from Files

Import items from JSON, JSONL, or CSV files.

# Import from JSON or JSONL file
result = dataset.insert_from_json("./data.json")
print(f"Created: {result['created']}, Skipped: {result['skipped']}")

# Import from CSV
result = dataset.insert_from_csv(
    "./qa_pairs.csv",
    column_mapping={
        "input_column": "question",
        "expected_column": "answer",
        "metadata_columns": ["category", "difficulty"],
    },
)

// Import from JSON or JSONL file
const result = await dataset.insertFromJson("./data.json");
console.log(`Created: ${result.created}, Skipped: ${result.skipped}`);

// Import from CSV
const csvResult = await dataset.insertFromCsv("./qa_pairs.csv", {
  inputColumn: "question",
  expectedColumn: "answer",
  metadataColumns: ["category", "difficulty"],
});

Import from Production Data

Create dataset items directly from production traces or spans — no manual data collection needed.

# Create items from production traces
result = dataset.from_traces(["trace_id_1", "trace_id_2"])
print(f"Created {result['created']} items from traces")

# Create items from spans
result = dataset.from_spans(["span_id_1", "span_id_2"])

// Create items from production traces
const result = await dataset.fromTraces(["trace_id_1", "trace_id_2"]);
console.log(`Created ${result.created} items from traces`);

// Create items from spans
const spanResult = await dataset.fromSpans(["span_id_1", "span_id_2"]);

Dataset Versioning

Versions create point-in-time snapshots of your dataset for reproducible evaluations.

# Create a version snapshot
version = dataset.create_version(
    description="Pre-training dataset v1",
    metadata={"experiment_id": "exp_123"},
)
print(f"Version {version['version']} with {version['item_count']} items")

# List all versions
versions = dataset.list_versions()

# Get items from a specific version
result = dataset.get_version_items("version_id", limit=50, offset=0)

# Pin dataset to a version (ensures reproducibility)
dataset.pin_version(version_id="version_id")

# Unpin to return to live items
dataset.pin_version(version_id=None)

// Create a version snapshot
const version = await dataset.createVersion({
  description: "Pre-training dataset v1",
  metadata: { experiment_id: "exp_123" },
});
console.log(`Version ${version.version} with ${version.item_count} items`);

// List all versions
const versions = await dataset.listVersions();

// Get items from a specific version
const { items, total } = await dataset.getVersionItems("version_id", {
  limit: 50,
  offset: 0,
});

// Pin dataset to a version (ensures reproducibility)
await dataset.pinVersion({ versionId: "version_id" });

// Unpin to return to live items
await dataset.pinVersion({ versionId: null });

DatasetsManager Reference

Method	Parameters	Returns	Description
`create`	`name`, `description?`, `metadata?`	`Dataset`	Create a new dataset
`get`	`datasetId`	`Dataset`	Fetch dataset by ID
`list`	`limit?` (default: 50), `page?` (default: 1)	`Dataset[]`	Paginated list of datasets
`update`	`datasetId`, `name?`, `description?`, `metadata?`	`Dataset`	Update dataset metadata
`delete`	`datasetId`	`void`	Delete a dataset

Dataset Object Reference

Method	Parameters	Returns	Description
`insert`	`items[]` — `{input, expected?, metadata?}`	`number`	Insert items, returns count created
`getItems`	`limit?`, `page?`	`DatasetItem[]`	Fetch items with pagination
`count`	—	`number`	Get total item count
`insertFromJson`	`filePath`, `options?`	`BulkImportResult`	Import from JSON/JSONL file
`insertFromCsv`	`filePath`, `columnMapping`, `options?`	`BulkImportResult`	Import from CSV file
`fromTraces`	`traceIds[]`, `options?`	`BulkImportResult`	Create items from production traces
`fromSpans`	`spanIds[]`, `options?`	`BulkImportResult`	Create items from production spans
`createVersion`	`description?`, `metadata?`	`DatasetVersion`	Create a version snapshot
`listVersions`	—	`DatasetVersion[]`	List all versions
`getVersion`	`versionId`	`DatasetVersion`	Get version by ID
`getVersionItems`	`versionId`, `limit?`, `offset?`	`{items, total}`	Get items for a version
`pinVersion`	`versionId` (or `null` to unpin)	`DatasetWithVersionInfo`	Pin dataset to a version
`getInfo`	—	`DatasetWithVersionInfo`	Get dataset with version details
`toJson`	`filePath`	`void`	Export items to JSON file
`export`	—	`DatasetItem[]`	Export all items as array

Dataset Properties

Property	Type	Description
`id`	`string`	Dataset ID (ULID)
`name`	`string`	Dataset name
`description`	`string?`	Optional description
`metadata`	`object?`	Optional metadata
`createdAt`	`string`	ISO timestamp
`updatedAt`	`string`	ISO timestamp

Evaluation & Experiments — Run experiments against datasets
Span Query — Query production spans to populate datasets
Evaluation Concepts — Conceptual overview of evaluation workflows

Datasets

Datasets

Quick Start

Creating Datasets

Retrieving Datasets

Updating & Deleting

Working with Items

Insert Items

Iterate Items

Import from Files

Import from Production Data

Dataset Versioning

DatasetsManager Reference

Dataset Object Reference

Dataset Properties

On this page

Datasets

Datasets

Quick Start

Creating Datasets

Retrieving Datasets

Updating & Deleting

Working with Items

Insert Items

Iterate Items

Import from Files

Import from Production Data

Dataset Versioning

DatasetsManager Reference

Dataset Object Reference

Dataset Properties

On this page