> ## Documentation Index
> Fetch the complete documentation index at: https://www.adaline.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

> Use datasets to store prompt test cases, production examples, and regression coverage

Datasets are structured test cases for your prompts. Each row is one case. Each column is a value the prompt, evaluator, or review workflow can use.

Use datasets when you want to test a prompt against more than one hand-picked Playground run: golden examples, CSV imports, multimodal inputs, production logs, generated cases, and known regressions.

<img src="https://mintcdn.com/adaline/6qZ1-Sm8NeEttI_w/images/evaluate/dataset-manual-entry.png?fit=max&auto=format&n=6qZ1-Sm8NeEttI_w&q=85&s=4520c1753ea6f82d18f4c709d03c61ac" alt="Dataset table showing manually entered rows and columns" title="Dataset table in Adaline" style={{ width: "100%" }} width="3134" height="1191" data-path="images/evaluate/dataset-manual-entry.png" />

## What Belongs In A Dataset

| Case type               | Use it for                                                                   |
| ----------------------- | ---------------------------------------------------------------------------- |
| **Golden examples**     | Canonical inputs that should keep passing.                                   |
| **Edge cases**          | Ambiguous, long, malformed, unsafe, or uncommon requests.                    |
| **Production examples** | Real spans copied from Monitor when users expose a useful case.              |
| **Regression cases**    | Failures that should be tested before the next release.                      |
| **Synthetic cases**     | Generated variants that broaden coverage around a Behavior or Improve cycle. |

Keep rows specific. A dataset row should make it clear what input is being tested and what good output looks like.

## How Datasets Work

Dataset columns usually map to prompt variables. If a prompt expects `{{request_genre}}`, the dataset should have a `request_genre` column. Extra columns can hold expected output, labels, notes, IDs, or evaluator context.

Columns can be:

| Column type        | Use it when                                                     |
| ------------------ | --------------------------------------------------------------- |
| **Static**         | The value is typed, imported, or copied into the dataset.       |
| **Dynamic API**    | The value should be fetched from your API per row.              |
| **Dynamic prompt** | The value should be generated by another prompt in the project. |

Datasets can also contain text, images, and PDFs. Use multimodal cells when your prompt consumes files or visual context.

## Common Workflows

<Steps>
  <Step title="Create or import rows">
    Add rows manually, upload a CSV, or copy a useful production span from Monitor.
  </Step>

  <Step title="Match columns to prompt variables">
    Make sure required prompt variables have matching dataset columns.
  </Step>

  <Step title="Add expected output or labels">
    Store what the evaluator should check, what the reviewer should notice, or why the case matters.
  </Step>

  <Step title="Run evaluations">
    Use the dataset with one or more evaluators to score prompt output.
  </Step>

  <Step title="Keep useful failures">
    Promote important production failures or Improve evidence into long-lived regression coverage.
  </Step>
</Steps>

## Where Datasets Fit

Datasets connect the rest of the Platform:

* **Prompts** use dataset rows as repeatable inputs.
* **Evaluators** score prompt responses against dataset cases.
* **Monitor** turns real production spans into dataset rows.
* **Behaviors** reveal repeated patterns worth preserving as coverage.
* **Improve** uses linked and generated datasets to compare prompt candidates before review.

The goal is not to build the biggest dataset. The goal is to keep the examples that make release decisions clearer.

## Next Steps

<CardGroup cols={2}>
  <Card title="Set up a dataset" icon="database" href="/evaluate/setup-dataset">
    Create a dataset, add rows, and map columns to prompt variables.
  </Card>

  <Card title="Import CSV into dataset" icon="upload" href="/evaluate/import-csv-into-dataset">
    Bulk-import text, image, or PDF test cases.
  </Card>

  <Card title="Use multimodal cells" icon="images" href="/evaluate/different-modalities-in-dataset">
    Add text, image, and PDF values to dataset rows.
  </Card>

  <Card title="Use dynamic columns" icon="zap" href="/evaluate/dynamic-columns-in-dataset">
    Fetch row values from APIs or other prompts.
  </Card>

  <Card title="Build datasets from logs" icon="list-tree" href="/monitor/build-logs-from-dataset">
    Preserve useful production spans as test cases.
  </Card>

  <Card title="Evaluate prompts" icon="play" href="/evaluate/evaluate-prompts">
    Run prompts against datasets and evaluators.
  </Card>
</CardGroup>
