> ## Documentation Index > Fetch the complete documentation index at: https://www.adaline.ai/docs/llms.txt > Use this file to discover all available pages before exploring further. # Overview > Use datasets to store prompt test cases, production examples, and regression coverage Datasets are structured test cases for your prompts. Each row is one case. Each column is a value the prompt, evaluator, or review workflow can use. Use datasets when you want to test a prompt against more than one hand-picked Playground run: golden examples, CSV imports, multimodal inputs, production logs, generated cases, and known regressions. Dataset table showing manually entered rows and columns

Dataset table showing manually entered rows and columns

## What Belongs In A Dataset | Case type | Use it for | | ----------------------- | ---------------------------------------------------------------------------- | | **Golden examples** | Canonical inputs that should keep passing. | | **Edge cases** | Ambiguous, long, malformed, unsafe, or uncommon requests. | | **Production examples** | Real spans copied from Monitor when users expose a useful case. | | **Regression cases** | Failures that should be tested before the next release. | | **Synthetic cases** | Generated variants that broaden coverage around a Behavior or Improve cycle. | Keep rows specific. A dataset row should make it clear what input is being tested and what good output looks like. ## How Datasets Work Dataset columns usually map to prompt variables. If a prompt expects `{{request_genre}}`, the dataset should have a `request_genre` column. Extra columns can hold expected output, labels, notes, IDs, or evaluator context. Columns can be: | Column type | Use it when | | ------------------ | --------------------------------------------------------------- | | **Static** | The value is typed, imported, or copied into the dataset. | | **Dynamic API** | The value should be fetched from your API per row. | | **Dynamic prompt** | The value should be generated by another prompt in the project. | Datasets can also contain text, images, and PDFs. Use multimodal cells when your prompt consumes files or visual context. ## Common Workflows Add rows manually, upload a CSV, or copy a useful production span from Monitor. Make sure required prompt variables have matching dataset columns. Store what the evaluator should check, what the reviewer should notice, or why the case matters. Use the dataset with one or more evaluators to score prompt output. Promote important production failures or Improve evidence into long-lived regression coverage. ## Where Datasets Fit Datasets connect the rest of the Platform: * **Prompts** use dataset rows as repeatable inputs. * **Evaluators** score prompt responses against dataset cases. * **Monitor** turns real production spans into dataset rows. * **Behaviors** reveal repeated patterns worth preserving as coverage. * **Improve** uses linked and generated datasets to compare prompt candidates before review. The goal is not to build the biggest dataset. The goal is to keep the examples that make release decisions clearer. ## Next Steps Create a dataset, add rows, and map columns to prompt variables. Bulk-import text, image, or PDF test cases. Add text, image, and PDF values to dataset rows. Fetch row values from APIs or other prompts. Preserve useful production spans as test cases. Run prompts against datasets and evaluators.