Managing Tests

Tests are LLM prompts that are used for evaluating model behavior. They are at the core of the OWL evaluation pipeline (see Evaluating Model Behavior for how this works), and are also available for prompting (see Workbench – Prompting).

Publicly-available tests are accessible via the TESTS link in the top menu, which takes you to the Public Tests page. To create and manage tests, click My Tests, which takes you to My Tests.

To create a test, click + Create New Test on the My Tests page, and provide a title, prompt and (optionally) sources. If you don't want the test to be available to other users, uncheck the "Make this public" box. Then click Create Test. (OWL implements a review and approval process for public tests, so a new public test will be available to you right away, but there will be a short delay before it is available to others.)

To edit or delete a test, select the test on the My Tests page, and then the appropriate option.

Tests can be about anything you wish, for example: