A Testset is a collection of Testcases used to evaluate the performance of an AI system across various inputs and scenarios. Testsets belong to a central theme like “Core Functionality”, “Edge Cases”, or “Customer Support Queries”. A Testcase is an individual test data point containing inputs, expected outputs, and metadata used for evaluation.
Screenshot of the testset details page in the UI.Screenshot of the testset details page in the UI.

Testset details page showing eight Testcases for a "Message tone rewriter" LLM system.

Create a Testset

Go to the Testsets page in your project and click the “New Testset” button to create a new, empty Testset.
Checking “Add example Testcases” will automatically generate 3 sample Testcases using AI based on your Testset’s description. This provides a starting point for your Testset.
Screenshot of the create testset modal in the UI.Screenshot of the create testset modal in the UI.

Create testset modal with AI generation option

Testset schema

Each Testset has a schema, which defines which fields a Testcase has, the type of each field, and the role each field plays in evaluation.
  • Input fields are sent to your AI system.
  • Expected fields are expected or ideal outputs, which metrics compare your AI system’s output to.
  • Metadata fields are additional context for analysis, not used by evaluation or your system.
You can update the schema of a Testset by clicking the “Edit Schema” button in the Testset actions menu. This allows you to add or remove fields, modify field types, and update field descriptions. Existing Testcases are not modified, but are validated against the new schema.
Screenshot of the testset schema editor in the UI.Screenshot of the testset schema editor in the UI.

Testset schema editor.

With the SDK

You can also create and update Testsets with the Scorecard SDK. You define a Testset’s schema using the JSON Schema format. For example, here’s a schema for a customer support system:
{
  "type": "object",
  "title": "Customer Support Schema",
  "properties": {
    "userQuery": {
      "type": "string",
      "description": "The customer's question or request"
    },
    "context": {
      "type": "string", 
      "description": "Additional context about the customer"
    },
    "ideal": {
      "type": "string",
      "description": "The ideal response from support"
    },
    "expectedSentiment": {
      "type": "string",
      "description": "The expected predicted sentiment of the user query."
    },
    "difficulty": {
      "type": "number",
      "description": "How difficult the customer support request is to solve (1-10)"
    }
  },
  "required": ["userQuery", "ideal"]
}
Supported Data Types
  • string: Text content
  • number: Numeric values (integers or floats)
  • boolean: either true or false
  • object: Nested JSON objects
  • array: Lists of JSON values
You also need to define the field mapping when creating a Testset with the SDK. A field mapping categorizes schema fields by their role in evaluation. For example, here’s a field mapping for the customer support schema above:
{
  "inputs": ["userQuery", "context"],
  "expected": ["ideal", "expectedSentiment"], 
  "metadata": ["difficulty"]
}

Create and edit Testcases

From the UI

Click the “New Testcase” button to create a new Testcase matching your Testset’s schema.
Screenshot of the testcase creation modal in the UI.Screenshot of the testcase creation modal in the UI.

Testcase creation modal.

You can edit a particular Testcase’s fields from the Testcase table, or from the Testcase details page.
Screenshot of the testcase details page in the UI.Screenshot of the testcase details page in the UI.

Testcase details page.

Importing from a file

The Scorecard UI supports importing Testcases in CSV, TSV, JSON, and JSONL formats. Scorecard automatically maps your file’s columns to the testset’s schema fields and validates data.
userQuery,context,ideal,category
"How do I cancel my order?","Order placed 1 hour ago","You can cancel orders within 2 hours...","cancellation"
"Where is my package?","Order shipped yesterday","Track your package using the link...","tracking"

Using the API

With our SDKs, you can create, update, and delete Testcases.

Export Testset

You can export a Testset’s Testcases to a CSV file by clicking the Export as CSV button in the Testset’s details page.

Other Testset features

Testset tags You can add custom tags to your Testsets to categorize them. For example, regression or edge-cases. Duplicate Testset You can create a copy of a Testset by clicking the “Duplicate” button in the Testset actions menu. This maintains original field mappings, so it’s useful for creating variants of your Testsets without having to recreate the schema.

Best practices

Testset strategy by use case
Remember that testcase data may contain sensitive information. Follow your organization’s data handling policies and avoid including PII, secrets, or confidential data in Testsets.