In order to develop controllable and safe LLM applications, you can integrate Scorecard with the open-source toolkits NeMo Guardrails and Guardrails AI.
evalType = heuristic
and outputType = boolean
in your Project, and you have a RUN_ID
for grouping related interactions. When your guardrail fires (in NeMo Guardrails or Guardrails AI), create a Record for the event and upsert a boolean Score.
record.id
and only call scores.upsert
. A Score
is uniquely keyed by (recordId, metricConfigId)
and is safe to upsert repeatedly.Testrecord
per interaction and multiple Metrics/Scores per record (one per guardrail), instead of creating multiple records for the same interaction.Run
groups many Testrecords
(each interaction you log creates one record).Testrecord
, you can have multiple Scores
— one per Metric attached to the Run.Scores
automatically.Score
yourself, like in the example above.{ binaryScore: true|false, reasoning: string }
; integer metrics should use { intScore: number, reasoning: string }
so pass/fail and aggregations work out of the box.Type of AI Guardrails with NeMo Guardrails
Examples of AI Guardrails with Guardrails AI (Guardrails AI, 2024)