Evaluators#
- class EvaluatorsClient(*, sdk_config: SDKConfiguration, generated_client: ApiClient)[source]#
Bases:
objectClient for managing Arize evaluators and evaluator versions.
This class is primarily intended for internal use within the SDK. Users are highly encouraged to access resource-specific functionality via
arize.ArizeClient.The evaluators client is a thin wrapper around the generated REST API client, using the shared generated API client owned by
arize.config.SDKConfiguration.- Parameters:
sdk_config (SDKConfiguration) – Resolved SDK configuration.
generated_client (ApiClient) – Shared generated API client instance.
- list(*, name: str | None = None, space: str | None = None, limit: int = DEFAULT_LIST_LIMIT, cursor: str | None = None) EvaluatorListResponse[source]#
List evaluators the user has access to.
Results are sorted by update date (most recent first). This endpoint supports cursor-based pagination. When
spaceis provided, results are limited to that space; otherwise evaluators from all permitted spaces are returned.- Parameters:
name (str | None) – Optional case-insensitive substring filter on the evaluator name.
space (str | None) – Optional space filter. If the value is a base64-encoded resource ID it is treated as a space ID; otherwise it is used as a case-insensitive substring filter on the space name.
limit (int) – Maximum number of evaluators to return (1-100).
cursor (str | None) – Opaque pagination cursor from a previous response.
- Returns:
A paginated evaluator list response from the Arize REST API.
- Raises:
ApiException – If the API request fails.
- Return type:
EvaluatorListResponse
- get(*, evaluator: str, space: str | None = None, version_id: str | None = None) EvaluatorWithVersion[source]#
Get an evaluator by name or ID, with its resolved version.
By default, the latest version is returned. Pass
version_idto resolve a specific version instead.- Parameters:
- Returns:
The evaluator with its resolved version.
- Raises:
ApiException – If the API request fails (for example, evaluator not found).
- Return type:
- create_template_evaluator(*, name: str, space: str, commit_message: str, template_config: TemplateConfig, description: str | None = None) EvaluatorWithVersion[source]#
Create a new template evaluator with an initial version.
The evaluator
namemust be unique within the given space.- Parameters:
name (str) – Evaluator name (must be unique within the space).
space (str) – Space name or ID to create the evaluator in.
commit_message (str) – Commit message for the initial version.
template_config (TemplateConfig) –
Template configuration for the evaluator. Build with
arize.evaluators.types.TemplateConfig. Required fields:name— eval column name; must match^[a-zA-Z0-9_\\s\\-&()]+$.template— prompt template string with{variable}placeholders referencing span/trace attributes.include_explanations— whether the LLM should include a reasoning explanation alongside the score.use_function_calling_if_available— prefer structured function-call output over free-text parsing when the model supports it.llm_config—arize.evaluators.types.EvaluatorLlmConfigspecifying the model provider, model name, and API key.
Optional fields:
classification_choices,direction,data_granularity.description (str | None) – Optional human-readable description of the evaluator.
- Returns:
The created evaluator with its initial version.
- Raises:
ApiException – If the API request fails (for example, name conflict or invalid payload).
- Return type:
- create_code_evaluator(*, name: str, space: str, commit_message: str, code_config: CodeConfig | CustomCodeConfig | ManagedCodeConfig | dict, description: str | None = None) EvaluatorWithVersion[source]#
Create a new code evaluator with an initial version.
The evaluator
namemust be unique within the given space.- Parameters:
name (str) – Evaluator name (must be unique within the space).
space (str) – Space name or ID to create the evaluator in.
commit_message (str) – Commit message for the initial version.
code_config (CodeConfig | CustomCodeConfig | ManagedCodeConfig | dict) – Code configuration for the evaluator. Accepts a
arize.evaluators.types.CodeConfigwrapper, an unwrappedarize.evaluators.types.ManagedCodeConfigorarize.evaluators.types.CustomCodeConfig, or a plaindictmatching one of those schemas.description (str | None) – Optional human-readable description of the evaluator.
- Returns:
The created evaluator with its initial version.
- Raises:
ApiException – If the API request fails (for example, name conflict or invalid payload).
- Return type:
- update(*, evaluator: str, space: str | None = None, name: str | None = None, description: str | None = None) Evaluator[source]#
Update an evaluator’s metadata.
- Parameters:
evaluator (str) – Evaluator name or identifier (base64) to update.
space (str | None) – Optional space name or ID. Required when
evaluatoris a name rather than an ID.name (str | None) – New evaluator name (must be unique within its space).
description (str | None) – New description for the evaluator.
- Returns:
The updated evaluator.
- Raises:
ApiException – If the API request fails.
- Return type:
- delete(*, evaluator: str, space: str | None = None) None[source]#
Delete an evaluator and all its versions.
This operation is irreversible.
- list_versions(*, evaluator: str, space: str | None = None, limit: int = DEFAULT_LIST_LIMIT, cursor: str | None = None) EvaluatorVersionListResponse[source]#
List all versions of an evaluator.
Results are returned with cursor-based pagination.
- Parameters:
evaluator (str) – Evaluator name or identifier (base64) to list versions for.
space (str | None) – Optional space name or ID. Required when
evaluatoris a name rather than an ID.limit (int) – Maximum number of versions to return (1-100).
cursor (str | None) – Opaque pagination cursor from a previous response.
- Returns:
A paginated evaluator version list response.
- Raises:
ApiException – If the API request fails.
- Return type:
EvaluatorVersionListResponse
- get_version(*, version_id: str) EvaluatorVersionCode | EvaluatorVersionTemplate[source]#
Get a specific evaluator version by its global ID.
- Parameters:
version_id (str) – Evaluator version identifier (base64).
- Returns:
The evaluator version — a
EvaluatorVersionCodefor code evaluators (withcode_configalready unwrapped), or anEvaluatorVersionTemplatefor template evaluators.- Raises:
ApiException – If the API request fails (for example, version not found).
- Return type:
EvaluatorVersionCode | EvaluatorVersionTemplate
- create_template_version(*, evaluator: str, space: str | None = None, commit_message: str, template_config: TemplateConfig) EvaluatorVersionTemplate[source]#
Create a new template version of an existing evaluator.
The new version becomes the latest version immediately (versioning is append-only). Versions are immutable once created; to change the configuration, create a new version.
- Parameters:
evaluator (str) – Evaluator name or identifier (base64) to add a version to.
space (str | None) – Optional space name or ID. Required when
evaluatoris a name rather than an ID.commit_message (str) – Commit message describing the changes in this version.
template_config (TemplateConfig) – Updated template configuration for this version. Build with
arize.evaluators.types.TemplateConfig.
- Returns:
The newly created evaluator version.
- Raises:
ApiException – If the API request fails.
- Return type:
EvaluatorVersionTemplate
- create_code_version(*, evaluator: str, space: str | None = None, commit_message: str, code_config: CodeConfig | CustomCodeConfig | ManagedCodeConfig | dict) EvaluatorVersionCode[source]#
Create a new code version of an existing evaluator.
The new version becomes the latest version immediately (versioning is append-only). Versions are immutable once created; to change the configuration, create a new version.
- Parameters:
evaluator (str) – Evaluator name or identifier (base64) to add a version to.
space (str | None) – Optional space name or ID. Required when
evaluatoris a name rather than an ID.commit_message (str) – Commit message describing the changes in this version.
code_config (CodeConfig | CustomCodeConfig | ManagedCodeConfig | dict) – Updated code configuration for this version. Accepts a
arize.evaluators.types.CodeConfigwrapper, an unwrappedarize.evaluators.types.ManagedCodeConfigorarize.evaluators.types.CustomCodeConfig, or a plaindictmatching one of those schemas.
- Returns:
The newly created evaluator version.
- Raises:
ApiException – If the API request fails.
- Return type:
EvaluatorVersionCode
Response Types#
- class Evaluator(*, id: Annotated[str, Strict(strict=True)], name: Annotated[str, Strict(strict=True)], description: Annotated[str, Strict(strict=True)] | None = None, type: EvaluatorType, space_id: Annotated[str, Strict(strict=True)], created_at: datetime, updated_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None)[source]#
Bases:
BaseModelAn evaluator defines reusable evaluation logic that can be attached to evaluation tasks. The type field determines the kind of evaluation: template (LLM-based template evaluation) or code (custom code evaluation).
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
- id: StrictStr#
- name: StrictStr#
- type: EvaluatorType#
- space_id: StrictStr#
- created_at: datetime#
- updated_at: datetime#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of Evaluator from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.
- class EvaluatorLlmConfig(*, ai_integration_id: Annotated[str, Strict(strict=True)], model_name: Annotated[str, Strict(strict=True)], invocation_parameters: InvocationParams, provider_parameters: ProviderParams)[source]#
Bases:
BaseModelCreate a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
invocation_parameters (InvocationParams)
provider_parameters (ProviderParams)
- ai_integration_id: StrictStr#
- model_name: StrictStr#
- invocation_parameters: InvocationParams#
- provider_parameters: ProviderParams#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of EvaluatorLlmConfig from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.
- class EvaluatorWithVersion(*, id: str, name: str, description: str | None = None, type: EvaluatorType, space_id: str, created_at: datetime, updated_at: datetime, created_by_user_id: str | None = None, version: EvaluatorVersionCode | EvaluatorVersionTemplate)[source]#
Bases:
BaseModelSDK view of the generated
EvaluatorWithVersionwithversionunwrapped.The
versionfield holds the concrete inner type (EvaluatorVersionCodefor code evaluators, orEvaluatorVersionTemplatefor template evaluators) instead of the oneOf wrapper.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
- type: EvaluatorType#
- version: EvaluatorVersionCode | EvaluatorVersionTemplate#
- model_config: ClassVar[ConfigDict] = {'from_attributes': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class TemplateConfig(*, name: Annotated[str, Strict(strict=True)], template: Annotated[str, Strict(strict=True)], include_explanations: Annotated[bool, Strict(strict=True)], use_function_calling_if_available: Annotated[bool, Strict(strict=True)], classification_choices: Dict[str, Annotated[float, Strict(strict=True)] | Annotated[int, Strict(strict=True)]] | None = None, direction: OptimizationDirection | None = None, data_granularity: DataGranularity | None = None, llm_config: EvaluatorLlmConfig)[source]#
Bases:
BaseModelCreate a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
use_function_calling_if_available (Annotated[bool, Strict(strict=True)])
classification_choices (Dict[str, Annotated[float, Strict(strict=True)] | Annotated[int, Strict(strict=True)]] | None)
direction (OptimizationDirection | None)
data_granularity (DataGranularity | None)
llm_config (EvaluatorLlmConfig)
- name: StrictStr#
- template: StrictStr#
- include_explanations: StrictBool#
- use_function_calling_if_available: StrictBool#
- direction: OptimizationDirection | None#
- llm_config: EvaluatorLlmConfig#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of TemplateConfig from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.