Evaluators#
- class EvaluatorsClient(*, sdk_config: SDKConfiguration, generated_client: ApiClient)[source]#
Bases:
objectClient for managing Arize evaluators and evaluator versions.
This class is primarily intended for internal use within the SDK. Users are highly encouraged to access resource-specific functionality via
arize.ArizeClient.The evaluators client is a thin wrapper around the generated REST API client, using the shared generated API client owned by
arize.config.SDKConfiguration.- Parameters:
sdk_config (SDKConfiguration) – Resolved SDK configuration.
generated_client (ApiClient) – Shared generated API client instance.
- list(*, name: str | None = None, space: str | None = None, limit: int = 100, cursor: str | None = None) EvaluatorsList200Response[source]#
List evaluators the user has access to.
Results are sorted by update date (most recent first). This endpoint supports cursor-based pagination. When
spaceis provided, results are limited to that space; otherwise evaluators from all permitted spaces are returned.- Parameters:
name (str | None) – Optional case-insensitive substring filter on the evaluator name.
space (str | None) – Optional space filter. If the value is a base64-encoded resource ID it is treated as a space ID; otherwise it is used as a case-insensitive substring filter on the space name.
limit (int) – Maximum number of evaluators to return (1-100).
cursor (str | None) – Opaque pagination cursor from a previous response.
- Returns:
A paginated evaluator list response from the Arize REST API.
- Raises:
ApiException – If the API request fails.
- Return type:
- get(*, evaluator: str, space: str | None = None, version_id: str | None = None) EvaluatorWithVersion[source]#
Get an evaluator by name or ID, with its resolved version.
By default, the latest version is returned. Pass
version_idto resolve a specific version instead.- Parameters:
- Returns:
The evaluator with its resolved version.
- Raises:
ApiException – If the API request fails (for example, evaluator not found).
- Return type:
- create_template_evaluator(*, name: str, space: str, commit_message: str, template_config: TemplateConfig, description: str | None = None) EvaluatorWithVersion[source]#
Create a new template evaluator with an initial version.
The evaluator
namemust be unique within the given space.- Parameters:
name (str) – Evaluator name (must be unique within the space).
space (str) – Space name or ID to create the evaluator in.
commit_message (str) – Commit message for the initial version.
template_config (TemplateConfig) –
Template configuration for the evaluator. Build with
arize.evaluators.types.TemplateConfig. Required fields:name— eval column name; must match^[a-zA-Z0-9_\\s\\-&()]+$.template— prompt template string with{variable}placeholders referencing span/trace attributes.include_explanations— whether the LLM should include a reasoning explanation alongside the score.use_function_calling_if_available— prefer structured function-call output over free-text parsing when the model supports it.llm_config—arize.evaluators.types.EvaluatorLlmConfigspecifying the model provider, model name, and API key.
Optional fields:
classification_choices,direction,data_granularity.description (str | None) – Optional human-readable description of the evaluator.
- Returns:
The created evaluator with its initial version.
- Raises:
ApiException – If the API request fails (for example, name conflict or invalid payload).
- Return type:
- create_code_evaluator(*, name: str, space: str, commit_message: str, code_config: CodeConfig, description: str | None = None) EvaluatorWithVersion[source]#
Create a new code evaluator with an initial version.
The evaluator
namemust be unique within the given space.- Parameters:
name (str) – Evaluator name (must be unique within the space).
space (str) – Space name or ID to create the evaluator in.
commit_message (str) – Commit message for the initial version.
code_config (CodeConfig) – Code configuration for the evaluator. Build with
arize.evaluators.types.ManagedCodeConfig(for built-in evaluators) orarize.evaluators.types.CustomCodeConfig(for custom Python code). Wrap inarize.evaluators.types.CodeConfig.description (str | None) – Optional human-readable description of the evaluator.
- Returns:
The created evaluator with its initial version.
- Raises:
ApiException – If the API request fails (for example, name conflict or invalid payload).
- Return type:
- update(*, evaluator: str, space: str | None = None, name: str | None = None, description: str | None = None) Evaluator[source]#
Update an evaluator’s metadata.
- Parameters:
evaluator (str) – Evaluator name or global ID (base64) to update.
space (str | None) – Optional space name or ID. Required when
evaluatoris a name rather than an ID.name (str | None) – New evaluator name (must be unique within its space).
description (str | None) – New description for the evaluator.
- Returns:
The updated evaluator.
- Raises:
ApiException – If the API request fails.
- Return type:
- delete(*, evaluator: str, space: str | None = None) None[source]#
Delete an evaluator and all its versions.
This operation is irreversible.
- list_versions(*, evaluator: str, space: str | None = None, limit: int = 100, cursor: str | None = None) EvaluatorVersionsList200Response[source]#
List all versions of an evaluator.
Results are returned with cursor-based pagination.
- Parameters:
evaluator (str) – Evaluator name or global ID (base64) to list versions for.
space (str | None) – Optional space name or ID. Required when
evaluatoris a name rather than an ID.limit (int) – Maximum number of versions to return (1-100).
cursor (str | None) – Opaque pagination cursor from a previous response.
- Returns:
A paginated evaluator version list response.
- Raises:
ApiException – If the API request fails.
- Return type:
- get_version(*, version_id: str) EvaluatorVersion[source]#
Get a specific evaluator version by its global ID.
- Parameters:
version_id (str) – Evaluator version global ID (base64).
- Returns:
The evaluator version.
- Raises:
ApiException – If the API request fails (for example, version not found).
- Return type:
- create_template_version(*, evaluator: str, space: str | None = None, commit_message: str, template_config: TemplateConfig) EvaluatorVersion[source]#
Create a new template version of an existing evaluator.
The new version becomes the latest version immediately (versioning is append-only). Versions are immutable once created; to change the configuration, create a new version.
- Parameters:
evaluator (str) – Evaluator name or global ID (base64) to add a version to.
space (str | None) – Optional space name or ID. Required when
evaluatoris a name rather than an ID.commit_message (str) – Commit message describing the changes in this version.
template_config (TemplateConfig) – Updated template configuration for this version. Build with
arize.evaluators.types.TemplateConfig.
- Returns:
The newly created evaluator version.
- Raises:
ApiException – If the API request fails.
- Return type:
- create_code_version(*, evaluator: str, space: str | None = None, commit_message: str, code_config: CodeConfig) EvaluatorVersion[source]#
Create a new code version of an existing evaluator.
The new version becomes the latest version immediately (versioning is append-only). Versions are immutable once created; to change the configuration, create a new version.
- Parameters:
evaluator (str) – Evaluator name or global ID (base64) to add a version to.
space (str | None) – Optional space name or ID. Required when
evaluatoris a name rather than an ID.commit_message (str) – Commit message describing the changes in this version.
code_config (CodeConfig) – Updated code configuration for this version. Build with
arize.evaluators.types.ManagedCodeConfigorarize.evaluators.types.CustomCodeConfig. Wrap inarize.evaluators.types.CodeConfig.
- Returns:
The newly created evaluator version.
- Raises:
ApiException – If the API request fails.
- Return type:
Response Types#
- class Evaluator(*, id: Annotated[str, Strict(strict=True)], name: Annotated[str, Strict(strict=True)], description: Annotated[str, Strict(strict=True)] | None = None, type: EvaluatorType, space_id: Annotated[str, Strict(strict=True)], created_at: datetime, updated_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None)[source]#
Bases:
BaseModelAn evaluator defines reusable evaluation logic that can be attached to evaluation tasks. The type field determines the kind of evaluation: template (LLM-based template evaluation) or code (custom code evaluation).
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
- id: StrictStr#
- name: StrictStr#
- type: EvaluatorType#
- space_id: StrictStr#
- created_at: datetime#
- updated_at: datetime#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of Evaluator from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.
- class EvaluatorLlmConfig(*, ai_integration_id: Annotated[str, Strict(strict=True)], model_name: Annotated[str, Strict(strict=True)], invocation_parameters: InvocationParams, provider_parameters: ProviderParams)[source]#
Bases:
BaseModelCreate a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
invocation_parameters (InvocationParams)
provider_parameters (ProviderParams)
- ai_integration_id: StrictStr#
- model_name: StrictStr#
- invocation_parameters: InvocationParams#
- provider_parameters: ProviderParams#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of EvaluatorLlmConfig from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.
- class EvaluatorWithVersion(*, id: Annotated[str, Strict(strict=True)], name: Annotated[str, Strict(strict=True)], description: Annotated[str, Strict(strict=True)] | None = None, type: EvaluatorType, space_id: Annotated[str, Strict(strict=True)], created_at: datetime, updated_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None, version: EvaluatorVersion)[source]#
Bases:
BaseModelCreate a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
type (EvaluatorType)
created_at (datetime)
updated_at (datetime)
created_by_user_id (Annotated[str, Strict(strict=True)] | None)
version (EvaluatorVersion)
- id: StrictStr#
- name: StrictStr#
- type: EvaluatorType#
- space_id: StrictStr#
- created_at: datetime#
- updated_at: datetime#
- version: EvaluatorVersion#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of EvaluatorWithVersion from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.
- class EvaluatorVersion(*args, oneof_schema_1_validator: EvaluatorVersionTemplate | None = None, oneof_schema_2_validator: EvaluatorVersionCode | None = None, actual_instance: EvaluatorVersionCode | EvaluatorVersionTemplate | None = None, one_of_schemas: Set[str] = {'EvaluatorVersionCode', 'EvaluatorVersionTemplate'}, discriminator_value_class_map: Dict[str, str] = {})[source]#
Bases:
BaseModelA versioned snapshot of an evaluator’s configuration. Exactly one of template_config or code_config is present. The type field discriminates the branch and matches the parent evaluator’s type.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self[source]#
Returns the object represented by the json string
- class EvaluatorsList200Response(*, evaluators: List[Evaluator], pagination: PaginationMetadata)[source]#
Bases:
BaseModelCreate a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- pagination: PaginationMetadata#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of EvaluatorsList200Response from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.
- classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#
Create an instance of EvaluatorsList200Response from a dict
- to_df(by_alias: bool = False, exclude_none: str | bool = True, json_normalize: bool = False, convert_dtypes: bool = True, expand_field: str = 'additional_properties', expand_prefix: str = '') pd.DataFrame#
Convert a list of objects to a
pandas.DataFrame.- Behavior:
If an item is a Pydantic v2 model, use .model_dump(by_alias=…).
If an item is a mapping (dict-like), use it as-is.
Otherwise, raise a ValueError (unsupported row type).
- Parameters:
self (object) – The object instance containing the field to convert.
by_alias (bool) – Use field aliases when dumping Pydantic models.
exclude_none (str | bool) – Control None/NaN column dropping. - False: keep Nones as-is - “all”: drop columns where all values are None/NaN - “any”: drop columns where any value is None/NaN - True: alias for “all”
json_normalize (bool) – If True, flatten nested dicts via pandas.json_normalize.
convert_dtypes (bool) – If True, call DataFrame.convert_dtypes() at the end.
expand_field (str) – If set, look for this field in each row and
columns. (expand its keys into top-level)
expand_prefix (str) – If set, prefix expanded column names with this string.
- Returns:
The converted DataFrame.
- Return type:
- class EvaluatorVersionsList200Response(*, evaluator_versions: List[EvaluatorVersion], pagination: PaginationMetadata)[source]#
Bases:
BaseModelCreate a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
evaluator_versions (List[EvaluatorVersion])
pagination (PaginationMetadata)
- evaluator_versions: List[EvaluatorVersion]#
- pagination: PaginationMetadata#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of EvaluatorVersionsList200Response from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.
- classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#
Create an instance of EvaluatorVersionsList200Response from a dict
- to_df(by_alias: bool = False, exclude_none: str | bool = True, json_normalize: bool = False, convert_dtypes: bool = True, expand_field: str = 'additional_properties', expand_prefix: str = '') pd.DataFrame#
Convert a list of objects to a
pandas.DataFrame.- Behavior:
If an item is a Pydantic v2 model, use .model_dump(by_alias=…).
If an item is a mapping (dict-like), use it as-is.
Otherwise, raise a ValueError (unsupported row type).
- Parameters:
self (object) – The object instance containing the field to convert.
by_alias (bool) – Use field aliases when dumping Pydantic models.
exclude_none (str | bool) – Control None/NaN column dropping. - False: keep Nones as-is - “all”: drop columns where all values are None/NaN - “any”: drop columns where any value is None/NaN - True: alias for “all”
json_normalize (bool) – If True, flatten nested dicts via pandas.json_normalize.
convert_dtypes (bool) – If True, call DataFrame.convert_dtypes() at the end.
expand_field (str) – If set, look for this field in each row and
columns. (expand its keys into top-level)
expand_prefix (str) – If set, prefix expanded column names with this string.
- Returns:
The converted DataFrame.
- Return type:
- class TemplateConfig(*, name: Annotated[str, Strict(strict=True)], template: Annotated[str, Strict(strict=True)], include_explanations: Annotated[bool, Strict(strict=True)], use_function_calling_if_available: Annotated[bool, Strict(strict=True)], classification_choices: Dict[str, Annotated[float, Strict(strict=True)] | Annotated[int, Strict(strict=True)]] | None = None, direction: OptimizationDirection | None = None, data_granularity: Annotated[str, Strict(strict=True)] | None = None, llm_config: EvaluatorLlmConfig)[source]#
Bases:
BaseModelCreate a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
use_function_calling_if_available (Annotated[bool, Strict(strict=True)])
classification_choices (Dict[str, Annotated[float, Strict(strict=True)] | Annotated[int, Strict(strict=True)]] | None)
direction (OptimizationDirection | None)
data_granularity (Annotated[str, Strict(strict=True)] | None)
llm_config (EvaluatorLlmConfig)
- name: StrictStr#
- template: StrictStr#
- include_explanations: StrictBool#
- use_function_calling_if_available: StrictBool#
- direction: OptimizationDirection | None#
- llm_config: EvaluatorLlmConfig#
- model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod from_json(json_str: str) Self | None[source]#
Create an instance of TemplateConfig from a JSON string
- to_dict() Dict[str, Any][source]#
Return the dictionary representation of the model using alias.
This has the following differences from calling pydantic’s self.model_dump(by_alias=True):
None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.