Evaluators#

class EvaluatorsClient(*, sdk_config: SDKConfiguration, generated_client: ApiClient)[source]#

Bases: object

Client for managing Arize evaluators and evaluator versions.

This class is primarily intended for internal use within the SDK. Users are highly encouraged to access resource-specific functionality via arize.ArizeClient.

The evaluators client is a thin wrapper around the generated REST API client, using the shared generated API client owned by arize.config.SDKConfiguration.

Parameters:
  • sdk_config (SDKConfiguration) – Resolved SDK configuration.

  • generated_client (ApiClient) – Shared generated API client instance.

list(*, name: str | None = None, space: str | None = None, limit: int = 100, cursor: str | None = None) EvaluatorsList200Response[source]#

List evaluators the user has access to.

Results are sorted by update date (most recent first). This endpoint supports cursor-based pagination. When space is provided, results are limited to that space; otherwise evaluators from all permitted spaces are returned.

Parameters:
  • name (str | None) – Optional case-insensitive substring filter on the evaluator name.

  • space (str | None) – Optional space filter. If the value is a base64-encoded resource ID it is treated as a space ID; otherwise it is used as a case-insensitive substring filter on the space name.

  • limit (int) – Maximum number of evaluators to return (1-100).

  • cursor (str | None) – Opaque pagination cursor from a previous response.

Returns:

A paginated evaluator list response from the Arize REST API.

Raises:

ApiException – If the API request fails.

Return type:

EvaluatorsList200Response

get(*, evaluator: str, space: str | None = None, version_id: str | None = None) EvaluatorWithVersion[source]#

Get an evaluator by name or ID, with its resolved version.

By default, the latest version is returned. Pass version_id to resolve a specific version instead.

Parameters:
  • evaluator (str) – Evaluator name or global ID (base64) to retrieve.

  • space (str | None) – Optional space name or ID. Required when evaluator is a name rather than an ID.

  • version_id (str | None) – Optional version global ID (base64). If omitted, the latest version is returned.

Returns:

The evaluator with its resolved version.

Raises:

ApiException – If the API request fails (for example, evaluator not found).

Return type:

EvaluatorWithVersion

create_template_evaluator(*, name: str, space: str, commit_message: str, template_config: TemplateConfig, description: str | None = None) EvaluatorWithVersion[source]#

Create a new template evaluator with an initial version.

The evaluator name must be unique within the given space.

Parameters:
  • name (str) – Evaluator name (must be unique within the space).

  • space (str) – Space name or ID to create the evaluator in.

  • commit_message (str) – Commit message for the initial version.

  • template_config (TemplateConfig) –

    Template configuration for the evaluator. Build with arize.evaluators.types.TemplateConfig. Required fields:

    • name — eval column name; must match ^[a-zA-Z0-9_\\s\\-&()]+$.

    • template — prompt template string with {variable} placeholders referencing span/trace attributes.

    • include_explanations — whether the LLM should include a reasoning explanation alongside the score.

    • use_function_calling_if_available — prefer structured function-call output over free-text parsing when the model supports it.

    • llm_configarize.evaluators.types.EvaluatorLlmConfig specifying the model provider, model name, and API key.

    Optional fields: classification_choices, direction, data_granularity.

  • description (str | None) – Optional human-readable description of the evaluator.

Returns:

The created evaluator with its initial version.

Raises:

ApiException – If the API request fails (for example, name conflict or invalid payload).

Return type:

EvaluatorWithVersion

create_code_evaluator(*, name: str, space: str, commit_message: str, code_config: CodeConfig, description: str | None = None) EvaluatorWithVersion[source]#

Create a new code evaluator with an initial version.

The evaluator name must be unique within the given space.

Parameters:
  • name (str) – Evaluator name (must be unique within the space).

  • space (str) – Space name or ID to create the evaluator in.

  • commit_message (str) – Commit message for the initial version.

  • code_config (CodeConfig) – Code configuration for the evaluator. Build with arize.evaluators.types.ManagedCodeConfig (for built-in evaluators) or arize.evaluators.types.CustomCodeConfig (for custom Python code). Wrap in arize.evaluators.types.CodeConfig.

  • description (str | None) – Optional human-readable description of the evaluator.

Returns:

The created evaluator with its initial version.

Raises:

ApiException – If the API request fails (for example, name conflict or invalid payload).

Return type:

EvaluatorWithVersion

update(*, evaluator: str, space: str | None = None, name: str | None = None, description: str | None = None) Evaluator[source]#

Update an evaluator’s metadata.

Parameters:
  • evaluator (str) – Evaluator name or global ID (base64) to update.

  • space (str | None) – Optional space name or ID. Required when evaluator is a name rather than an ID.

  • name (str | None) – New evaluator name (must be unique within its space).

  • description (str | None) – New description for the evaluator.

Returns:

The updated evaluator.

Raises:

ApiException – If the API request fails.

Return type:

Evaluator

delete(*, evaluator: str, space: str | None = None) None[source]#

Delete an evaluator and all its versions.

This operation is irreversible.

Parameters:
  • evaluator (str) – Evaluator name or global ID (base64) to delete.

  • space (str | None) – Optional space name or ID. Required when evaluator is a name rather than an ID.

Returns:

None.

Raises:

ApiException – If the API request fails (for example, evaluator not found).

Return type:

None

list_versions(*, evaluator: str, space: str | None = None, limit: int = 100, cursor: str | None = None) EvaluatorVersionsList200Response[source]#

List all versions of an evaluator.

Results are returned with cursor-based pagination.

Parameters:
  • evaluator (str) – Evaluator name or global ID (base64) to list versions for.

  • space (str | None) – Optional space name or ID. Required when evaluator is a name rather than an ID.

  • limit (int) – Maximum number of versions to return (1-100).

  • cursor (str | None) – Opaque pagination cursor from a previous response.

Returns:

A paginated evaluator version list response.

Raises:

ApiException – If the API request fails.

Return type:

EvaluatorVersionsList200Response

get_version(*, version_id: str) EvaluatorVersion[source]#

Get a specific evaluator version by its global ID.

Parameters:

version_id (str) – Evaluator version global ID (base64).

Returns:

The evaluator version.

Raises:

ApiException – If the API request fails (for example, version not found).

Return type:

EvaluatorVersion

create_template_version(*, evaluator: str, space: str | None = None, commit_message: str, template_config: TemplateConfig) EvaluatorVersion[source]#

Create a new template version of an existing evaluator.

The new version becomes the latest version immediately (versioning is append-only). Versions are immutable once created; to change the configuration, create a new version.

Parameters:
  • evaluator (str) – Evaluator name or global ID (base64) to add a version to.

  • space (str | None) – Optional space name or ID. Required when evaluator is a name rather than an ID.

  • commit_message (str) – Commit message describing the changes in this version.

  • template_config (TemplateConfig) – Updated template configuration for this version. Build with arize.evaluators.types.TemplateConfig.

Returns:

The newly created evaluator version.

Raises:

ApiException – If the API request fails.

Return type:

EvaluatorVersion

create_code_version(*, evaluator: str, space: str | None = None, commit_message: str, code_config: CodeConfig) EvaluatorVersion[source]#

Create a new code version of an existing evaluator.

The new version becomes the latest version immediately (versioning is append-only). Versions are immutable once created; to change the configuration, create a new version.

Parameters:
  • evaluator (str) – Evaluator name or global ID (base64) to add a version to.

  • space (str | None) – Optional space name or ID. Required when evaluator is a name rather than an ID.

  • commit_message (str) – Commit message describing the changes in this version.

  • code_config (CodeConfig) – Updated code configuration for this version. Build with arize.evaluators.types.ManagedCodeConfig or arize.evaluators.types.CustomCodeConfig. Wrap in arize.evaluators.types.CodeConfig.

Returns:

The newly created evaluator version.

Raises:

ApiException – If the API request fails.

Return type:

EvaluatorVersion

Response Types#

class Evaluator(*, id: Annotated[str, Strict(strict=True)], name: Annotated[str, Strict(strict=True)], description: Annotated[str, Strict(strict=True)] | None = None, type: EvaluatorType, space_id: Annotated[str, Strict(strict=True)], created_at: datetime, updated_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None)[source]#

Bases: BaseModel

An evaluator defines reusable evaluation logic that can be attached to evaluation tasks. The type field determines the kind of evaluation: template (LLM-based template evaluation) or code (custom code evaluation).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
id: StrictStr#
name: StrictStr#
description: StrictStr | None#
type: EvaluatorType#
space_id: StrictStr#
created_at: datetime#
updated_at: datetime#
created_by_user_id: StrictStr | None#
model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of Evaluator from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of Evaluator from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

class EvaluatorLlmConfig(*, ai_integration_id: Annotated[str, Strict(strict=True)], model_name: Annotated[str, Strict(strict=True)], invocation_parameters: InvocationParams, provider_parameters: ProviderParams)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
ai_integration_id: StrictStr#
model_name: StrictStr#
invocation_parameters: InvocationParams#
provider_parameters: ProviderParams#
model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of EvaluatorLlmConfig from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of EvaluatorLlmConfig from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

class EvaluatorWithVersion(*, id: Annotated[str, Strict(strict=True)], name: Annotated[str, Strict(strict=True)], description: Annotated[str, Strict(strict=True)] | None = None, type: EvaluatorType, space_id: Annotated[str, Strict(strict=True)], created_at: datetime, updated_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None, version: EvaluatorVersion)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
id: StrictStr#
name: StrictStr#
description: StrictStr | None#
type: EvaluatorType#
space_id: StrictStr#
created_at: datetime#
updated_at: datetime#
created_by_user_id: StrictStr | None#
version: EvaluatorVersion#
model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of EvaluatorWithVersion from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of EvaluatorWithVersion from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

class EvaluatorVersion(*args, oneof_schema_1_validator: EvaluatorVersionTemplate | None = None, oneof_schema_2_validator: EvaluatorVersionCode | None = None, actual_instance: EvaluatorVersionCode | EvaluatorVersionTemplate | None = None, one_of_schemas: Set[str] = {'EvaluatorVersionCode', 'EvaluatorVersionTemplate'}, discriminator_value_class_map: Dict[str, str] = {})[source]#

Bases: BaseModel

A versioned snapshot of an evaluator’s configuration. Exactly one of template_config or code_config is present. The type field discriminates the branch and matches the parent evaluator’s type.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • oneof_schema_1_validator (EvaluatorVersionTemplate | None)

  • oneof_schema_2_validator (EvaluatorVersionCode | None)

  • actual_instance (EvaluatorVersionCode | EvaluatorVersionTemplate | None)

  • one_of_schemas (Set[str])

  • discriminator_value_class_map (Dict[str, str])

oneof_schema_1_validator: EvaluatorVersionTemplate | None#
oneof_schema_2_validator: EvaluatorVersionCode | None#
actual_instance: EvaluatorVersionCode | EvaluatorVersionTemplate | None#
one_of_schemas: Set[str]#
model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

discriminator_value_class_map: Dict[str, str]#
classmethod actual_instance_must_validate_oneof(v)[source]#
classmethod from_dict(obj: str | Dict[str, Any]) Self[source]#
Parameters:

obj (str | Dict[str, Any])

Return type:

Self

classmethod from_json(json_str: str) Self[source]#

Returns the object represented by the json string

Parameters:

json_str (str)

Return type:

Self

to_json() str[source]#

Returns the JSON representation of the actual instance

Return type:

str

to_dict() Dict[str, Any] | EvaluatorVersionCode | EvaluatorVersionTemplate | None[source]#

Returns the dict representation of the actual instance

Return type:

Dict[str, Any] | EvaluatorVersionCode | EvaluatorVersionTemplate | None

to_str() str[source]#

Returns the string representation of the actual instance

Return type:

str

class EvaluatorsList200Response(*, evaluators: List[Evaluator], pagination: PaginationMetadata)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
evaluators: List[Evaluator]#
pagination: PaginationMetadata#
model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of EvaluatorsList200Response from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of EvaluatorsList200Response from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

to_df(by_alias: bool = False, exclude_none: str | bool = True, json_normalize: bool = False, convert_dtypes: bool = True, expand_field: str = 'additional_properties', expand_prefix: str = '') pd.DataFrame#

Convert a list of objects to a pandas.DataFrame.

Behavior:
  • If an item is a Pydantic v2 model, use .model_dump(by_alias=…).

  • If an item is a mapping (dict-like), use it as-is.

  • Otherwise, raise a ValueError (unsupported row type).

Parameters:
  • self (object) – The object instance containing the field to convert.

  • by_alias (bool) – Use field aliases when dumping Pydantic models.

  • exclude_none (str | bool) – Control None/NaN column dropping. - False: keep Nones as-is - “all”: drop columns where all values are None/NaN - “any”: drop columns where any value is None/NaN - True: alias for “all”

  • json_normalize (bool) – If True, flatten nested dicts via pandas.json_normalize.

  • convert_dtypes (bool) – If True, call DataFrame.convert_dtypes() at the end.

  • expand_field (str) – If set, look for this field in each row and

  • columns. (expand its keys into top-level)

  • expand_prefix (str) – If set, prefix expanded column names with this string.

Returns:

The converted DataFrame.

Return type:

pandas.DataFrame

class EvaluatorVersionsList200Response(*, evaluator_versions: List[EvaluatorVersion], pagination: PaginationMetadata)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
evaluator_versions: List[EvaluatorVersion]#
pagination: PaginationMetadata#
model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of EvaluatorVersionsList200Response from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of EvaluatorVersionsList200Response from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

to_df(by_alias: bool = False, exclude_none: str | bool = True, json_normalize: bool = False, convert_dtypes: bool = True, expand_field: str = 'additional_properties', expand_prefix: str = '') pd.DataFrame#

Convert a list of objects to a pandas.DataFrame.

Behavior:
  • If an item is a Pydantic v2 model, use .model_dump(by_alias=…).

  • If an item is a mapping (dict-like), use it as-is.

  • Otherwise, raise a ValueError (unsupported row type).

Parameters:
  • self (object) – The object instance containing the field to convert.

  • by_alias (bool) – Use field aliases when dumping Pydantic models.

  • exclude_none (str | bool) – Control None/NaN column dropping. - False: keep Nones as-is - “all”: drop columns where all values are None/NaN - “any”: drop columns where any value is None/NaN - True: alias for “all”

  • json_normalize (bool) – If True, flatten nested dicts via pandas.json_normalize.

  • convert_dtypes (bool) – If True, call DataFrame.convert_dtypes() at the end.

  • expand_field (str) – If set, look for this field in each row and

  • columns. (expand its keys into top-level)

  • expand_prefix (str) – If set, prefix expanded column names with this string.

Returns:

The converted DataFrame.

Return type:

pandas.DataFrame

class TemplateConfig(*, name: Annotated[str, Strict(strict=True)], template: Annotated[str, Strict(strict=True)], include_explanations: Annotated[bool, Strict(strict=True)], use_function_calling_if_available: Annotated[bool, Strict(strict=True)], classification_choices: Dict[str, Annotated[float, Strict(strict=True)] | Annotated[int, Strict(strict=True)]] | None = None, direction: OptimizationDirection | None = None, data_granularity: Annotated[str, Strict(strict=True)] | None = None, llm_config: EvaluatorLlmConfig)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
name: StrictStr#
template: StrictStr#
include_explanations: StrictBool#
use_function_calling_if_available: StrictBool#
classification_choices: Dict[str, StrictFloat | StrictInt] | None#
direction: OptimizationDirection | None#
data_granularity: StrictStr | None#
llm_config: EvaluatorLlmConfig#
classmethod data_granularity_validate_enum(value)[source]#

Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of TemplateConfig from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of TemplateConfig from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None