Tasks#

class TasksClient(*, sdk_config: SDKConfiguration, generated_client: ApiClient)[source]#

Bases: object

Client for managing Arize tasks and task runs.

This class is primarily intended for internal use within the SDK. Users are highly encouraged to access resource-specific functionality via arize.ArizeClient.

The tasks client is a thin wrapper around the generated REST API client, using the shared generated API client owned by arize.config.SDKConfiguration.

Parameters:
  • sdk_config (SDKConfiguration) – Resolved SDK configuration.

  • generated_client (ApiClient) – Shared generated API client instance.

list(*, name: str | None = None, project: str | None = None, dataset: str | None = None, space: str | None = None, task_type: TaskType | None = None, limit: int = 100, cursor: str | None = None) TasksList200Response[source]#

List tasks the user has access to.

Results support cursor-based pagination. Optionally filter by space, project, dataset, or task type.

Parameters:
  • name (str | None) – Optional case-insensitive substring filter on the task name.

  • project (str | None) – Optional project name or global ID (base64) to filter results. If the value is a name, space must also be provided.

  • dataset (str | None) – Optional dataset name or global ID (base64) to filter results. If the value is a name, space must also be provided.

  • space (str | None) – Optional space name or ID used to disambiguate name-based resolution for project and dataset. If the value is a base64-encoded resource ID it is treated as a space ID; otherwise it is used as a case-insensitive substring filter on the space name.

  • task_type (TaskType | None) – Optional task type filter. One of "template_evaluation" or "code_evaluation".

  • limit (int) – Maximum number of tasks to return (1-100).

  • cursor (str | None) – Opaque pagination cursor from a previous response.

Returns:

A paginated task list response from the Arize REST API.

Raises:

ApiException – If the API request fails.

Return type:

TasksList200Response

get(*, task: str, space: str | None = None) Task[source]#

Get a task by name or ID.

Parameters:
  • task (str) – Task name or global ID (base64). If the value looks like an ID it is used directly; otherwise it is resolved by name.

  • space (str | None) – Optional space name or ID used to disambiguate the task lookup. Recommended when resolving by name.

Returns:

The task with its full configuration.

Raises:

ApiException – If the API request fails (for example, task not found).

Return type:

Task

create_evaluation_task(*, name: str, task_type: TaskType, evaluators: builtins.list[BaseEvaluationTaskRequestEvaluatorsInner], project: str | None = None, dataset: str | None = None, space: str | None = None, experiment_ids: builtins.list[str] | None = None, sampling_rate: float | None = None, is_continuous: bool | None = None, query_filter: str | None = None) Task[source]#

Create a new evaluation task.

A typed convenience wrapper around the internal task-creation logic for "template_evaluation" and "code_evaluation" task types. Prefer this method when creating evaluation tasks for a cleaner, narrowly-typed signature.

Parameters:
  • name (str) – Task name (must be unique within the space).

  • task_type (TaskType) – Task type: "template_evaluation" or "code_evaluation".

  • evaluators (builtins.list[BaseEvaluationTaskRequestEvaluatorsInner]) –

    List of evaluators to attach (at least one required). Each entry is a arize.tasks.types.BaseEvaluationTaskRequestEvaluatorsInner with the following fields:

    • evaluator_id — Evaluator global ID (base64). Required.

    • query_filter — Per-evaluator filter. Optional.

    • column_mappings — Template variable name mappings. Optional.

  • project (str | None) – Project name or global ID (base64). Required when dataset is not provided.

  • dataset (str | None) – Dataset name or global ID (base64). Required when project is not provided.

  • space (str | None) – Optional space name or ID used to disambiguate name-based resolution for project and dataset.

  • experiment_ids (builtins.list[str] | None) – Experiment global IDs (base64). Required (at least one) when dataset is provided.

  • sampling_rate (float | None) – Fraction of data to evaluate (0-1). Only valid for project-based tasks.

  • is_continuous (bool | None) – Whether to run the task continuously. Only valid for project-based tasks.

  • query_filter (str | None) – Task-level query filter applied to all evaluators.

Returns:

The newly created task.

Raises:
  • ValueError – If required fields are missing or mutually exclusive fields are combined.

  • ApiException – If the API request fails.

Return type:

Task

create_run_experiment_task(*, name: str, dataset: str, run_configuration: RunConfiguration, space: str | None = None) Task[source]#

Create a new run_experiment task.

A typed convenience wrapper around the internal task-creation logic for "run_experiment" task types. The server drives all LLM calls using the AI integration specified in run_configuration — no local callable is required.

To create and immediately trigger a run in one call, use create_and_run_experiment_task (available separately).

Parameters:
Returns:

The newly created task.

Raises:

ApiException – If the API request fails.

Return type:

Task

update(*, task: str, space: str | None = None, name: str | _Missing = _MISSING, sampling_rate: float | _Missing = _MISSING, is_continuous: bool | _Missing = _MISSING, query_filter: str | None | _Missing = _MISSING, evaluators: builtins.list[BaseEvaluationTaskRequestEvaluatorsInner] | _Missing = _MISSING, run_configuration: RunConfiguration | _Missing = _MISSING) Task[source]#

Update mutable fields on an existing task.

Dispatches based on the task’s type — resolves the task by ID or name first, then GETs it to determine whether it is an evaluation task or a run_experiment task, and builds the appropriate PATCH body.

At least one mutable field must be provided. Pass None to query_filter to clear the existing filter; omit the argument to leave it unchanged.

For evaluation tasks (template_evaluation / code_evaluation):

  • Valid fields: name, sampling_rate, is_continuous, query_filter, evaluators.

  • run_configuration must not be provided.

For run_experiment tasks:

  • Valid fields: name, run_configuration.

  • Evaluation-only fields (sampling_rate, is_continuous, query_filter, evaluators) must not be provided.

Parameters:
  • task (str) – Task name or global ID (base64). Names are resolved within the space when space is provided.

  • space (str | None) – Optional space name or ID used to disambiguate task name resolution.

  • name (str | _Missing) – New display name for the task.

  • sampling_rate (float | _Missing) – Fraction of data to evaluate (0-1). Evaluation tasks only, project-based tasks only.

  • is_continuous (bool | _Missing) – Whether the task runs continuously. Evaluation tasks only.

  • query_filter (str | None | _Missing) – Task-level query filter, or None to clear the filter. Evaluation tasks only.

  • evaluators (builtins.list[BaseEvaluationTaskRequestEvaluatorsInner] | _Missing) – Full replacement list of evaluators (at least one when provided). Evaluation tasks only.

  • run_configuration (RunConfiguration | _Missing) – Replacement run configuration. When provided the entire stored config is atomically replaced. run_experiment tasks only.

Returns:

The updated task.

Raises:
  • ValueError – If no update fields were provided, or if a field is not valid for the resolved task type.

  • ApiException – If the API request fails.

Return type:

Task

delete(*, task: str, space: str | None = None) None[source]#

Delete a task and its associated configuration.

Parameters:
  • task (str) – Task name or global ID (base64).

  • space (str | None) – Optional space name or ID used when resolving by task name.

Raises:

ApiException – If the API request fails.

Return type:

None

trigger_run(*, task: str, space: str | None = None, data_start_time: datetime | None = None, data_end_time: datetime | None = None, max_spans: int | None = None, override_evaluations: bool | None = None, experiment_ids: builtins.list[str] | None = None, experiment_name: str | None = None, dataset_version_id: str | None = None, max_examples: int | None = None, tracing_metadata: dict[str, Any] | None = None) TaskRun[source]#

Trigger an on-demand run for a task.

Dispatches based on the task’s type — resolves the task by ID or name first, then GETs it to determine whether it is an evaluation task or a run_experiment task, and builds the appropriate trigger body.

For evaluation tasks (template_evaluation / code_evaluation):

  • Valid fields: data_start_time, data_end_time, max_spans, override_evaluations, experiment_ids.

  • All fields are optional; an empty trigger body uses server defaults.

For run_experiment tasks:

  • Valid fields: experiment_name (required), dataset_version_id, max_examples, tracing_metadata.

  • experiment_name is the display name for the new experiment that will be created for this run.

Parameters:
  • task (str) – Task name or global ID (base64) to trigger a run for.

  • space (str | None) – Optional space name or ID used to disambiguate the task lookup. Recommended when resolving by name.

  • data_start_time (datetime | None) – Start of the data window to evaluate. Evaluation tasks only.

  • data_end_time (datetime | None) – End of the data window to evaluate. Defaults to now when omitted. Evaluation tasks only.

  • max_spans (int | None) – Maximum number of spans to process (default 10 000). Evaluation tasks only.

  • override_evaluations (bool | None) – Whether to re-evaluate data that already has evaluation labels. Defaults to False. Evaluation tasks only.

  • experiment_ids (builtins.list[str] | None) – Experiment global IDs (base64) to run against. Only applicable for dataset-based evaluation tasks.

  • experiment_name (str | None) – Display name for the experiment to be created. Must be unique within the dataset. Required for run_experiment tasks.

  • dataset_version_id (str | None) – Dataset version global ID (base64). Defaults to the latest version when omitted. run_experiment tasks only.

  • max_examples (int | None) – Maximum number of examples to run. Mutually exclusive with example_ids (not yet exposed). When omitted, all examples are used. run_experiment tasks only.

  • tracing_metadata (dict[str, Any] | None) – Arbitrary key-value metadata attached to the run’s traces. run_experiment tasks only.

Returns:

The newly created task run (initially in "pending" status).

Raises:
  • ValueError – If a field is not valid for the resolved task type, or if experiment_name is missing for a run_experiment task.

  • ApiException – If the API request fails.

Return type:

TaskRun

list_runs(*, task: str, space: str | None = None, status: RunStatus | None = None, limit: int = 100, cursor: str | None = None) TasksListRuns200Response[source]#

List runs for a task.

Results support cursor-based pagination. Optionally filter by run status.

Parameters:
  • task (str) – Task name or global ID (base64) to list runs for.

  • space (str | None) – Optional space name or ID used to disambiguate the task lookup. Recommended when resolving by name.

  • status (RunStatus | None) – Optional run status filter. One of "pending", "running", "completed", "failed", or "cancelled".

  • limit (int) – Maximum number of runs to return (1-100).

  • cursor (str | None) – Opaque pagination cursor from a previous response.

Returns:

A paginated task run list response from the Arize REST API.

Raises:

ApiException – If the API request fails.

Return type:

TasksListRuns200Response

get_run(*, run_id: str) TaskRun[source]#

Get a task run by its global ID.

Parameters:

run_id (str) – Task run global ID (base64) to retrieve.

Returns:

The task run with its current status and statistics.

Raises:

ApiException – If the API request fails (for example, run not found).

Return type:

TaskRun

cancel_run(*, run_id: str) TaskRun[source]#

Cancel a task run.

Only valid when the run’s current status is "pending" or "running".

Parameters:

run_id (str) – Task run global ID (base64) to cancel.

Returns:

The updated task run with status "cancelled".

Raises:

ApiException – If the API request fails (for example, run not found or already in terminal state).

Return type:

TaskRun

wait_for_run(*, run_id: str, poll_interval: float = _DEFAULT_POLL_INTERVAL, timeout: float = _DEFAULT_TIMEOUT) TaskRun[source]#

Poll a task run until it reaches a terminal state.

Repeatedly calls get_run at poll_interval-second intervals until the run’s status is one of "completed", "failed", or "cancelled", or until timeout seconds have elapsed.

Parameters:
  • run_id (str) – Task run global ID (base64) to wait for.

  • poll_interval (float) – Seconds between polling attempts. Defaults to 5.

  • timeout (float) – Maximum seconds to wait before raising TimeoutError. Defaults to 600.

Returns:

The task run in its terminal state.

Raises:
  • ValueError – If timeout or poll_interval is not positive.

  • TimeoutError – If the run does not reach a terminal state within timeout seconds.

  • ApiException – If any polling request fails.

Return type:

TaskRun

Response Types#

class Task(*, id: Annotated[str, Strict(strict=True)], name: Annotated[str, Strict(strict=True)], type: Annotated[str, Strict(strict=True)], project_id: Annotated[str, Strict(strict=True)] | None = None, dataset_id: Annotated[str, Strict(strict=True)] | None = None, sampling_rate: Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0), Le(le=1)])] | Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0), Le(le=1)])] | None = None, is_continuous: Annotated[bool, Strict(strict=True)], query_filter: Annotated[str, Strict(strict=True)] | None, evaluators: List[TaskEvaluator], experiment_ids: List[Annotated[str, Strict(strict=True)]], run_configuration: RunConfiguration | None = None, last_run_at: datetime | None, created_at: datetime, updated_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None)[source]#

Bases: BaseModel

A task is a typed, configurable unit of work that ties one or more evaluators to a data source (project or dataset). run_experiment tasks additionally carry a run_configuration that defines the LLM or evaluator settings for each triggered run.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
id: StrictStr#
name: StrictStr#
type: StrictStr#
project_id: StrictStr | None#
dataset_id: StrictStr | None#
sampling_rate: Annotated[float, Field(le=1, strict=True, ge=0)] | Annotated[int, Field(le=1, strict=True, ge=0)] | None#
is_continuous: StrictBool#
query_filter: StrictStr | None#
evaluators: List[TaskEvaluator]#
experiment_ids: List[StrictStr]#
run_configuration: RunConfiguration | None#
last_run_at: datetime | None#
created_at: datetime#
updated_at: datetime#
created_by_user_id: StrictStr | None#
classmethod type_validate_enum(value)[source]#

Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of Task from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of Task from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

class TaskRun(*, id: Annotated[str, Strict(strict=True)], task_id: Annotated[str, Strict(strict=True)], experiment_id: Annotated[str, Strict(strict=True)] | None = None, status: Annotated[str, Strict(strict=True)], run_started_at: datetime | None, run_finished_at: datetime | None, data_start_time: datetime | None, data_end_time: datetime | None, num_successes: Annotated[int, Strict(strict=True)], num_errors: Annotated[int, Strict(strict=True)], num_skipped: Annotated[int, Strict(strict=True)], created_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None)[source]#

Bases: BaseModel

A task run is an async job that executes the work defined on a task. Runs are created by triggering an existing task (POST /v2/tasks/{task_id}/trigger). For run_experiment tasks, experiment_id is populated after the experiment is provisioned; poll GET /v2/task-runs/{run_id} until status reaches a terminal state.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
id: StrictStr#
task_id: StrictStr#
experiment_id: StrictStr | None#
status: StrictStr#
run_started_at: datetime | None#
run_finished_at: datetime | None#
data_start_time: datetime | None#
data_end_time: datetime | None#
num_successes: StrictInt#
num_errors: StrictInt#
num_skipped: StrictInt#
created_at: datetime#
created_by_user_id: StrictStr | None#
classmethod status_validate_enum(value)[source]#

Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of TaskRun from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of TaskRun from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

class TasksList200Response(*, tasks: List[Task], pagination: PaginationMetadata)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • tasks (List[Task])

  • pagination (PaginationMetadata)

tasks: List[Task]#
pagination: PaginationMetadata#
model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of TasksList200Response from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of TasksList200Response from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

to_df(by_alias: bool = False, exclude_none: str | bool = True, json_normalize: bool = False, convert_dtypes: bool = True, expand_field: str = 'additional_properties', expand_prefix: str = '') pd.DataFrame#

Convert a list of objects to a pandas.DataFrame.

Behavior:
  • If an item is a Pydantic v2 model, use .model_dump(by_alias=…).

  • If an item is a mapping (dict-like), use it as-is.

  • Otherwise, raise a ValueError (unsupported row type).

Parameters:
  • self (object) – The object instance containing the field to convert.

  • by_alias (bool) – Use field aliases when dumping Pydantic models.

  • exclude_none (str | bool) – Control None/NaN column dropping. - False: keep Nones as-is - “all”: drop columns where all values are None/NaN - “any”: drop columns where any value is None/NaN - True: alias for “all”

  • json_normalize (bool) – If True, flatten nested dicts via pandas.json_normalize.

  • convert_dtypes (bool) – If True, call DataFrame.convert_dtypes() at the end.

  • expand_field (str) – If set, look for this field in each row and

  • columns. (expand its keys into top-level)

  • expand_prefix (str) – If set, prefix expanded column names with this string.

Returns:

The converted DataFrame.

Return type:

pandas.DataFrame

class TasksListRuns200Response(*, task_runs: List[TaskRun], pagination: PaginationMetadata)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • task_runs (List[TaskRun])

  • pagination (PaginationMetadata)

task_runs: List[TaskRun]#
pagination: PaginationMetadata#
model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of TasksListRuns200Response from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of TasksListRuns200Response from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

to_df(by_alias: bool = False, exclude_none: str | bool = True, json_normalize: bool = False, convert_dtypes: bool = True, expand_field: str = 'additional_properties', expand_prefix: str = '') pd.DataFrame#

Convert a list of objects to a pandas.DataFrame.

Behavior:
  • If an item is a Pydantic v2 model, use .model_dump(by_alias=…).

  • If an item is a mapping (dict-like), use it as-is.

  • Otherwise, raise a ValueError (unsupported row type).

Parameters:
  • self (object) – The object instance containing the field to convert.

  • by_alias (bool) – Use field aliases when dumping Pydantic models.

  • exclude_none (str | bool) – Control None/NaN column dropping. - False: keep Nones as-is - “all”: drop columns where all values are None/NaN - “any”: drop columns where any value is None/NaN - True: alias for “all”

  • json_normalize (bool) – If True, flatten nested dicts via pandas.json_normalize.

  • convert_dtypes (bool) – If True, call DataFrame.convert_dtypes() at the end.

  • expand_field (str) – If set, look for this field in each row and

  • columns. (expand its keys into top-level)

  • expand_prefix (str) – If set, prefix expanded column names with this string.

Returns:

The converted DataFrame.

Return type:

pandas.DataFrame

class BaseEvaluationTaskRequestEvaluatorsInner(*, evaluator_id: Annotated[str, Strict(strict=True)], query_filter: Annotated[str, Strict(strict=True)] | None = None, column_mappings: Dict[str, Annotated[str, Strict(strict=True)]] | None = None)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
evaluator_id: StrictStr#
query_filter: StrictStr | None#
column_mappings: Dict[str, StrictStr] | None#
model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of BaseEvaluationTaskRequestEvaluatorsInner from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of BaseEvaluationTaskRequestEvaluatorsInner from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

class LlmGenerationRunConfig(*, experiment_type: Annotated[str, Strict(strict=True)], ai_integration_id: Annotated[str, Strict(strict=True)], model_name: Annotated[str, Strict(strict=True)] | None = None, messages: Annotated[List[LLMMessage], MinLen(min_length=1)], input_variable_format: InputVariableFormat, invocation_parameters: InvocationParams | None = None, provider_parameters: Dict[str, Any] | None = None, tool_config: ToolConfig | None = None, prompt_version_id: Annotated[str, Strict(strict=True)] | None = None)[source]#

Bases: BaseModel

Configuration for running an LLM prompt against each dataset example.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
experiment_type: StrictStr#
ai_integration_id: StrictStr#
model_name: StrictStr | None#
messages: Annotated[List[LLMMessage], Field(min_length=1)]#
input_variable_format: InputVariableFormat#
invocation_parameters: InvocationParams | None#
provider_parameters: Dict[str, Any] | None#
tool_config: ToolConfig | None#
prompt_version_id: StrictStr | None#
classmethod experiment_type_validate_enum(value)[source]#

Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of LlmGenerationRunConfig from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of LlmGenerationRunConfig from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

class TemplateEvaluationRunConfig(*, experiment_type: Annotated[str, Strict(strict=True)], ai_integration_id: Annotated[str, Strict(strict=True)], model_name: Annotated[str, Strict(strict=True)] | None = None, template: Annotated[str, Strict(strict=True), MinLen(min_length=1)], provide_explanation: Annotated[bool, Strict(strict=True)], classification_choices: Dict[str, Annotated[float, Strict(strict=True)] | Annotated[int, Strict(strict=True)]] | None = None, column_mapping: Dict[str, Annotated[str, Strict(strict=True)]] | None = None, evaluator_version_id: Annotated[str, Strict(strict=True)] | None = None, invocation_parameters: InvocationParams | None = None, provider_parameters: Dict[str, Any] | None = None)[source]#

Bases: BaseModel

Configuration for running a template-based LLM evaluator against each dataset example.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
experiment_type: StrictStr#
ai_integration_id: StrictStr#
model_name: StrictStr | None#
template: Annotated[str, Field(min_length=1, strict=True)]#
provide_explanation: StrictBool#
classification_choices: Dict[str, StrictFloat | StrictInt] | None#
column_mapping: Dict[str, StrictStr] | None#
evaluator_version_id: StrictStr | None#
invocation_parameters: InvocationParams | None#
provider_parameters: Dict[str, Any] | None#
classmethod experiment_type_validate_enum(value)[source]#

Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() str[source]#

Returns the string representation of the model using alias

Return type:

str

to_json() str[source]#

Returns the JSON representation of the model using alias

Return type:

str

classmethod from_json(json_str: str) Self | None[source]#

Create an instance of TemplateEvaluationRunConfig from a JSON string

Parameters:

json_str (str)

Return type:

Self | None

to_dict() Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

  • None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:

Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) Self | None[source]#

Create an instance of TemplateEvaluationRunConfig from a dict

Parameters:

obj (Dict[str, Any] | None)

Return type:

Self | None

class RunConfiguration(*args, oneof_schema_1_validator: LlmGenerationRunConfig | None = None, oneof_schema_2_validator: TemplateEvaluationRunConfig | None = None, actual_instance: LlmGenerationRunConfig | TemplateEvaluationRunConfig | None = None, one_of_schemas: Set[str] = {'LlmGenerationRunConfig', 'TemplateEvaluationRunConfig'}, discriminator_value_class_map: Dict[str, str] = {})[source]#

Bases: BaseModel

Experiment execution configuration for a run_experiment task. Exactly one variant must be supplied, identified by experiment_type. All fields sit at the top level alongside experiment_type (flat — no wrapper sub-object).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
oneof_schema_1_validator: LlmGenerationRunConfig | None#
oneof_schema_2_validator: TemplateEvaluationRunConfig | None#
actual_instance: LlmGenerationRunConfig | TemplateEvaluationRunConfig | None#
one_of_schemas: Set[str]#
model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

discriminator_value_class_map: Dict[str, str]#
classmethod actual_instance_must_validate_oneof(v)[source]#
classmethod from_dict(obj: str | Dict[str, Any]) Self[source]#
Parameters:

obj (str | Dict[str, Any])

Return type:

Self

classmethod from_json(json_str: str) Self[source]#

Returns the object represented by the json string

Parameters:

json_str (str)

Return type:

Self

to_json() str[source]#

Returns the JSON representation of the actual instance

Return type:

str

to_dict() Dict[str, Any] | LlmGenerationRunConfig | TemplateEvaluationRunConfig | None[source]#

Returns the dict representation of the actual instance

Return type:

Dict[str, Any] | LlmGenerationRunConfig | TemplateEvaluationRunConfig | None

to_str() str[source]#

Returns the string representation of the actual instance

Return type:

str