Tasks#

class TasksClient(*, sdk_config: SDKConfiguration, generated_client: ApiClient)[source]#

Bases: object

Client for managing Arize tasks and task runs.

This class is primarily intended for internal use within the SDK. Users are highly encouraged to access resource-specific functionality via arize.ArizeClient.

The tasks client is a thin wrapper around the generated REST API client, using the shared generated API client owned by arize.config.SDKConfiguration.

Parameters:

sdk_config (SDKConfiguration) – Resolved SDK configuration.
generated_client (ApiClient) – Shared generated API client instance.

List tasks the user has access to.

Results support cursor-based pagination. Optionally filter by space, project, dataset, or task type.

Parameters:

name (str | None) – Optional case-insensitive substring filter on the task name.
project (str | None) – Optional project name or global ID (base64) to filter results. If the value is a name, space must also be provided.
dataset (str | None) – Optional dataset name or global ID (base64) to filter results. If the value is a name, space must also be provided.
space (str | None) – Optional space name or ID used to disambiguate name-based resolution for project and dataset. If the value is a base64-encoded resource ID it is treated as a space ID; otherwise it is used as a case-insensitive substring filter on the space name.
task_type (TaskType | None) – Optional task type filter. One of "template_evaluation" or "code_evaluation".
limit (int) – Maximum number of tasks to return (1-100).
cursor (str | None) – Opaque pagination cursor from a previous response.

Returns:

A paginated task list response from the Arize REST API.

Raises:

ApiException – If the API request fails.

Return type:

TasksList200Response

get(*, task: str, space: str | None = None) → Task[source]#

Get a task by name or ID.

Parameters:

task (str) – Task name or global ID (base64). If the value looks like an ID it is used directly; otherwise it is resolved by name.
space (str | None) – Optional space name or ID used to disambiguate the task lookup. Recommended when resolving by name.

Returns:

The task with its full configuration.

Raises:

ApiException – If the API request fails (for example, task not found).

Return type:

Task

create_evaluation_task(*, name: str, task_type: TaskType, evaluators: builtins.list[BaseEvaluationTaskRequestEvaluatorsInner], project: str | None = None, dataset: str | None = None, space: str | None = None, experiment_ids: builtins.list[str] | None = None, sampling_rate: float | None = None, is_continuous: bool | None = None, query_filter: str | None = None) → Task[source]#

Create a new evaluation task.

A typed convenience wrapper around the internal task-creation logic for "template_evaluation" and "code_evaluation" task types. Prefer this method when creating evaluation tasks for a cleaner, narrowly-typed signature.

Parameters:

name (str) – Task name (must be unique within the space).
task_type (TaskType) – Task type: "template_evaluation" or "code_evaluation".
evaluators (builtins.list[BaseEvaluationTaskRequestEvaluatorsInner]) –
List of evaluators to attach (at least one required). Each entry is a arize.tasks.types.BaseEvaluationTaskRequestEvaluatorsInner with the following fields:
- evaluator_id — Evaluator global ID (base64). Required.
- query_filter — Per-evaluator filter. Optional.
- column_mappings — Template variable name mappings. Optional.
project (str | None) – Project name or global ID (base64). Required when dataset is not provided.
dataset (str | None) – Dataset name or global ID (base64). Required when project is not provided.
space (str | None) – Optional space name or ID used to disambiguate name-based resolution for project and dataset.
experiment_ids (builtins.list[str] | None) – Experiment global IDs (base64). Required (at least one) when dataset is provided.
sampling_rate (float | None) – Fraction of data to evaluate (0-1). Only valid for project-based tasks.
is_continuous (bool | None) – Whether to run the task continuously. Only valid for project-based tasks.
query_filter (str | None) – Task-level query filter applied to all evaluators.

Returns:

The newly created task.

Raises:

ValueError – If required fields are missing or mutually exclusive fields are combined.
ApiException – If the API request fails.

Return type:

Task

create_run_experiment_task(*, name: str, dataset: str, run_configuration: RunConfiguration, space: str | None = None) → Task[source]#

Create a new run_experiment task.

A typed convenience wrapper around the internal task-creation logic for "run_experiment" task types. The server drives all LLM calls using the AI integration specified in run_configuration — no local callable is required.

To create and immediately trigger a run in one call, use create_and_run_experiment_task (available separately).

Parameters:

name (str) – Task name (must be unique within the space).
dataset (str) – Dataset name or global ID (base64) to run the experiment against.
run_configuration (RunConfiguration) – Discriminated experiment configuration. Use arize.tasks.types.LlmGenerationRunConfig or arize.tasks.types.TemplateEvaluationRunConfig wrapped in arize.tasks.types.RunConfiguration.
space (str | None) – Optional space name or ID used to resolve dataset by name.

Returns:

The newly created task.

Raises:

ApiException – If the API request fails.

Return type:

Task

update(*, task: str, space: str | None = None, name: str | _Missing = _MISSING, sampling_rate: float | _Missing = _MISSING, is_continuous: bool | _Missing = _MISSING, query_filter: str | None | _Missing = _MISSING, evaluators: builtins.list[BaseEvaluationTaskRequestEvaluatorsInner] | _Missing = _MISSING, run_configuration: RunConfiguration | _Missing = _MISSING) → Task[source]#

Update mutable fields on an existing task.

Dispatches based on the task’s type — resolves the task by ID or name first, then GETs it to determine whether it is an evaluation task or a run_experiment task, and builds the appropriate PATCH body.

At least one mutable field must be provided. Pass None to query_filter to clear the existing filter; omit the argument to leave it unchanged.

For evaluation tasks (template_evaluation / code_evaluation):

Valid fields: name, sampling_rate, is_continuous, query_filter, evaluators.
run_configuration must not be provided.

For run_experiment tasks:

Valid fields: name, run_configuration.
Evaluation-only fields (sampling_rate, is_continuous, query_filter, evaluators) must not be provided.

Parameters:

task (str) – Task name or global ID (base64). Names are resolved within the space when space is provided.
space (str | None) – Optional space name or ID used to disambiguate task name resolution.
name (str | _Missing) – New display name for the task.
sampling_rate (float | _Missing) – Fraction of data to evaluate (0-1). Evaluation tasks only, project-based tasks only.
is_continuous (bool | _Missing) – Whether the task runs continuously. Evaluation tasks only.
query_filter (str | None | _Missing) – Task-level query filter, or None to clear the filter. Evaluation tasks only.
evaluators (builtins.list[BaseEvaluationTaskRequestEvaluatorsInner] | _Missing) – Full replacement list of evaluators (at least one when provided). Evaluation tasks only.
run_configuration (RunConfiguration | _Missing) – Replacement run configuration. When provided the entire stored config is atomically replaced. run_experiment tasks only.

Returns:

The updated task.

Raises:

ValueError – If no update fields were provided, or if a field is not valid for the resolved task type.
ApiException – If the API request fails.

Return type:

Task

delete(*, task: str, space: str | None = None) → None[source]#

Delete a task and its associated configuration.

Parameters:

task (str) – Task name or global ID (base64).
space (str | None) – Optional space name or ID used when resolving by task name.

Raises:

ApiException – If the API request fails.

Return type:

None

Trigger an on-demand run for a task.

Dispatches based on the task’s type — resolves the task by ID or name first, then GETs it to determine whether it is an evaluation task or a run_experiment task, and builds the appropriate trigger body.

For evaluation tasks (template_evaluation / code_evaluation):

Valid fields: data_start_time, data_end_time, max_spans, override_evaluations, experiment_ids.
All fields are optional; an empty trigger body uses server defaults.

For run_experiment tasks:

Valid fields: experiment_name (required), dataset_version_id, max_examples, tracing_metadata.
experiment_name is the display name for the new experiment that will be created for this run.

Parameters:

task (str) – Task name or global ID (base64) to trigger a run for.
space (str | None) – Optional space name or ID used to disambiguate the task lookup. Recommended when resolving by name.
data_start_time (datetime | None) – Start of the data window to evaluate. Evaluation tasks only.
data_end_time (datetime | None) – End of the data window to evaluate. Defaults to now when omitted. Evaluation tasks only.
max_spans (int | None) – Maximum number of spans to process (default 10 000). Evaluation tasks only.
override_evaluations (bool | None) – Whether to re-evaluate data that already has evaluation labels. Defaults to False. Evaluation tasks only.
experiment_ids (builtins.list[str] | None) – Experiment global IDs (base64) to run against. Only applicable for dataset-based evaluation tasks.
experiment_name (str | None) – Display name for the experiment to be created. Must be unique within the dataset. Required for run_experiment tasks.
dataset_version_id (str | None) – Dataset version global ID (base64). Defaults to the latest version when omitted. run_experiment tasks only.
max_examples (int | None) – Maximum number of examples to run. Mutually exclusive with example_ids (not yet exposed). When omitted, all examples are used. run_experiment tasks only.
tracing_metadata (dict[str, Any] | None) – Arbitrary key-value metadata attached to the run’s traces. run_experiment tasks only.

Returns:

The newly created task run (initially in "pending" status).

Raises:

ValueError – If a field is not valid for the resolved task type, or if experiment_name is missing for a run_experiment task.
ApiException – If the API request fails.

Return type:

TaskRun

list_runs(*, task: str, space: str | None = None, status: RunStatus | None = None, limit: int = 100, cursor: str | None = None) → TasksListRuns200Response[source]#

List runs for a task.

Results support cursor-based pagination. Optionally filter by run status.

Parameters:

task (str) – Task name or global ID (base64) to list runs for.
space (str | None) – Optional space name or ID used to disambiguate the task lookup. Recommended when resolving by name.
status (RunStatus | None) – Optional run status filter. One of "pending", "running", "completed", "failed", or "cancelled".
limit (int) – Maximum number of runs to return (1-100).
cursor (str | None) – Opaque pagination cursor from a previous response.

Returns:

A paginated task run list response from the Arize REST API.

Raises:

ApiException – If the API request fails.

Return type:

TasksListRuns200Response

get_run(*, run_id: str) → TaskRun[source]#

Get a task run by its global ID.

Parameters:: run_id (str) – Task run global ID (base64) to retrieve.
Returns:: The task run with its current status and statistics.
Raises:: ApiException – If the API request fails (for example, run not found).
Return type:: TaskRun

cancel_run(*, run_id: str) → TaskRun[source]#

Cancel a task run.

Only valid when the run’s current status is "pending" or "running".

Parameters:: run_id (str) – Task run global ID (base64) to cancel.
Returns:: The updated task run with status "cancelled".
Raises:: ApiException – If the API request fails (for example, run not found or already in terminal state).
Return type:: TaskRun

wait_for_run(*, run_id: str, poll_interval: float = _DEFAULT_POLL_INTERVAL, timeout: float = _DEFAULT_TIMEOUT) → TaskRun[source]#

Poll a task run until it reaches a terminal state.

Repeatedly calls get_run at poll_interval-second intervals until the run’s status is one of "completed", "failed", or "cancelled", or until timeout seconds have elapsed.

Parameters:

run_id (str) – Task run global ID (base64) to wait for.
poll_interval (float) – Seconds between polling attempts. Defaults to 5.
timeout (float) – Maximum seconds to wait before raising TimeoutError. Defaults to 600.

Returns:

The task run in its terminal state.

Raises:

ValueError – If timeout or poll_interval is not positive.
TimeoutError – If the run does not reach a terminal state within timeout seconds.
ApiException – If any polling request fails.

Return type:

TaskRun

Response Types#

class Task(*, id: Annotated[str, Strict(strict=True)], name: Annotated[str, Strict(strict=True)], type: Annotated[str, Strict(strict=True)], project_id: Annotated[str, Strict(strict=True)] | None = None, dataset_id: Annotated[str, Strict(strict=True)] | None = None, sampling_rate: Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0), Le(le=1)])] | Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0), Le(le=1)])] | None = None, is_continuous: Annotated[bool, Strict(strict=True)], query_filter: Annotated[str, Strict(strict=True)] | None, evaluators: List[TaskEvaluator], experiment_ids: List[Annotated[str, Strict(strict=True)]], run_configuration: RunConfiguration | None = None, last_run_at: datetime | None, created_at: datetime, updated_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None)[source]#

Bases: BaseModel

A task is a typed, configurable unit of work that ties one or more evaluators to a data source (project or dataset). run_experiment tasks additionally carry a run_configuration that defines the LLM or evaluator settings for each triggered run.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (Annotated[str, Strict(strict=True)])
name (Annotated[str, Strict(strict=True)])
type (Annotated[str, Strict(strict=True)])
project_id (Annotated[str, Strict(strict=True)] | None)
dataset_id (Annotated[str, Strict(strict=True)] | None)
sampling_rate (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0), Le(le=1)])] | Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0), Le(le=1)])] | None)
is_continuous (Annotated[bool, Strict(strict=True)])
query_filter (Annotated[str, Strict(strict=True)] | None)
evaluators (List[TaskEvaluator])
experiment_ids (List[Annotated[str, Strict(strict=True)]])
run_configuration (RunConfiguration | None)
last_run_at (datetime | None)
created_at (datetime)
updated_at (datetime)
created_by_user_id (Annotated[str, Strict(strict=True)] | None)

id: StrictStr#

name: StrictStr#

type: StrictStr#

project_id: StrictStr | None#

dataset_id: StrictStr | None#

sampling_rate: Annotated[float, Field(le=1, strict=True, ge=0)] | Annotated[int, Field(le=1, strict=True, ge=0)] | None#

is_continuous: StrictBool#

query_filter: StrictStr | None#

evaluators: List[TaskEvaluator]#

experiment_ids: List[StrictStr]#

run_configuration: RunConfiguration | None#

last_run_at: datetime | None#

created_at: datetime#

updated_at: datetime#

created_by_user_id: StrictStr | None#

classmethod type_validate_enum(value)[source]#: Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() → str[source]#

Returns the string representation of the model using alias

Return type:: str

to_json() → str[source]#

Returns the JSON representation of the model using alias

Return type:: str

classmethod from_json(json_str: str) → Self | None[source]#

Create an instance of Task from a JSON string

Parameters:: json_str (str)
Return type:: Self | None

to_dict() → Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:: Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) → Self | None[source]#

Create an instance of Task from a dict

Parameters:: obj (Dict[str, Any] | None)
Return type:: Self | None

class TaskRun(*, id: Annotated[str, Strict(strict=True)], task_id: Annotated[str, Strict(strict=True)], experiment_id: Annotated[str, Strict(strict=True)] | None = None, status: Annotated[str, Strict(strict=True)], run_started_at: datetime | None, run_finished_at: datetime | None, data_start_time: datetime | None, data_end_time: datetime | None, num_successes: Annotated[int, Strict(strict=True)], num_errors: Annotated[int, Strict(strict=True)], num_skipped: Annotated[int, Strict(strict=True)], created_at: datetime, created_by_user_id: Annotated[str, Strict(strict=True)] | None)[source]#

Bases: BaseModel

A task run is an async job that executes the work defined on a task. Runs are created by triggering an existing task (POST /v2/tasks/{task_id}/trigger). For run_experiment tasks, experiment_id is populated after the experiment is provisioned; poll GET /v2/task-runs/{run_id} until status reaches a terminal state.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

id (Annotated[str, Strict(strict=True)])
task_id (Annotated[str, Strict(strict=True)])
experiment_id (Annotated[str, Strict(strict=True)] | None)
status (Annotated[str, Strict(strict=True)])
run_started_at (datetime | None)
run_finished_at (datetime | None)
data_start_time (datetime | None)
data_end_time (datetime | None)
num_successes (Annotated[int, Strict(strict=True)])
num_errors (Annotated[int, Strict(strict=True)])
num_skipped (Annotated[int, Strict(strict=True)])
created_at (datetime)
created_by_user_id (Annotated[str, Strict(strict=True)] | None)

id: StrictStr#

task_id: StrictStr#

experiment_id: StrictStr | None#

status: StrictStr#

run_started_at: datetime | None#

run_finished_at: datetime | None#

data_start_time: datetime | None#

data_end_time: datetime | None#

num_successes: StrictInt#

num_errors: StrictInt#

num_skipped: StrictInt#

created_at: datetime#

created_by_user_id: StrictStr | None#

classmethod status_validate_enum(value)[source]#: Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() → str[source]#

Returns the string representation of the model using alias

Return type:: str

to_json() → str[source]#

Returns the JSON representation of the model using alias

Return type:: str

classmethod from_json(json_str: str) → Self | None[source]#

Create an instance of TaskRun from a JSON string

Parameters:: json_str (str)
Return type:: Self | None

to_dict() → Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:: Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) → Self | None[source]#

Create an instance of TaskRun from a dict

Parameters:: obj (Dict[str, Any] | None)
Return type:: Self | None

class TasksList200Response(*, tasks: List[Task], pagination: PaginationMetadata)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

tasks (List[Task])
pagination (PaginationMetadata)

tasks: List[Task]#

pagination: PaginationMetadata#

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() → str[source]#

Returns the string representation of the model using alias

Return type:: str

to_json() → str[source]#

Returns the JSON representation of the model using alias

Return type:: str

classmethod from_json(json_str: str) → Self | None[source]#

Create an instance of TasksList200Response from a JSON string

Parameters:: json_str (str)
Return type:: Self | None

to_dict() → Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:: Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) → Self | None[source]#

Create an instance of TasksList200Response from a dict

Parameters:: obj (Dict[str, Any] | None)
Return type:: Self | None

to_df(by_alias: bool = False, exclude_none: str | bool = True, json_normalize: bool = False, convert_dtypes: bool = True, expand_field: str = 'additional_properties', expand_prefix: str = '') → pd.DataFrame#

Convert a list of objects to a pandas.DataFrame.

Behavior:

If an item is a Pydantic v2 model, use .model_dump(by_alias=…).
If an item is a mapping (dict-like), use it as-is.
Otherwise, raise a ValueError (unsupported row type).

Parameters:

self (object) – The object instance containing the field to convert.
by_alias (bool) – Use field aliases when dumping Pydantic models.
exclude_none (str | bool) – Control None/NaN column dropping. - False: keep Nones as-is - “all”: drop columns where all values are None/NaN - “any”: drop columns where any value is None/NaN - True: alias for “all”
json_normalize (bool) – If True, flatten nested dicts via pandas.json_normalize.
convert_dtypes (bool) – If True, call DataFrame.convert_dtypes() at the end.
expand_field (str) – If set, look for this field in each row and
columns. (expand its keys into top-level)
expand_prefix (str) – If set, prefix expanded column names with this string.

Returns:

The converted DataFrame.

Return type:

pandas.DataFrame

class TasksListRuns200Response(*, task_runs: List[TaskRun], pagination: PaginationMetadata)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

task_runs (List[TaskRun])
pagination (PaginationMetadata)

task_runs: List[TaskRun]#

pagination: PaginationMetadata#

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() → str[source]#

Returns the string representation of the model using alias

Return type:: str

to_json() → str[source]#

Returns the JSON representation of the model using alias

Return type:: str

classmethod from_json(json_str: str) → Self | None[source]#

Create an instance of TasksListRuns200Response from a JSON string

Parameters:: json_str (str)
Return type:: Self | None

to_dict() → Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:: Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) → Self | None[source]#

Create an instance of TasksListRuns200Response from a dict

Parameters:: obj (Dict[str, Any] | None)
Return type:: Self | None

to_df(by_alias: bool = False, exclude_none: str | bool = True, json_normalize: bool = False, convert_dtypes: bool = True, expand_field: str = 'additional_properties', expand_prefix: str = '') → pd.DataFrame#

Convert a list of objects to a pandas.DataFrame.

Behavior:

If an item is a Pydantic v2 model, use .model_dump(by_alias=…).
If an item is a mapping (dict-like), use it as-is.
Otherwise, raise a ValueError (unsupported row type).

Parameters:

self (object) – The object instance containing the field to convert.
by_alias (bool) – Use field aliases when dumping Pydantic models.
exclude_none (str | bool) – Control None/NaN column dropping. - False: keep Nones as-is - “all”: drop columns where all values are None/NaN - “any”: drop columns where any value is None/NaN - True: alias for “all”
json_normalize (bool) – If True, flatten nested dicts via pandas.json_normalize.
convert_dtypes (bool) – If True, call DataFrame.convert_dtypes() at the end.
expand_field (str) – If set, look for this field in each row and
columns. (expand its keys into top-level)
expand_prefix (str) – If set, prefix expanded column names with this string.

Returns:

The converted DataFrame.

Return type:

pandas.DataFrame

class BaseEvaluationTaskRequestEvaluatorsInner(*, evaluator_id: Annotated[str, Strict(strict=True)], query_filter: Annotated[str, Strict(strict=True)] | None = None, column_mappings: Dict[str, Annotated[str, Strict(strict=True)]] | None = None)[source]#

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

evaluator_id (Annotated[str, Strict(strict=True)])
query_filter (Annotated[str, Strict(strict=True)] | None)
column_mappings (Dict[str, Annotated[str, Strict(strict=True)]] | None)

evaluator_id: StrictStr#

query_filter: StrictStr | None#

column_mappings: Dict[str, StrictStr] | None#

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() → str[source]#

Returns the string representation of the model using alias

Return type:: str

to_json() → str[source]#

Returns the JSON representation of the model using alias

Return type:: str

classmethod from_json(json_str: str) → Self | None[source]#

Create an instance of BaseEvaluationTaskRequestEvaluatorsInner from a JSON string

Parameters:: json_str (str)
Return type:: Self | None

to_dict() → Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:: Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) → Self | None[source]#

Create an instance of BaseEvaluationTaskRequestEvaluatorsInner from a dict

Parameters:: obj (Dict[str, Any] | None)
Return type:: Self | None

class LlmGenerationRunConfig(*, experiment_type: Annotated[str, Strict(strict=True)], ai_integration_id: Annotated[str, Strict(strict=True)], model_name: Annotated[str, Strict(strict=True)] | None = None, messages: Annotated[List[LLMMessage], MinLen(min_length=1)], input_variable_format: InputVariableFormat, invocation_parameters: InvocationParams | None = None, provider_parameters: Dict[str, Any] | None = None, tool_config: ToolConfig | None = None, prompt_version_id: Annotated[str, Strict(strict=True)] | None = None)[source]#

Bases: BaseModel

Configuration for running an LLM prompt against each dataset example.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

experiment_type (Annotated[str, Strict(strict=True)])
ai_integration_id (Annotated[str, Strict(strict=True)])
model_name (Annotated[str, Strict(strict=True)] | None)
messages (Annotated[List[LLMMessage], FieldInfo(annotation=NoneType, required=True, metadata=[MinLen(min_length=1)])])
input_variable_format (InputVariableFormat)
invocation_parameters (InvocationParams | None)
provider_parameters (Dict[str, Any] | None)
tool_config (ToolConfig | None)
prompt_version_id (Annotated[str, Strict(strict=True)] | None)

experiment_type: StrictStr#

ai_integration_id: StrictStr#

model_name: StrictStr | None#

messages: Annotated[List[LLMMessage], Field(min_length=1)]#

input_variable_format: InputVariableFormat#

invocation_parameters: InvocationParams | None#

provider_parameters: Dict[str, Any] | None#

tool_config: ToolConfig | None#

prompt_version_id: StrictStr | None#

classmethod experiment_type_validate_enum(value)[source]#: Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() → str[source]#

Returns the string representation of the model using alias

Return type:: str

to_json() → str[source]#

Returns the JSON representation of the model using alias

Return type:: str

classmethod from_json(json_str: str) → Self | None[source]#

Create an instance of LlmGenerationRunConfig from a JSON string

Parameters:: json_str (str)
Return type:: Self | None

to_dict() → Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:: Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) → Self | None[source]#

Create an instance of LlmGenerationRunConfig from a dict

Parameters:: obj (Dict[str, Any] | None)
Return type:: Self | None

class TemplateEvaluationRunConfig(*, experiment_type: Annotated[str, Strict(strict=True)], ai_integration_id: Annotated[str, Strict(strict=True)], model_name: Annotated[str, Strict(strict=True)] | None = None, template: Annotated[str, Strict(strict=True), MinLen(min_length=1)], provide_explanation: Annotated[bool, Strict(strict=True)], classification_choices: Dict[str, Annotated[float, Strict(strict=True)] | Annotated[int, Strict(strict=True)]] | None = None, column_mapping: Dict[str, Annotated[str, Strict(strict=True)]] | None = None, evaluator_version_id: Annotated[str, Strict(strict=True)] | None = None, invocation_parameters: InvocationParams | None = None, provider_parameters: Dict[str, Any] | None = None)[source]#

Bases: BaseModel

Configuration for running a template-based LLM evaluator against each dataset example.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

experiment_type (Annotated[str, Strict(strict=True)])
ai_integration_id (Annotated[str, Strict(strict=True)])
model_name (Annotated[str, Strict(strict=True)] | None)
template (Annotated[str, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), MinLen(min_length=1)])])
provide_explanation (Annotated[bool, Strict(strict=True)])
classification_choices (Dict[str, Annotated[float, Strict(strict=True)] | Annotated[int, Strict(strict=True)]] | None)
column_mapping (Dict[str, Annotated[str, Strict(strict=True)]] | None)
evaluator_version_id (Annotated[str, Strict(strict=True)] | None)
invocation_parameters (InvocationParams | None)
provider_parameters (Dict[str, Any] | None)

experiment_type: StrictStr#

ai_integration_id: StrictStr#

model_name: StrictStr | None#

template: Annotated[str, Field(min_length=1, strict=True)]#

provide_explanation: StrictBool#

classification_choices: Dict[str, StrictFloat | StrictInt] | None#

column_mapping: Dict[str, StrictStr] | None#

evaluator_version_id: StrictStr | None#

invocation_parameters: InvocationParams | None#

provider_parameters: Dict[str, Any] | None#

classmethod experiment_type_validate_enum(value)[source]#: Validates the enum

model_config: ClassVar[ConfigDict] = {'populate_by_name': True, 'protected_namespaces': (), 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

to_str() → str[source]#

Returns the string representation of the model using alias

Return type:: str

to_json() → str[source]#

Returns the JSON representation of the model using alias

Return type:: str

classmethod from_json(json_str: str) → Self | None[source]#

Create an instance of TemplateEvaluationRunConfig from a JSON string

Parameters:: json_str (str)
Return type:: Self | None

to_dict() → Dict[str, Any][source]#

Return the dictionary representation of the model using alias.

This has the following differences from calling pydantic’s self.model_dump(by_alias=True):

None is only added to the output dict for nullable fields that were set at model initialization. Other fields with value None are ignored.

Return type:: Dict[str, Any]

classmethod from_dict(obj: Dict[str, Any] | None) → Self | None[source]#

Create an instance of TemplateEvaluationRunConfig from a dict

Parameters:: obj (Dict[str, Any] | None)
Return type:: Self | None

class RunConfiguration(*args, oneof_schema_1_validator: LlmGenerationRunConfig | None = None, oneof_schema_2_validator: TemplateEvaluationRunConfig | None = None, actual_instance: LlmGenerationRunConfig | TemplateEvaluationRunConfig | None = None, one_of_schemas: Set[str] = {'LlmGenerationRunConfig', 'TemplateEvaluationRunConfig'}, discriminator_value_class_map: Dict[str, str] = {})[source]#

Bases: BaseModel

Experiment execution configuration for a run_experiment task. Exactly one variant must be supplied, identified by experiment_type. All fields sit at the top level alongside experiment_type (flat — no wrapper sub-object).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

oneof_schema_1_validator (LlmGenerationRunConfig | None)
oneof_schema_2_validator (TemplateEvaluationRunConfig | None)
actual_instance (LlmGenerationRunConfig | TemplateEvaluationRunConfig | None)
one_of_schemas (Set[str])
discriminator_value_class_map (Dict[str, str])

oneof_schema_1_validator: LlmGenerationRunConfig | None#

oneof_schema_2_validator: TemplateEvaluationRunConfig | None#

actual_instance: LlmGenerationRunConfig | TemplateEvaluationRunConfig | None#

one_of_schemas: Set[str]#

model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

discriminator_value_class_map: Dict[str, str]#

classmethod actual_instance_must_validate_oneof(v)[source]#

classmethod from_dict(obj: str | Dict[str, Any]) → Self[source]#

Parameters:: obj (str | Dict[str, Any])
Return type:: Self

classmethod from_json(json_str: str) → Self[source]#

Returns the object represented by the json string

Parameters:: json_str (str)
Return type:: Self

to_json() → str[source]#

Returns the JSON representation of the actual instance

Return type:: str

to_dict() → Dict[str, Any] | LlmGenerationRunConfig | TemplateEvaluationRunConfig | None[source]#

Returns the dict representation of the actual instance

Return type:: Dict[str, Any] | LlmGenerationRunConfig | TemplateEvaluationRunConfig | None

to_str() → str[source]#

Returns the string representation of the actual instance

Return type:: str