evaluators#
These are used to create evaluators as a class. See our docs for more information.
To import evaluators, use the following:
from arize.experimental.datasets.experiments.evaluators.base import ...
- class Evaluator(*args, **kwargs)#
Bases:
ABC
A helper super class to guide the implementation of an Evaluator object. Subclasses must implement either the evaluate or async_evaluate method. Implementing both methods is recommended, but not required.
This Class is intended to be subclassed, and should not be instantiated directly.
- async async_evaluate(*, dataset_row=None, input=MappingProxyType({}), output=None, experiment_output=None, dataset_output=MappingProxyType({}), metadata=MappingProxyType({}), **kwargs)#
Asynchronously evaluate the given inputs and produce an evaluation result. This method should be implemented by subclasses to perform the actual evaluation logic. It is recommended to implement both this asynchronous method and the synchronous evaluate method, but it is not required. :param output: The output produced by the task. :type output: Optional[TaskOutput] :param expected: The expected output for comparison. :type expected: Optional[ExampleOutput] :param dataset_row: A row from the dataset. :type dataset_row: Optional[Mapping[str, JSONSerializable]] :param metadata: Metadata associated with the example. :type metadata: ExampleMetadata :param input: The input provided for evaluation. :type input: ExampleInput :param **kwargs: Additional keyword arguments. :type **kwargs: Any
- Returns:
The result of the evaluation.
- Return type:
- Raises:
NotImplementedError – If the method is not implemented by the subclass.
- evaluate(*, dataset_row=None, input=MappingProxyType({}), output=None, experiment_output=None, dataset_output=MappingProxyType({}), metadata=MappingProxyType({}), **kwargs)#
Evaluate the given inputs and produce an evaluation result. This method should be implemented by subclasses to perform the actual evaluation logic. It is recommended to implement both this synchronous method and the asynchronous async_evaluate method, but it is not required. :param output: The output produced by the task. :type output: Optional[TaskOutput] :param expected: The expected output for comparison. :type expected: Optional[ExampleOutput] :param dataset_row: A row from the dataset. :type dataset_row: Optional[Mapping[str, JSONSerializable]] :param metadata: Metadata associated with the example. :type metadata: ExampleMetadata :param input: The input provided for evaluation. :type input: ExampleInput :param **kwargs: Additional keyword arguments. :type **kwargs: Any
- Raises:
NotImplementedError – If the method is not implemented by the subclass.