exporter#
Use this to export data from Arize. Read this guide for more information.
To use in your code, import the following:
from arize.exporter import ArizeExportClient
- class ArizeExportClient(api_key=None, arize_profile='default', arize_config_path='/home/docs/.arize', host='flight.arize.com', port=443, scheme='grpc+tls')#
Bases:
object
Arize’s Export Client.
- Parameters:
api_key (str, optional) – Arize provided personal API key associated with your user profile, located on the API Explorer page. API key is required to initiate a new client, it can be passed in explicitly, or set up as an environment variable or in profile file.
arize_profile (str, optional) – profile name for ArizeExportClient credentials and endpoint.
arize_config_path (str, optional) – path to the config file that stores ArizeExportClient credentials and endpoint. Defaults to ‘~/.arize’.
host (str, optional) – URI endpoint host to send your export request to Arize AI.
port (int, optional) – URI endpoint port to send your export request to Arize AI.
- export_model_to_df(space_id, model_id, environment, start_time, end_time, include_actuals=False, model_version=None, batch_id=None, where=None, similarity_search_params=None, columns=None)#
Exports data of a specific model in the Arize platform to a pandas dataframe for a defined time interval and model environment, optionally by model version and/or batch id.
- Parameters:
space_id (str) – The id for the space where to export models from, can be retrieved from the url of the Space Overview page in the Arize UI.
model_id (str) – The name of the model to export, can be found in the Model Overview tab in the Arize UI.
environment (Environments) – The environment for the model to export (can be Production, Training, or Validation).
start_time (datetime) – The start time for the data to export for the model, start time is inclusive. Time interval has hourly granularity.
end_time (datetime) – The end time for the data to export for the model, end time is not inclusive. Time interval has hourly granularity.
include_actuals (bool, optional) – An input to indicate whether to include actuals / ground truth in the data to export. include_actuals only applies to the Production environment and defaults to ‘False’.
model_version (str, optional) – An input to indicate the version of the model to export. Model versions for all model environments can be found in the Datasets tab on the model page in the Arize UI. Defaults to None.
batch_id (str, optional) – An input to indicate the batch name of the model to export. Batches only apply to the Validation environment, and can be found in the Datasets tab on the model page in the Arize UI. Defaults to None.
where (str, optional) – An input to provide sql like where statement to filter a subset of records from the model, e.g. “age > 50 And state=’CA’”. Defaults to None.
similarity_search_params (SimilaritySearchParams, optional) – Parameters for embedding similarity search using cosine similarity. It includes ‘references’, a list of reference embeddings for comparison; ‘search_column_name’, specifying the column that contains the embeddings to search within; and ‘threshold’, which sets the cosine similarity threshold required for embeddings to be considered similar.
columns (list, optional) – Specifies the columns to include from the model data during export. If not provided, all columns will be exported.
- Returns:
A pandas dataframe
- export_model_to_parquet(path, space_id, model_id, environment, start_time, end_time, include_actuals=False, model_version=None, batch_id=None, where=None, similarity_search_params=None, columns=None)#
Exports data of a specific model in the Arize platform to a parquet file for a defined time interval and model environment, optionally by model version and/or batch id.
- Parameters:
path (str) – path to the file to store exported data. File must be in parquet format and has a ‘.parquet’ extension.
space_id (str) – The id for the space where to export models from, can be retrieved from the url of the Space Overview page in the Arize UI.
model_id (str) – The name of the model to export, can be found in the Model Overview tab in the Arize UI.
environment (Environments) – The environment for the model to export (can be Production, Training, or Validation).
start_time (datetime) – The start time for the data to export for the model, start time is inclusive. Time interval has hourly granularity.
end_time (datetime) – The end time for the data to export for the model, end time is not inclusive. Time interval has hourly granularity.
include_actuals (bool, optional) – An input to indicate whether to include actuals / ground truth in the data to export. include_actuals only applies to the Production environment and defaults to ‘False’.
model_version (str, optional) – An input to indicate the version of the model to export. Model versions for all model environments can be found in the Datasets tab on the model page in the Arize UI. Defaults to None.
batch_id (str, optional) – An input to indicate the batch name of the model to export. Batches only apply to the Validation environment, and can be found in the Datasets tab on the model page in the Arize UI. Defaults to None.
where (str, optional) – An input to provide sql like where statement to filter a subset of records from the model, e.g. “age > 50 And state=’CA’”. Defaults to None.
similarity_search_params (SimilaritySearchParams, optional) – Parameters for embedding similarity search using cosine similarity. It includes ‘references’, a list of reference embeddings for comparison; ‘search_column_name’, specifying the column that contains the embeddings to search within; and ‘threshold’, which sets the cosine similarity threshold required for embeddings to be considered similar.
columns (list, optional) – Specifies the columns to include from the model data during export. If not provided, all columns will be exported.
- Returns:
None