Skip to content

Validation API

This is the API reference for all functions designed to be used for validation. You can find usage examples here.

hh.create_error_report

create_error_report(
    df: DataFrame, Model: Any, df_name: str
) -> pd.DataFrame

Uses a Pydantic model to validate each row of a DataFrame and generates an error report. Returns the original DataFrame with three new columns: 'val_error_count', 'val_error_details', and 'validation_status'.

Parameters:

Name Type Description Default
df DataFrame

The DataFrame you want to validate.

required
Model Any

Your pydantic model. Should be a subclass of BaseModel.

required
df_name str

Name of your DataFrame for logging purposes.

required

Raises:

Type Description
ImportError

Raised if pydantic is not installed

TypeError

Raised if df is not a DataFrame or Model is not a Pydantic BaseModel class.

AttributeError

Raised if Model does not have 'model_validate' method (ensures you are using Pydantic v2).

Returns:

Type Description
DataFrame

Your original DataFrame with three new columns: 'val_error_count': Number of validation errors in the row (0 if valid); 'val_error_details': A string summarising the validation errors (None if valid);'validation_status': "Valid" or "Invalid"