Validation API
This is the API reference for all functions designed to be used for validation. You can find usage examples here.
hh.create_error_report
create_error_report(
df: DataFrame, Model: Any, df_name: str
) -> pd.DataFrame
Uses a Pydantic model to validate each row of a DataFrame and generates an error report. Returns the original DataFrame with three new columns: 'val_error_count', 'val_error_details', and 'validation_status'.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
The DataFrame you want to validate. |
required |
Model
|
Any
|
Your pydantic model. Should be a subclass of BaseModel. |
required |
df_name
|
str
|
Name of your DataFrame for logging purposes. |
required |
Raises:
| Type | Description |
|---|---|
ImportError
|
Raised if pydantic is not installed |
TypeError
|
Raised if df is not a DataFrame or Model is not a Pydantic BaseModel class. |
AttributeError
|
Raised if Model does not have 'model_validate' method (ensures you are using Pydantic v2). |
Returns:
| Type | Description |
|---|---|
DataFrame
|
Your original DataFrame with three new columns: 'val_error_count': Number of validation errors in the row (0 if valid); 'val_error_details': A string summarising the validation errors (None if valid);'validation_status': "Valid" or "Invalid" |