File Processing API
This is the API reference for all functions designed to be files. You can find usage examples here.
hh.get_excel_filepaths_in_folder
get_excel_filepaths_in_folder(
input_dir: str, print_to_terminal: bool = False
) -> list[str]
Returns a list of filepaths to Excel files (with the extension .xlsx or .xls) in a given folder. Note: Only searches the top-level of input_dir; does not recursively search subdirectories.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dir
|
str
|
The directory you want to get the filepaths from. |
required |
print_to_terminal
|
optional
|
Defaults to False. Set to True if you want the terminal to print messages about the file processing. |
False
|
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
Raises errors if the directory does not exist. |
Returns:
| Type | Description |
|---|---|
list[str]
|
A list of filepaths from the specified folder. Returns empty list if there are no Excel files in the folder. |
hh.convert_col_snake_case
convert_col_snake_case(df: DataFrame) -> pd.DataFrame
Converts the column names of a DataFrame to snake case. This function also checks for duplicate columns before and after conversion, and raises a ColumnsNotUnique error if duplicates are found.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
The DataFrame you want to convert the column names of to snake case. |
required |
Raises:
| Type | Description |
|---|---|
TypeError
|
Raises an error if the input is not a DataFrame. |
ColumnsNotUnique
|
Raises an error if there are duplicate columns in the original DataFrame or if the snake-case conversion results in duplicate column names. The error message will specify which columns are duplicated |
Returns:
| Type | Description |
|---|---|
DataFrame
|
A new DataFrame with the same data but with column names converted to snake case. |