tab_err#
Submodules#
Classes#
Error Mechanism Abstract Base Class. |
|
Combines an error mechanism and error type and defines how many percent of the column should be perturbed. |
|
Error Type Abstract Base Class. |
Package Contents#
- class tab_err.ErrorMechanism(condition_to_column: int | str | None = None, seed: int | None = None)#
Bases:
abc.ABCError Mechanism Abstract Base Class.
- sample(data: pandas.DataFrame, column: str | int, error_rate: float, error_mask: pandas.DataFrame | None = None) pandas.DataFrame#
Returns an error mask for locations to introduce errors in a pandas DataFrame.
- Description:
Does error checking for the abstract method ‘_sample’. Assigns the _random_generator attribute. Calls subclass _sample method.
- Parameters:
data (pd.DataFrame) – DataFrame containing the column to add errors to
column (str | int) – The column of ‘data’ to create an error mask for
error_rate (float) – Percentage of rows to be affected by errors in range [0,1].
error_mask (pd.DataFrame | None, optional) – An existing error mask to add more errors to in the case of the mid-/high-level APIs. Defaults to None.
- Raises:
ValueError – If error rate is out of the [0,1] interval, a ValueError is thrown
TypeError – If the ‘data’ argument is not a pandas dataframe or the data is empty, a TypeError is thrown
ValueError – If required and there are not 2 columns in the ‘data’ argument, a ValueError is thrown.
- Returns:
Updated dataframe with the generated error mask
- Return type:
pd.DataFrame
- condition_to_column = None#
- class tab_err.ErrorModel#
Combines an error mechanism and error type and defines how many percent of the column should be perturbed.
- error_mechanism#
Instance of an ErrorMechanism that will be applied.
- Type:
- error_rate#
Defines how many percent should be perturbed.
- Type:
float
- apply(data: pandas.DataFrame, column: str | int) tuple[pandas.DataFrame, pandas.DataFrame]#
Applies the defined ErrorModel to the given column of a pandas DataFrame.
- Parameters:
data (pd.DataFrame) – The pandas DataFrame to create errors in.
column (str | int) – The column to create errors in.
- Returns:
The first element is a copy of ‘data’ with errors.
The second element is the associated error mask.
- Return type:
tuple[pd.DataFrame, pd.DataFrame]
- error_mechanism: tab_err.ErrorMechanism#
- error_rate: float#
- error_type: tab_err.ErrorType#
- class tab_err.ErrorType(config: tab_err.error_type._config.ErrorTypeConfig | dict | None = None, seed: int | None = None)#
Bases:
abc.ABCError Type Abstract Base Class.
- apply(data: pandas.DataFrame, error_mask: pandas.DataFrame, column: str | int) pandas.Series#
Applies an ErrorType to a column of ‘data’. Does type and shape checking and creates a random number generator.
- Parameters:
data (pd.DataFrame) – The Pandas DataFrame containing the column where errors are to be introduced.
error_mask (pd.DataFrame) – The Pandas DataFrame containing the error mask for ‘column’.
column (str | int) – The index in the ‘data’ and ‘error_mask’ DataFrames where errors are to be introduced.
- Returns:
The data column, ‘column’, after errors of ErrorType at the locations specified by ‘error_mask’ are introduced.
- Return type:
pd.Series
- classmethod from_dict(data: dict[str, Any]) ErrorType#
Deserialize an ErrorType object from a dictionary.
- Parameters:
data (dict[str, Any]) – A dictionary representation of the ErrorType object.
- Returns:
An ErrorType object deserialized from the dictionary.
- Return type:
- get_valid_columns(data: pandas.DataFrame) list[str | int]#
Finds the valid columns to which the error type can be applied. Wrapper around _get_valid_columns.
- to_dict() dict[str, Any]#
Serialized the ErrorType object into a dictionary.
- Returns:
A dictionary representation of the ErrorType object.
- Return type:
dict[str, Any]