tab_err.error_mechanism#
Classes#
Package Contents#
- class tab_err.error_mechanism.EAR#
Bases:
tab_err.error_mechanism._error_mechanism.ErrorMechanismErrorMechanism subclass implementing the Erroneous Completely At Random error mechanism.
- Description:
Errors are assumed to be completely independent of the data distribution
- class tab_err.error_mechanism.ECAR#
Bases:
tab_err.error_mechanism._error_mechanism.ErrorMechanismErrorMechanism subclass implementing the ‘Erroneous Completely At Random’ error mechanism.
- Description:
Errors are assumed to be completely independent of the data distribution
- class tab_err.error_mechanism.ENAR(condition_to_column: int | str | None = None, seed: int | None = None)#
Bases:
tab_err.error_mechanism._error_mechanism.ErrorMechanismErrorMechanism subclass implementing the Erroneous Not At Random error mechanism.
- Description:
Errors are assumed to depend on either other variables, the incorrect data itself, or both.
- sample(data: pandas.DataFrame, column: str | int, error_rate: float, error_mask: pandas.DataFrame | None = None) pandas.DataFrame#
Returns an error mask for locations to introduce errors in a pandas DataFrame.
- Description:
Does error checking for the abstract method ‘_sample’. Assigns the _random_generator attribute. Calls subclass _sample method.
- Parameters:
data (pd.DataFrame) – DataFrame containing the column to add errors to
column (str | int) – The column of ‘data’ to create an error mask for
error_rate (float) – Percentage of rows to be affected by errors in range [0,1].
error_mask (pd.DataFrame | None, optional) – An existing error mask to add more errors to in the case of the mid-/high-level APIs. Defaults to None.
- Raises:
ValueError – If error rate is out of the [0,1] interval, a ValueError is thrown
TypeError – If the ‘data’ argument is not a pandas dataframe or the data is empty, a TypeError is thrown
ValueError – If required and there are not 2 columns in the ‘data’ argument, a ValueError is thrown.
- Returns:
Updated dataframe with the generated error mask
- Return type:
pd.DataFrame
- condition_to_column = None#