algocomponents.utils

Contents

class algocomponents.utils.ABTools

Bases: object

A collection of methods useful in A/B-tests.

ABTools contain a number of methods that are useful for conducting A/B tests and report results of A/B tests.

get_control_group_ratio_constant(N: int, N_adj: int) → float

Returns the control group ratio constant k.

Parameters:

N – The minimal sample size (both test & control group).
N_adj – The adjusted sample size. Needs to be bigger than N.

Returns:

The control group ratio constant.

get_min_sample_size_binomial(p1: float, p2: float, power: float = 0.8, sig_level: float = 0.05, tail: str = 'two_sided') → int

Returns the minimum sample size to set up an A/B test for a binomial metric.

We assume the same sample size for both test and control group.

Parameters:

p1 – probability of success for target group.
p2 – probability of success for control group, sometimes referred to as baseline conversion rate.
power – probability of rejecting the null hypothesis when the null hypothesis is false, typically 0.8.
sig_level – significance level often denoted as alpha, typically 0.05.
tail – one_sided or two_sided test.

Returns:

minimum sample size required in each group.

Return type:

min_N

get_min_sample_size_continuous(u1: float, u2: float, var: float, power: float = 0.8, sig_level: float = 0.05, tail: str = 'two_sided') → int

Returns the minimum sample size to set up an A/B test for a continuous metric.

We assume the same sample size for both test and control group.

Parameters:

u1 – expected value for target group.
u2 – estimated value for control group.
var – estimated variance for control group.
power – probability of rejecting the null hypothesis when the null hypothesis is false, typically 0.8.
sig_level – significance level often denoted as alpha, typically 0.05.
tail – “one_sided” or “two_sided” test.

Returns:

minimum sample size required in each group.

Return type:

min_N

get_unequal_sample_size(N: int, N_adj: int) → Tuple[int, int]

Returns the minimum sample size of the control group when the samples have unequal size.

When the control and target groups are of different sizes, the total sample size needs to be adjusted because unequal groups produces higher variance in the test statistic.

Parameters:

N – The minimal sample size (both test & control group).
N_adj – The adjusted sample size. Needs to be bigger than N.

Returns:

Minimum sample size of the control group. test_sample_size: Minimum sample size of the test group.

Return type:

control_sample_size

is_significant_binomial(n1: int, n2: int, p1: float, p2: float, sig_level: float = 0.05, tail: str = 'two_sided') → Tuple[float, float]

Tests the null hypothesis against the given alternative for significance:

H0: p1=p2 vs HA: p1!=p2

using Z-test with the pooled standard error.

Parameters:

n1 – target group size.
n2 – control group size.
p1 – probability of success for target group.
p2 – probability of success for control group, sometimes referred to as baseline conversion rate.
sig_level – significance level often denoted as alpha, typically 0.05.
tail – “one_sided” or “two_sided” test.

Returns:

z-score value. p: p-value.

Return type:

z

is_significant_continuous(n1, n2, x1, x2, var1, var2, sig_level=0.05, tail='two_sided') → Tuple[float, float]

Welch’s t-test. Unequal variance. Unequal or equal sample size.

We test the null hypothesis against the given alternative:

H0: p1=p2 vs HA: p1!=p2

using Welch-test with the unpooled standard error.

Parameters:

n1 – target group size.
n2 – control group size.
x1 – observed value for target group.
x2 – observed value for control group.
var1 – variance of target group.
var2 – variance of control group.
sig_level – significance level often denoted as alpha, typically 0.05.
tail – “one_sided” or “two_sided” test.

Returns:

t-score value. p: p-value.

Return type:

t

class algocomponents.utils.LoggieDoggie

Bases: object

Your loyal LoggieDoggie that fetches loggers.

LoggieDoggie will fetch a logger given a name. If that logger is fetched for the first time, the logger will be set up with handlers. If the logger has handlers, it will be evaluated as “having already been set up” and returned.

Properties:

log_file_name: Name of log file to create or append to. logger_format: How log messages will be formatted. date_format: The format to use for the date in the log format. log_levels: Allowed log_levels.

_default_log_level: Log level to use when no log level is given. _default_log_to_file: Whether to log to file or not when setting is not present in config.

date_format = '%Y-%m-%d %H:%M:%S'

fetch_logger(logger_name: str, config: Dict = None) → Logger

Fetch a logger by logger name, or create one if one did not exist.

Parameters:

logger_name – Name of the logger to either create or fetch.
config – A dictionary of settings, out of which these are read: log_level: At what level to log. log_to_file: Whether log should also output to file.

Returns:

An instantiated Logger-object.

log_file_name = 'log.log'

log_levels = {'CRITICAL': 50, 'DEBUG': 10, 'ERROR': 40, 'INFO': 20, 'NOTSET': 0, 'WARNING': 30}

logger_format = '%(asctime)s [%(filename)s:%(lineno)d] %(levelname)s: %(message)s'

algocomponents.utils.config_to_str(config) → str

Convert a ConfigParser-object to a string.

This is used to log the contents of Tasks configs.

Parameters:: config – A ConfigParser object.
Returns:: A string representation of the ConfigParser.

algocomponents.utils.launch_task(task_file_name: str, section: str, adapter_type: str, **task_kwargs)

Find a task by file name and start it with it’s .start()-method.

Parameters:

task_file_name – Name of the python module (file.py) declaring the Task.
section – What part of config to use.
adapter_type – Which adapter to give the Task.
**task_kwargs – Any keyword argument not matched is passed on to the Task.

algocomponents.utils.load_model(model_path: str, metadata: Dict)

algocomponents.utils.merge_configs(merge_this, into_this, overwrite: bool = False)

Merge two ConfigParser-objects.

When ConfigParser reads a second config, it always overwrites. This method is created in order to merge two configs without overwriting.

Parameters:

merge_this – The ConfigParser to put into another ConfigParser.
into_this – The ConfigParser you wish to update.
overwrite – Whether existing values in into_this should be kept or not.

Examples

merge_this = {DEFAULT: {“a”: 1, “b”: 2, “c”: 3}} into_this = {DEFAULT: {“a”: 1, “b”: 99, “d”: 4}} overwrite = False result = {DEFAULT: {“a”: 1, “b”: 99, “c”: 3, “d”: 4}}

merge_this = {DEFAULT: {“a”: 1, “b”: 2, “c”: 3}} into_this = {DEFAULT: {“a”: 1, “b”: 99, “d”: 4}} overwrite = True result = {DEFAULT: {“a”: 1, “b”: 2, “c”: 3, “d”: 4}}

Returns:: The ConfigParser object which is the result of the merge.

algocomponents.utils.pre_process_df(df: DataFrame, model_path: str, metadata: Dict) → DataFrame

algocomponents.utils.predict_with_model(model_path: str, metadata: Dict, df: DataFrame, prediction_column: str = 'score', predict_probabilities: bool = False) → DataFrame

algocomponents.utils.require_connection(func)

Wrapper which ensures that the adapter is connected.

In order to use it, place @require_connection above your function definition. As long as the adapter is connected, the function is run normally. If the adapter is not connected, an exception is raised.