This page was generated from doc/source/methods/ksdrift.ipynb.

source

Kolmogorov-Smirnov

Overview

The drift detector applies feature-wise two-sample Kolmogorov-Smirnov (K-S) tests. For multivariate data, the obtained p-values for each feature are aggregated either via the Bonferroni or the False Discovery Rate (FDR) correction. The Bonferroni correction is more conservative and controls for the probability of at least one false positive. The FDR correction on the other hand allows for an expected fraction of false positives to occur.

For high-dimensional data, we typically want to reduce the dimensionality before computing the feature-wise univariate K-S tests and aggregating those via the chosen correction method. Following suggestions in Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift, we incorporate Untrained AutoEncoders (UAE), black-box shift detection using the classifier’s softmax outputs (BBSDs) and PCA as out-of-the box preprocessing methods. Preprocessing methods which do not rely on the classifier will usually pick up drift in the input data, while BBSDs focuses on label shift. The adversarial detector which is part of the library can also be transformed into a drift detector picking up drift that reduces the performance of the classification model. We can therefore combine different preprocessing techniques to figure out if there is drift which hurts the model performance, and whether this drift can be classified as input drift or label shift.

Detecting input data drift (covariate shift) \(\Delta p(x)\) for text data requires a custom preprocessing step. We can pick up changes in the semantics of the input by extracting (contextual) embeddings and detect drift on those. Strictly speaking we are not detecting \(\Delta p(x)\) anymore since the whole training procedure (objective function, training data etc) for the (pre)trained embeddings has an impact on the embeddings we extract. The library contains functionality to leverage pre-trained embeddings from HuggingFace’s transformer package but also allows you to easily use your own embeddings of choice. Both options are illustrated with examples in the Text drift detection on IMDB movie reviews notebook.

Usage

Initialize

Parameters:

  • p_val: p-value used for significance of the K-S test for each feature. If the FDR correction method is used, this corresponds to the acceptable q-value.

  • X_ref: Data used as reference distribution.

  • preprocess_X_ref: Whether to already apply the (optional) preprocessing step to the reference data at initialization and store the preprocessed data. Dependent on the preprocessing step, this can reduce the computation time for the predict step significantly, especially when the reference dataset is large. Defaults to True.

  • update_X_ref: Reference data can optionally be updated to the last N instances seen by the detector or via reservoir sampling with size N. For the former, the parameter equals {‘last’: N} while for reservoir sampling {‘reservoir_sampling’: N} is passed.

  • preprocess_fn: Function to preprocess the data before computing the data drift metrics. Typically a dimensionality reduction technique.

  • preprocess_kwargs: Keyword arguments for preprocess_fn. Again see the notebooks for image and text data for concrete, detailed examples. The built-in UAE, BBSDs or text-specific preprocessing steps are passed here as well. See below for a brief example.

  • correction: Correction type for multivariate data. Either ‘bonferroni’ or ‘fdr’ (False Discovery Rate).

  • alternative: Defines the alternative hypothesis. Options are ‘two-sided’ (default), ‘less’ or ‘greater’.

  • n_features: Number of features used in the K-S test. No need to pass it if no preprocessing takes place. In case of a preprocessing step, this can also be inferred automatically but could be more expensive to compute.

  • n_infer: If the number of features need to be inferred after the preprocessing step, we can specify the number of instances used to infer the number of features from since this can depend on the specific preprocessing step.

  • data_type: can specify data type added to metadata. E.g. ‘tabular’ or ‘image’.

Initialized drift detector example:

from alibi_detect.cd import KSDrift
from alibi_detect.cd.preprocess import UAE  # Untrained AutoEncoder

encoder_net = tf.keras.Sequential(
  [
      InputLayer(input_shape=(32, 32, 3)),
      Conv2D(64, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(128, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(512, 4, strides=2, padding='same', activation=tf.nn.relu),
      Flatten(),
      Dense(32,)
  ]
)
uae = UAE(encoder_net=encoder_net)

cd = KSDrift(
    p_val=0.05,
    X_ref=X_ref,
    preprocess_X_ref=True,
    preprocess_kwargs={'model': uae, 'batch_size': 128},
    alternative='two-sided',
    correction='bonferroni'
)

Detect Drift

We detect data drift by simply calling predict on a batch of instances X. We can return the feature-wise p-values before the multivariate correction by setting return_p_val to True. The drift can also be detected at the feature level by setting drift_type to ‘feature’. No multivariate correction will take place since we return the output of n_features univariate tests. For drift detection on all the features combined with the correction, use ‘batch’. return_p_val equal to True will also return the threshold used by the detector (either for the univariate case or after the multivariate correction).

The prediction takes the form of a dictionary with meta and data keys. meta contains the detector’s metadata while data is also a dictionary which contains the actual predictions stored in the following keys:

  • is_drift: 1 if the sample tested has drifted from the reference data and 0 otherwise.

  • p_val: contains feature-level p-values if return_p_val equals True.

  • threshold: for feature-level drift detection the threshold equals the p-value used for the significance of the K-S test. Otherwise the threshold after the multivariate correction (either bonferroni or fdr) is returned.

  • distance: feature-wise K-S statistics between the reference data and the new batch if return_distance equals True.

preds_drift = cd.predict(X, drift_type='batch', return_p_val=True, return_distance=True)

Saving and loading

The drift detectors can be saved and loaded in the same way as other detectors when using the built-in preprocessing steps (alibi_detect.cd.preprocess.UAE and alibi_detect.cd.preprocess.HiddenOutput) or no preprocessing at all:

from alibi_detect.utils.saving import save_detector, load_detector

filepath = 'my_path'
save_detector(cd, filepath)
cd = load_detector(filepath)

A custom preprocessing step can be passed as follows:

cd = load_detector(filepath, **{'preprocess_kwargs': preprocess_kwargs})