When we published the GHOST paper on shifting the decision boundary to improve the predictive performance of classification models built on imbalanced datasets, we only considered binary classifiers (e.g. active/inactive, soluble/insoluble, etc.). I was recently asked if the method could be extended to ternary (three-class) classifiers. This post is about doing that.
The code here isn't set up for easy re-use at the moment. It will eventually find its way into the open-source ghostml package once we've had a chance to review and test it more thoroughly.
Aside: the ghostml package is now pip installable:
python -m pip install ghostml
to install it in your environment
In order for this to make sense, I think I should start with some explanation of the way I've approached the problem:
Things are a bit more complicated here than with binary classifiers. For the binary case we just have a single threshold which determines whether an instance is predicted to be in class 0 or 1. So, assuming that we optimized based on the probability of class 1, we can formulate the decision as:
if probabilities[1] >= threshold:
prediction = 1
else:
prediction = 0
Before doing any optimization threshold
is equal to 0.5.
For ternary predictions we have two different decision boundaries and there's no longer a simple threshold; instead the default decision rule can be expressed as:
prediction = argmax(probabilities)
i.e., the prediction is the class which has the highest predicted probability.
Aside: the same decision rule can be used for a binary classifier with the default threshold. It's just easier to explain using the threshold of 0.5.
If we want to introduce two thresholds for the ternary classifier, and assuming that we optimize the thresholds for classes 0 and 2, we have to use a more complex decision rule:
if probabilities[0]>=thresholds[0]:
# we might still be in class 2 if the relative probability of that
# is larger than the probability of class 0
if (probabilities[2]-thresholds[1])>(probabilities[0]-thresholds[0]):
prediction = 2
else:
prediction = 0
elif probabilities[2]>=thresholds[1]:
prediction = 0
else:
prediction = 1
For the sake of this post let's assume that we're optimizing the thresholds for classes 0 and 2; we could also do 0 and 1, or 1 and 2, the results should still be the same.
In this post I explore two different approaches for optimizing these thresholds.
Here I optimize the two thresholds independently of each other by constructing two binary classification problems and optimizing the thresholds for those problems. Here's the process:
y
values to 1 if the original value is 0 and to 0 otherwise.ghostml
approach with that binary classification data and the predicted probabilities of each training point to be 0 in order to set threshold0
, the threshold for the predicted probability of being 0.y
values to 1 if the original value is 2 and to 0 otherwise.ghostml
approach with that binary classification data and the predicted probabilities of each training point to be 2 in order to set threshold2
, the threshold for the predicted probability of being 2.Since the current ghostml
code doesn't support using balanced accuracy for optimization, I just use kappa for the greedy optimization.
Explore the full grid of possible (threshold0, threshold2)
pairs and pick the one which produces the optimal Cohen's kappa value. I also try a variant of this which optimizes balanced accuracy instead of Cohen's kappa.
Both approaches work well with both simulated data and a couple of datasets from ChEMBL. There doesn't seem to be a large or consistent difference in the quality of the results generated with the two different methods. The greedy optimization approach is, however, quite a bit faster.
Here's the improvement in three scoring metrics (kappa, balanced accuracy, and overall accuracy) when using the greedy optimization procedure on 50 simulated datasets with a 10-80-10 class split; the threshold shift improves both kappa and balanced accuracy on all datasets:
And here's the same plot for 20 different random stratified train/tests splits with target CHEMBL205 (carbonic anhydrase II) with activity thresholds chosen to give a 19-72-9 class split. Once again, the threshold shift improves predictive performance:
Note: the original version of this notebook and the two CHEMBL data files (file1, file2), are both in github in the older rdkit blog repo.
I put some thought into figuring out how to extend this to the general multi-class prediction case, but that turned out to be more difficult than I'd anticipated. If you have suggestions, ideally suggestions accompanied by code, please let me know in the comments!
And now onto the code and more detailed exploration
from rdkit import Chem
from rdkit.Chem import rdMolDescriptors
from rdkit.Chem import rdFingerprintGenerator
from rdkit.Chem import PandasTools
# note that you can install ghost using pip: python -m pip install ghostml
import ghostml
import pandas as pd
from sklearn import metrics
import numpy as np
%pylab inline
Populating the interactive namespace from numpy and matplotlib
def ternary_rebin(probs,thresholds):
''' returns a list of classifications based on the provided predicted probabilities and thresholds '''
res = []
for prob in probs:
if prob[0]>=thresholds[0]:
# we might still be in class 2 if the relative probability of that
# is larger than the probability of class 0
if (prob[2]-thresholds[1])>(prob[0]-thresholds[0]):
res.append(2)
else:
res.append(0)
elif prob[2]>=thresholds[1]:
res.append(2)
else:
res.append(1)
return res
def run_ternary_oob_optimization(oob_probs, labels_train, thresholds, ThOpt_metrics = 'Kappa'):
''' does a grid search to optimize the decision thresholds for a ternary problem '''
res = []
tscores = []
for t1 in thresholds:
for t2 in thresholds:
preds = ternary_rebin(oob_probs,(t1,t2))
if ThOpt_metrics == 'Kappa':
tgt = metrics.cohen_kappa_score(labels_train,preds)
elif ThOpt_metrics == 'BalancedAccuracy':
tgt = metrics.balanced_accuracy_score(labels_train,preds)
elif ThOpt_metrics == 'F1':
tgt = metrics.f1_score(labels_train,preds)
tscores.append((np.round(tgt,3),(t1,t2)))
tscores.sort(reverse=True)
thresh = tscores[0][-1]
return thresh
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
def run_ternary_experiment(X,y,accum,random_state=0):
''' experiment wrapper for the ternary bounds optimization '''
n_classes = max(y)+1
local = {}
# --------------------
# Train - test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, stratify = y,
random_state=random_state)
# --------------------
# Train a RF classifier
cls = RandomForestClassifier(n_estimators=500,max_depth=10,oob_score=True,n_jobs=8)
cls.fit(X_train, y_train)
# --------------------
# Calculate the baseline accuracy values
test_preds = cls.predict(X_test)
test_probs = cls.predict_proba(X_test)
kappa = metrics.cohen_kappa_score(y_test,test_preds)
balanced = metrics.balanced_accuracy_score(y_test,test_preds)
accuracy = metrics.accuracy_score(y_test,test_preds)
confusion = metrics.confusion_matrix(y_test,test_preds,labels=list(set(y_test)))
print('original')
print(f'accuracy: {accuracy:.3f} balanced accuracy: {balanced:.3f} kappa: {kappa:.3f}')
print(confusion)
local['orig-accuracy'] = accuracy
local['orig-balanced'] = balanced
local['orig-kappa'] = kappa
local['orig-confusion'] = confusion
# --------------------
# optimize the two thresholds individually
thresholds = [0]*(n_classes-1)
for i,clsv in enumerate((0,2)):
d_tform = [1 if y==clsv else 0 for y in y_train]
d_probs = [x[clsv] for x in cls.oob_decision_function_]
thresholds[i] = ghostml.optimize_threshold_from_oob_predictions(d_tform,d_probs,thresholds=np.arange(0.05,1.0,0.05))
local['thresholds'] = thresholds
# calculate the accuracy values for those thresholds:
test_preds = ternary_rebin(test_probs,thresholds)
kappa = metrics.cohen_kappa_score(y_test,test_preds)
balanced = metrics.balanced_accuracy_score(y_test,test_preds)
accuracy = metrics.accuracy_score(y_test,test_preds)
confusion = metrics.confusion_matrix(y_test,test_preds,labels=list(set(y_test)))
print('rebalanced')
print(f'thresholds: {thresholds}')
print(f'accuracy: {accuracy:.3f} balanced accuracy: {balanced:.3f} kappa: {kappa:.3f}')
print(confusion)
local['shift-accuracy'] = accuracy
local['shift-balanced'] = balanced
local['shift-kappa'] = kappa
local['shift-confusion'] = confusion
# --------------------
# grid-search optimization of the threshold values based on kappa
thresholds = run_ternary_oob_optimization(cls.oob_decision_function_,y_train,
thresholds=np.arange(0.05,1.00,0.05),
ThOpt_metrics = 'Kappa')
test_preds = ternary_rebin(test_probs,thresholds)
kappa = metrics.cohen_kappa_score(y_test,test_preds)
balanced = metrics.balanced_accuracy_score(y_test,test_preds)
accuracy = metrics.accuracy_score(y_test,test_preds)
confusion = metrics.confusion_matrix(y_test,test_preds,labels=list(set(y_test)))
print('global kappa rebalanced')
print(f'thresholds: {thresholds}')
print(f'accuracy: {accuracy:.3f} balanced accuracy: {balanced:.3f} kappa: {kappa:.3f}')
print(confusion)
local['global-k-shift-accuracy'] = accuracy
local['global-k-shift-balanced'] = balanced
local['global-k-shift-kappa'] = kappa
local['global-k-shift-confusion'] = confusion
# --------------------
# grid-search optimization of the threshold values based on the balanced accuracy
thresholds = run_ternary_oob_optimization(cls.oob_decision_function_,y_train,
thresholds=np.arange(0.05,1.00,0.05),
ThOpt_metrics = 'BalancedAccuracy')
test_preds = ternary_rebin(test_probs,thresholds)
kappa = metrics.cohen_kappa_score(y_test,test_preds)
balanced = metrics.balanced_accuracy_score(y_test,test_preds)
accuracy = metrics.accuracy_score(y_test,test_preds)
confusion = metrics.confusion_matrix(y_test,test_preds,labels=list(set(y_test)))
print('global balanced_accuracy rebalanced')
print(f'thresholds: {thresholds}')
print(f'accuracy: {accuracy:.3f} balanced accuracy: {balanced:.3f} kappa: {kappa:.3f}')
print(confusion)
local['global-ba-shift-accuracy'] = accuracy
local['global-ba-shift-balanced'] = balanced
local['global-ba-shift-kappa'] = kappa
local['global-ba-shift-confusion'] = confusion
accum.append(local)
I will try out a couple of real datasets below, but I want to start by verifying that the process works with some synthetic datasest. Scikit-learn's make_classification() function makes this really easy.
I will test this with multiple different forms of imbalance, just to be sure that it generalizes. Let's start with an example where the majority class is in the middle:
from sklearn.datasets import make_classification
accum_10_80_10 = []
for rep in range(50):
print('--------------')
# Generate a ternary imbalanced classification problem
X, y = make_classification(n_samples=6000, n_features=20,
n_informative=10, n_redundant=0, n_classes=3,
random_state=0xf00d+rep, shuffle=False, weights = [0.1, 0.8, 0.1])
run_ternary_experiment(X,y,accum_10_80_10)
-------------- original accuracy: 0.865 balanced accuracy: 0.569 kappa: 0.483 [[ 54 69 1] [ 1 950 1] [ 4 86 34]] rebalanced thresholds: [0.25, 0.2] accuracy: 0.893 balanced accuracy: 0.763 kappa: 0.682 [[ 77 42 5] [ 14 906 32] [ 7 28 89]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.893 balanced accuracy: 0.763 kappa: 0.682 [[ 77 42 5] [ 14 906 32] [ 7 28 89]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.844 balanced accuracy: 0.820 kappa: 0.619 [[100 19 5] [ 66 814 72] [ 10 15 99]] -------------- original accuracy: 0.885 balanced accuracy: 0.628 kappa: 0.584 [[ 49 67 7] [ 1 953 0] [ 7 56 60]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.920 balanced accuracy: 0.775 kappa: 0.750 [[ 93 26 4] [ 11 939 4] [ 16 35 72]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.918 balanced accuracy: 0.798 kappa: 0.756 [[101 18 4] [ 22 927 5] [ 19 30 74]] global balanced_accuracy rebalanced thresholds: (0.2, 0.15000000000000002) accuracy: 0.913 balanced accuracy: 0.827 kappa: 0.753 [[ 94 18 11] [ 21 908 25] [ 9 20 94]] -------------- original accuracy: 0.866 balanced accuracy: 0.565 kappa: 0.476 [[ 61 61 0] [ 2 954 0] [ 4 94 24]] rebalanced thresholds: [0.25, 0.2] accuracy: 0.899 balanced accuracy: 0.753 kappa: 0.690 [[ 81 29 12] [ 19 921 16] [ 5 40 77]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.899 balanced accuracy: 0.753 kappa: 0.690 [[ 81 29 12] [ 19 921 16] [ 5 40 77]] global balanced_accuracy rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.821 balanced accuracy: 0.792 kappa: 0.571 [[102 9 11] [126 797 33] [ 12 24 86]] -------------- original accuracy: 0.878 balanced accuracy: 0.603 kappa: 0.542 [[ 67 51 4] [ 1 955 0] [ 4 86 32]] rebalanced thresholds: [0.25, 0.2] accuracy: 0.893 balanced accuracy: 0.803 kappa: 0.700 [[ 95 14 13] [ 13 892 51] [ 6 31 85]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.893 balanced accuracy: 0.803 kappa: 0.700 [[ 95 14 13] [ 13 892 51] [ 6 31 85]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.845 balanced accuracy: 0.823 kappa: 0.624 [[109 2 11] [ 55 817 84] [ 14 20 88]] -------------- original accuracy: 0.868 balanced accuracy: 0.570 kappa: 0.484 [[ 65 57 0] [ 1 954 1] [ 3 97 22]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.910 balanced accuracy: 0.765 kappa: 0.715 [[ 95 23 4] [ 17 931 8] [ 5 51 66]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.910 balanced accuracy: 0.765 kappa: 0.715 [[ 95 23 4] [ 17 931 8] [ 5 51 66]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.858 balanced accuracy: 0.833 kappa: 0.645 [[106 8 8] [ 54 831 71] [ 5 24 93]] -------------- original accuracy: 0.868 balanced accuracy: 0.574 kappa: 0.503 [[ 24 87 12] [ 0 953 1] [ 2 56 65]] rebalanced thresholds: [0.2, 0.3] accuracy: 0.885 balanced accuracy: 0.699 kappa: 0.637 [[ 64 49 10] [ 26 923 5] [ 13 35 75]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.886 balanced accuracy: 0.718 kappa: 0.649 [[ 60 47 16] [ 26 916 12] [ 8 28 87]] global balanced_accuracy rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.775 balanced accuracy: 0.785 kappa: 0.515 [[ 93 11 19] [168 735 51] [ 15 6 102]] -------------- original accuracy: 0.874 balanced accuracy: 0.601 kappa: 0.533 [[ 57 67 0] [ 1 949 2] [ 6 75 43]] rebalanced thresholds: [0.3, 0.2] accuracy: 0.897 balanced accuracy: 0.751 kappa: 0.686 [[ 74 43 7] [ 14 917 21] [ 7 31 86]] global kappa rebalanced thresholds: (0.3, 0.2) accuracy: 0.897 balanced accuracy: 0.751 kappa: 0.686 [[ 74 43 7] [ 14 917 21] [ 7 31 86]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.864 balanced accuracy: 0.798 kappa: 0.647 [[ 96 23 5] [ 63 851 38] [ 17 17 90]] -------------- original accuracy: 0.851 balanced accuracy: 0.530 kappa: 0.423 [[ 32 81 10] [ 1 948 7] [ 4 76 41]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.877 balanced accuracy: 0.694 kappa: 0.613 [[ 61 48 14] [ 7 916 33] [ 4 41 76]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.877 balanced accuracy: 0.722 kappa: 0.630 [[ 74 37 12] [ 19 904 33] [ 8 38 75]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.844 balanced accuracy: 0.744 kappa: 0.583 [[ 79 28 16] [ 37 849 70] [ 7 29 85]] -------------- original accuracy: 0.885 balanced accuracy: 0.632 kappa: 0.582 [[ 64 58 2] [ 2 951 0] [ 5 71 47]] rebalanced thresholds: [0.3, 0.25] accuracy: 0.909 balanced accuracy: 0.758 kappa: 0.715 [[ 77 40 7] [ 4 931 18] [ 7 33 83]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.906 balanced accuracy: 0.758 kappa: 0.709 [[ 81 36 7] [ 9 926 18] [ 10 33 80]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.848 balanced accuracy: 0.807 kappa: 0.623 [[ 92 19 13] [ 43 825 85] [ 12 11 100]] -------------- original accuracy: 0.895 balanced accuracy: 0.671 kappa: 0.628 [[ 67 54 1] [ 6 949 0] [ 3 62 58]] rebalanced thresholds: [0.3, 0.3] accuracy: 0.924 balanced accuracy: 0.812 kappa: 0.767 [[ 96 24 2] [ 20 930 5] [ 4 36 83]] global kappa rebalanced thresholds: (0.3, 0.3) accuracy: 0.924 balanced accuracy: 0.812 kappa: 0.767 [[ 96 24 2] [ 20 930 5] [ 4 36 83]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.885 balanced accuracy: 0.900 kappa: 0.715 [[112 7 3] [ 62 839 54] [ 4 8 111]] -------------- original accuracy: 0.867 balanced accuracy: 0.570 kappa: 0.488 [[ 52 71 1] [ 1 952 0] [ 5 82 36]] rebalanced thresholds: [0.25, 0.2] accuracy: 0.887 balanced accuracy: 0.739 kappa: 0.656 [[ 81 39 4] [ 20 909 24] [ 6 42 75]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.887 balanced accuracy: 0.739 kappa: 0.656 [[ 81 39 4] [ 20 909 24] [ 6 42 75]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.728 balanced accuracy: 0.780 kappa: 0.452 [[106 12 6] [137 672 144] [ 15 12 96]] -------------- original accuracy: 0.884 balanced accuracy: 0.629 kappa: 0.577 [[ 57 64 3] [ 1 951 1] [ 3 67 53]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.907 balanced accuracy: 0.778 kappa: 0.715 [[ 86 37 1] [ 18 919 16] [ 8 32 83]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.889 balanced accuracy: 0.780 kappa: 0.680 [[ 83 36 5] [ 18 894 41] [ 6 27 90]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.809 balanced accuracy: 0.800 kappa: 0.554 [[ 94 25 5] [ 72 775 106] [ 7 14 102]] -------------- original accuracy: 0.866 balanced accuracy: 0.573 kappa: 0.489 [[ 50 70 3] [ 2 950 2] [ 4 80 39]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.888 balanced accuracy: 0.724 kappa: 0.655 [[ 82 33 8] [ 23 917 14] [ 11 45 67]] global kappa rebalanced thresholds: (0.3, 0.2) accuracy: 0.882 balanced accuracy: 0.724 kappa: 0.642 [[ 68 41 14] [ 9 909 36] [ 4 37 82]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.831 balanced accuracy: 0.780 kappa: 0.583 [[ 99 15 9] [ 70 814 70] [ 15 24 84]] -------------- original accuracy: 0.881 balanced accuracy: 0.621 kappa: 0.558 [[ 61 60 2] [ 1 951 4] [ 1 75 45]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.907 balanced accuracy: 0.796 kappa: 0.720 [[ 88 31 4] [ 25 913 18] [ 3 31 87]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.907 balanced accuracy: 0.796 kappa: 0.720 [[ 88 31 4] [ 25 913 18] [ 3 31 87]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.847 balanced accuracy: 0.819 kappa: 0.621 [[ 96 14 13] [ 73 821 62] [ 5 17 99]] -------------- original accuracy: 0.877 balanced accuracy: 0.606 kappa: 0.546 [[ 52 67 4] [ 0 952 2] [ 5 69 49]] rebalanced thresholds: [0.3, 0.25] accuracy: 0.898 balanced accuracy: 0.773 kappa: 0.695 [[ 76 40 7] [ 18 910 26] [ 4 27 92]] global kappa rebalanced thresholds: (0.3, 0.25) accuracy: 0.898 balanced accuracy: 0.773 kappa: 0.695 [[ 76 40 7] [ 18 910 26] [ 4 27 92]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.828 balanced accuracy: 0.821 kappa: 0.595 [[ 98 14 11] [ 73 792 89] [ 8 12 103]] -------------- original accuracy: 0.858 balanced accuracy: 0.542 kappa: 0.440 [[ 36 83 4] [ 1 953 0] [ 1 81 41]] rebalanced thresholds: [0.2, 0.25] accuracy: 0.885 balanced accuracy: 0.730 kappa: 0.649 [[ 80 38 5] [ 38 910 6] [ 12 39 72]] global kappa rebalanced thresholds: (0.2, 0.2) accuracy: 0.888 balanced accuracy: 0.753 kappa: 0.669 [[ 78 37 8] [ 37 905 12] [ 11 29 83]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.730 balanced accuracy: 0.790 kappa: 0.459 [[106 8 9] [191 671 92] [ 13 11 99]] -------------- original accuracy: 0.892 balanced accuracy: 0.657 kappa: 0.622 [[ 74 43 6] [ 2 951 0] [ 9 69 46]] rebalanced thresholds: [0.3, 0.25] accuracy: 0.917 balanced accuracy: 0.787 kappa: 0.749 [[ 92 22 9] [ 6 929 18] [ 15 30 79]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.916 balanced accuracy: 0.796 kappa: 0.751 [[ 98 18 7] [ 11 924 18] [ 18 29 77]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.873 balanced accuracy: 0.825 kappa: 0.677 [[109 3 11] [ 32 852 69] [ 21 17 86]] -------------- original accuracy: 0.879 balanced accuracy: 0.609 kappa: 0.544 [[ 42 79 1] [ 1 953 0] [ 0 64 60]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.925 balanced accuracy: 0.787 kappa: 0.761 [[ 80 37 5] [ 7 941 6] [ 2 33 89]] global kappa rebalanced thresholds: (0.2, 0.2) accuracy: 0.932 balanced accuracy: 0.835 kappa: 0.796 [[ 90 26 6] [ 12 931 11] [ 2 24 98]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.895 balanced accuracy: 0.850 kappa: 0.720 [[ 96 19 7] [ 37 873 44] [ 5 14 105]] -------------- original accuracy: 0.874 balanced accuracy: 0.601 kappa: 0.531 [[ 50 67 5] [ 2 950 3] [ 2 72 49]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.900 balanced accuracy: 0.787 kappa: 0.706 [[ 86 28 8] [ 14 907 34] [ 6 30 87]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.900 balanced accuracy: 0.787 kappa: 0.706 [[ 86 28 8] [ 14 907 34] [ 6 30 87]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.858 balanced accuracy: 0.843 kappa: 0.649 [[ 98 15 9] [ 50 825 80] [ 6 11 106]] -------------- original accuracy: 0.902 balanced accuracy: 0.692 kappa: 0.661 [[ 75 45 3] [ 3 951 2] [ 5 59 57]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.931 balanced accuracy: 0.813 kappa: 0.786 [[103 17 3] [ 12 939 5] [ 8 38 75]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.928 balanced accuracy: 0.831 kappa: 0.785 [[103 17 3] [ 12 928 16] [ 7 31 83]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.890 balanced accuracy: 0.875 kappa: 0.718 [[115 3 5] [ 50 857 49] [ 8 17 96]] -------------- original accuracy: 0.865 balanced accuracy: 0.574 kappa: 0.497 [[ 59 57 8] [ 2 948 2] [ 5 88 31]] rebalanced thresholds: [0.3, 0.25] accuracy: 0.889 balanced accuracy: 0.715 kappa: 0.653 [[ 81 31 12] [ 8 921 23] [ 7 52 65]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.889 balanced accuracy: 0.727 kappa: 0.663 [[ 88 24 12] [ 14 916 22] [ 13 48 63]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.829 balanced accuracy: 0.769 kappa: 0.580 [[ 95 13 16] [ 52 815 85] [ 15 24 85]] -------------- original accuracy: 0.886 balanced accuracy: 0.632 kappa: 0.583 [[ 35 79 8] [ 1 953 1] [ 1 47 75]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.920 balanced accuracy: 0.796 kappa: 0.756 [[ 79 33 10] [ 13 931 11] [ 6 23 94]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.912 balanced accuracy: 0.826 kappa: 0.749 [[ 87 25 10] [ 24 908 23] [ 7 16 100]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.2) accuracy: 0.873 balanced accuracy: 0.842 kappa: 0.677 [[ 98 14 10] [ 62 846 47] [ 10 10 103]] -------------- original accuracy: 0.855 balanced accuracy: 0.531 kappa: 0.416 [[ 21 103 1] [ 0 953 0] [ 0 70 52]] rebalanced thresholds: [0.2, 0.25] accuracy: 0.893 balanced accuracy: 0.746 kappa: 0.676 [[ 81 39 5] [ 26 914 13] [ 13 32 77]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.893 balanced accuracy: 0.746 kappa: 0.676 [[ 81 39 5] [ 26 914 13] [ 13 32 77]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.764 balanced accuracy: 0.791 kappa: 0.500 [[ 99 14 12] [124 717 112] [ 14 7 101]] -------------- original accuracy: 0.906 balanced accuracy: 0.691 kappa: 0.673 [[ 75 43 5] [ 0 956 0] [ 10 55 56]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.929 balanced accuracy: 0.818 kappa: 0.787 [[ 96 17 10] [ 5 935 16] [ 13 24 84]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.929 balanced accuracy: 0.818 kappa: 0.787 [[ 96 17 10] [ 5 935 16] [ 13 24 84]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.888 balanced accuracy: 0.824 kappa: 0.703 [[101 8 14] [ 26 876 54] [ 14 18 89]] -------------- original accuracy: 0.866 balanced accuracy: 0.569 kappa: 0.485 [[ 39 79 6] [ 0 952 2] [ 1 73 48]] rebalanced thresholds: [0.3, 0.25] accuracy: 0.892 balanced accuracy: 0.698 kappa: 0.645 [[ 56 55 13] [ 6 933 15] [ 3 38 81]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.887 balanced accuracy: 0.745 kappa: 0.662 [[ 81 35 8] [ 33 906 15] [ 11 34 77]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.853 balanced accuracy: 0.802 kappa: 0.627 [[ 89 22 13] [ 57 836 61] [ 7 16 99]] -------------- original accuracy: 0.906 balanced accuracy: 0.700 kappa: 0.676 [[ 67 45 9] [ 1 952 2] [ 3 53 68]] rebalanced thresholds: [0.3, 0.25] accuracy: 0.925 balanced accuracy: 0.807 kappa: 0.773 [[ 79 26 16] [ 4 933 18] [ 4 22 98]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.926 balanced accuracy: 0.814 kappa: 0.778 [[ 83 23 15] [ 8 931 16] [ 5 22 97]] global balanced_accuracy rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.849 balanced accuracy: 0.825 kappa: 0.632 [[ 95 13 13] [ 74 821 60] [ 11 10 103]] -------------- original accuracy: 0.928 balanced accuracy: 0.767 kappa: 0.763 [[ 81 38 4] [ 0 954 0] [ 4 40 79]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.939 balanced accuracy: 0.847 kappa: 0.818 [[ 93 22 8] [ 12 935 7] [ 6 18 99]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.939 balanced accuracy: 0.847 kappa: 0.818 [[ 93 22 8] [ 12 935 7] [ 6 18 99]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.900 balanced accuracy: 0.871 kappa: 0.740 [[102 10 11] [ 39 871 44] [ 7 9 107]] -------------- original accuracy: 0.862 balanced accuracy: 0.556 kappa: 0.468 [[ 59 63 2] [ 3 951 0] [ 7 91 24]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.891 balanced accuracy: 0.710 kappa: 0.650 [[ 91 29 4] [ 21 926 7] [ 11 59 52]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.893 balanced accuracy: 0.738 kappa: 0.671 [[ 89 28 7] [ 20 918 16] [ 9 48 65]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.1) accuracy: 0.782 balanced accuracy: 0.791 kappa: 0.521 [[ 96 10 18] [ 54 742 158] [ 9 13 100]] -------------- original accuracy: 0.866 balanced accuracy: 0.570 kappa: 0.488 [[ 41 76 5] [ 2 952 2] [ 5 71 46]] rebalanced thresholds: [0.2, 0.25] accuracy: 0.906 balanced accuracy: 0.794 kappa: 0.723 [[ 92 22 8] [ 12 913 31] [ 11 29 82]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.906 balanced accuracy: 0.794 kappa: 0.723 [[ 92 22 8] [ 12 913 31] [ 11 29 82]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.863 balanced accuracy: 0.824 kappa: 0.654 [[ 95 14 13] [ 27 842 87] [ 10 13 99]] -------------- original accuracy: 0.887 balanced accuracy: 0.632 kappa: 0.582 [[ 38 80 3] [ 0 954 2] [ 2 49 72]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.899 balanced accuracy: 0.805 kappa: 0.708 [[ 82 35 4] [ 38 899 19] [ 4 21 98]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.899 balanced accuracy: 0.805 kappa: 0.708 [[ 82 35 4] [ 38 899 19] [ 4 21 98]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.833 balanced accuracy: 0.840 kappa: 0.605 [[ 98 17 6] [104 794 58] [ 4 11 108]] -------------- original accuracy: 0.898 balanced accuracy: 0.686 kappa: 0.657 [[ 56 53 14] [ 2 947 5] [ 6 42 75]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.917 balanced accuracy: 0.795 kappa: 0.754 [[ 84 25 14] [ 10 927 17] [ 11 22 90]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.917 balanced accuracy: 0.802 kappa: 0.757 [[ 79 23 21] [ 10 923 21] [ 9 16 98]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.874 balanced accuracy: 0.808 kappa: 0.675 [[ 84 13 26] [ 33 862 59] [ 11 9 103]] -------------- original accuracy: 0.893 balanced accuracy: 0.660 kappa: 0.619 [[ 72 48 2] [ 3 952 1] [ 6 68 48]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.924 balanced accuracy: 0.820 kappa: 0.773 [[110 9 3] [ 19 927 10] [ 13 37 72]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.924 balanced accuracy: 0.842 kappa: 0.781 [[107 8 7] [ 19 918 19] [ 10 28 84]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.874 balanced accuracy: 0.869 kappa: 0.689 [[113 1 8] [ 58 838 60] [ 11 13 98]] -------------- original accuracy: 0.884 balanced accuracy: 0.630 kappa: 0.574 [[ 60 61 2] [ 1 951 2] [ 1 72 50]] rebalanced thresholds: [0.25, 0.2] accuracy: 0.908 balanced accuracy: 0.761 kappa: 0.714 [[ 84 27 12] [ 14 929 11] [ 3 43 77]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.908 balanced accuracy: 0.761 kappa: 0.714 [[ 84 27 12] [ 14 929 11] [ 3 43 77]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.1) accuracy: 0.825 balanced accuracy: 0.820 kappa: 0.590 [[ 96 13 14] [ 54 789 111] [ 4 14 105]] -------------- original accuracy: 0.914 balanced accuracy: 0.729 kappa: 0.713 [[ 79 31 12] [ 0 951 4] [ 2 54 67]] rebalanced thresholds: [0.3, 0.3] accuracy: 0.936 balanced accuracy: 0.807 kappa: 0.799 [[ 90 19 13] [ 2 948 5] [ 4 34 85]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.939 balanced accuracy: 0.839 kappa: 0.817 [[ 94 15 13] [ 3 939 13] [ 4 25 94]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.908 balanced accuracy: 0.855 kappa: 0.751 [[100 8 14] [ 30 890 35] [ 5 18 100]] -------------- original accuracy: 0.867 balanced accuracy: 0.568 kappa: 0.487 [[ 53 62 7] [ 2 954 0] [ 3 86 33]] rebalanced thresholds: [0.25, 0.2] accuracy: 0.897 balanced accuracy: 0.730 kappa: 0.676 [[ 80 28 14] [ 16 927 13] [ 8 45 69]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.897 balanced accuracy: 0.730 kappa: 0.676 [[ 80 28 14] [ 16 927 13] [ 8 45 69]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.859 balanced accuracy: 0.800 kappa: 0.638 [[ 98 10 14] [ 72 846 38] [ 12 23 87]] -------------- original accuracy: 0.882 balanced accuracy: 0.620 kappa: 0.560 [[ 36 86 1] [ 0 953 1] [ 0 53 70]] rebalanced thresholds: [0.2, 0.3] accuracy: 0.892 balanced accuracy: 0.750 kappa: 0.671 [[ 76 44 3] [ 26 912 16] [ 7 33 83]] global kappa rebalanced thresholds: (0.2, 0.3) accuracy: 0.892 balanced accuracy: 0.750 kappa: 0.671 [[ 76 44 3] [ 26 912 16] [ 7 33 83]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.835 balanced accuracy: 0.810 kappa: 0.595 [[ 88 28 7] [ 72 807 75] [ 2 14 107]] -------------- original accuracy: 0.899 balanced accuracy: 0.674 kappa: 0.644 [[ 72 48 3] [ 1 953 0] [ 5 64 54]] rebalanced thresholds: [0.3, 0.3] accuracy: 0.925 balanced accuracy: 0.803 kappa: 0.769 [[ 97 21 5] [ 12 934 8] [ 7 37 79]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.919 balanced accuracy: 0.846 kappa: 0.770 [[107 11 5] [ 22 908 24] [ 7 28 88]] global balanced_accuracy rebalanced thresholds: (0.2, 0.15000000000000002) accuracy: 0.875 balanced accuracy: 0.877 kappa: 0.693 [[109 4 10] [ 31 834 89] [ 5 11 107]] -------------- original accuracy: 0.895 balanced accuracy: 0.666 kappa: 0.629 [[ 61 61 2] [ 2 951 1] [ 7 53 62]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.933 balanced accuracy: 0.819 kappa: 0.796 [[ 92 29 3] [ 11 939 4] [ 13 20 89]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.923 balanced accuracy: 0.831 kappa: 0.776 [[ 91 28 5] [ 10 920 24] [ 11 14 97]] global balanced_accuracy rebalanced thresholds: (0.2, 0.15000000000000002) accuracy: 0.894 balanced accuracy: 0.826 kappa: 0.712 [[ 92 25 7] [ 23 882 49] [ 12 11 99]] -------------- original accuracy: 0.858 balanced accuracy: 0.542 kappa: 0.435 [[ 50 72 0] [ 0 953 1] [ 1 96 27]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.902 balanced accuracy: 0.758 kappa: 0.697 [[ 88 30 4] [ 19 921 14] [ 9 42 73]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.889 balanced accuracy: 0.763 kappa: 0.675 [[ 98 20 4] [ 38 902 14] [ 15 42 67]] global balanced_accuracy rebalanced thresholds: (0.2, 0.15000000000000002) accuracy: 0.847 balanced accuracy: 0.832 kappa: 0.627 [[ 94 15 13] [ 37 814 103] [ 3 13 108]] -------------- original accuracy: 0.863 balanced accuracy: 0.554 kappa: 0.458 [[ 35 86 2] [ 1 954 0] [ 2 74 46]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.895 balanced accuracy: 0.776 kappa: 0.689 [[ 87 34 2] [ 30 905 20] [ 9 31 82]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.895 balanced accuracy: 0.776 kappa: 0.689 [[ 87 34 2] [ 30 905 20] [ 9 31 82]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.814 balanced accuracy: 0.818 kappa: 0.570 [[106 13 4] [121 776 58] [ 10 17 95]] -------------- original accuracy: 0.906 balanced accuracy: 0.703 kappa: 0.670 [[ 89 33 0] [ 3 951 2] [ 0 75 47]] rebalanced thresholds: [0.3, 0.25] accuracy: 0.927 balanced accuracy: 0.810 kappa: 0.773 [[101 18 3] [ 8 935 13] [ 6 40 76]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.921 balanced accuracy: 0.810 kappa: 0.760 [[103 16 3] [ 16 927 13] [ 9 38 75]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.874 balanced accuracy: 0.838 kappa: 0.676 [[107 10 5] [ 47 851 58] [ 11 20 91]] -------------- original accuracy: 0.855 balanced accuracy: 0.538 kappa: 0.429 [[ 51 70 2] [ 4 950 0] [ 3 95 25]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.892 balanced accuracy: 0.730 kappa: 0.660 [[ 86 34 3] [ 20 919 15] [ 8 50 65]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.890 balanced accuracy: 0.756 kappa: 0.670 [[ 84 33 6] [ 20 906 28] [ 6 39 78]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.861 balanced accuracy: 0.828 kappa: 0.649 [[109 10 4] [ 57 835 62] [ 11 23 89]] -------------- original accuracy: 0.882 balanced accuracy: 0.618 kappa: 0.564 [[ 48 70 5] [ 2 953 0] [ 4 61 57]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.900 balanced accuracy: 0.735 kappa: 0.684 [[ 81 36 6] [ 22 929 4] [ 11 41 70]] global kappa rebalanced thresholds: (0.25, 0.25) accuracy: 0.902 balanced accuracy: 0.750 kappa: 0.695 [[ 77 35 11] [ 22 925 8] [ 7 35 80]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.784 balanced accuracy: 0.789 kappa: 0.524 [[ 98 9 16] [125 747 83] [ 13 13 96]] -------------- original accuracy: 0.888 balanced accuracy: 0.638 kappa: 0.588 [[ 53 67 3] [ 0 954 1] [ 0 63 59]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.919 balanced accuracy: 0.769 kappa: 0.740 [[ 83 37 3] [ 10 941 4] [ 5 38 79]] global kappa rebalanced thresholds: (0.25, 0.3) accuracy: 0.919 balanced accuracy: 0.769 kappa: 0.740 [[ 83 37 3] [ 10 941 4] [ 5 38 79]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.867 balanced accuracy: 0.833 kappa: 0.660 [[103 11 9] [ 56 842 57] [ 6 21 95]] -------------- original accuracy: 0.848 balanced accuracy: 0.509 kappa: 0.385 [[ 35 86 1] [ 1 953 0] [ 5 89 30]] rebalanced thresholds: [0.2, 0.2] accuracy: 0.883 balanced accuracy: 0.729 kappa: 0.642 [[ 80 39 3] [ 27 908 19] [ 9 43 72]] global kappa rebalanced thresholds: (0.2, 0.2) accuracy: 0.883 balanced accuracy: 0.729 kappa: 0.642 [[ 80 39 3] [ 27 908 19] [ 9 43 72]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.835 balanced accuracy: 0.775 kappa: 0.580 [[ 94 24 4] [ 87 822 45] [ 9 29 86]] -------------- original accuracy: 0.868 balanced accuracy: 0.579 kappa: 0.500 [[ 36 84 3] [ 1 951 2] [ 4 64 55]] rebalanced thresholds: [0.2, 0.25] accuracy: 0.897 balanced accuracy: 0.763 kappa: 0.687 [[ 74 48 1] [ 23 912 19] [ 9 24 90]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.897 balanced accuracy: 0.763 kappa: 0.687 [[ 74 48 1] [ 23 912 19] [ 9 24 90]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.860 balanced accuracy: 0.802 kappa: 0.635 [[ 79 38 6] [ 42 845 67] [ 8 7 108]] -------------- original accuracy: 0.856 balanced accuracy: 0.530 kappa: 0.417 [[ 24 95 3] [ 1 955 0] [ 0 74 48]] rebalanced thresholds: [0.25, 0.25] accuracy: 0.901 balanced accuracy: 0.725 kappa: 0.678 [[ 66 48 8] [ 9 935 12] [ 6 36 80]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.902 balanced accuracy: 0.761 kappa: 0.697 [[ 81 37 4] [ 25 921 10] [ 10 32 80]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.857 balanced accuracy: 0.802 kappa: 0.634 [[ 89 19 14] [ 47 842 67] [ 11 14 97]] -------------- original accuracy: 0.866 balanced accuracy: 0.564 kappa: 0.482 [[ 24 92 6] [ 0 954 1] [ 3 59 61]] rebalanced thresholds: [0.2, 0.25] accuracy: 0.907 balanced accuracy: 0.760 kappa: 0.715 [[ 76 36 10] [ 14 929 12] [ 12 27 84]] global kappa rebalanced thresholds: (0.2, 0.25) accuracy: 0.907 balanced accuracy: 0.760 kappa: 0.715 [[ 76 36 10] [ 14 929 12] [ 12 27 84]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.1) accuracy: 0.827 balanced accuracy: 0.806 kappa: 0.590 [[ 87 13 22] [ 44 798 113] [ 6 10 107]] -------------- original accuracy: 0.888 balanced accuracy: 0.639 kappa: 0.592 [[ 42 78 3] [ 1 953 0] [ 2 50 71]] rebalanced thresholds: [0.25, 0.3] accuracy: 0.910 balanced accuracy: 0.747 kappa: 0.709 [[ 73 47 3] [ 11 937 6] [ 6 35 82]] global kappa rebalanced thresholds: (0.25, 0.2) accuracy: 0.905 balanced accuracy: 0.769 kappa: 0.708 [[ 69 46 8] [ 11 921 22] [ 3 24 96]] global balanced_accuracy rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.811 balanced accuracy: 0.812 kappa: 0.565 [[105 15 3] [147 773 34] [ 14 14 95]] -------------- original accuracy: 0.913 balanced accuracy: 0.722 kappa: 0.704 [[ 74 43 6] [ 1 953 1] [ 3 50 69]] rebalanced thresholds: [0.3, 0.3] accuracy: 0.927 balanced accuracy: 0.792 kappa: 0.771 [[ 89 28 6] [ 6 943 6] [ 7 34 81]] global kappa rebalanced thresholds: (0.3, 0.25) accuracy: 0.927 balanced accuracy: 0.804 kappa: 0.776 [[ 87 27 9] [ 6 938 11] [ 5 29 88]] global balanced_accuracy rebalanced thresholds: (0.2, 0.15000000000000002) accuracy: 0.915 balanced accuracy: 0.855 kappa: 0.766 [[ 97 14 12] [ 16 899 40] [ 7 13 102]]
Start by comparing the model-performance metrics kappa, balanced accuracy, and accuracy between the model with the greedy threshold shift based on kappa and the model with "default thresholds".
accum = accum_10_80_10
figsize(9,6)
scatter([x['orig-kappa'] for x in accum],[x['shift-kappa'] for x in accum],label='kappa');
scatter([x['orig-balanced'] for x in accum],[x['shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['orig-accuracy'] for x in accum],[x['shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('orig')
ylabel('greedy shift');
title('10-80-10');
The shift improves all three metrics for every dataset.
Now compare the results for using a grid search based on Cohen's kappa to the greedy shift results:
scatter([x['shift-kappa'] for x in accum],[x['global-k-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-k-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-k-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-kappa');
title('10-80-10');
Here the changes are reasonably small, but they do tend to slightly favor the results of the grid search.
Finally, do the equivalent plot comparing the result from using balanced accuracy in the grid search to the results from the greedy shift:
scatter([x['shift-kappa'] for x in accum],[x['global-ba-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-ba-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-ba-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-balanced');
title('10-80-10');
That plot makes it look like doing the threshold shifts using balanced accuracy doesn't improve kappa, but it's important to remember that this comparing the balanced accuracy shift vs the kappa shift.
Using balanced accuracy to do the shift instead of kappa does actually help kappa too, as this plot shows:
scatter([x['orig-kappa'] for x in accum],[x['global-ba-shift-kappa'] for x in accum],label='kappa');
scatter([x['orig-balanced'] for x in accum],[x['global-ba-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['orig-accuracy'] for x in accum],[x['global-ba-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('orig')
ylabel('grid-balanced');
title('10-80-10');
Still, with these datasets it looks like optimizing the threshold with kappa instead of balanced accuracy is a better idea.
Now let's make sure that the code doesn't have some "feature" which causes it to only work with the middle class is the majority:
accum_80_10_10 = []
for rep in range(50):
print('--------------')
# Generate a ternary imbalanced classification problem
X, y = make_classification(n_samples=6000, n_features=20,
n_informative=10, n_redundant=0, n_classes=3,
random_state=0xf00d+rep, shuffle=False, weights = [0.8, 0.1, 0.1])
run_ternary_experiment(X,y,accum_80_10_10)
-------------- original accuracy: 0.883 balanced accuracy: 0.623 kappa: 0.571 [[953 0 1] [ 64 54 4] [ 67 4 53]] rebalanced thresholds: [0.6000000000000001, 0.3] accuracy: 0.911 balanced accuracy: 0.745 kappa: 0.723 [[939 10 5] [ 30 76 16] [ 29 17 78]] global kappa rebalanced thresholds: (0.6000000000000001, 0.3) accuracy: 0.911 balanced accuracy: 0.745 kappa: 0.723 [[939 10 5] [ 30 76 16] [ 29 17 78]] global balanced_accuracy rebalanced thresholds: (0.8, 0.3) accuracy: 0.812 balanced accuracy: 0.756 kappa: 0.556 [[798 151 5] [ 9 97 16] [ 13 32 79]] -------------- original accuracy: 0.875 balanced accuracy: 0.596 kappa: 0.526 [[953 1 0] [ 71 51 1] [ 74 3 46]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.904 balanced accuracy: 0.778 kappa: 0.714 [[916 30 8] [ 21 94 8] [ 37 11 75]] global kappa rebalanced thresholds: (0.7000000000000001, 0.25) accuracy: 0.904 balanced accuracy: 0.778 kappa: 0.714 [[916 30 8] [ 21 94 8] [ 37 11 75]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.891 balanced accuracy: 0.796 kappa: 0.698 [[890 55 9] [ 12 103 8] [ 26 21 76]] -------------- original accuracy: 0.879 balanced accuracy: 0.607 kappa: 0.549 [[954 0 0] [ 69 52 2] [ 68 6 49]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.905 balanced accuracy: 0.762 kappa: 0.710 [[924 23 7] [ 32 79 12] [ 32 8 83]] global kappa rebalanced thresholds: (0.6000000000000001, 0.25) accuracy: 0.906 balanced accuracy: 0.736 kappa: 0.699 [[936 13 5] [ 41 71 11] [ 37 6 80]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.868 balanced accuracy: 0.772 kappa: 0.645 [[869 78 7] [ 23 88 12] [ 20 18 85]] -------------- original accuracy: 0.892 balanced accuracy: 0.646 kappa: 0.608 [[956 1 0] [ 51 64 7] [ 66 5 50]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.900 balanced accuracy: 0.760 kappa: 0.708 [[920 29 8] [ 17 76 29] [ 22 15 84]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.905 balanced accuracy: 0.747 kappa: 0.711 [[932 17 8] [ 22 71 29] [ 29 9 83]] global balanced_accuracy rebalanced thresholds: (0.8, 0.3) accuracy: 0.841 balanced accuracy: 0.746 kappa: 0.597 [[844 108 5] [ 11 93 18] [ 14 35 72]] -------------- original accuracy: 0.895 balanced accuracy: 0.674 kappa: 0.635 [[948 4 3] [ 52 66 5] [ 55 7 60]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.906 balanced accuracy: 0.799 kappa: 0.731 [[910 26 19] [ 18 88 17] [ 19 14 89]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.906 balanced accuracy: 0.799 kappa: 0.731 [[910 26 19] [ 18 88 17] [ 19 14 89]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.873 balanced accuracy: 0.807 kappa: 0.672 [[861 74 20] [ 11 95 17] [ 8 23 91]] -------------- original accuracy: 0.832 balanced accuracy: 0.453 kappa: 0.271 [[954 1 0] [100 21 1] [100 0 23]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.856 balanced accuracy: 0.664 kappa: 0.554 [[898 33 24] [ 55 57 10] [ 41 10 72]] global kappa rebalanced thresholds: (0.7000000000000001, 0.25) accuracy: 0.856 balanced accuracy: 0.664 kappa: 0.554 [[898 33 24] [ 55 57 10] [ 41 10 72]] global balanced_accuracy rebalanced thresholds: (0.8, 0.15000000000000002) accuracy: 0.749 balanced accuracy: 0.736 kappa: 0.450 [[721 138 96] [ 28 74 20] [ 15 4 104]] -------------- original accuracy: 0.868 balanced accuracy: 0.589 kappa: 0.513 [[945 5 2] [ 71 51 3] [ 70 8 45]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.875 balanced accuracy: 0.693 kappa: 0.617 [[911 24 17] [ 43 63 19] [ 38 9 76]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.875 balanced accuracy: 0.693 kappa: 0.617 [[911 24 17] [ 43 63 19] [ 38 9 76]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.843 balanced accuracy: 0.714 kappa: 0.577 [[857 77 18] [ 28 78 19] [ 27 20 76]] -------------- original accuracy: 0.876 balanced accuracy: 0.596 kappa: 0.528 [[955 1 0] [ 78 42 3] [ 64 3 54]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.912 balanced accuracy: 0.782 kappa: 0.732 [[926 23 7] [ 35 76 12] [ 22 7 92]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.912 balanced accuracy: 0.782 kappa: 0.732 [[926 23 7] [ 35 76 12] [ 22 7 92]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.875 balanced accuracy: 0.805 kappa: 0.670 [[866 81 9] [ 20 91 12] [ 11 17 93]] -------------- original accuracy: 0.874 balanced accuracy: 0.596 kappa: 0.531 [[952 2 0] [ 79 36 8] [ 59 3 61]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.899 balanced accuracy: 0.771 kappa: 0.701 [[912 28 14] [ 27 77 19] [ 32 1 90]] global kappa rebalanced thresholds: (0.6000000000000001, 0.25) accuracy: 0.902 balanced accuracy: 0.740 kappa: 0.694 [[930 12 12] [ 39 65 19] [ 34 1 88]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.863 balanced accuracy: 0.794 kappa: 0.647 [[852 86 16] [ 13 91 19] [ 16 15 92]] -------------- original accuracy: 0.930 balanced accuracy: 0.792 kappa: 0.775 [[947 2 7] [ 35 81 5] [ 31 4 88]] rebalanced thresholds: [0.6500000000000001, 0.3] accuracy: 0.918 balanced accuracy: 0.830 kappa: 0.765 [[915 32 9] [ 22 84 15] [ 14 6 103]] global kappa rebalanced thresholds: (0.6000000000000001, 0.3) accuracy: 0.923 balanced accuracy: 0.820 kappa: 0.773 [[926 21 9] [ 27 79 15] [ 16 4 103]] global balanced_accuracy rebalanced thresholds: (0.7000000000000001, 0.3) accuracy: 0.902 balanced accuracy: 0.828 kappa: 0.731 [[894 53 9] [ 20 86 15] [ 11 9 103]] -------------- original accuracy: 0.882 balanced accuracy: 0.627 kappa: 0.583 [[950 3 1] [ 64 54 5] [ 53 15 55]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.894 balanced accuracy: 0.738 kappa: 0.687 [[919 27 8] [ 28 66 29] [ 23 12 88]] global kappa rebalanced thresholds: (0.6500000000000001, 0.3) accuracy: 0.899 balanced accuracy: 0.755 kappa: 0.701 [[919 29 6] [ 28 80 15] [ 23 20 80]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.35000000000000003) accuracy: 0.869 balanced accuracy: 0.747 kappa: 0.646 [[881 70 3] [ 20 93 10] [ 14 40 69]] -------------- original accuracy: 0.877 balanced accuracy: 0.609 kappa: 0.547 [[951 2 2] [ 76 42 4] [ 58 5 60]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.898 balanced accuracy: 0.753 kappa: 0.692 [[919 25 11] [ 35 70 17] [ 29 5 89]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.898 balanced accuracy: 0.753 kappa: 0.692 [[919 25 11] [ 35 70 17] [ 29 5 89]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.875 balanced accuracy: 0.786 kappa: 0.661 [[873 53 29] [ 22 80 20] [ 21 5 97]] -------------- original accuracy: 0.868 balanced accuracy: 0.581 kappa: 0.504 [[950 2 2] [ 72 48 3] [ 73 6 44]] rebalanced thresholds: [0.6000000000000001, 0.3] accuracy: 0.891 balanced accuracy: 0.704 kappa: 0.653 [[929 15 10] [ 39 74 10] [ 42 15 66]] global kappa rebalanced thresholds: (0.6000000000000001, 0.3) accuracy: 0.891 balanced accuracy: 0.704 kappa: 0.653 [[929 15 10] [ 39 74 10] [ 42 15 66]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.3) accuracy: 0.862 balanced accuracy: 0.741 kappa: 0.626 [[873 71 10] [ 19 94 10] [ 22 34 67]] -------------- original accuracy: 0.876 balanced accuracy: 0.595 kappa: 0.532 [[955 1 0] [ 63 53 7] [ 75 3 43]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.914 balanced accuracy: 0.828 kappa: 0.757 [[910 35 11] [ 11 95 17] [ 20 9 92]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.930 balanced accuracy: 0.825 kappa: 0.792 [[933 13 10] [ 15 91 17] [ 23 6 92]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.897 balanced accuracy: 0.836 kappa: 0.725 [[883 44 29] [ 9 89 25] [ 10 7 104]] -------------- original accuracy: 0.875 balanced accuracy: 0.613 kappa: 0.542 [[946 8 1] [ 69 52 1] [ 66 5 52]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.882 balanced accuracy: 0.739 kappa: 0.647 [[902 36 17] [ 41 70 11] [ 31 6 86]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.882 balanced accuracy: 0.739 kappa: 0.647 [[902 36 17] [ 41 70 11] [ 31 6 86]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.847 balanced accuracy: 0.767 kappa: 0.605 [[842 74 39] [ 25 77 20] [ 14 12 97]] -------------- original accuracy: 0.858 balanced accuracy: 0.537 kappa: 0.430 [[955 0 0] [ 83 38 1] [ 85 1 37]] rebalanced thresholds: [0.7000000000000001, 0.2] accuracy: 0.887 balanced accuracy: 0.742 kappa: 0.667 [[909 35 11] [ 30 75 17] [ 30 12 81]] global kappa rebalanced thresholds: (0.6500000000000001, 0.2) accuracy: 0.894 balanced accuracy: 0.716 kappa: 0.665 [[929 16 10] [ 38 67 17] [ 39 7 77]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.873 balanced accuracy: 0.759 kappa: 0.649 [[881 61 13] [ 23 82 17] [ 21 18 84]] -------------- original accuracy: 0.893 balanced accuracy: 0.663 kappa: 0.626 [[950 1 3] [ 36 80 6] [ 75 7 42]] rebalanced thresholds: [0.6500000000000001, 0.3] accuracy: 0.903 balanced accuracy: 0.760 kappa: 0.708 [[923 16 15] [ 8 104 10] [ 51 16 57]] global kappa rebalanced thresholds: (0.6000000000000001, 0.3) accuracy: 0.907 balanced accuracy: 0.742 kappa: 0.703 [[935 5 14] [ 16 96 10] [ 59 8 57]] global balanced_accuracy rebalanced thresholds: (0.6500000000000001, 0.3) accuracy: 0.903 balanced accuracy: 0.760 kappa: 0.708 [[923 16 15] [ 8 104 10] [ 51 16 57]] -------------- original accuracy: 0.873 balanced accuracy: 0.592 kappa: 0.529 [[951 1 2] [ 75 38 9] [ 60 6 58]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.909 balanced accuracy: 0.796 kappa: 0.738 [[915 20 19] [ 26 75 21] [ 13 10 101]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.909 balanced accuracy: 0.796 kappa: 0.738 [[915 20 19] [ 26 75 21] [ 13 10 101]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.864 balanced accuracy: 0.820 kappa: 0.662 [[843 89 22] [ 9 92 21] [ 7 15 102]] -------------- original accuracy: 0.890 balanced accuracy: 0.646 kappa: 0.603 [[953 0 3] [ 73 42 6] [ 47 3 73]] rebalanced thresholds: [0.6500000000000001, 0.3] accuracy: 0.905 balanced accuracy: 0.776 kappa: 0.715 [[919 22 15] [ 34 75 12] [ 23 8 92]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.902 balanced accuracy: 0.768 kappa: 0.705 [[918 18 20] [ 34 68 19] [ 23 4 96]] global balanced_accuracy rebalanced thresholds: (0.8, 0.2) accuracy: 0.838 balanced accuracy: 0.777 kappa: 0.600 [[827 94 35] [ 16 77 28] [ 11 10 102]] -------------- original accuracy: 0.914 balanced accuracy: 0.721 kappa: 0.709 [[955 0 1] [ 42 76 5] [ 45 10 66]] rebalanced thresholds: [0.6500000000000001, 0.3] accuracy: 0.937 balanced accuracy: 0.830 kappa: 0.810 [[940 12 4] [ 13 100 10] [ 24 13 84]] global kappa rebalanced thresholds: (0.6500000000000001, 0.3) accuracy: 0.937 balanced accuracy: 0.830 kappa: 0.810 [[940 12 4] [ 13 100 10] [ 24 13 84]] global balanced_accuracy rebalanced thresholds: (0.7000000000000001, 0.3) accuracy: 0.931 balanced accuracy: 0.840 kappa: 0.799 [[928 24 4] [ 8 105 10] [ 21 16 84]] -------------- original accuracy: 0.887 balanced accuracy: 0.650 kappa: 0.607 [[946 5 3] [ 53 59 10] [ 56 9 59]] rebalanced thresholds: [0.6000000000000001, 0.3] accuracy: 0.902 balanced accuracy: 0.742 kappa: 0.699 [[929 17 8] [ 31 70 21] [ 33 7 84]] global kappa rebalanced thresholds: (0.6000000000000001, 0.35000000000000003) accuracy: 0.900 balanced accuracy: 0.734 kappa: 0.691 [[929 18 7] [ 31 78 13] [ 34 17 73]] global balanced_accuracy rebalanced thresholds: (0.8, 0.3) accuracy: 0.812 balanced accuracy: 0.758 kappa: 0.557 [[797 148 9] [ 10 91 21] [ 11 27 86]] -------------- original accuracy: 0.852 balanced accuracy: 0.518 kappa: 0.395 [[954 1 0] [ 93 27 2] [ 82 0 41]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.901 balanced accuracy: 0.768 kappa: 0.701 [[916 27 12] [ 40 73 9] [ 23 8 92]] global kappa rebalanced thresholds: (0.7000000000000001, 0.25) accuracy: 0.901 balanced accuracy: 0.768 kappa: 0.701 [[916 27 12] [ 40 73 9] [ 23 8 92]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.888 balanced accuracy: 0.803 kappa: 0.692 [[884 42 29] [ 29 77 16] [ 12 6 105]] -------------- original accuracy: 0.855 balanced accuracy: 0.525 kappa: 0.408 [[956 0 0] [ 91 30 1] [ 81 1 40]] rebalanced thresholds: [0.7000000000000001, 0.2] accuracy: 0.886 balanced accuracy: 0.711 kappa: 0.645 [[920 17 19] [ 42 61 19] [ 34 6 82]] global kappa rebalanced thresholds: (0.7000000000000001, 0.25) accuracy: 0.892 balanced accuracy: 0.731 kappa: 0.664 [[921 23 12] [ 42 72 8] [ 36 8 78]] global balanced_accuracy rebalanced thresholds: (0.8, 0.15000000000000002) accuracy: 0.822 balanced accuracy: 0.744 kappa: 0.564 [[818 93 45] [ 14 76 32] [ 13 17 92]] -------------- original accuracy: 0.912 balanced accuracy: 0.720 kappa: 0.701 [[953 2 1] [ 45 75 3] [ 48 6 67]] rebalanced thresholds: [0.6000000000000001, 0.3] accuracy: 0.924 balanced accuracy: 0.804 kappa: 0.768 [[934 15 7] [ 23 92 8] [ 30 8 83]] global kappa rebalanced thresholds: (0.6000000000000001, 0.3) accuracy: 0.924 balanced accuracy: 0.804 kappa: 0.768 [[934 15 7] [ 23 92 8] [ 30 8 83]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.3) accuracy: 0.893 balanced accuracy: 0.824 kappa: 0.714 [[883 66 7] [ 10 105 8] [ 15 22 84]] -------------- original accuracy: 0.863 balanced accuracy: 0.552 kappa: 0.456 [[954 0 0] [ 67 57 0] [ 96 2 24]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.904 balanced accuracy: 0.740 kappa: 0.695 [[932 19 3] [ 32 87 5] [ 47 9 66]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.904 balanced accuracy: 0.740 kappa: 0.695 [[932 19 3] [ 32 87 5] [ 47 9 66]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.881 balanced accuracy: 0.787 kappa: 0.672 [[880 57 17] [ 16 94 14] [ 30 9 83]] -------------- original accuracy: 0.859 balanced accuracy: 0.557 kappa: 0.468 [[947 3 2] [ 61 55 8] [ 91 4 29]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.874 balanced accuracy: 0.720 kappa: 0.635 [[898 35 19] [ 27 67 30] [ 35 5 84]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.874 balanced accuracy: 0.720 kappa: 0.635 [[898 35 19] [ 27 67 30] [ 35 5 84]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.823 balanced accuracy: 0.745 kappa: 0.566 [[816 89 47] [ 16 72 36] [ 16 9 99]] -------------- original accuracy: 0.899 balanced accuracy: 0.683 kappa: 0.647 [[950 4 1] [ 36 83 3] [ 73 4 46]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.917 balanced accuracy: 0.835 kappa: 0.765 [[911 20 24] [ 8 103 11] [ 26 10 87]] global kappa rebalanced thresholds: (0.6000000000000001, 0.25) accuracy: 0.922 balanced accuracy: 0.823 kappa: 0.770 [[922 12 21] [ 12 99 11] [ 29 9 85]] global balanced_accuracy rebalanced thresholds: (0.7000000000000001, 0.25) accuracy: 0.904 balanced accuracy: 0.837 kappa: 0.739 [[892 36 27] [ 7 104 11] [ 19 15 89]] -------------- original accuracy: 0.877 balanced accuracy: 0.597 kappa: 0.543 [[955 0 0] [ 67 51 5] [ 64 12 46]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.907 balanced accuracy: 0.765 kappa: 0.722 [[927 27 1] [ 25 74 24] [ 25 9 88]] global kappa rebalanced thresholds: (0.6500000000000001, 0.3) accuracy: 0.910 balanced accuracy: 0.754 kappa: 0.722 [[935 19 1] [ 30 76 17] [ 29 12 81]] global balanced_accuracy rebalanced thresholds: (0.8, 0.25) accuracy: 0.861 balanced accuracy: 0.783 kappa: 0.643 [[855 99 1] [ 9 90 24] [ 14 20 88]] -------------- original accuracy: 0.869 balanced accuracy: 0.578 kappa: 0.493 [[953 0 2] [ 79 43 1] [ 75 0 47]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.894 balanced accuracy: 0.775 kappa: 0.691 [[904 36 15] [ 22 94 7] [ 35 12 75]] global kappa rebalanced thresholds: (0.7000000000000001, 0.2) accuracy: 0.893 balanced accuracy: 0.775 kappa: 0.690 [[903 25 27] [ 22 87 14] [ 33 7 82]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.874 balanced accuracy: 0.788 kappa: 0.661 [[871 56 28] [ 12 96 15] [ 29 11 82]] -------------- original accuracy: 0.900 balanced accuracy: 0.677 kappa: 0.649 [[953 1 0] [ 73 40 10] [ 35 1 87]] rebalanced thresholds: [0.6000000000000001, 0.35000000000000003] accuracy: 0.910 balanced accuracy: 0.747 kappa: 0.719 [[937 15 2] [ 45 62 16] [ 19 11 93]] global kappa rebalanced thresholds: (0.55, 0.3) accuracy: 0.906 balanced accuracy: 0.724 kappa: 0.698 [[941 6 7] [ 52 49 22] [ 22 4 97]] global balanced_accuracy rebalanced thresholds: (0.8, 0.3) accuracy: 0.849 balanced accuracy: 0.793 kappa: 0.627 [[834 111 9] [ 17 84 22] [ 4 18 101]] -------------- original accuracy: 0.866 balanced accuracy: 0.571 kappa: 0.481 [[951 2 1] [ 71 50 2] [ 85 0 38]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.911 balanced accuracy: 0.783 kappa: 0.730 [[923 24 7] [ 21 94 8] [ 39 8 76]] global kappa rebalanced thresholds: (0.6500000000000001, 0.2) accuracy: 0.913 balanced accuracy: 0.761 kappa: 0.726 [[936 10 8] [ 30 81 12] [ 42 2 79]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.889 balanced accuracy: 0.812 kappa: 0.698 [[881 51 22] [ 10 101 12] [ 29 9 85]] -------------- original accuracy: 0.873 balanced accuracy: 0.588 kappa: 0.518 [[954 0 0] [ 67 56 1] [ 79 5 38]] rebalanced thresholds: [0.7000000000000001, 0.2] accuracy: 0.915 balanced accuracy: 0.799 kappa: 0.750 [[922 21 11] [ 16 82 26] [ 27 1 94]] global kappa rebalanced thresholds: (0.7000000000000001, 0.2) accuracy: 0.915 balanced accuracy: 0.799 kappa: 0.750 [[922 21 11] [ 16 82 26] [ 27 1 94]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.882 balanced accuracy: 0.795 kappa: 0.683 [[879 60 15] [ 12 86 26] [ 22 6 94]] -------------- original accuracy: 0.879 balanced accuracy: 0.611 kappa: 0.549 [[953 0 3] [ 65 56 1] [ 70 6 46]] rebalanced thresholds: [0.6500000000000001, 0.3] accuracy: 0.899 balanced accuracy: 0.753 kappa: 0.700 [[921 26 9] [ 23 91 8] [ 27 28 67]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.906 balanced accuracy: 0.775 kappa: 0.720 [[921 20 15] [ 23 86 13] [ 26 16 80]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.863 balanced accuracy: 0.790 kappa: 0.648 [[855 86 15] [ 10 99 13] [ 11 30 81]] -------------- original accuracy: 0.887 balanced accuracy: 0.638 kappa: 0.590 [[952 2 1] [ 43 79 0] [ 80 10 33]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.912 balanced accuracy: 0.766 kappa: 0.726 [[932 16 7] [ 19 90 13] [ 44 7 72]] global kappa rebalanced thresholds: (0.6000000000000001, 0.3) accuracy: 0.901 balanced accuracy: 0.712 kappa: 0.678 [[940 12 3] [ 26 89 7] [ 55 16 52]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.882 balanced accuracy: 0.782 kappa: 0.675 [[884 61 10] [ 13 96 13] [ 27 18 78]] -------------- original accuracy: 0.872 balanced accuracy: 0.583 kappa: 0.511 [[954 0 1] [ 77 43 3] [ 68 5 49]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.887 balanced accuracy: 0.725 kappa: 0.660 [[916 31 8] [ 30 70 23] [ 34 9 79]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.887 balanced accuracy: 0.725 kappa: 0.660 [[916 31 8] [ 30 70 23] [ 34 9 79]] global balanced_accuracy rebalanced thresholds: (0.8, 0.25) accuracy: 0.807 balanced accuracy: 0.746 kappa: 0.541 [[797 150 8] [ 9 91 23] [ 21 20 81]] -------------- original accuracy: 0.853 balanced accuracy: 0.528 kappa: 0.410 [[952 1 2] [114 6 2] [ 57 0 66]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.869 balanced accuracy: 0.703 kappa: 0.597 [[900 39 16] [ 64 54 4] [ 27 7 89]] global kappa rebalanced thresholds: (0.7000000000000001, 0.2) accuracy: 0.870 balanced accuracy: 0.715 kappa: 0.605 [[896 34 25] [ 64 52 6] [ 24 3 96]] global balanced_accuracy rebalanced thresholds: (0.8, 0.2) accuracy: 0.812 balanced accuracy: 0.803 kappa: 0.562 [[780 146 29] [ 20 96 6] [ 13 11 99]] -------------- original accuracy: 0.903 balanced accuracy: 0.684 kappa: 0.661 [[955 0 0] [ 50 67 5] [ 55 6 62]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.922 balanced accuracy: 0.789 kappa: 0.764 [[937 8 10] [ 28 72 22] [ 20 5 98]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.921 balanced accuracy: 0.770 kappa: 0.753 [[943 3 9] [ 33 67 22] [ 23 5 95]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.906 balanced accuracy: 0.794 kappa: 0.731 [[912 31 12] [ 23 77 22] [ 13 12 98]] -------------- original accuracy: 0.879 balanced accuracy: 0.612 kappa: 0.550 [[952 3 0] [ 60 59 4] [ 77 1 44]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.898 balanced accuracy: 0.744 kappa: 0.687 [[923 20 12] [ 24 86 13] [ 44 9 69]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.898 balanced accuracy: 0.744 kappa: 0.687 [[923 20 12] [ 24 86 13] [ 44 9 69]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.871 balanced accuracy: 0.775 kappa: 0.648 [[872 63 20] [ 14 88 21] [ 32 5 85]] -------------- original accuracy: 0.877 balanced accuracy: 0.604 kappa: 0.542 [[953 0 0] [ 57 62 4] [ 83 3 38]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.908 balanced accuracy: 0.741 kappa: 0.716 [[937 9 7] [ 25 74 24] [ 35 10 79]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.908 balanced accuracy: 0.741 kappa: 0.716 [[937 9 7] [ 25 74 24] [ 35 10 79]] global balanced_accuracy rebalanced thresholds: (0.7000000000000001, 0.25) accuracy: 0.897 balanced accuracy: 0.753 kappa: 0.700 [[917 29 7] [ 19 80 24] [ 28 16 80]] -------------- original accuracy: 0.887 balanced accuracy: 0.636 kappa: 0.590 [[953 1 0] [ 62 59 3] [ 65 4 53]] rebalanced thresholds: [0.6500000000000001, 0.3] accuracy: 0.916 balanced accuracy: 0.774 kappa: 0.740 [[934 17 3] [ 33 81 10] [ 28 10 84]] global kappa rebalanced thresholds: (0.6500000000000001, 0.3) accuracy: 0.916 balanced accuracy: 0.774 kappa: 0.740 [[934 17 3] [ 33 81 10] [ 28 10 84]] global balanced_accuracy rebalanced thresholds: (0.8, 0.25) accuracy: 0.860 balanced accuracy: 0.781 kappa: 0.637 [[854 92 8] [ 20 88 16] [ 13 19 90]] -------------- original accuracy: 0.906 balanced accuracy: 0.703 kappa: 0.679 [[951 3 2] [ 41 78 3] [ 52 12 58]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.901 balanced accuracy: 0.765 kappa: 0.706 [[918 31 7] [ 20 84 18] [ 30 13 79]] global kappa rebalanced thresholds: (0.6000000000000001, 0.25) accuracy: 0.912 balanced accuracy: 0.763 kappa: 0.728 [[934 15 7] [ 22 82 18] [ 34 10 78]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.25) accuracy: 0.873 balanced accuracy: 0.775 kappa: 0.661 [[876 73 7] [ 12 92 18] [ 17 25 80]] -------------- original accuracy: 0.908 balanced accuracy: 0.707 kappa: 0.686 [[952 0 2] [ 55 64 4] [ 43 6 74]] rebalanced thresholds: [0.7000000000000001, 0.3] accuracy: 0.922 balanced accuracy: 0.811 kappa: 0.770 [[926 24 4] [ 19 94 10] [ 21 16 86]] global kappa rebalanced thresholds: (0.6000000000000001, 0.3) accuracy: 0.931 balanced accuracy: 0.796 kappa: 0.785 [[945 5 4] [ 27 86 10] [ 26 11 86]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.3) accuracy: 0.902 balanced accuracy: 0.812 kappa: 0.727 [[898 52 4] [ 15 98 10] [ 18 19 86]] -------------- original accuracy: 0.864 balanced accuracy: 0.567 kappa: 0.475 [[951 5 0] [ 64 56 2] [ 89 3 30]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.890 balanced accuracy: 0.716 kappa: 0.652 [[924 28 4] [ 34 80 8] [ 48 10 64]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.890 balanced accuracy: 0.716 kappa: 0.652 [[924 28 4] [ 34 80 8] [ 48 10 64]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.839 balanced accuracy: 0.718 kappa: 0.565 [[853 75 28] [ 26 78 18] [ 33 13 76]] -------------- original accuracy: 0.894 balanced accuracy: 0.659 kappa: 0.622 [[953 0 2] [ 59 60 3] [ 57 6 60]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.907 balanced accuracy: 0.788 kappa: 0.725 [[917 23 15] [ 33 77 12] [ 22 6 95]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.907 balanced accuracy: 0.788 kappa: 0.725 [[917 23 15] [ 33 77 12] [ 22 6 95]] global balanced_accuracy rebalanced thresholds: (0.8, 0.25) accuracy: 0.845 balanced accuracy: 0.814 kappa: 0.625 [[820 119 16] [ 12 98 12] [ 7 20 96]] -------------- original accuracy: 0.859 balanced accuracy: 0.542 kappa: 0.444 [[954 0 0] [ 70 47 5] [ 92 2 30]] rebalanced thresholds: [0.7000000000000001, 0.2] accuracy: 0.884 balanced accuracy: 0.710 kappa: 0.645 [[917 21 16] [ 31 66 25] [ 41 5 78]] global kappa rebalanced thresholds: (0.7000000000000001, 0.25) accuracy: 0.882 balanced accuracy: 0.700 kappa: 0.637 [[918 29 7] [ 31 74 17] [ 41 17 66]] global balanced_accuracy rebalanced thresholds: (0.8, 0.2) accuracy: 0.808 balanced accuracy: 0.717 kappa: 0.530 [[810 125 19] [ 18 79 25] [ 20 23 81]] -------------- original accuracy: 0.848 balanced accuracy: 0.502 kappa: 0.369 [[955 0 0] [105 14 3] [ 75 0 48]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.895 balanced accuracy: 0.738 kappa: 0.676 [[921 27 7] [ 38 74 10] [ 33 11 79]] global kappa rebalanced thresholds: (0.7000000000000001, 0.2) accuracy: 0.896 balanced accuracy: 0.745 kappa: 0.681 [[919 20 16] [ 38 68 16] [ 31 4 88]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.872 balanced accuracy: 0.775 kappa: 0.650 [[873 61 21] [ 23 83 16] [ 23 10 90]] -------------- original accuracy: 0.856 balanced accuracy: 0.532 kappa: 0.420 [[954 1 0] [ 83 37 3] [ 86 0 36]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.891 balanced accuracy: 0.717 kappa: 0.657 [[924 17 14] [ 47 66 10] [ 31 12 79]] global kappa rebalanced thresholds: (0.6500000000000001, 0.25) accuracy: 0.891 balanced accuracy: 0.717 kappa: 0.657 [[924 17 14] [ 47 66 10] [ 31 12 79]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.848 balanced accuracy: 0.749 kappa: 0.595 [[851 73 31] [ 33 74 16] [ 18 12 92]] -------------- original accuracy: 0.846 balanced accuracy: 0.508 kappa: 0.390 [[950 0 2] [ 96 22 7] [ 74 6 43]] rebalanced thresholds: [0.7000000000000001, 0.25] accuracy: 0.863 balanced accuracy: 0.707 kappa: 0.599 [[888 44 20] [ 46 62 17] [ 29 9 85]] global kappa rebalanced thresholds: (0.7000000000000001, 0.2) accuracy: 0.861 balanced accuracy: 0.711 kappa: 0.597 [[884 38 30] [ 46 58 21] [ 29 3 91]] global balanced_accuracy rebalanced thresholds: (0.8, 0.2) accuracy: 0.795 balanced accuracy: 0.734 kappa: 0.513 [[783 135 34] [ 24 80 21] [ 20 12 91]] -------------- original accuracy: 0.858 balanced accuracy: 0.558 kappa: 0.454 [[946 2 5] [ 88 36 0] [ 73 2 48]] rebalanced thresholds: [0.7000000000000001, 0.3] accuracy: 0.895 balanced accuracy: 0.749 kappa: 0.688 [[915 29 9] [ 36 83 5] [ 22 25 76]] global kappa rebalanced thresholds: (0.6500000000000001, 0.3) accuracy: 0.902 balanced accuracy: 0.738 kappa: 0.695 [[930 14 9] [ 42 77 5] [ 31 16 76]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.887 balanced accuracy: 0.781 kappa: 0.683 [[890 39 24] [ 29 81 14] [ 18 12 93]] -------------- original accuracy: 0.887 balanced accuracy: 0.644 kappa: 0.588 [[949 4 2] [ 56 67 0] [ 72 2 48]] rebalanced thresholds: [0.6500000000000001, 0.25] accuracy: 0.912 balanced accuracy: 0.800 kappa: 0.737 [[919 21 15] [ 28 87 8] [ 31 2 89]] global kappa rebalanced thresholds: (0.6000000000000001, 0.25) accuracy: 0.909 balanced accuracy: 0.768 kappa: 0.716 [[928 13 14] [ 39 76 8] [ 33 2 87]] global balanced_accuracy rebalanced thresholds: (0.7500000000000001, 0.2) accuracy: 0.890 balanced accuracy: 0.828 kappa: 0.704 [[876 47 32] [ 13 97 13] [ 21 6 95]]
accum = accum_80_10_10
figsize(9,6)
scatter([x['orig-kappa'] for x in accum],[x['shift-kappa'] for x in accum],label='kappa');
scatter([x['orig-balanced'] for x in accum],[x['shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['orig-accuracy'] for x in accum],[x['shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('orig')
ylabel('greedy shift');
title('80-10-10');
scatter([x['shift-kappa'] for x in accum],[x['global-k-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-k-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-k-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-kappa');
title('80-10-10');
scatter([x['shift-kappa'] for x in accum],[x['global-ba-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-ba-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-ba-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-balanced');
title('80-10-10');
Same conclusions as before (good thing!)
accum_10_10_80 = []
for rep in range(50):
print('--------------')
# Generate a ternary imbalanced classification problem
X, y = make_classification(n_samples=6000, n_features=20,
n_informative=10, n_redundant=0, n_classes=3,
random_state=0xf00d+rep, shuffle=False, weights = [0.1, 0.1, 0.8])
run_ternary_experiment(X,y,accum_10_10_80)
-------------- original accuracy: 0.877 balanced accuracy: 0.604 kappa: 0.534 [[ 54 1 69] [ 1 46 74] [ 3 0 952]] rebalanced thresholds: [0.3, 0.7000000000000001] accuracy: 0.893 balanced accuracy: 0.742 kappa: 0.681 [[ 74 25 25] [ 8 81 32] [ 10 28 917]] global kappa rebalanced thresholds: (0.25, 0.7000000000000001) accuracy: 0.897 balanced accuracy: 0.755 kappa: 0.694 [[ 84 15 25] [ 13 76 32] [ 16 22 917]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.840 balanced accuracy: 0.767 kappa: 0.597 [[ 95 16 13] [ 21 80 20] [ 34 88 833]] -------------- original accuracy: 0.857 balanced accuracy: 0.534 kappa: 0.425 [[ 47 0 76] [ 3 27 93] [ 0 0 954]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.897 balanced accuracy: 0.791 kappa: 0.705 [[ 79 10 34] [ 11 97 15] [ 11 43 900]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.906 balanced accuracy: 0.767 kappa: 0.712 [[ 78 6 39] [ 11 86 26] [ 10 21 923]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.870 balanced accuracy: 0.797 kappa: 0.658 [[ 89 9 25] [ 18 94 11] [ 27 66 861]] -------------- original accuracy: 0.869 balanced accuracy: 0.580 kappa: 0.497 [[ 54 0 67] [ 2 37 86] [ 1 1 952]] rebalanced thresholds: [0.2, 0.7000000000000001] accuracy: 0.898 balanced accuracy: 0.781 kappa: 0.705 [[ 93 4 24] [ 20 78 27] [ 21 26 907]] global kappa rebalanced thresholds: (0.2, 0.7000000000000001) accuracy: 0.898 balanced accuracy: 0.781 kappa: 0.705 [[ 93 4 24] [ 20 78 27] [ 21 26 907]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.839 balanced accuracy: 0.798 kappa: 0.610 [[ 95 14 12] [ 20 94 11] [ 23 113 818]] -------------- original accuracy: 0.878 balanced accuracy: 0.610 kappa: 0.543 [[ 50 1 72] [ 1 52 69] [ 2 1 952]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.903 balanced accuracy: 0.763 kappa: 0.701 [[ 78 9 36] [ 2 84 36] [ 15 18 922]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.904 balanced accuracy: 0.775 kappa: 0.707 [[ 86 3 34] [ 5 81 36] [ 25 12 918]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.863 balanced accuracy: 0.791 kappa: 0.637 [[ 87 12 24] [ 5 94 23] [ 30 70 855]] -------------- original accuracy: 0.870 balanced accuracy: 0.583 kappa: 0.503 [[ 61 1 60] [ 3 31 89] [ 2 1 952]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.893 balanced accuracy: 0.732 kappa: 0.669 [[ 77 15 30] [ 4 74 45] [ 8 26 921]] global kappa rebalanced thresholds: (0.3, 0.6500000000000001) accuracy: 0.893 balanced accuracy: 0.732 kappa: 0.669 [[ 77 15 30] [ 4 74 45] [ 8 26 921]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.840 balanced accuracy: 0.788 kappa: 0.604 [[ 98 15 9] [ 16 86 21] [ 30 101 824]] -------------- original accuracy: 0.882 balanced accuracy: 0.617 kappa: 0.569 [[ 35 11 77] [ 3 69 50] [ 0 0 955]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.881 balanced accuracy: 0.701 kappa: 0.641 [[ 72 10 41] [ 33 68 21] [ 9 29 917]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.892 balanced accuracy: 0.682 kappa: 0.646 [[ 56 13 54] [ 18 74 30] [ 3 12 940]] global balanced_accuracy rebalanced thresholds: (0.3, 0.8) accuracy: 0.812 balanced accuracy: 0.697 kappa: 0.531 [[ 58 40 25] [ 18 92 12] [ 4 126 825]] -------------- original accuracy: 0.859 balanced accuracy: 0.552 kappa: 0.451 [[ 53 0 71] [ 1 29 95] [ 0 2 949]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.887 balanced accuracy: 0.762 kappa: 0.677 [[ 83 16 25] [ 13 84 28] [ 13 41 897]] global kappa rebalanced thresholds: (0.25, 0.7000000000000001) accuracy: 0.887 balanced accuracy: 0.762 kappa: 0.677 [[ 83 16 25] [ 13 84 28] [ 13 41 897]] global balanced_accuracy rebalanced thresholds: (0.25, 0.8) accuracy: 0.823 balanced accuracy: 0.772 kappa: 0.579 [[ 83 30 11] [ 13 100 12] [ 13 134 804]] -------------- original accuracy: 0.865 balanced accuracy: 0.566 kappa: 0.484 [[ 40 6 77] [ 4 46 73] [ 0 2 952]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.877 balanced accuracy: 0.734 kappa: 0.650 [[ 70 22 31] [ 19 85 19] [ 9 47 898]] global kappa rebalanced thresholds: (0.3, 0.6500000000000001) accuracy: 0.886 balanced accuracy: 0.716 kappa: 0.654 [[ 61 21 41] [ 13 85 25] [ 4 33 917]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.851 balanced accuracy: 0.727 kappa: 0.603 [[ 70 30 23] [ 19 87 17] [ 9 81 864]] -------------- original accuracy: 0.881 balanced accuracy: 0.613 kappa: 0.559 [[ 49 7 68] [ 3 54 65] [ 0 0 954]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.896 balanced accuracy: 0.739 kappa: 0.682 [[ 78 12 34] [ 13 76 33] [ 13 20 921]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.896 balanced accuracy: 0.739 kappa: 0.682 [[ 78 12 34] [ 13 76 33] [ 13 20 921]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.863 balanced accuracy: 0.756 kappa: 0.632 [[ 78 27 19] [ 13 89 20] [ 16 70 868]] -------------- original accuracy: 0.922 balanced accuracy: 0.747 kappa: 0.738 [[ 87 2 33] [ 7 64 50] [ 1 0 956]] rebalanced thresholds: [0.3, 0.7000000000000001] accuracy: 0.948 balanced accuracy: 0.868 kappa: 0.847 [[104 11 7] [ 9 93 19] [ 5 11 941]] global kappa rebalanced thresholds: (0.3, 0.7000000000000001) accuracy: 0.948 balanced accuracy: 0.868 kappa: 0.847 [[104 11 7] [ 9 93 19] [ 5 11 941]] global balanced_accuracy rebalanced thresholds: (0.3, 0.8) accuracy: 0.899 balanced accuracy: 0.869 kappa: 0.737 [[104 13 5] [ 9 102 10] [ 5 79 873]] -------------- original accuracy: 0.858 balanced accuracy: 0.545 kappa: 0.446 [[ 48 4 72] [ 1 31 92] [ 1 0 951]] rebalanced thresholds: [0.3, 0.7000000000000001] accuracy: 0.879 balanced accuracy: 0.729 kappa: 0.645 [[ 83 25 16] [ 3 71 50] [ 9 42 901]] global kappa rebalanced thresholds: (0.25, 0.7000000000000001) accuracy: 0.883 balanced accuracy: 0.745 kappa: 0.658 [[ 96 12 16] [ 10 64 50] [ 19 33 900]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.830 balanced accuracy: 0.807 kappa: 0.597 [[109 7 8] [ 19 87 18] [ 46 106 800]] -------------- original accuracy: 0.907 balanced accuracy: 0.705 kappa: 0.686 [[ 79 11 34] [ 6 59 58] [ 1 2 950]] rebalanced thresholds: [0.35000000000000003, 0.6500000000000001] accuracy: 0.907 balanced accuracy: 0.773 kappa: 0.724 [[ 84 18 22] [ 8 83 32] [ 2 29 922]] global kappa rebalanced thresholds: (0.35000000000000003, 0.55) accuracy: 0.916 balanced accuracy: 0.753 kappa: 0.734 [[ 83 16 25] [ 8 74 41] [ 2 9 942]] global balanced_accuracy rebalanced thresholds: (0.35000000000000003, 0.7500000000000001) accuracy: 0.868 balanced accuracy: 0.769 kappa: 0.645 [[ 84 24 16] [ 8 88 27] [ 2 81 870]] -------------- original accuracy: 0.864 balanced accuracy: 0.565 kappa: 0.475 [[ 45 1 77] [ 3 41 79] [ 3 0 951]] rebalanced thresholds: [0.3, 0.7000000000000001] accuracy: 0.872 balanced accuracy: 0.724 kappa: 0.627 [[ 68 24 31] [ 5 84 34] [ 12 48 894]] global kappa rebalanced thresholds: (0.2, 0.6500000000000001) accuracy: 0.882 balanced accuracy: 0.719 kappa: 0.640 [[ 86 3 34] [ 19 62 42] [ 31 13 910]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.858 balanced accuracy: 0.754 kappa: 0.617 [[ 92 11 20] [ 19 75 29] [ 38 54 862]] -------------- original accuracy: 0.880 balanced accuracy: 0.612 kappa: 0.558 [[ 59 3 61] [ 8 44 71] [ 1 0 953]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.901 balanced accuracy: 0.779 kappa: 0.708 [[ 86 7 30] [ 14 84 25] [ 16 27 911]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.901 balanced accuracy: 0.779 kappa: 0.708 [[ 86 7 30] [ 14 84 25] [ 16 27 911]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.8) accuracy: 0.832 balanced accuracy: 0.790 kappa: 0.595 [[104 4 15] [ 31 83 9] [ 64 79 811]] -------------- original accuracy: 0.881 balanced accuracy: 0.611 kappa: 0.558 [[ 51 4 68] [ 7 51 64] [ 0 0 955]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.901 balanced accuracy: 0.742 kappa: 0.691 [[ 87 7 29] [ 13 67 42] [ 4 24 927]] global kappa rebalanced thresholds: (0.25, 0.6000000000000001) accuracy: 0.907 balanced accuracy: 0.731 kappa: 0.699 [[ 85 4 34] [ 13 63 46] [ 2 12 941]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.863 balanced accuracy: 0.755 kappa: 0.627 [[ 96 7 20] [ 25 70 27] [ 20 66 869]] -------------- original accuracy: 0.859 balanced accuracy: 0.540 kappa: 0.440 [[ 36 1 86] [ 5 40 77] [ 0 0 955]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.884 balanced accuracy: 0.764 kappa: 0.673 [[ 82 19 22] [ 15 84 23] [ 22 38 895]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.897 balanced accuracy: 0.739 kappa: 0.687 [[ 82 14 27] [ 15 71 36] [ 15 16 924]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.846 balanced accuracy: 0.762 kappa: 0.607 [[ 82 26 15] [ 15 90 17] [ 22 90 843]] -------------- original accuracy: 0.877 balanced accuracy: 0.623 kappa: 0.549 [[ 53 1 69] [ 0 55 67] [ 4 7 944]] rebalanced thresholds: [0.3, 0.7000000000000001] accuracy: 0.860 balanced accuracy: 0.728 kappa: 0.598 [[ 77 14 32] [ 2 78 42] [ 21 57 877]] global kappa rebalanced thresholds: (0.25, 0.6000000000000001) accuracy: 0.875 balanced accuracy: 0.708 kappa: 0.609 [[ 78 4 41] [ 3 66 53] [ 23 26 906]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.840 balanced accuracy: 0.774 kappa: 0.585 [[ 95 7 21] [ 3 83 36] [ 56 69 830]] -------------- original accuracy: 0.886 balanced accuracy: 0.632 kappa: 0.583 [[ 60 5 57] [ 3 50 70] [ 2 0 953]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.912 balanced accuracy: 0.766 kappa: 0.728 [[ 91 3 28] [ 20 71 32] [ 7 16 932]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.912 balanced accuracy: 0.766 kappa: 0.728 [[ 91 3 28] [ 20 71 32] [ 7 16 932]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.885 balanced accuracy: 0.784 kappa: 0.682 [[ 92 11 19] [ 20 82 21] [ 8 59 888]] -------------- original accuracy: 0.898 balanced accuracy: 0.675 kappa: 0.643 [[ 75 6 41] [ 7 50 64] [ 2 2 953]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.916 balanced accuracy: 0.795 kappa: 0.748 [[ 94 13 15] [ 12 78 31] [ 9 21 927]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.919 balanced accuracy: 0.774 kappa: 0.747 [[ 94 9 19] [ 12 69 40] [ 8 9 940]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7000000000000001) accuracy: 0.907 balanced accuracy: 0.807 kappa: 0.732 [[103 4 15] [ 20 76 25] [ 21 27 909]] -------------- original accuracy: 0.905 balanced accuracy: 0.712 kappa: 0.676 [[ 65 0 57] [ 5 75 42] [ 2 8 946]] rebalanced thresholds: [0.3, 0.6000000000000001] accuracy: 0.918 balanced accuracy: 0.789 kappa: 0.748 [[ 85 5 32] [ 9 85 28] [ 6 18 932]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.918 balanced accuracy: 0.789 kappa: 0.748 [[ 85 5 32] [ 9 85 28] [ 6 18 932]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.881 balanced accuracy: 0.817 kappa: 0.684 [[ 94 12 16] [ 13 94 15] [ 17 70 869]] -------------- original accuracy: 0.865 balanced accuracy: 0.568 kappa: 0.478 [[ 53 1 70] [ 2 34 86] [ 2 1 951]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.889 balanced accuracy: 0.722 kappa: 0.654 [[ 77 9 38] [ 8 71 43] [ 12 23 919]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.894 balanced accuracy: 0.741 kappa: 0.671 [[ 84 4 36] [ 8 71 43] [ 20 16 918]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.849 balanced accuracy: 0.767 kappa: 0.605 [[ 87 16 21] [ 8 87 27] [ 23 86 845]] -------------- original accuracy: 0.902 balanced accuracy: 0.683 kappa: 0.656 [[ 55 6 61] [ 4 73 45] [ 1 0 955]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.910 balanced accuracy: 0.767 kappa: 0.726 [[ 86 11 25] [ 18 76 28] [ 8 18 930]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.915 balanced accuracy: 0.745 kappa: 0.727 [[ 75 12 35] [ 12 77 33] [ 5 5 946]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.894 balanced accuracy: 0.763 kappa: 0.694 [[ 95 8 19] [ 29 68 25] [ 15 31 910]] -------------- original accuracy: 0.859 balanced accuracy: 0.548 kappa: 0.450 [[ 40 3 82] [ 2 40 81] [ 1 0 951]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.881 balanced accuracy: 0.716 kappa: 0.641 [[ 66 18 41] [ 8 82 33] [ 7 36 909]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.890 balanced accuracy: 0.697 kappa: 0.648 [[ 66 16 43] [ 8 72 43] [ 7 15 930]] global balanced_accuracy rebalanced thresholds: (0.3, 0.8) accuracy: 0.769 balanced accuracy: 0.710 kappa: 0.478 [[ 66 43 16] [ 8 99 16] [ 7 187 758]] -------------- original accuracy: 0.892 balanced accuracy: 0.652 kappa: 0.607 [[ 65 1 57] [ 1 53 69] [ 1 1 952]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.909 balanced accuracy: 0.790 kappa: 0.728 [[ 93 5 25] [ 10 80 33] [ 15 21 918]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.909 balanced accuracy: 0.790 kappa: 0.728 [[ 93 5 25] [ 10 80 33] [ 15 21 918]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.875 balanced accuracy: 0.822 kappa: 0.677 [[ 95 15 13] [ 10 98 15] [ 16 81 857]] -------------- original accuracy: 0.866 balanced accuracy: 0.567 kappa: 0.485 [[ 30 6 88] [ 1 57 66] [ 0 0 952]] rebalanced thresholds: [0.2, 0.6500000000000001] accuracy: 0.883 balanced accuracy: 0.691 kappa: 0.629 [[ 65 3 56] [ 18 72 34] [ 16 13 923]] global kappa rebalanced thresholds: (0.2, 0.6500000000000001) accuracy: 0.883 balanced accuracy: 0.691 kappa: 0.629 [[ 65 3 56] [ 18 72 34] [ 16 13 923]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.834 balanced accuracy: 0.741 kappa: 0.578 [[ 72 21 31] [ 18 95 11] [ 22 96 834]] -------------- original accuracy: 0.892 balanced accuracy: 0.650 kappa: 0.606 [[ 47 2 72] [ 1 70 53] [ 0 2 953]] rebalanced thresholds: [0.2, 0.7000000000000001] accuracy: 0.894 balanced accuracy: 0.790 kappa: 0.697 [[ 85 7 29] [ 13 90 21] [ 30 27 898]] global kappa rebalanced thresholds: (0.2, 0.6500000000000001) accuracy: 0.898 balanced accuracy: 0.765 kappa: 0.695 [[ 81 5 35] [ 13 83 28] [ 26 15 914]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.841 balanced accuracy: 0.805 kappa: 0.612 [[ 86 15 20] [ 13 105 6] [ 34 103 818]] -------------- original accuracy: 0.896 balanced accuracy: 0.674 kappa: 0.634 [[ 57 4 62] [ 3 69 50] [ 1 5 949]] rebalanced thresholds: [0.2, 0.6500000000000001] accuracy: 0.918 balanced accuracy: 0.800 kappa: 0.754 [[ 85 5 33] [ 12 90 20] [ 13 15 927]] global kappa rebalanced thresholds: (0.3, 0.6500000000000001) accuracy: 0.915 balanced accuracy: 0.784 kappa: 0.741 [[ 75 11 37] [ 8 94 20] [ 4 22 929]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.899 balanced accuracy: 0.830 kappa: 0.726 [[ 90 14 19] [ 12 101 9] [ 18 49 888]] -------------- original accuracy: 0.839 balanced accuracy: 0.479 kappa: 0.326 [[ 41 0 83] [ 3 13 107] [ 0 0 953]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.879 balanced accuracy: 0.705 kappa: 0.623 [[ 76 9 39] [ 6 67 50] [ 13 28 912]] global kappa rebalanced thresholds: (0.2, 0.7000000000000001) accuracy: 0.880 balanced accuracy: 0.710 kappa: 0.627 [[ 83 3 38] [ 11 62 50] [ 22 20 911]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.800 balanced accuracy: 0.740 kappa: 0.521 [[ 88 20 16] [ 11 84 28] [ 24 141 788]] -------------- original accuracy: 0.887 balanced accuracy: 0.634 kappa: 0.581 [[ 52 1 70] [ 1 59 63] [ 1 0 953]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.913 balanced accuracy: 0.789 kappa: 0.740 [[ 93 8 22] [ 13 79 31] [ 12 18 924]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.907 balanced accuracy: 0.739 kappa: 0.705 [[ 74 12 37] [ 8 78 37] [ 7 10 937]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.876 balanced accuracy: 0.799 kappa: 0.675 [[104 11 8] [ 26 79 18] [ 31 55 868]] -------------- original accuracy: 0.899 balanced accuracy: 0.674 kappa: 0.646 [[ 63 3 55] [ 11 62 50] [ 1 1 954]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.938 balanced accuracy: 0.826 kappa: 0.817 [[ 87 21 13] [ 16 95 12] [ 1 11 944]] global kappa rebalanced thresholds: (0.3, 0.6500000000000001) accuracy: 0.938 balanced accuracy: 0.826 kappa: 0.817 [[ 87 21 13] [ 16 95 12] [ 1 11 944]] global balanced_accuracy rebalanced thresholds: (0.3, 0.7500000000000001) accuracy: 0.916 balanced accuracy: 0.831 kappa: 0.768 [[ 87 26 8] [ 16 101 6] [ 1 44 911]] -------------- original accuracy: 0.895 balanced accuracy: 0.662 kappa: 0.623 [[ 43 2 78] [ 3 78 41] [ 2 0 953]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.903 balanced accuracy: 0.813 kappa: 0.728 [[ 80 14 29] [ 11 103 8] [ 22 32 901]] global kappa rebalanced thresholds: (0.25, 0.6000000000000001) accuracy: 0.913 balanced accuracy: 0.774 kappa: 0.731 [[ 77 6 40] [ 10 88 24] [ 15 9 931]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.880 balanced accuracy: 0.808 kappa: 0.681 [[ 87 11 25] [ 18 98 6] [ 41 43 871]] -------------- original accuracy: 0.870 balanced accuracy: 0.578 kappa: 0.497 [[ 28 2 92] [ 1 62 60] [ 0 1 954]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.894 balanced accuracy: 0.773 kappa: 0.690 [[ 80 10 32] [ 10 88 25] [ 17 33 905]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.908 balanced accuracy: 0.762 kappa: 0.716 [[ 77 8 37] [ 10 84 29] [ 13 13 929]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.874 balanced accuracy: 0.796 kappa: 0.664 [[100 5 17] [ 21 81 21] [ 40 47 868]] -------------- original accuracy: 0.882 balanced accuracy: 0.621 kappa: 0.562 [[ 61 2 60] [ 1 45 76] [ 0 2 953]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.904 balanced accuracy: 0.796 kappa: 0.725 [[ 91 14 18] [ 14 85 23] [ 3 43 909]] global kappa rebalanced thresholds: (0.25, 0.6000000000000001) accuracy: 0.920 balanced accuracy: 0.788 kappa: 0.756 [[ 91 10 22] [ 14 79 29] [ 1 20 934]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.868 balanced accuracy: 0.840 kappa: 0.669 [[102 8 13] [ 15 99 8] [ 9 106 840]] -------------- original accuracy: 0.861 balanced accuracy: 0.571 kappa: 0.472 [[ 33 0 89] [ 3 56 64] [ 1 10 944]] rebalanced thresholds: [0.2, 0.6500000000000001] accuracy: 0.876 balanced accuracy: 0.730 kappa: 0.627 [[ 67 2 53] [ 10 86 27] [ 25 32 898]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.881 balanced accuracy: 0.727 kappa: 0.634 [[ 63 2 57] [ 8 88 27] [ 14 35 906]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.8) accuracy: 0.801 balanced accuracy: 0.765 kappa: 0.533 [[ 91 5 26] [ 23 90 10] [ 65 110 780]] -------------- original accuracy: 0.864 balanced accuracy: 0.561 kappa: 0.474 [[ 48 1 73] [ 6 36 82] [ 1 0 953]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.892 balanced accuracy: 0.722 kappa: 0.663 [[ 80 7 35] [ 13 67 44] [ 13 17 924]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.892 balanced accuracy: 0.722 kappa: 0.663 [[ 80 7 35] [ 13 67 44] [ 13 17 924]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.851 balanced accuracy: 0.734 kappa: 0.594 [[ 80 16 26] [ 13 80 31] [ 14 79 861]] -------------- original accuracy: 0.892 balanced accuracy: 0.650 kappa: 0.614 [[ 48 7 68] [ 6 69 48] [ 0 1 953]] rebalanced thresholds: [0.3, 0.6000000000000001] accuracy: 0.902 balanced accuracy: 0.709 kappa: 0.691 [[ 70 18 35] [ 23 70 30] [ 0 11 943]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.902 balanced accuracy: 0.709 kappa: 0.691 [[ 70 18 35] [ 23 70 30] [ 0 11 943]] global balanced_accuracy rebalanced thresholds: (0.3, 0.7000000000000001) accuracy: 0.894 balanced accuracy: 0.748 kappa: 0.697 [[ 71 30 22] [ 23 87 13] [ 0 39 915]] -------------- original accuracy: 0.885 balanced accuracy: 0.629 kappa: 0.576 [[ 56 0 67] [ 5 53 64] [ 1 1 953]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.912 balanced accuracy: 0.764 kappa: 0.727 [[ 84 8 31] [ 11 77 34] [ 7 14 934]] global kappa rebalanced thresholds: (0.25, 0.6000000000000001) accuracy: 0.916 balanced accuracy: 0.753 kappa: 0.729 [[ 87 1 35] [ 16 69 37] [ 7 5 943]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.896 balanced accuracy: 0.809 kappa: 0.714 [[102 6 15] [ 24 81 17] [ 19 44 892]] -------------- original accuracy: 0.879 balanced accuracy: 0.609 kappa: 0.550 [[ 37 5 82] [ 2 65 56] [ 0 0 953]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.913 balanced accuracy: 0.797 kappa: 0.745 [[ 83 8 33] [ 15 93 15] [ 13 20 920]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.913 balanced accuracy: 0.797 kappa: 0.745 [[ 83 8 33] [ 15 93 15] [ 13 20 920]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.879 balanced accuracy: 0.822 kappa: 0.687 [[ 97 9 18] [ 19 96 8] [ 28 63 862]] -------------- original accuracy: 0.865 balanced accuracy: 0.564 kappa: 0.469 [[ 47 0 75] [ 0 38 85] [ 1 1 953]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.888 balanced accuracy: 0.735 kappa: 0.659 [[ 82 9 31] [ 10 71 42] [ 11 31 913]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.900 balanced accuracy: 0.728 kappa: 0.677 [[ 80 2 40] [ 10 68 45] [ 11 12 932]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.860 balanced accuracy: 0.747 kappa: 0.615 [[ 92 8 22] [ 19 71 33] [ 27 59 869]] -------------- original accuracy: 0.860 balanced accuracy: 0.559 kappa: 0.459 [[ 34 2 87] [ 1 50 72] [ 3 3 948]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.882 balanced accuracy: 0.747 kappa: 0.651 [[ 87 7 29] [ 10 73 40] [ 24 32 898]] global kappa rebalanced thresholds: (0.25, 0.6000000000000001) accuracy: 0.887 balanced accuracy: 0.726 kappa: 0.649 [[ 83 5 35] [ 10 67 46] [ 18 22 914]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.823 balanced accuracy: 0.756 kappa: 0.560 [[ 95 10 18] [ 19 79 25] [ 52 89 813]] -------------- original accuracy: 0.902 balanced accuracy: 0.685 kappa: 0.651 [[ 78 2 42] [ 1 51 70] [ 2 1 953]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.923 balanced accuracy: 0.818 kappa: 0.768 [[101 6 15] [ 4 80 38] [ 11 18 927]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.922 balanced accuracy: 0.815 kappa: 0.766 [[103 5 14] [ 7 77 38] [ 15 14 927]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.851 balanced accuracy: 0.842 kappa: 0.641 [[108 11 3] [ 12 96 14] [ 28 111 817]] -------------- original accuracy: 0.860 balanced accuracy: 0.547 kappa: 0.449 [[ 54 0 69] [ 5 25 93] [ 1 0 953]] rebalanced thresholds: [0.25, 0.7000000000000001] accuracy: 0.901 balanced accuracy: 0.772 kappa: 0.706 [[101 10 12] [ 13 66 44] [ 17 23 914]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.902 balanced accuracy: 0.737 kappa: 0.693 [[101 7 15] [ 13 51 59] [ 14 9 931]] global balanced_accuracy rebalanced thresholds: (0.2, 0.8) accuracy: 0.819 balanced accuracy: 0.797 kappa: 0.573 [[111 5 7] [ 18 81 24] [ 35 128 791]] -------------- original accuracy: 0.851 balanced accuracy: 0.518 kappa: 0.396 [[ 29 2 92] [ 2 39 81] [ 1 1 953]] rebalanced thresholds: [0.2, 0.7000000000000001] accuracy: 0.861 balanced accuracy: 0.702 kappa: 0.600 [[ 76 10 37] [ 28 68 26] [ 33 33 889]] global kappa rebalanced thresholds: (0.25, 0.7000000000000001) accuracy: 0.871 balanced accuracy: 0.725 kappa: 0.623 [[ 69 13 41] [ 13 83 26] [ 19 43 893]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.846 balanced accuracy: 0.724 kappa: 0.587 [[ 79 12 32] [ 28 77 17] [ 35 61 859]] -------------- original accuracy: 0.876 balanced accuracy: 0.604 kappa: 0.540 [[ 58 3 62] [ 7 42 73] [ 2 2 951]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.901 balanced accuracy: 0.773 kappa: 0.707 [[ 89 12 22] [ 12 78 32] [ 10 31 914]] global kappa rebalanced thresholds: (0.3, 0.6500000000000001) accuracy: 0.901 balanced accuracy: 0.773 kappa: 0.707 [[ 89 12 22] [ 12 78 32] [ 10 31 914]] global balanced_accuracy rebalanced thresholds: (0.3, 0.7000000000000001) accuracy: 0.892 balanced accuracy: 0.784 kappa: 0.695 [[ 89 19 15] [ 12 84 26] [ 10 48 897]] -------------- original accuracy: 0.833 balanced accuracy: 0.451 kappa: 0.275 [[ 21 2 99] [ 3 22 97] [ 0 0 956]] rebalanced thresholds: [0.2, 0.7000000000000001] accuracy: 0.899 balanced accuracy: 0.731 kappa: 0.692 [[ 86 10 26] [ 30 63 29] [ 9 17 930]] global kappa rebalanced thresholds: (0.2, 0.7000000000000001) accuracy: 0.899 balanced accuracy: 0.731 kappa: 0.692 [[ 86 10 26] [ 30 63 29] [ 9 17 930]] global balanced_accuracy rebalanced thresholds: (0.15000000000000002, 0.8) accuracy: 0.817 balanced accuracy: 0.728 kappa: 0.557 [[101 11 10] [ 53 61 8] [ 34 104 818]] -------------- original accuracy: 0.882 balanced accuracy: 0.623 kappa: 0.580 [[ 44 14 65] [ 6 63 54] [ 1 1 952]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.909 balanced accuracy: 0.759 kappa: 0.726 [[ 72 22 29] [ 14 88 21] [ 7 16 931]] global kappa rebalanced thresholds: (0.3, 0.6500000000000001) accuracy: 0.909 balanced accuracy: 0.759 kappa: 0.726 [[ 72 22 29] [ 14 88 21] [ 7 16 931]] global balanced_accuracy rebalanced thresholds: (0.3, 0.7500000000000001) accuracy: 0.876 balanced accuracy: 0.766 kappa: 0.667 [[ 73 36 14] [ 14 96 13] [ 8 64 882]] -------------- original accuracy: 0.857 balanced accuracy: 0.534 kappa: 0.436 [[ 37 3 82] [ 8 37 79] [ 0 0 954]] rebalanced thresholds: [0.2, 0.7000000000000001] accuracy: 0.881 balanced accuracy: 0.700 kappa: 0.640 [[ 78 14 30] [ 28 62 34] [ 16 21 917]] global kappa rebalanced thresholds: (0.2, 0.7000000000000001) accuracy: 0.881 balanced accuracy: 0.700 kappa: 0.640 [[ 78 14 30] [ 28 62 34] [ 16 21 917]] global balanced_accuracy rebalanced thresholds: (0.2, 0.7500000000000001) accuracy: 0.863 balanced accuracy: 0.721 kappa: 0.620 [[ 81 21 20] [ 28 71 25] [ 18 53 883]] -------------- original accuracy: 0.886 balanced accuracy: 0.633 kappa: 0.588 [[ 46 7 68] [ 5 65 55] [ 2 0 952]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.895 balanced accuracy: 0.731 kappa: 0.684 [[ 71 17 33] [ 21 80 24] [ 8 23 923]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.902 balanced accuracy: 0.720 kappa: 0.691 [[ 71 13 37] [ 21 74 30] [ 6 11 937]] global balanced_accuracy rebalanced thresholds: (0.3, 0.7500000000000001) accuracy: 0.871 balanced accuracy: 0.740 kappa: 0.646 [[ 71 29 21] [ 21 88 16] [ 8 60 886]] -------------- original accuracy: 0.882 balanced accuracy: 0.625 kappa: 0.569 [[ 46 4 73] [ 2 63 60] [ 1 2 949]] rebalanced thresholds: [0.3, 0.6500000000000001] accuracy: 0.902 balanced accuracy: 0.776 kappa: 0.714 [[ 76 26 21] [ 3 94 28] [ 6 34 912]] global kappa rebalanced thresholds: (0.3, 0.6000000000000001) accuracy: 0.907 balanced accuracy: 0.762 kappa: 0.715 [[ 75 17 31] [ 3 88 34] [ 5 22 925]] global balanced_accuracy rebalanced thresholds: (0.3, 0.7500000000000001) accuracy: 0.869 balanced accuracy: 0.799 kappa: 0.663 [[ 76 31 16] [ 3 110 12] [ 6 89 857]] -------------- original accuracy: 0.892 balanced accuracy: 0.671 kappa: 0.620 [[ 54 3 66] [ 0 72 51] [ 8 2 944]] rebalanced thresholds: [0.25, 0.6500000000000001] accuracy: 0.907 balanced accuracy: 0.796 kappa: 0.725 [[ 84 8 31] [ 7 92 24] [ 20 22 912]] global kappa rebalanced thresholds: (0.25, 0.6500000000000001) accuracy: 0.907 balanced accuracy: 0.796 kappa: 0.725 [[ 84 8 31] [ 7 92 24] [ 20 22 912]] global balanced_accuracy rebalanced thresholds: (0.25, 0.7500000000000001) accuracy: 0.883 balanced accuracy: 0.814 kappa: 0.690 [[ 88 19 16] [ 7 100 16] [ 21 61 872]]
accum = accum_10_10_80
figsize(9,6)
scatter([x['orig-kappa'] for x in accum],[x['shift-kappa'] for x in accum],label='kappa');
scatter([x['orig-balanced'] for x in accum],[x['shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['orig-accuracy'] for x in accum],[x['shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('orig')
ylabel('greedy shift');
title('10-10-80');
scatter([x['shift-kappa'] for x in accum],[x['global-k-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-k-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-k-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-kappa');
title('10-10-80');
scatter([x['shift-kappa'] for x in accum],[x['global-ba-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-ba-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-ba-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-balanced');
title('10-10-80');
Same conclusions as before (good thing!)
Let's just be sure that this approach works with bioactivity data too. I don't think it's necessary do a comprehensive evaluation here, but I want to show a couple of examples. I didn't cherry pick these.
data = pd.read_csv('../data/target_CHEMBL205.csv.gz')
PandasTools.AddMoleculeColumnToFrame(data,smilesCol='canonical_smiles')
data['pKi'] = [-math.log10(x*1e-9) for x in data['standard_value']]
data.head()
compound_chembl_id | canonical_smiles | standard_value | standard_units | standard_relation | standard_type | year | ROMol | pKi | |
---|---|---|---|---|---|---|---|---|---|
0 | CHEMBL1054 | NS(=O)(=O)c1cc2c(cc1Cl)NC(C(Cl)Cl)NS2(=O)=O | 91.0 | nM | = | Ki | 2009 | 7.040959 | |
1 | CHEMBL1055 | NS(=O)(=O)c1cc(C2(O)NC(=O)c3ccccc32)ccc1Cl | 138.0 | nM | = | Ki | 2009 | 6.860121 | |
2 | CHEMBL1060 | O=P([O-])([O-])O.[Na+].[Na+] | 13200000.0 | nM | = | Ki | 2004 | 1.879426 | |
3 | CHEMBL106848 | NS(=O)(=O)c1ccc(SCCO)cc1 | 21.0 | nM | = | Ki | 2013 | 7.677781 | |
4 | CHEMBL107217 | CCN(CC)C(=S)[S-].[Na+] | 3100.0 | nM | = | Ki | 2009 | 5.508638 |
Pick two pKi values for binning
def binner(act,bins=(5,8.5)):
for i,bin in enumerate(bins):
if act<=bin:
return i
return len(bins)
data['activity'] = [binner(x) for x in data.pKi]
data.groupby('activity').describe()
standard_value | year | pKi | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | mean | std | min | 25% | 50% | 75% | max | count | mean | ... | 75% | max | count | mean | std | min | 25% | 50% | 75% | max | |
activity | |||||||||||||||||||||
0 | 968.0 | 1.242009e+18 | 3.864224e+19 | 10000.000 | 10000.0000 | 50000.0 | 196700.000 | 1.202264e+21 | 968.0 | 2012.994835 | ... | 2016.0 | 2020.0 | 968.0 | 4.069107 | 1.200449 | -12.080000 | 3.706216 | 4.301030 | 5.000000 | 5.00000 |
1 | 3582.0 | 7.292523e+02 | 1.778519e+03 | 3.200 | 13.5000 | 73.4 | 417.750 | 9.900000e+03 | 3582.0 | 2013.261307 | ... | 2017.0 | 2020.0 | 3582.0 | 7.050231 | 0.915651 | 5.004365 | 6.379084 | 7.134306 | 7.869666 | 8.49485 |
2 | 427.0 | 1.309327e+00 | 8.709364e-01 | 0.008 | 0.6355 | 1.0 | 2.035 | 3.100000e+00 | 427.0 | 2014.962529 | ... | 2017.0 | 2020.0 | 427.0 | 9.050659 | 0.500779 | 8.508638 | 8.691437 | 9.000000 | 9.196895 | 11.09691 |
3 rows × 24 columns
Ok, that's imbalanced :-)
Generate fingerprints:
from rdkit.Chem import SaltRemover
sr = SaltRemover.SaltRemover()
stripped = [sr.StripMol(m) for m in data.ROMol]
fpgen = rdFingerprintGenerator.GetMorganGenerator(radius=2)
fps = [fpgen.GetFingerprint(m) for m in stripped]
And now run the experiment with 20 random splits:
accum_chembl205 = []
for i in range(20):
run_ternary_experiment(fps,data.activity,accum_chembl205,random_state=0xf00d+i)
original accuracy: 0.833 balanced accuracy: 0.547 kappa: 0.530 [[126 68 0] [ 14 703 0] [ 0 84 1]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.851 balanced accuracy: 0.719 kappa: 0.647 [[154 38 2] [ 39 656 22] [ 0 47 38]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.851 balanced accuracy: 0.719 kappa: 0.647 [[154 38 2] [ 39 656 22] [ 0 47 38]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.712 balanced accuracy: 0.776 kappa: 0.493 [[177 12 5] [ 96 467 154] [ 3 17 65]] original accuracy: 0.822 balanced accuracy: 0.538 kappa: 0.497 [[117 77 0] [ 17 699 1] [ 0 82 3]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.837 balanced accuracy: 0.723 kappa: 0.622 [[149 45 0] [ 41 642 34] [ 1 41 43]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.837 balanced accuracy: 0.723 kappa: 0.622 [[149 45 0] [ 41 642 34] [ 1 41 43]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.745 balanced accuracy: 0.785 kappa: 0.528 [[172 18 4] [ 93 505 119] [ 2 18 65]] original accuracy: 0.810 balanced accuracy: 0.518 kappa: 0.463 [[114 80 0] [ 23 693 1] [ 1 84 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.827 balanced accuracy: 0.704 kappa: 0.596 [[146 48 0] [ 49 638 30] [ 0 45 40]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.827 balanced accuracy: 0.704 kappa: 0.596 [[146 48 0] [ 49 638 30] [ 0 45 40]] global balanced_accuracy rebalanced thresholds: (0.35000000000000003, 0.1) accuracy: 0.728 balanced accuracy: 0.758 kappa: 0.494 [[162 28 4] [ 83 500 134] [ 0 22 63]] original accuracy: 0.820 balanced accuracy: 0.534 kappa: 0.491 [[115 79 0] [ 17 699 1] [ 1 81 3]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.841 balanced accuracy: 0.729 kappa: 0.631 [[152 42 0] [ 48 643 26] [ 1 41 43]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.841 balanced accuracy: 0.729 kappa: 0.631 [[152 42 0] [ 48 643 26] [ 1 41 43]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.735 balanced accuracy: 0.780 kappa: 0.518 [[177 12 5] [ 96 492 129] [ 1 21 63]] original accuracy: 0.834 balanced accuracy: 0.549 kappa: 0.535 [[127 67 0] [ 14 703 0] [ 2 82 1]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.852 balanced accuracy: 0.737 kappa: 0.652 [[149 45 0] [ 34 655 28] [ 1 39 45]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.852 balanced accuracy: 0.737 kappa: 0.652 [[149 45 0] [ 34 655 28] [ 1 39 45]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.772 balanced accuracy: 0.792 kappa: 0.566 [[176 15 3] [ 75 531 111] [ 2 21 62]] original accuracy: 0.814 balanced accuracy: 0.511 kappa: 0.462 [[107 87 0] [ 13 704 0] [ 3 82 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.842 balanced accuracy: 0.707 kappa: 0.619 [[140 53 1] [ 37 658 22] [ 0 44 41]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.842 balanced accuracy: 0.707 kappa: 0.619 [[140 53 1] [ 37 658 22] [ 0 44 41]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.749 balanced accuracy: 0.783 kappa: 0.533 [[174 17 3] [ 81 509 127] [ 1 21 63]] original accuracy: 0.820 balanced accuracy: 0.530 kappa: 0.494 [[120 74 0] [ 20 697 0] [ 2 83 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.832 balanced accuracy: 0.703 kappa: 0.607 [[151 42 1] [ 54 641 22] [ 0 48 37]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.832 balanced accuracy: 0.703 kappa: 0.607 [[151 42 1] [ 54 641 22] [ 0 48 37]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.747 balanced accuracy: 0.769 kappa: 0.526 [[175 15 4] [108 510 99] [ 3 23 59]] original accuracy: 0.825 balanced accuracy: 0.539 kappa: 0.507 [[122 72 0] [ 18 699 0] [ 0 84 1]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.835 balanced accuracy: 0.690 kappa: 0.605 [[145 48 1] [ 45 652 20] [ 1 49 35]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.835 balanced accuracy: 0.690 kappa: 0.605 [[145 48 1] [ 45 652 20] [ 1 49 35]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.750 balanced accuracy: 0.788 kappa: 0.536 [[172 17 5] [ 93 510 114] [ 3 17 65]] original accuracy: 0.829 balanced accuracy: 0.539 kappa: 0.514 [[121 73 0] [ 13 704 0] [ 0 84 1]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.843 balanced accuracy: 0.695 kappa: 0.624 [[149 44 1] [ 31 657 29] [ 1 50 34]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.843 balanced accuracy: 0.695 kappa: 0.624 [[149 44 1] [ 31 657 29] [ 1 50 34]] global balanced_accuracy rebalanced thresholds: (0.35000000000000003, 0.1) accuracy: 0.794 balanced accuracy: 0.777 kappa: 0.586 [[164 26 4] [ 51 568 98] [ 1 25 59]] original accuracy: 0.826 balanced accuracy: 0.536 kappa: 0.509 [[122 72 0] [ 15 701 1] [ 1 84 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.846 balanced accuracy: 0.717 kappa: 0.637 [[152 41 1] [ 37 652 28] [ 0 46 39]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.846 balanced accuracy: 0.717 kappa: 0.637 [[152 41 1] [ 37 652 28] [ 0 46 39]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.735 balanced accuracy: 0.766 kappa: 0.512 [[177 13 4] [107 496 114] [ 0 26 59]] original accuracy: 0.820 balanced accuracy: 0.530 kappa: 0.490 [[117 77 0] [ 18 699 0] [ 1 83 1]] rebalanced thresholds: [0.45, 0.15000000000000002] accuracy: 0.831 balanced accuracy: 0.696 kappa: 0.592 [[132 62 0] [ 30 654 33] [ 1 42 42]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.834 balanced accuracy: 0.720 kappa: 0.616 [[150 44 0] [ 45 639 33] [ 1 42 42]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.715 balanced accuracy: 0.768 kappa: 0.490 [[175 16 3] [101 474 142] [ 1 21 63]] original accuracy: 0.827 balanced accuracy: 0.537 kappa: 0.509 [[120 74 0] [ 14 703 0] [ 1 83 1]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.838 balanced accuracy: 0.700 kappa: 0.616 [[152 42 0] [ 47 648 22] [ 0 50 35]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.838 balanced accuracy: 0.700 kappa: 0.616 [[152 42 0] [ 47 648 22] [ 0 50 35]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.746 balanced accuracy: 0.784 kappa: 0.531 [[176 15 3] [ 96 504 117] [ 2 20 63]] original accuracy: 0.828 balanced accuracy: 0.538 kappa: 0.514 [[123 71 0] [ 15 702 0] [ 1 84 0]] rebalanced thresholds: [0.45, 0.15000000000000002] accuracy: 0.845 balanced accuracy: 0.711 kappa: 0.623 [[134 60 0] [ 24 664 29] [ 0 41 44]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.833 balanced accuracy: 0.725 kappa: 0.614 [[149 45 0] [ 51 637 29] [ 0 41 44]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.736 balanced accuracy: 0.778 kappa: 0.518 [[181 12 1] [102 491 124] [ 1 23 61]] original accuracy: 0.819 balanced accuracy: 0.525 kappa: 0.485 [[116 78 0] [ 17 700 0] [ 1 84 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.820 balanced accuracy: 0.685 kappa: 0.577 [[144 50 0] [ 47 637 33] [ 0 49 36]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.820 balanced accuracy: 0.685 kappa: 0.577 [[144 50 0] [ 47 637 33] [ 0 49 36]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.727 balanced accuracy: 0.757 kappa: 0.495 [[170 21 3] [100 494 123] [ 0 25 60]] original accuracy: 0.822 balanced accuracy: 0.534 kappa: 0.502 [[122 72 0] [ 20 697 0] [ 3 82 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.827 balanced accuracy: 0.701 kappa: 0.598 [[149 45 0] [ 53 637 27] [ 2 45 38]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.827 balanced accuracy: 0.701 kappa: 0.598 [[149 45 0] [ 53 637 27] [ 2 45 38]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.757 balanced accuracy: 0.791 kappa: 0.543 [[175 19 0] [ 98 515 104] [ 2 19 64]] original accuracy: 0.827 balanced accuracy: 0.541 kappa: 0.517 [[126 68 0] [ 19 698 0] [ 2 83 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.831 balanced accuracy: 0.709 kappa: 0.614 [[159 34 1] [ 54 633 30] [ 3 46 36]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.831 balanced accuracy: 0.709 kappa: 0.614 [[159 34 1] [ 54 633 30] [ 3 46 36]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.746 balanced accuracy: 0.781 kappa: 0.529 [[174 17 3] [101 506 110] [ 4 18 63]] original accuracy: 0.826 balanced accuracy: 0.533 kappa: 0.505 [[117 77 0] [ 12 705 0] [ 4 80 1]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.812 balanced accuracy: 0.663 kappa: 0.556 [[141 53 0] [ 51 636 30] [ 3 50 32]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.812 balanced accuracy: 0.663 kappa: 0.556 [[141 53 0] [ 51 636 30] [ 3 50 32]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.739 balanced accuracy: 0.742 kappa: 0.506 [[170 21 3] [ 92 512 113] [ 3 28 54]] original accuracy: 0.837 balanced accuracy: 0.564 kappa: 0.557 [[141 53 0] [ 24 693 0] [ 0 85 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.853 balanced accuracy: 0.735 kappa: 0.662 [[169 25 0] [ 54 644 19] [ 0 48 37]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.853 balanced accuracy: 0.735 kappa: 0.662 [[169 25 0] [ 54 644 19] [ 0 48 37]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.754 balanced accuracy: 0.810 kappa: 0.552 [[186 6 2] [102 499 116] [ 0 19 66]] original accuracy: 0.809 balanced accuracy: 0.506 kappa: 0.447 [[105 89 0] [ 16 701 0] [ 2 83 0]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.813 balanced accuracy: 0.671 kappa: 0.557 [[138 56 0] [ 48 637 32] [ 2 48 35]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.813 balanced accuracy: 0.671 kappa: 0.557 [[138 56 0] [ 48 637 32] [ 2 48 35]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.730 balanced accuracy: 0.758 kappa: 0.502 [[170 20 4] [ 83 497 137] [ 2 23 60]] original accuracy: 0.821 balanced accuracy: 0.538 kappa: 0.499 [[120 74 0] [ 20 696 1] [ 1 82 2]] rebalanced thresholds: [0.4, 0.15000000000000002] accuracy: 0.838 balanced accuracy: 0.716 kappa: 0.623 [[154 40 0] [ 42 642 33] [ 0 46 39]] global kappa rebalanced thresholds: (0.4, 0.15000000000000002) accuracy: 0.838 balanced accuracy: 0.716 kappa: 0.623 [[154 40 0] [ 42 642 33] [ 0 46 39]] global balanced_accuracy rebalanced thresholds: (0.3, 0.1) accuracy: 0.771 balanced accuracy: 0.814 kappa: 0.572 [[180 14 0] [ 78 521 118] [ 1 17 67]]
accum = accum_chembl205
figsize(9,6)
scatter([x['orig-kappa'] for x in accum],[x['shift-kappa'] for x in accum],label='kappa');
scatter([x['orig-balanced'] for x in accum],[x['shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['orig-accuracy'] for x in accum],[x['shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('orig')
ylabel('greedy shift');
title('CHEMBL205');
scatter([x['shift-kappa'] for x in accum],[x['global-k-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-k-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-k-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-kappa');
title('CHEMBL205');
scatter([x['shift-kappa'] for x in accum],[x['global-ba-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-ba-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-ba-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-balanced');
title('CHEMBL205');
We see the same behavior as before: shifting the descision thresholds using either the greedy approach or grid-based approach improves prediction accuracy over the default decision thresholds.
data = pd.read_csv('../data/target_CHEMBL217.csv.gz')
PandasTools.AddMoleculeColumnToFrame(data,smilesCol='canonical_smiles')
data['pKi'] = [-math.log10(x*1e-9) for x in data['standard_value']]
def binner(act,bins=(5,8)):
for i,bin in enumerate(bins):
if act<=bin:
return i
return len(bins)
data['activity'] = [binner(x) for x in data.pKi]
data.groupby('activity').describe()
standard_value | year | pKi | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | mean | std | min | 25% | 50% | 75% | max | count | mean | ... | 75% | max | count | mean | std | min | 25% | 50% | 75% | max | |
activity | |||||||||||||||||||||
0 | 356.0 | 143415.189354 | 781194.668326 | 10000.000 | 10000.0000 | 10000.00 | 24234.5 | 10000000.00 | 356.0 | 2011.679775 | ... | 2017.0 | 2019.0 | 356.0 | 4.672916 | 0.581865 | 2.000000 | 4.615626 | 5.000000 | 5.000000 | 5.000000 |
1 | 4014.0 | 830.546163 | 1471.610125 | 10.000 | 63.1875 | 238.51 | 931.0 | 9906.00 | 4014.0 | 2011.100648 | ... | 2015.0 | 2020.0 | 4014.0 | 6.620074 | 0.724919 | 5.004102 | 6.031050 | 6.622494 | 7.199370 | 8.000000 |
2 | 607.0 | 3.715942 | 2.786155 | 0.027 | 1.2150 | 3.00 | 5.9 | 9.86 | 607.0 | 2011.957166 | ... | 2016.0 | 2019.0 | 607.0 | 8.614671 | 0.475862 | 8.006123 | 8.229148 | 8.522879 | 8.915457 | 10.568636 |
3 rows × 24 columns
from rdkit.Chem import SaltRemover
sr = SaltRemover.SaltRemover()
stripped = [sr.StripMol(m) for m in data.ROMol]
fpgen = rdFingerprintGenerator.GetMorganGenerator(radius=2)
fps = [fpgen.GetFingerprint(m) for m in stripped]
accum_chembl217 = []
for i in range(20):
run_ternary_experiment(fps,data.activity,accum_chembl217,random_state=0xf00d+i)
original accuracy: 0.832 balanced accuracy: 0.436 kappa: 0.239 [[ 9 62 0] [ 0 797 6] [ 0 99 23]] rebalanced thresholds: [0.1, 0.15000000000000002] accuracy: 0.810 balanced accuracy: 0.657 kappa: 0.453 [[ 33 38 0] [ 33 696 74] [ 0 44 78]] global kappa rebalanced thresholds: (0.15000000000000002, 0.15000000000000002) accuracy: 0.819 balanced accuracy: 0.601 kappa: 0.434 [[ 19 52 0] [ 10 719 74] [ 0 44 78]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.706 balanced accuracy: 0.665 kappa: 0.353 [[ 33 29 9] [ 31 570 202] [ 0 22 100]] original accuracy: 0.832 balanced accuracy: 0.426 kappa: 0.234 [[ 5 66 0] [ 3 798 2] [ 0 96 26]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.845 balanced accuracy: 0.693 kappa: 0.524 [[ 43 28 0] [ 36 730 37] [ 0 53 69]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.845 balanced accuracy: 0.693 kappa: 0.524 [[ 43 28 0] [ 36 730 37] [ 0 53 69]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.752 balanced accuracy: 0.725 kappa: 0.429 [[ 42 24 5] [ 35 606 162] [ 0 21 101]] original accuracy: 0.838 balanced accuracy: 0.452 kappa: 0.276 [[ 10 61 0] [ 2 798 3] [ 0 95 27]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.832 balanced accuracy: 0.701 kappa: 0.507 [[ 45 26 0] [ 41 713 49] [ 1 50 71]] global kappa rebalanced thresholds: (0.15000000000000002, 0.2) accuracy: 0.850 balanced accuracy: 0.645 kappa: 0.508 [[ 30 41 0] [ 8 746 49] [ 0 51 71]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.747 balanced accuracy: 0.747 kappa: 0.440 [[ 45 19 7] [ 41 593 169] [ 1 15 106]] original accuracy: 0.841 balanced accuracy: 0.447 kappa: 0.282 [[ 7 64 0] [ 1 801 1] [ 0 92 30]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.845 balanced accuracy: 0.700 kappa: 0.538 [[ 38 33 0] [ 36 723 44] [ 1 40 81]] global kappa rebalanced thresholds: (0.15000000000000002, 0.2) accuracy: 0.849 balanced accuracy: 0.633 kappa: 0.510 [[ 22 49 0] [ 16 743 44] [ 0 41 81]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.714 balanced accuracy: 0.701 kappa: 0.379 [[ 38 28 5] [ 34 568 201] [ 1 16 105]] original accuracy: 0.838 balanced accuracy: 0.448 kappa: 0.271 [[ 9 62 0] [ 0 799 4] [ 0 95 27]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.811 balanced accuracy: 0.650 kappa: 0.446 [[ 35 36 0] [ 54 702 47] [ 0 51 71]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.811 balanced accuracy: 0.650 kappa: 0.446 [[ 35 36 0] [ 54 702 47] [ 0 51 71]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.705 balanced accuracy: 0.689 kappa: 0.365 [[ 35 32 4] [ 48 560 195] [ 0 15 107]] original accuracy: 0.840 balanced accuracy: 0.469 kappa: 0.290 [[ 16 55 0] [ 0 798 5] [ 0 99 23]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.838 balanced accuracy: 0.703 kappa: 0.514 [[ 47 24 0] [ 42 721 40] [ 0 55 67]] global kappa rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.837 balanced accuracy: 0.747 kappa: 0.548 [[ 47 24 0] [ 42 701 60] [ 0 36 86]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.746 balanced accuracy: 0.760 kappa: 0.442 [[ 47 20 4] [ 39 588 176] [ 0 14 108]] original accuracy: 0.842 balanced accuracy: 0.450 kappa: 0.293 [[ 6 65 0] [ 1 800 2] [ 0 89 33]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.853 balanced accuracy: 0.699 kappa: 0.551 [[ 38 33 0] [ 25 733 45] [ 1 42 79]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.853 balanced accuracy: 0.699 kappa: 0.551 [[ 38 33 0] [ 25 733 45] [ 1 42 79]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.742 balanced accuracy: 0.713 kappa: 0.413 [[ 38 30 3] [ 23 596 184] [ 0 17 105]] original accuracy: 0.827 balanced accuracy: 0.426 kappa: 0.223 [[ 6 65 0] [ 2 793 8] [ 0 97 25]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.833 balanced accuracy: 0.683 kappa: 0.492 [[ 45 26 0] [ 39 722 42] [ 0 59 63]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.833 balanced accuracy: 0.683 kappa: 0.492 [[ 45 26 0] [ 39 722 42] [ 0 59 63]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.753 balanced accuracy: 0.745 kappa: 0.440 [[ 45 23 3] [ 37 601 165] [ 0 18 104]] original accuracy: 0.845 balanced accuracy: 0.457 kappa: 0.313 [[ 5 66 0] [ 0 800 3] [ 0 85 37]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.840 balanced accuracy: 0.691 kappa: 0.521 [[ 38 33 0] [ 32 721 50] [ 0 44 78]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.840 balanced accuracy: 0.691 kappa: 0.521 [[ 38 33 0] [ 32 721 50] [ 0 44 78]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.724 balanced accuracy: 0.703 kappa: 0.391 [[ 37 29 5] [ 28 578 197] [ 0 16 106]] original accuracy: 0.846 balanced accuracy: 0.476 kappa: 0.329 [[ 11 60 0] [ 3 798 2] [ 0 88 34]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.835 balanced accuracy: 0.715 kappa: 0.518 [[ 48 23 0] [ 39 713 51] [ 0 51 71]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.835 balanced accuracy: 0.715 kappa: 0.518 [[ 48 23 0] [ 39 713 51] [ 0 51 71]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.740 balanced accuracy: 0.755 kappa: 0.428 [[ 48 21 2] [ 36 584 183] [ 0 17 105]] original accuracy: 0.832 balanced accuracy: 0.423 kappa: 0.220 [[ 7 64 0] [ 2 801 0] [ 0 101 21]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.839 balanced accuracy: 0.663 kappa: 0.495 [[ 38 33 0] [ 40 732 31] [ 0 56 66]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.839 balanced accuracy: 0.663 kappa: 0.495 [[ 38 33 0] [ 40 732 31] [ 0 56 66]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.750 balanced accuracy: 0.697 kappa: 0.410 [[ 38 30 3] [ 38 612 153] [ 0 25 97]] original accuracy: 0.842 balanced accuracy: 0.456 kappa: 0.294 [[ 9 62 0] [ 1 800 2] [ 0 92 30]] rebalanced thresholds: [0.1, 0.25] accuracy: 0.829 balanced accuracy: 0.648 kappa: 0.461 [[ 40 31 0] [ 39 728 36] [ 0 64 58]] global kappa rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.814 balanced accuracy: 0.693 kappa: 0.480 [[ 40 31 0] [ 39 691 73] [ 0 42 80]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.725 balanced accuracy: 0.712 kappa: 0.395 [[ 40 26 5] [ 39 578 186] [ 0 18 104]] original accuracy: 0.846 balanced accuracy: 0.486 kappa: 0.337 [[ 14 57 0] [ 1 796 6] [ 0 89 33]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.845 balanced accuracy: 0.712 kappa: 0.542 [[ 42 29 0] [ 43 721 39] [ 0 43 79]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.845 balanced accuracy: 0.712 kappa: 0.542 [[ 42 29 0] [ 43 721 39] [ 0 43 79]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.732 balanced accuracy: 0.712 kappa: 0.400 [[ 41 26 4] [ 41 587 175] [ 0 21 101]] original accuracy: 0.836 balanced accuracy: 0.439 kappa: 0.262 [[ 6 65 0] [ 2 798 3] [ 0 93 29]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.835 balanced accuracy: 0.669 kappa: 0.497 [[ 36 35 0] [ 43 723 37] [ 0 49 73]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.835 balanced accuracy: 0.669 kappa: 0.497 [[ 36 35 0] [ 43 723 37] [ 0 49 73]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.747 balanced accuracy: 0.699 kappa: 0.415 [[ 36 28 7] [ 40 606 157] [ 0 20 102]] original accuracy: 0.839 balanced accuracy: 0.457 kappa: 0.291 [[ 9 62 0] [ 2 796 5] [ 0 91 31]] rebalanced thresholds: [0.1, 0.15000000000000002] accuracy: 0.829 balanced accuracy: 0.725 kappa: 0.527 [[ 41 30 0] [ 36 696 71] [ 1 32 89]] global kappa rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.829 balanced accuracy: 0.725 kappa: 0.527 [[ 41 30 0] [ 36 696 71] [ 1 32 89]] global balanced_accuracy rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.829 balanced accuracy: 0.725 kappa: 0.527 [[ 41 30 0] [ 36 696 71] [ 1 32 89]] original accuracy: 0.837 balanced accuracy: 0.452 kappa: 0.274 [[ 10 61 0] [ 2 797 4] [ 0 95 27]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.846 balanced accuracy: 0.697 kappa: 0.534 [[ 41 30 0] [ 32 728 43] [ 2 46 74]] global kappa rebalanced thresholds: (0.1, 0.2) accuracy: 0.846 balanced accuracy: 0.697 kappa: 0.534 [[ 41 30 0] [ 32 728 43] [ 2 46 74]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.730 balanced accuracy: 0.709 kappa: 0.395 [[ 41 28 2] [ 31 586 186] [ 2 20 100]] original accuracy: 0.840 balanced accuracy: 0.465 kappa: 0.290 [[ 14 57 0] [ 2 798 3] [ 0 97 25]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.836 balanced accuracy: 0.669 kappa: 0.497 [[ 37 34 0] [ 41 725 37] [ 0 51 71]] global kappa rebalanced thresholds: (0.15000000000000002, 0.2) accuracy: 0.855 balanced accuracy: 0.638 kappa: 0.515 [[ 28 43 0] [ 13 753 37] [ 0 51 71]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.725 balanced accuracy: 0.694 kappa: 0.385 [[ 37 29 5] [ 39 583 181] [ 0 20 102]] original accuracy: 0.827 balanced accuracy: 0.417 kappa: 0.213 [[ 4 67 0] [ 3 795 5] [ 0 97 25]] rebalanced thresholds: [0.15000000000000002, 0.2] accuracy: 0.858 balanced accuracy: 0.649 kappa: 0.530 [[ 28 43 0] [ 11 752 40] [ 0 47 75]] global kappa rebalanced thresholds: (0.15000000000000002, 0.2) accuracy: 0.858 balanced accuracy: 0.649 kappa: 0.530 [[ 28 43 0] [ 11 752 40] [ 0 47 75]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.734 balanced accuracy: 0.726 kappa: 0.411 [[ 43 23 5] [ 36 585 182] [ 1 18 103]] original accuracy: 0.839 balanced accuracy: 0.448 kappa: 0.277 [[ 8 63 0] [ 0 799 4] [ 0 93 29]] rebalanced thresholds: [0.1, 0.15000000000000002] accuracy: 0.819 balanced accuracy: 0.699 kappa: 0.490 [[ 42 29 0] [ 42 696 65] [ 1 43 78]] global kappa rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.819 balanced accuracy: 0.699 kappa: 0.490 [[ 42 29 0] [ 42 696 65] [ 1 43 78]] global balanced_accuracy rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.819 balanced accuracy: 0.699 kappa: 0.490 [[ 42 29 0] [ 42 696 65] [ 1 43 78]] original accuracy: 0.830 balanced accuracy: 0.436 kappa: 0.226 [[ 12 59 0] [ 2 797 4] [ 0 104 18]] rebalanced thresholds: [0.1, 0.2] accuracy: 0.837 balanced accuracy: 0.692 kappa: 0.511 [[ 43 28 0] [ 38 721 44] [ 1 51 70]] global kappa rebalanced thresholds: (0.1, 0.15000000000000002) accuracy: 0.826 balanced accuracy: 0.718 kappa: 0.516 [[ 43 27 1] [ 38 697 68] [ 1 38 83]] global balanced_accuracy rebalanced thresholds: (0.1, 0.1) accuracy: 0.729 balanced accuracy: 0.724 kappa: 0.403 [[ 43 24 4] [ 37 580 186] [ 0 19 103]]
accum = accum_chembl217
figsize(9,6)
scatter([x['orig-kappa'] for x in accum],[x['shift-kappa'] for x in accum],label='kappa');
scatter([x['orig-balanced'] for x in accum],[x['shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['orig-accuracy'] for x in accum],[x['shift-accuracy'] for x in accum],label='accuracy');
plot([.2,1],[.2,1]);
legend();
xlabel('orig')
ylabel('greedy shift');
title('CHEMBL217');
scatter([x['shift-kappa'] for x in accum],[x['global-k-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-k-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-k-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-kappa');
title('CHEMBL217');
scatter([x['shift-kappa'] for x in accum],[x['global-ba-shift-kappa'] for x in accum],label='kappa');
scatter([x['shift-balanced'] for x in accum],[x['global-ba-shift-balanced'] for x in accum],label='balanced accuracy');
scatter([x['shift-accuracy'] for x in accum],[x['global-ba-shift-accuracy'] for x in accum],label='accuracy');
plot([.4,1],[.4,1]);
legend();
xlabel('greedy shift')
ylabel('grid-balanced');
title('CHEMBL217');
Again, the same conclusions hold here.