clustering_metrics.monte_carlo.predictions module

class clustering_metrics.monte_carlo.predictions.Grid(seed=None)[source]

Bases: object

best_clustering_by_score(score, flip_sign=False)[source]
compare(others, scores, dtype=<type 'numpy.float16'>, plot=False)[source]
compute(scores, show_progress=False, dtype=<type 'numpy.float16'>)[source]
corrplot(compute_result, save_to, symmetric=False, **kwargs)[source]
describe_matrices()[source]
fill_clusters(n=None, size=None, max_classes=None)[source]
fill_matrices(max_counts=None, n=None)[source]
fill_sim_clusters(n=None, size=None, **kwargs)[source]
find_highest(score, flip_sign=False)[source]
find_matching_matrix(matches)[source]
iter_clusters()
iter_grid()[source]
iter_matrices()[source]
static matrix_from_labels(*args)[source]
static matrix_from_matrices(*args)[source]
static plot(pairs, xlim=None, ylim=None, title=None, dither=0.0002, marker='.', s=0.01, color='black', alpha=1.0, save_to=None, label=None, xlabel=None, ylabel=None, **kwargs)[source]
show_cluster(idx, inverse=False)[source]
show_matrix(idx, inverse=False)[source]
classmethod with_clusters(n=1000, size=200, max_classes=5, seed=None)[source]
classmethod with_matrices(n=1000, max_counts=100, seed=None)[source]
classmethod with_sim_clusters(n=1000, size=200, seed=None, **kwargs)[source]
clustering_metrics.monte_carlo.predictions.auc_xscaled(xs, ys)[source]

AUC score scaled to fill x interval

clustering_metrics.monte_carlo.predictions.create_plots(args, df)[source]
clustering_metrics.monte_carlo.predictions.do_mapper(args)[source]
clustering_metrics.monte_carlo.predictions.do_reducer(args)[source]
clustering_metrics.monte_carlo.predictions.get_conf(obj)[source]
clustering_metrics.monte_carlo.predictions.join_clusters(clusters)[source]

Reduce number of clusters 2x by joining

clustering_metrics.monte_carlo.predictions.parse_args(args=None)[source]
clustering_metrics.monte_carlo.predictions.relabel_negatives(clusters)[source]

Place each negative label in its own class

clustering_metrics.monte_carlo.predictions.run(args)[source]
clustering_metrics.monte_carlo.predictions.sample_with_error(label, error_distribution, null_distribution)[source]

Return label given error probability and null distributions

error_distribution must be of form {False: 1.0 - p_err, True: p_err}

clustering_metrics.monte_carlo.predictions.simulate_clustering(galpha=2, gbeta=10, nclusters=20, pos_ratio=0.2, p_err=0.05, population_size=2000, split_join=0, join_negatives=False, with_warnings=True)[source]
clustering_metrics.monte_carlo.predictions.simulate_labeling(sample_size=2000, **kwargs)[source]
clustering_metrics.monte_carlo.predictions.split_clusters(clusters)[source]

Increase number of clusters 2x by splitting