clustering_metrics.entropy module¶
-
clustering_metrics.entropy.
assignment_cost
()¶
-
clustering_metrics.entropy.
centropy
()¶ Entropy of an iterable of counts (integers)
Assumes every entry in the list belongs to a different class. The resulting value is not normalized by N. Also note that the entropy value is calculated using natural base, which may not be what you want, so you may need to normalized it with log(base).
The ‘counts’ parameter is expected to be an list or tuple-like iterable. For convenience, it can also be a dict/mapping type, in which case its values will be used to calculate entropy.
-
clustering_metrics.entropy.
cnum_pairs
()¶ Binomial coefficient for k=2 (integer)
For non-vectorized computation, this is faster than calling
scipy.misc.comb(x, 2)
orscipy.special.binom(x, 2)
. Unlike with those two, the domain here extends into negative integers.
-
clustering_metrics.entropy.
csum_pairs
()¶ Count sum of possible pairs (integer)
Use n choose 2 to calculate sum of possible pairs.
-
clustering_metrics.entropy.
emi_from_margins
()¶ Calculate Expected Mutual Information given margins of RxC table
For the sake of numeric precision, the resulting value is not normalized by N.
License: BSD 3 clause
-
clustering_metrics.entropy.
fentropy
()¶ Entropy of an iterable of frequencies (floating point)
Assumes every entry in the list belongs to a different class. The resulting value is not normalized by N. Also note that the entropy value is calculated using natural base, which may not be what you want, so you may need to normalized it with log(base).
The ‘freqs’ parameter is expected to be an list or tuple-like iterable. For convenience, it can also be a dict/mapping type, in which case its values will be used to calculate entropy.
-
clustering_metrics.entropy.
fnum_pairs
()¶ Binomial coefficient for k=2 (floating point)
For non-vectorized computation, this is faster than calling
scipy.misc.comb(x, 2)
orscipy.special.binom(x, 2)
. Unlike with those two, the domain here extends into negative integers.
-
clustering_metrics.entropy.
fsum_pairs
()¶ Count sum of possible pairs (floating points)
Use n choose 2 to calculate sum of possible pairs.
-
clustering_metrics.entropy.
lgamma
()¶ Log of gamma function for scalar double x
This is a scalar-only replacement for
scipy.special.gammaln
. On scalar values, this method is ~10x faster than the corresponding SciPy one. On large arrays, however, even when vectorized usingnp.vectorize
, this method is slower than the SciPy one, so usegammaln
in those cases.This function is borrowed verbatim from Scikit-Learn.
-
clustering_metrics.entropy.
ndarray_from_iter
()¶ Create NumPy arrays from different object types
In addition to standard
np.asarray
casting functionality, this function handles conversion from the following types:collections.Mapping
,collections.Iterator
.If the input object is an instance of
collections.Mapping
, assumes that we are interesting in creating a NumPy array from the values.