clustering_metrics.utils module¶

clustering_metrics.utils.fill_with_last(lst, k)[source]¶

extend a list to length k by duplicating last item

>>> fill_with_last([1, 2, 3], 5)
[1, 2, 3, 3, 3]

clustering_metrics.utils.gapply(n, func, *args, **kwargs)[source]¶

Apply a generating function n times to the argument list

Parameters:	n (integer) – number of times to apply a function func (instancemethod) – a function to apply
Return type:	collections.iterable

clustering_metrics.utils.get_df_subset(df, fields)[source]¶: Give a subset of a pandas.DataFrame instance

clustering_metrics.utils.getpropval(obj)[source]¶

Returns:	a generator of properties and their values

clustering_metrics.utils.lapply(n, func, *args, **kwargs)[source]¶

Same as gapply, except returns a list

Parameters:	n (integer) – number of times to apply a function func (instancemethod) – a function to apply
Return type:	list

clustering_metrics.utils.random_string(length, alphabet='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz')[source]¶

Generate a random string

Parameters:	length (int) – length of the string alphabet (str) – alphabet to draw letters from
Returns:	random string of specified length
Return type:	str

clustering_metrics.utils.randset(value_range=(0, 10), sample_range=(5, 20))[source]¶

Return a random set of integers sampled

Returns:	a list of integers
Return type:	tuple

clustering_metrics.utils.sigsim(x, y, dim)[source]¶

Return the similarity of the two signatures

Parameters:	x (object) – signature 1 y (object) – signature 2 dim (int) – number of dimensions
Returns:	similarity between two signatures
Return type:	float

clustering_metrics.utils.sort_by_length(els, reverse=True)[source]¶

Given a list of els, sort its elements by len() in descending order. Returns a generator

Parameters:	els (list) – input list reverse (bool) – Whether to reverse a list
Return type:	collections.iterable

clustering_metrics.utils.wrap_scalar(a)[source]¶: If scalar, convert to tuple