clustering_metrics.hungarian module¶
- 
clustering_metrics.hungarian.linear_sum_assignment(cost_matrix)[source]¶
- Solve the linear sum assignment problem. - The linear sum assignment problem is also known as minimum weight matching in bipartite graphs. A problem instance is described by a matrix C, where each C[i,j] is the cost of matching vertex i of the first partite set (a “worker”) and vertex j of the second set (a “job”). The goal is to find a complete assignment of workers to jobs of minimal cost. - Formally, let X be a boolean matrix where \(X[i,j] = 1\) iff row i is assigned to column j. Then the optimal assignment has cost \[\min \sum_i \sum_j C_{i,j} X_{i,j}\]- s.t. each row is assignment to at most one column, and each column to at most one row. - This function can also solve a generalization of the classic assignment problem where the cost matrix is rectangular. If it has more rows than columns, then not every row needs to be assigned to a column, and vice versa. - The method used is the Hungarian algorithm, also known as the Munkres or Kuhn-Munkres algorithm. - Parameters: - cost_matrix : array - The cost matrix of the bipartite graph. - Returns: - row_ind, col_ind : array - An array of row indices and one of corresponding column indices giving the optimal assignment. The cost of the assignment can be computed as - cost_matrix[row_ind, col_ind].sum(). The row indices will be sorted; in the case of a square cost matrix they will be equal to- numpy.arange(cost_matrix.shape[0]).- Notes - New in version 0.17.0. - References - [991] - R.A. Pilgrim’s page on Munkres’ Assignemnt Algorithm - [992] - Harold W. Kuhn. The Hungarian Method for the assignment problem. Naval Research Logistics Quarterly, 2:83-97, 1955. - [993] - Harold W. Kuhn. Variants of the Hungarian method for assignment problems. Naval Research Logistics Quarterly, 3: 253-258, 1956. - [994] - Munkres, J. Algorithms for the Assignment and Transportation Problems. J. SIAM, 5(1):32-38, March, 1957. - [995] - Wikipedia entry for the Hungarian algorithm - Examples - >>> cost = np.array([[4, 1, 3], [2, 0, 5], [3, 2, 2]]) >>> row_ind, col_ind = linear_sum_assignment(cost) >>> col_ind array([1, 0, 2]) >>> cost[row_ind, col_ind].sum() 5