mlpack
master
|
An algorithm for an exact Lloyd iteration which simply uses dual-tree nearest-neighbor search to find the nearest centroid for each point in the dataset. More...
Public Types | |
template<typename TreeMetricType , typename IgnoredStatType , typename TreeMatType > | |
using | NNSTreeType = TreeType< TreeMetricType, DualTreeKMeansStatistic, TreeMatType > |
typedef TreeType< MetricType, DualTreeKMeansStatistic, MatType > | Tree |
Convenience typedef. More... | |
Public Member Functions | |
DualTreeKMeans (const MatType &dataset, MetricType &metric) | |
Construct the DualTreeKMeans object, which will construct a tree on the points. More... | |
~DualTreeKMeans () | |
Delete the tree constructed by the DualTreeKMeans object. More... | |
size_t | DistanceCalculations () const |
Return the number of distance calculations. More... | |
size_t & | DistanceCalculations () |
Modify the number of distance calculations. More... | |
double | Iterate (const arma::mat ¢roids, arma::mat &newCentroids, arma::Col< size_t > &counts) |
Run a single iteration of the dual-tree nearest neighbor algorithm for k-means, updating the given centroids into the newCentroids matrix. More... | |
Private Member Functions | |
void | CoalesceTree (Tree &node, const size_t child=0) |
void | DecoalesceTree (Tree &node) |
void | ExtractCentroids (Tree &node, arma::mat &newCentroids, arma::Col< size_t > &newCounts, const arma::mat ¢roids) |
Extract the centroids of the clusters. More... | |
void | UpdateTree (Tree &node, const arma::mat ¢roids, const double parentUpperBound=0.0, const double adjustedParentUpperBound=DBL_MAX, const double parentLowerBound=DBL_MAX, const double adjustedParentLowerBound=0.0) |
Update the bounds in the tree before the next iteration. More... | |
Private Attributes | |
arma::Row< size_t > | assignments |
arma::vec | clusterDistances |
const MatType & | dataset |
The dataset we are using. More... | |
const MatType & | datasetOrig |
The original dataset reference. More... | |
size_t | distanceCalculations |
Track distance calculations. More... | |
arma::mat | interclusterDistances |
size_t | iteration |
Track iteration number. More... | |
arma::mat | lastIterationCentroids |
arma::vec | lowerBounds |
Lower bounds on second closest cluster distance for each point. More... | |
MetricType | metric |
The metric. More... | |
std::vector< bool > | prunedPoints |
Indicator of whether or not the point is pruned. More... | |
Tree * | tree |
The tree built on the points. More... | |
arma::vec | upperBounds |
Upper bounds on nearest centroid. More... | |
std::vector< bool > | visited |
An algorithm for an exact Lloyd iteration which simply uses dual-tree nearest-neighbor search to find the nearest centroid for each point in the dataset.
The conditions under which this will perform best are probably limited to the case where k is close to the number of points in the dataset, and the number of iterations of the k-means algorithm will be few.
Definition at line 41 of file dual_tree_kmeans.hpp.
using mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::NNSTreeType = TreeType<TreeMetricType, DualTreeKMeansStatistic, TreeMatType> |
Definition at line 51 of file dual_tree_kmeans.hpp.
typedef TreeType<MetricType, DualTreeKMeansStatistic, MatType> mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::Tree |
Convenience typedef.
Definition at line 45 of file dual_tree_kmeans.hpp.
mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::DualTreeKMeans | ( | const MatType & | dataset, |
MetricType & | metric | ||
) |
Construct the DualTreeKMeans object, which will construct a tree on the points.
mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::~DualTreeKMeans | ( | ) |
Delete the tree constructed by the DualTreeKMeans object.
|
private |
|
private |
|
inline |
Return the number of distance calculations.
Definition at line 77 of file dual_tree_kmeans.hpp.
References mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::distanceCalculations.
|
inline |
Modify the number of distance calculations.
Definition at line 79 of file dual_tree_kmeans.hpp.
References mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::distanceCalculations.
|
private |
Extract the centroids of the clusters.
double mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::Iterate | ( | const arma::mat & | centroids, |
arma::mat & | newCentroids, | ||
arma::Col< size_t > & | counts | ||
) |
Run a single iteration of the dual-tree nearest neighbor algorithm for k-means, updating the given centroids into the newCentroids matrix.
centroids | Current cluster centroids. |
newCentroids | New cluster centroids. |
counts | Current counts, to be overwritten with new counts. |
|
private |
Update the bounds in the tree before the next iteration.
centroids is the current (not yet searched) centroids.
|
private |
Definition at line 103 of file dual_tree_kmeans.hpp.
|
private |
Definition at line 109 of file dual_tree_kmeans.hpp.
|
private |
The dataset we are using.
Definition at line 87 of file dual_tree_kmeans.hpp.
|
private |
The original dataset reference.
Definition at line 83 of file dual_tree_kmeans.hpp.
|
private |
Track distance calculations.
Definition at line 92 of file dual_tree_kmeans.hpp.
Referenced by mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::DistanceCalculations().
|
private |
Definition at line 111 of file dual_tree_kmeans.hpp.
|
private |
Track iteration number.
Definition at line 94 of file dual_tree_kmeans.hpp.
|
private |
Definition at line 107 of file dual_tree_kmeans.hpp.
|
private |
Lower bounds on second closest cluster distance for each point.
Definition at line 99 of file dual_tree_kmeans.hpp.
|
private |
The metric.
Definition at line 89 of file dual_tree_kmeans.hpp.
|
private |
Indicator of whether or not the point is pruned.
Definition at line 101 of file dual_tree_kmeans.hpp.
|
private |
The tree built on the points.
Definition at line 85 of file dual_tree_kmeans.hpp.
|
private |
Upper bounds on nearest centroid.
Definition at line 97 of file dual_tree_kmeans.hpp.
|
private |
Definition at line 105 of file dual_tree_kmeans.hpp.