mlpack  master
Public Types | Public Member Functions | Private Member Functions | Private Attributes | List of all members
mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType > Class Template Reference

An algorithm for an exact Lloyd iteration which simply uses dual-tree nearest-neighbor search to find the nearest centroid for each point in the dataset. More...

Public Types

template<typename TreeMetricType , typename IgnoredStatType , typename TreeMatType >
using NNSTreeType = TreeType< TreeMetricType, DualTreeKMeansStatistic, TreeMatType >
 
typedef TreeType< MetricType, DualTreeKMeansStatistic, MatType > Tree
 Convenience typedef. More...
 

Public Member Functions

 DualTreeKMeans (const MatType &dataset, MetricType &metric)
 Construct the DualTreeKMeans object, which will construct a tree on the points. More...
 
 ~DualTreeKMeans ()
 Delete the tree constructed by the DualTreeKMeans object. More...
 
size_t DistanceCalculations () const
 Return the number of distance calculations. More...
 
size_t & DistanceCalculations ()
 Modify the number of distance calculations. More...
 
double Iterate (const arma::mat &centroids, arma::mat &newCentroids, arma::Col< size_t > &counts)
 Run a single iteration of the dual-tree nearest neighbor algorithm for k-means, updating the given centroids into the newCentroids matrix. More...
 

Private Member Functions

void CoalesceTree (Tree &node, const size_t child=0)
 
void DecoalesceTree (Tree &node)
 
void ExtractCentroids (Tree &node, arma::mat &newCentroids, arma::Col< size_t > &newCounts, const arma::mat &centroids)
 Extract the centroids of the clusters. More...
 
void UpdateTree (Tree &node, const arma::mat &centroids, const double parentUpperBound=0.0, const double adjustedParentUpperBound=DBL_MAX, const double parentLowerBound=DBL_MAX, const double adjustedParentLowerBound=0.0)
 Update the bounds in the tree before the next iteration. More...
 

Private Attributes

arma::Row< size_t > assignments
 
arma::vec clusterDistances
 
const MatType & dataset
 The dataset we are using. More...
 
const MatType & datasetOrig
 The original dataset reference. More...
 
size_t distanceCalculations
 Track distance calculations. More...
 
arma::mat interclusterDistances
 
size_t iteration
 Track iteration number. More...
 
arma::mat lastIterationCentroids
 
arma::vec lowerBounds
 Lower bounds on second closest cluster distance for each point. More...
 
MetricType metric
 The metric. More...
 
std::vector< bool > prunedPoints
 Indicator of whether or not the point is pruned. More...
 
Treetree
 The tree built on the points. More...
 
arma::vec upperBounds
 Upper bounds on nearest centroid. More...
 
std::vector< bool > visited
 

Detailed Description

template<typename MetricType, typename MatType, template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
class mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >

An algorithm for an exact Lloyd iteration which simply uses dual-tree nearest-neighbor search to find the nearest centroid for each point in the dataset.

The conditions under which this will perform best are probably limited to the case where k is close to the number of points in the dataset, and the number of iterations of the k-means algorithm will be few.

Definition at line 41 of file dual_tree_kmeans.hpp.

Member Typedef Documentation

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
template<typename TreeMetricType , typename IgnoredStatType , typename TreeMatType >
using mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::NNSTreeType = TreeType<TreeMetricType, DualTreeKMeansStatistic, TreeMatType>

Definition at line 51 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
typedef TreeType<MetricType, DualTreeKMeansStatistic, MatType> mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::Tree

Convenience typedef.

Definition at line 45 of file dual_tree_kmeans.hpp.

Constructor & Destructor Documentation

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::DualTreeKMeans ( const MatType &  dataset,
MetricType &  metric 
)

Construct the DualTreeKMeans object, which will construct a tree on the points.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::~DualTreeKMeans ( )

Delete the tree constructed by the DualTreeKMeans object.

Member Function Documentation

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
void mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::CoalesceTree ( Tree node,
const size_t  child = 0 
)
private
template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
void mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::DecoalesceTree ( Tree node)
private
template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
size_t mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::DistanceCalculations ( ) const
inline

Return the number of distance calculations.

Definition at line 77 of file dual_tree_kmeans.hpp.

References mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::distanceCalculations.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
size_t& mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::DistanceCalculations ( )
inline

Modify the number of distance calculations.

Definition at line 79 of file dual_tree_kmeans.hpp.

References mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::distanceCalculations.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
void mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::ExtractCentroids ( Tree node,
arma::mat &  newCentroids,
arma::Col< size_t > &  newCounts,
const arma::mat &  centroids 
)
private

Extract the centroids of the clusters.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
double mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::Iterate ( const arma::mat &  centroids,
arma::mat &  newCentroids,
arma::Col< size_t > &  counts 
)

Run a single iteration of the dual-tree nearest neighbor algorithm for k-means, updating the given centroids into the newCentroids matrix.

Parameters
centroidsCurrent cluster centroids.
newCentroidsNew cluster centroids.
countsCurrent counts, to be overwritten with new counts.
template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
void mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::UpdateTree ( Tree node,
const arma::mat &  centroids,
const double  parentUpperBound = 0.0,
const double  adjustedParentUpperBound = DBL_MAX,
const double  parentLowerBound = DBL_MAX,
const double  adjustedParentLowerBound = 0.0 
)
private

Update the bounds in the tree before the next iteration.

centroids is the current (not yet searched) centroids.

Member Data Documentation

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
arma::Row<size_t> mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::assignments
private

Definition at line 103 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
arma::vec mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::clusterDistances
private

Definition at line 109 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
const MatType& mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::dataset
private

The dataset we are using.

Definition at line 87 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
const MatType& mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::datasetOrig
private

The original dataset reference.

Definition at line 83 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
size_t mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::distanceCalculations
private

Track distance calculations.

Definition at line 92 of file dual_tree_kmeans.hpp.

Referenced by mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::DistanceCalculations().

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
arma::mat mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::interclusterDistances
private

Definition at line 111 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
size_t mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::iteration
private

Track iteration number.

Definition at line 94 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
arma::mat mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::lastIterationCentroids
private

Definition at line 107 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
arma::vec mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::lowerBounds
private

Lower bounds on second closest cluster distance for each point.

Definition at line 99 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
MetricType mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::metric
private

The metric.

Definition at line 89 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
std::vector<bool> mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::prunedPoints
private

Indicator of whether or not the point is pruned.

Definition at line 101 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
Tree* mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::tree
private

The tree built on the points.

Definition at line 85 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
arma::vec mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::upperBounds
private

Upper bounds on nearest centroid.

Definition at line 97 of file dual_tree_kmeans.hpp.

template<typename MetricType , typename MatType , template< typename TreeMetricType, typename TreeStatType, typename TreeMatType > class TreeType = tree::KDTree>
std::vector<bool> mlpack::kmeans::DualTreeKMeans< MetricType, MatType, TreeType >::visited
private

Definition at line 105 of file dual_tree_kmeans.hpp.


The documentation for this class was generated from the following file: