mlpack  master
Public Member Functions | Private Member Functions | Private Attributes | List of all members
mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy > Class Template Reference

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering technique described in the following paper: More...

Public Member Functions

 DBSCAN (const double epsilon, const size_t minPoints, RangeSearchType rangeSearch=RangeSearchType(), PointSelectionPolicy pointSelector=PointSelectionPolicy())
 Construct the DBSCAN object with the given parameters. More...
 
template<typename MatType >
size_t Cluster (const MatType &data, arma::mat &centroids)
 Performs DBSCAN clustering on the data, returning number of clusters and also the centroid of each cluster. More...
 
template<typename MatType >
size_t Cluster (const MatType &data, arma::Row< size_t > &assignments)
 Performs DBSCAN clustering on the data, returning number of clusters and also the list of cluster assignments. More...
 
template<typename MatType >
size_t Cluster (const MatType &data, arma::Row< size_t > &assignments, arma::mat &centroids)
 Performs DBSCAN clustering on the data, returning number of clusters, the centroid of each cluster and also the list of cluster assignments. More...
 

Private Member Functions

template<typename MatType >
size_t ProcessPoint (const MatType &data, boost::dynamic_bitset<> &unvisited, const size_t index, arma::Row< size_t > &assignments, const size_t currentCluster, const std::vector< std::vector< size_t >> &neighbors, const std::vector< std::vector< double >> &distances, const bool topLevel=true)
 This function processes the point at index. More...
 

Private Attributes

double epsilon
 Maximum distance between two points to be part of same cluster. More...
 
size_t minPoints
 Minimum number of points to be in the epsilon-neighborhood (including itself) for the point to be a core-point. More...
 
PointSelectionPolicy pointSelector
 Instantiated point selection policy. More...
 
RangeSearchType rangeSearch
 Instantiated range search policy. More...
 

Detailed Description

template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
class mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering technique described in the following paper:

@inproceedings{ester1996density,
title={A density-based algorithm for discovering clusters in large spatial
databases with noise.},
author={Ester, M. and Kriegel, H.-P. and Sander, J. and Xu, X.},
booktitle={Proceedings of the Second International Conference on Knowledge
Discovery and Data Mining (KDD '96)},
pages={226--231},
year={1996}
}

The DBSCAN algorithm iteratively clusters points using range searches with a specified radius parameter. This implementation allows configuration of the range search technique used and the point selection strategy by means of template parameters.

Template Parameters
RangeSearchTypeClass to use for range searching.
PointSelectionPolicyStrategy for selecting next point to cluster with.

Definition at line 46 of file dbscan.hpp.

Constructor & Destructor Documentation

template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::DBSCAN ( const double  epsilon,
const size_t  minPoints,
RangeSearchType  rangeSearch = RangeSearchType(),
PointSelectionPolicy  pointSelector = PointSelectionPolicy() 
)

Construct the DBSCAN object with the given parameters.

Parameters
epsilonSize of range query.
minPointsMinimum number of points for each cluster.
rangeSearchOptional instantiated RangeSearch object.
pointSelectorOptionL instantiated PointSelectionPolicy object.

Member Function Documentation

template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
template<typename MatType >
size_t mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::Cluster ( const MatType &  data,
arma::mat &  centroids 
)

Performs DBSCAN clustering on the data, returning number of clusters and also the centroid of each cluster.

Template Parameters
MatTypeType of matrix (arma::mat or arma::sp_mat).
Parameters
dataDataset to cluster.
centroidsMatrix in which centroids are stored.
template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
template<typename MatType >
size_t mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::Cluster ( const MatType &  data,
arma::Row< size_t > &  assignments 
)

Performs DBSCAN clustering on the data, returning number of clusters and also the list of cluster assignments.

Template Parameters
MatTypeType of matrix (arma::mat or arma::sp_mat).
Parameters
dataDataset to cluster.
assignmentsVector to store cluster assignments.
template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
template<typename MatType >
size_t mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::Cluster ( const MatType &  data,
arma::Row< size_t > &  assignments,
arma::mat &  centroids 
)

Performs DBSCAN clustering on the data, returning number of clusters, the centroid of each cluster and also the list of cluster assignments.

If assignments[i] == assignments.n_elem - 1, then the point is considered "noise".

Template Parameters
MatTypeType of matrix (arma::mat or arma::sp_mat).
Parameters
dataDataset to cluster.
assignmentsVector to store cluster assignments.
centroidsMatrix in which centroids are stored.
template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
template<typename MatType >
size_t mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::ProcessPoint ( const MatType &  data,
boost::dynamic_bitset<> &  unvisited,
const size_t  index,
arma::Row< size_t > &  assignments,
const size_t  currentCluster,
const std::vector< std::vector< size_t >> &  neighbors,
const std::vector< std::vector< double >> &  distances,
const bool  topLevel = true 
)
private

This function processes the point at index.

It marks the point as visited, checks if the given point is core or non-core. If it is a core point, it expands the cluster, otherwise it returns.

Template Parameters
MatTypeType of matrix (arma::mat or arma::sp_mat).
Parameters
dataDataset to cluster.
unvisitedRemembers if a point has been visited.
indexIndex of point to be visited now.
assignmentsVector to store cluster assignments.
currentClusterIndex of cluster which will be assigned to points in current cluster.
neighborsMatrix containing list of neighbors for each point which fall in its epsilon-neighborhood.
distancesMatrix containing list of distances for each point which fall in its epsilon-neighborhood.
topLevelIf true, then current point is the first point in the current cluster, helps in detecting noise.

Member Data Documentation

template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
double mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::epsilon
private

Maximum distance between two points to be part of same cluster.

Definition at line 104 of file dbscan.hpp.

template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
size_t mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::minPoints
private

Minimum number of points to be in the epsilon-neighborhood (including itself) for the point to be a core-point.

Definition at line 108 of file dbscan.hpp.

template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
PointSelectionPolicy mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::pointSelector
private

Instantiated point selection policy.

Definition at line 114 of file dbscan.hpp.

template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = RandomPointSelection>
RangeSearchType mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >::rangeSearch
private

Instantiated range search policy.

Definition at line 111 of file dbscan.hpp.


The documentation for this class was generated from the following file: