| mlpack
    master
    | 
The HoeffdingNumericSplit class implements the numeric feature splitting strategy alluded to by Domingos and Hulten in the following paper: More...
| Public Types | |
| typedef NumericSplitInfo< ObservationType > | SplitInfo | 
| The splitting information type required by the HoeffdingNumericSplit.  More... | |
| Public Member Functions | |
| HoeffdingNumericSplit (const size_t numClasses, const size_t bins=10, const size_t observationsBeforeBinning=100) | |
| Create the HoeffdingNumericSplit class, and specify some basic parameters about how the binning should take place.  More... | |
| HoeffdingNumericSplit (const size_t numClasses, const HoeffdingNumericSplit &other) | |
| Create the HoeffdingNumericSplit class, using the parameters from the given other split object.  More... | |
| size_t | Bins () const | 
| Return the number of bins.  More... | |
| void | EvaluateFitnessFunction (double &bestFitness, double &secondBestFitness) const | 
| Evaluate the fitness function given what has been calculated so far.  More... | |
| size_t | MajorityClass () const | 
| Return the majority class.  More... | |
| double | MajorityProbability () const | 
| Return the probability of the majority class.  More... | |
| size_t | NumChildren () const | 
| Return the number of children if this node splits on this feature.  More... | |
| template<typename Archive > | |
| void | Serialize (Archive &ar, const unsigned int) | 
| Serialize the object.  More... | |
| void | Split (arma::Col< size_t > &childMajorities, SplitInfo &splitInfo) const | 
| Return the majority class of each child to be created, if a split on this dimension was performed.  More... | |
| void | Train (ObservationType value, const size_t label) | 
| Train the HoeffdingNumericSplit on the given observed value (remember that this object only cares about the information for a single feature, not an entire point).  More... | |
| Private Attributes | |
| size_t | bins | 
| The number of bins.  More... | |
| arma::Col< size_t > | labels | 
| This holds the labels of the points before binning.  More... | |
| arma::Col< ObservationType > | observations | 
| Before binning, this holds the points we have seen so far.  More... | |
| size_t | observationsBeforeBinning | 
| The number of observations we must see before binning.  More... | |
| size_t | samplesSeen | 
| The number of samples we have seen so far.  More... | |
| arma::Col< ObservationType > | splitPoints | 
| The split points for the binning (length bins - 1).  More... | |
| arma::Mat< size_t > | sufficientStatistics | 
| After binning, this contains the sufficient statistics.  More... | |
The HoeffdingNumericSplit class implements the numeric feature splitting strategy alluded to by Domingos and Hulten in the following paper:
The strategy alluded to is very simple: we discretize the numeric features that we see. But in this case, we don't know how many bins we have, which makes things a little difficult. This class only makes binary splits, and has a maximum number of bins. The binning strategy is simple: the split caches the minimum and maximum value of points seen so far, and when the number of points hits a predefined threshold, the cached minimum-maximum range is equally split into bins, and splitting proceeds in the same way as with the categorical splits. This is a simple and stupid strategy, so don't expect it to be the best possible thing you can do.
| FitnessFunction | Fitness function to use for calculating gain. | 
| ObservationType | Type of observations in this dimension. | 
Definition at line 53 of file hoeffding_numeric_split.hpp.
| typedef NumericSplitInfo<ObservationType> mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::SplitInfo | 
The splitting information type required by the HoeffdingNumericSplit.
Definition at line 57 of file hoeffding_numeric_split.hpp.
| mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::HoeffdingNumericSplit | ( | const size_t | numClasses, | 
| const size_t | bins = 10, | ||
| const size_t | observationsBeforeBinning = 100 | ||
| ) | 
Create the HoeffdingNumericSplit class, and specify some basic parameters about how the binning should take place.
| numClasses | Number of classes. | 
| bins | Number of bins. | 
| observationsBeforeBinning | Number of points to see before binning is performed. | 
| mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::HoeffdingNumericSplit | ( | const size_t | numClasses, | 
| const HoeffdingNumericSplit< FitnessFunction, ObservationType > & | other | ||
| ) | 
Create the HoeffdingNumericSplit class, using the parameters from the given other split object.
| 
 | inline | 
Return the number of bins.
Definition at line 120 of file hoeffding_numeric_split.hpp.
References mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::bins, and mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::Serialize().
| void mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::EvaluateFitnessFunction | ( | double & | bestFitness, | 
| double & | secondBestFitness | ||
| ) | const | 
Evaluate the fitness function given what has been calculated so far.
In this case, if binning has not yet been performed, 0 will be returned (i.e., no gain). Because this split can only split one possible way, secondBestFitness (the fitness function for the second best possible split) will be set to 0.
| bestFitness | Value of the fitness function for the best possible split. | 
| secondBestFitness | Value of the fitness function for the second best possible split (always 0 for this split). | 
| size_t mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::MajorityClass | ( | ) | const | 
Return the majority class.
Referenced by mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::NumChildren().
| double mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::MajorityProbability | ( | ) | const | 
Return the probability of the majority class.
Referenced by mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::NumChildren().
| 
 | inline | 
Return the number of children if this node splits on this feature.
Definition at line 106 of file hoeffding_numeric_split.hpp.
References mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::bins, mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::MajorityClass(), mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::MajorityProbability(), and mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::Split().
| void mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::Serialize | ( | Archive & | ar, | 
| const unsigned | int | ||
| ) | 
Serialize the object.
Referenced by mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::Bins().
| void mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::Split | ( | arma::Col< size_t > & | childMajorities, | 
| SplitInfo & | splitInfo | ||
| ) | const | 
Return the majority class of each child to be created, if a split on this dimension was performed.
Also create the split object.
Referenced by mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::NumChildren().
| void mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::Train | ( | ObservationType | value, | 
| const size_t | label | ||
| ) | 
Train the HoeffdingNumericSplit on the given observed value (remember that this object only cares about the information for a single feature, not an entire point).
| value | Value in the dimension that this HoeffdingNumericSplit refers to. | 
| label | Label of the given point. | 
| 
 | private | 
The number of bins.
Definition at line 135 of file hoeffding_numeric_split.hpp.
Referenced by mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::Bins(), and mlpack::tree::HoeffdingNumericSplit< FitnessFunction, ObservationType >::NumChildren().
| 
 | private | 
This holds the labels of the points before binning.
Definition at line 130 of file hoeffding_numeric_split.hpp.
| 
 | private | 
Before binning, this holds the points we have seen so far.
Definition at line 128 of file hoeffding_numeric_split.hpp.
| 
 | private | 
The number of observations we must see before binning.
Definition at line 137 of file hoeffding_numeric_split.hpp.
| 
 | private | 
The number of samples we have seen so far.
Definition at line 139 of file hoeffding_numeric_split.hpp.
| 
 | private | 
The split points for the binning (length bins - 1).
Definition at line 133 of file hoeffding_numeric_split.hpp.
| 
 | private | 
After binning, this contains the sufficient statistics.
Definition at line 142 of file hoeffding_numeric_split.hpp.
 1.8.11
 1.8.11