mlpack
master
|
The simple Naive Bayes classifier. More...
Public Member Functions | |
NaiveBayesClassifier (const MatType &data, const arma::Row< size_t > &labels, const size_t classes, const bool incrementalVariance=false) | |
Initializes the classifier as per the input and then trains it by calculating the sample mean and variances. More... | |
NaiveBayesClassifier (const size_t dimensionality=0, const size_t classes=0) | |
Initialize the Naive Bayes classifier without performing training. More... | |
void | Classify (const MatType &data, arma::Row< size_t > &results) |
Given a bunch of data points, this function evaluates the class of each of those data points, and puts it in the vector 'results'. More... | |
const MatType & | Means () const |
Get the sample means for each class. More... | |
MatType & | Means () |
Modify the sample means for each class. More... | |
const arma::vec & | Probabilities () const |
Get the prior probabilities for each class. More... | |
arma::vec & | Probabilities () |
Modify the prior probabilities for each class. More... | |
template<typename Archive > | |
void | Serialize (Archive &ar, const unsigned int) |
Serialize the classifier. More... | |
void | Train (const MatType &data, const arma::Row< size_t > &labels, const bool incremental=true) |
Train the Naive Bayes classifier on the given dataset. More... | |
template<typename VecType > | |
void | Train (const VecType &point, const size_t label) |
Train the Naive Bayes classifier on the given point. More... | |
const MatType & | Variances () const |
Get the sample variances for each class. More... | |
MatType & | Variances () |
Modify the sample variances for each class. More... | |
Private Attributes | |
MatType | means |
Sample mean for each class. More... | |
arma::vec | probabilities |
Class probabilities. More... | |
size_t | trainingPoints |
Number of training points seen so far. More... | |
MatType | variances |
Sample variances for each class. More... | |
The simple Naive Bayes classifier.
This class trains on the data by calculating the sample mean and variance of the features with respect to each of the labels, and also the class probabilities. The class labels are assumed to be positive integers (starting with 0), and are expected to be the last row of the data input to the constructor.
Mathematically, it computes P(X_i = x_i | Y = y_j) for each feature X_i for each of the labels y_j. Alongwith this, it also computes the class probabilities P(Y = y_j).
For classifying a data point (x_1, x_2, ..., x_n), it computes the following: arg max_y(P(Y = y)*P(X_1 = x_1 | Y = y) * ... * P(X_n = x_n | Y = y))
Example use:
Definition at line 47 of file naive_bayes_classifier.hpp.
mlpack::naive_bayes::NaiveBayesClassifier< MatType >::NaiveBayesClassifier | ( | const MatType & | data, |
const arma::Row< size_t > & | labels, | ||
const size_t | classes, | ||
const bool | incrementalVariance = false |
||
) |
Initializes the classifier as per the input and then trains it by calculating the sample mean and variances.
Example use:
data | Training data points. |
labels | Labels corresponding to training data points. |
classes | Number of classes in this classifier. |
incrementalVariance | If true, an incremental algorithm is used to calculate the variance; this can prevent loss of precision in some cases, but will be somewhat slower to calculate. |
mlpack::naive_bayes::NaiveBayesClassifier< MatType >::NaiveBayesClassifier | ( | const size_t | dimensionality = 0 , |
const size_t | classes = 0 |
||
) |
Initialize the Naive Bayes classifier without performing training.
All of the parameters of the model will be initialized to zero. Be sure to use Train() before calling Classify(), otherwise the results may be meaningless.
void mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Classify | ( | const MatType & | data, |
arma::Row< size_t > & | results | ||
) |
Given a bunch of data points, this function evaluates the class of each of those data points, and puts it in the vector 'results'.
data | List of data points. |
results | Vector that class predictions will be placed into. |
|
inline |
Get the sample means for each class.
Definition at line 129 of file naive_bayes_classifier.hpp.
References mlpack::naive_bayes::NaiveBayesClassifier< MatType >::means.
|
inline |
Modify the sample means for each class.
Definition at line 131 of file naive_bayes_classifier.hpp.
References mlpack::naive_bayes::NaiveBayesClassifier< MatType >::means.
|
inline |
Get the prior probabilities for each class.
Definition at line 139 of file naive_bayes_classifier.hpp.
References mlpack::naive_bayes::NaiveBayesClassifier< MatType >::probabilities.
|
inline |
Modify the prior probabilities for each class.
Definition at line 141 of file naive_bayes_classifier.hpp.
References mlpack::naive_bayes::NaiveBayesClassifier< MatType >::probabilities, and mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Serialize().
void mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Serialize | ( | Archive & | ar, |
const unsigned | int | ||
) |
Serialize the classifier.
Referenced by mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Probabilities().
void mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Train | ( | const MatType & | data, |
const arma::Row< size_t > & | labels, | ||
const bool | incremental = true |
||
) |
Train the Naive Bayes classifier on the given dataset.
If the incremental algorithm is used, the current model is used as a starting point (this is the default). If the incremental algorithm is not used, then the current model is ignored and the new model will be trained only on the given data. Note that even if the incremental algorithm is not used, the data must have the same dimensionality and number of classes that the model was initialized with. If you want to change the dimensionality or number of classes, either re-initialize or call Means(), Variances(), and Probabilities() individually to set them to the right size.
data | The dataset to train on. |
incremental | Whether or not to use the incremental algorithm for training. |
void mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Train | ( | const VecType & | point, |
const size_t | label | ||
) |
Train the Naive Bayes classifier on the given point.
This will use the incremental algorithm for updating the model parameters. The data must be the same dimensionality as the existing model parameters.
point | Data point to train on. |
label | Label of data point. |
|
inline |
Get the sample variances for each class.
Definition at line 134 of file naive_bayes_classifier.hpp.
References mlpack::naive_bayes::NaiveBayesClassifier< MatType >::variances.
|
inline |
Modify the sample variances for each class.
Definition at line 136 of file naive_bayes_classifier.hpp.
References mlpack::naive_bayes::NaiveBayesClassifier< MatType >::variances.
|
private |
Sample mean for each class.
Definition at line 149 of file naive_bayes_classifier.hpp.
Referenced by mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Means().
|
private |
Class probabilities.
Definition at line 153 of file naive_bayes_classifier.hpp.
Referenced by mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Probabilities().
|
private |
Number of training points seen so far.
Definition at line 155 of file naive_bayes_classifier.hpp.
|
private |
Sample variances for each class.
Definition at line 151 of file naive_bayes_classifier.hpp.
Referenced by mlpack::naive_bayes::NaiveBayesClassifier< MatType >::Variances().