A class that represents a Hidden Markov Model with an arbitrary type of emission distribution. More...

Inheritance diagram for mlpack::hmm::HMM< Distribution >:

[legend]

Public Member Functions
	HMM (const size_t states=0, const Distribution emissions=Distribution(), const double tolerance=1e-5)
	Create the Hidden Markov Model with the given number of hidden states and the given default distribution for emissions. More...

	HMM (const arma::vec &initial, const arma::mat &transition, const std::vector< Distribution > &emission, const double tolerance=1e-5)
	Create the Hidden Markov Model with the given initial probability vector, the given transition matrix, and the given emission distributions. More...

size_t	Dimensionality () const
	Get the dimensionality of observations. More...

size_t &	Dimensionality ()
	Set the dimensionality of observations. More...

const std::vector< Distribution > &	Emission () const
	Return the emission distributions. More...

std::vector< Distribution > &	Emission ()
	Return a modifiable emission probability matrix reference. More...

double	Estimate (const arma::mat &dataSeq, arma::mat &stateProb, arma::mat &forwardProb, arma::mat &backwardProb, arma::vec &scales) const
	Estimate the probabilities of each hidden state at each time step for each given data observation, using the Forward-Backward algorithm. More...

double	Estimate (const arma::mat &dataSeq, arma::mat &stateProb) const
	Estimate the probabilities of each hidden state at each time step of each given data observation, using the Forward-Backward algorithm. More...

void	Filter (const arma::mat &dataSeq, arma::mat &filterSeq, size_t ahead=0) const
	HMM filtering. More...

void	Generate (const size_t length, arma::mat &dataSequence, arma::Row< size_t > &stateSequence, const size_t startState=0) const
	Generate a random data sequence of the given length. More...

const arma::vec &	Initial () const
	Return the vector of initial state probabilities. More...

arma::vec &	Initial ()
	Modify the vector of initial state probabilities. More...

double	LogLikelihood (const arma::mat &dataSeq) const
	Compute the log-likelihood of the given data sequence. More...

double	Predict (const arma::mat &dataSeq, arma::Row< size_t > &stateSeq) const
	Compute the most probable hidden state sequence for the given data sequence, using the Viterbi algorithm, returning the log-likelihood of the most likely state sequence. More...

template<typename Archive >
void	Serialize (Archive &ar, const unsigned int version)
	Serialize the object. More...

void	Smooth (const arma::mat &dataSeq, arma::mat &smoothSeq) const
	HMM smoothing. More...

double	Tolerance () const
	Get the tolerance of the Baum-Welch algorithm. More...

double &	Tolerance ()
	Modify the tolerance of the Baum-Welch algorithm. More...

void	Train (const std::vector< arma::mat > &dataSeq)
	Train the model using the Baum-Welch algorithm, with only the given unlabeled observations. More...

void	Train (const std::vector< arma::mat > &dataSeq, const std::vector< arma::Row< size_t > > &stateSeq)
	Train the model using the given labeled observations; the transition and emission matrices are directly estimated. More...

const arma::mat &	Transition () const
	Return the transition matrix. More...

arma::mat &	Transition ()
	Return a modifiable transition matrix reference. More...

Protected Member Functions
void	Backward (const arma::mat &dataSeq, const arma::vec &scales, arma::mat &backwardProb) const
	The Backward algorithm (part of the Forward-Backward algorithm). More...

void	Forward (const arma::mat &dataSeq, arma::vec &scales, arma::mat &forwardProb) const
	The Forward algorithm (part of the Forward-Backward algorithm). More...

Protected Attributes
std::vector< Distribution >	emission
	Set of emission probability distributions; one for each state. More...

arma::mat	transition
	Transition probability matrix. More...

Private Attributes
size_t	dimensionality
	Dimensionality of observations. More...

arma::vec	initial
	Initial state probability vector. More...

double	tolerance
	Tolerance of Baum-Welch algorithm. More...

Detailed Description

template<typename Distribution = distribution::DiscreteDistribution>
class mlpack::hmm::HMM< Distribution >

A class that represents a Hidden Markov Model with an arbitrary type of emission distribution.

This HMM class supports training (supervised and unsupervised), prediction of state sequences via the Viterbi algorithm, estimation of state probabilities, generation of random sequences, and calculation of the log-likelihood of a given sequence.

The template parameter, Distribution, specifies the distribution which the emissions follow. The class should implement the following functions:

class Distribution
{
 public:
  // The type of observation used by this distribution.
  typedef something DataType;
  // Return the probability of the given observation.
  double Probability(const DataType& observation) const;
  // Estimate the distribution based on the given observations.
  void Train(const std::vector<DataType>& observations);
  // Estimate the distribution based on the given observations, given also
  // the probability of each observation coming from this distribution.
  void Train(const std::vector<DataType>& observations,
             const std::vector<double>& probabilities);
};

See the mlpack::distribution::DiscreteDistribution class for an example. One would use the DiscreteDistribution class when the observations are non-negative integers. Other distributions could be Gaussians, a mixture of Gaussians (GMM), or any other probability distribution implementing the four Distribution functions.

Usage of the HMM class generally involves either training an HMM or loading an already-known HMM and taking probability measurements of sequences. Example code for supervised training of a Gaussian HMM (that is, where the emission output distribution is a single Gaussian for each hidden state) is given below.

extern arma::mat observations; // Each column is an observation.
extern arma::Row<size_t> states; // Hidden states for each observation.
// Create an untrained HMM with 5 hidden states and default (N(0, 1))
// Gaussian distributions with the dimensionality of the dataset.
HMM<GaussianDistribution> hmm(5, GaussianDistribution(observations.n_rows));
// Train the HMM (the labels could be omitted to perform unsupervised
// training).
hmm.Train(observations, states);

Once initialized, the HMM can evaluate the probability of a certain sequence (with LogLikelihood()), predict the most likely sequence of hidden states (with Predict()), generate a sequence (with Generate()), or estimate the probabilities of each state for a sequence of observations (with Train()).

Template Parameters

Distribution Type of emission distribution for this HMM.

Definition at line 85 of file hmm.hpp.

Constructor & Destructor Documentation

template<typename Distribution = distribution::DiscreteDistribution>

mlpack::hmm::HMM< Distribution >::HMM	(	const size_t	states = `0`,
		const Distribution	emissions = `Distribution()`,
		const double	tolerance = `1e-5`
	)

Create the Hidden Markov Model with the given number of hidden states and the given default distribution for emissions.

The dimensionality of the observations is taken from the emissions variable, so it is important that the given default emission distribution is set with the correct dimensionality. Alternately, set the dimensionality with Dimensionality(). Optionally, the tolerance for convergence of the Baum-Welch algorithm can be set.

By default, the transition matrix and initial probability vector are set to contain equal probability for each state.

Parameters

states	Number of states.
emissions	Default distribution for emissions.
tolerance	Tolerance for convergence of training algorithm (Baum-Welch).

template<typename Distribution = distribution::DiscreteDistribution>

mlpack::hmm::HMM< Distribution >::HMM	(	const arma::vec &	initial,
		const arma::mat &	transition,
		const std::vector< Distribution > &	emission,
		const double	tolerance = `1e-5`
	)

Create the Hidden Markov Model with the given initial probability vector, the given transition matrix, and the given emission distributions.

The dimensionality of the observations of the HMM are taken from the given emission distributions. Alternately, the dimensionality can be set with Dimensionality().

The initial state probability vector should have length equal to the number of states, and each entry represents the probability of being in the given state at time T = 0 (the beginning of a sequence).

The transition matrix should be such that T(i, j) is the probability of transition to state i from state j. The columns of the matrix should sum to 1.

The emission matrix should be such that E(i, j) is the probability of emission i while in state j. The columns of the matrix should sum to 1.

Optionally, the tolerance for convergence of the Baum-Welch algorithm can be set.

Parameters

initial	Initial state probabilities.
transition	Transition matrix.
emission	Emission distributions.
tolerance	Tolerance for convergence of training algorithm (Baum-Welch).

Member Function Documentation

template<typename Distribution = distribution::DiscreteDistribution>

void mlpack::hmm::HMM< Distribution >::Backward	(	const arma::mat &	dataSeq,
		const arma::vec &	scales,
		arma::mat &	backwardProb
	)		const

protected

The Backward algorithm (part of the Forward-Backward algorithm).

Computes backward probabilities for each state for each observation in the given data sequence, using the scaling factors found (presumably) by Forward(). The returned matrix has rows equal to the number of hidden states and columns equal to the number of observations.

Parameters

dataSeq	Data sequence to compute probabilities for.
scales	Vector of scaling factors.
backwardProb	Matrix in which backward probabilities will be saved.

Referenced by mlpack::hmm::HMM< distribution::DiscreteDistribution >::Tolerance().

template<typename Distribution = distribution::DiscreteDistribution>

size_t mlpack::hmm::HMM< Distribution >::Dimensionality ( ) const

inline

Get the dimensionality of observations.

Definition at line 316 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

size_t& mlpack::hmm::HMM< Distribution >::Dimensionality ( )

inline

Set the dimensionality of observations.

Definition at line 318 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

const std::vector<Distribution>& mlpack::hmm::HMM< Distribution >::Emission ( ) const

inline

Return the emission distributions.

Definition at line 311 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

std::vector<Distribution>& mlpack::hmm::HMM< Distribution >::Emission ( )

inline

Return a modifiable emission probability matrix reference.

Definition at line 313 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

double mlpack::hmm::HMM< Distribution >::Estimate	(	const arma::mat &	dataSeq,
		arma::mat &	stateProb,
		arma::mat &	forwardProb,
		arma::mat &	backwardProb,
		arma::vec &	scales
	)		const

Estimate the probabilities of each hidden state at each time step for each given data observation, using the Forward-Backward algorithm.

Each matrix which is returned has columns equal to the number of data observations, and rows equal to the number of hidden states in the model. The log-likelihood of the most probable sequence is returned.

Parameters

dataSeq	Sequence of observations.
stateProb	Matrix in which the probabilities of each state at each time interval will be stored.
forwardProb	Matrix in which the forward probabilities of each state at each time interval will be stored.
backwardProb	Matrix in which the backward probabilities of each state at each time interval will be stored.
scales	Vector in which the scaling factors at each time interval will be stored.

Returns: Log-likelihood of most likely state sequence.

template<typename Distribution = distribution::DiscreteDistribution>

double mlpack::hmm::HMM< Distribution >::Estimate	(	const arma::mat &	dataSeq,
		arma::mat &	stateProb
	)		const

Estimate the probabilities of each hidden state at each time step of each given data observation, using the Forward-Backward algorithm.

The returned matrix of state probabilities has columns equal to the number of data observations, and rows equal to the number of hidden states in the model. The log-likelihood of the most probable sequence is returned.

Parameters

dataSeq	Sequence of observations.
stateProb	Probabilities of each state at each time interval.

Returns: Log-likelihood of most likely state sequence.

template<typename Distribution = distribution::DiscreteDistribution>

void mlpack::hmm::HMM< Distribution >::Filter	(	const arma::mat &	dataSeq,
		arma::mat &	filterSeq,
		size_t	ahead = `0`
	)		const

HMM filtering.

Computes the k-step-ahead expected emission at each time conditioned only on prior observations. That is E{ Y[t+k] | Y[0], ..., Y[t] }. The returned matrix has columns equal to the number of observations. Note that the expectation may not be meaningful for discrete emissions.

Parameters

dataSeq	Sequence of observations.
filterSeq	Vector in which the expected emission sequence will be stored.
ahead	Number of steps ahead (k) for expectations.

template<typename Distribution = distribution::DiscreteDistribution>

void mlpack::hmm::HMM< Distribution >::Forward	(	const arma::mat &	dataSeq,
		arma::vec &	scales,
		arma::mat &	forwardProb
	)		const

protected

The Forward algorithm (part of the Forward-Backward algorithm).

Computes forward probabilities for each state for each observation in the given data sequence. The returned matrix has rows equal to the number of hidden states and columns equal to the number of observations.

Parameters

dataSeq	Data sequence to compute probabilities for.
scales	Vector in which scaling factors will be saved.
forwardProb	Matrix in which forward probabilities will be saved.

Referenced by mlpack::hmm::HMM< distribution::DiscreteDistribution >::Tolerance().

template<typename Distribution = distribution::DiscreteDistribution>

void mlpack::hmm::HMM< Distribution >::Generate	(	const size_t	length,
		arma::mat &	dataSequence,
		arma::Row< size_t > &	stateSequence,
		const size_t	startState = `0`
	)		const

Generate a random data sequence of the given length.

The data sequence is stored in the dataSequence parameter, and the state sequence is stored in the stateSequence parameter. Each column of dataSequence represents a random observation.

Parameters

length	Length of random sequence to generate.
dataSequence	Vector to store data in.
stateSequence	Vector to store states in.
startState	Hidden state to start sequence in (default 0).

template<typename Distribution = distribution::DiscreteDistribution>

const arma::vec& mlpack::hmm::HMM< Distribution >::Initial ( ) const

inline

Return the vector of initial state probabilities.

Definition at line 301 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

arma::vec& mlpack::hmm::HMM< Distribution >::Initial ( )

inline

Modify the vector of initial state probabilities.

Definition at line 303 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

double mlpack::hmm::HMM< Distribution >::LogLikelihood ( const arma::mat & dataSeq ) const

Compute the log-likelihood of the given data sequence.

Parameters

dataSeq Data sequence to evaluate the likelihood of.

Returns: Log-likelihood of the given sequence.

template<typename Distribution = distribution::DiscreteDistribution>

double mlpack::hmm::HMM< Distribution >::Predict	(	const arma::mat &	dataSeq,
		arma::Row< size_t > &	stateSeq
	)		const

Compute the most probable hidden state sequence for the given data sequence, using the Viterbi algorithm, returning the log-likelihood of the most likely state sequence.

Parameters

dataSeq	Sequence of observations.
stateSeq	Vector in which the most probable state sequence will be stored.

Returns: Log-likelihood of most probable state sequence.

template<typename Distribution = distribution::DiscreteDistribution>

template<typename Archive >

void mlpack::hmm::HMM< Distribution >::Serialize	(	Archive &	ar,
		const unsigned int	version
	)

Serialize the object.

Referenced by mlpack::hmm::HMM< distribution::DiscreteDistribution >::Tolerance().

template<typename Distribution = distribution::DiscreteDistribution>

void mlpack::hmm::HMM< Distribution >::Smooth	(	const arma::mat &	dataSeq,
		arma::mat &	smoothSeq
	)		const

HMM smoothing.

Computes expected emission at each time conditioned on all observations. That is E{ Y[t] | Y[0], ..., Y[T] }. The returned matrix has columns equal to the number of observations. Note that the expectation may not be meaningful for discrete emissions.

Parameters

dataSeq	Sequence of observations.
smoothSeq	Vector in which the expected emission sequence will be stored.

template<typename Distribution = distribution::DiscreteDistribution>

double mlpack::hmm::HMM< Distribution >::Tolerance ( ) const

inline

Get the tolerance of the Baum-Welch algorithm.

Definition at line 321 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

double& mlpack::hmm::HMM< Distribution >::Tolerance ( )

inline

Modify the tolerance of the Baum-Welch algorithm.

Definition at line 323 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

void mlpack::hmm::HMM< Distribution >::Train ( const std::vector< arma::mat > & dataSeq )

Train the model using the Baum-Welch algorithm, with only the given unlabeled observations.

Instead of giving a guess transition and emission matrix here, do that in the constructor. Each matrix in the vector of data sequences holds an individual data sequence; each point in each individual data sequence should be a column in the matrix. The number of rows in each matrix should be equal to the dimensionality of the HMM (which is set in the constructor).

It is preferable to use the other overload of Train(), with labeled data. That will produce much better results. However, if labeled data is unavailable, this will work. In addition, it is possible to use Train() with labeled data first, and then continue to train the model using this overload of Train() with unlabeled data.

The tolerance of the Baum-Welch algorithm can be set either in the constructor or with the Tolerance() method. When the change in log-likelihood of the model between iterations is less than the tolerance, the Baum-Welch algorithm terminates.

Note: Train() can be called multiple times with different sequences; each time it is called, it uses the current parameters of the HMM as a starting point for training.

Parameters

dataSeq Vector of observation sequences.

template<typename Distribution = distribution::DiscreteDistribution>

void mlpack::hmm::HMM< Distribution >::Train	(	const std::vector< arma::mat > &	dataSeq,
		const std::vector< arma::Row< size_t > > &	stateSeq
	)

Train the model using the given labeled observations; the transition and emission matrices are directly estimated.

Each matrix in the vector of data sequences corresponds to a vector in the vector of state sequences. Each point in each individual data sequence should be a column in the matrix, and its state should be the corresponding element in the state sequence vector. For instance, dataSeq[0].col(3) corresponds to the fourth observation in the first data sequence, and its state is stateSeq[0][3]. The number of rows in each matrix should be equal to the dimensionality of the HMM (which is set in the constructor).

Note: Train() can be called multiple times with different sequences; each time it is called, it uses the current parameters of the HMM as a starting point for training.

Parameters

dataSeq	Vector of observation sequences.
stateSeq	Vector of state sequences, corresponding to each observation.

template<typename Distribution = distribution::DiscreteDistribution>

const arma::mat& mlpack::hmm::HMM< Distribution >::Transition ( ) const

inline

Return the transition matrix.

Definition at line 306 of file hmm.hpp.

template<typename Distribution = distribution::DiscreteDistribution>

arma::mat& mlpack::hmm::HMM< Distribution >::Transition ( )

inline

Return a modifiable transition matrix reference.

Definition at line 308 of file hmm.hpp.

Member Data Documentation

template<typename Distribution = distribution::DiscreteDistribution>

size_t mlpack::hmm::HMM< Distribution >::dimensionality

private

Dimensionality of observations.

Definition at line 373 of file hmm.hpp.

Referenced by mlpack::hmm::HMM< distribution::DiscreteDistribution >::Dimensionality().

template<typename Distribution = distribution::DiscreteDistribution>

std::vector<Distribution> mlpack::hmm::HMM< Distribution >::emission

protected

Set of emission probability distributions; one for each state.

Definition at line 363 of file hmm.hpp.

Referenced by mlpack::hmm::HMM< distribution::DiscreteDistribution >::Emission().

template<typename Distribution = distribution::DiscreteDistribution>

arma::vec mlpack::hmm::HMM< Distribution >::initial

private

Initial state probability vector.

Definition at line 370 of file hmm.hpp.

Referenced by mlpack::hmm::HMM< distribution::DiscreteDistribution >::Initial().

template<typename Distribution = distribution::DiscreteDistribution>

double mlpack::hmm::HMM< Distribution >::tolerance

private

Tolerance of Baum-Welch algorithm.

Definition at line 376 of file hmm.hpp.

Referenced by mlpack::hmm::HMM< distribution::DiscreteDistribution >::Tolerance().

template<typename Distribution = distribution::DiscreteDistribution>

arma::mat mlpack::hmm::HMM< Distribution >::transition

protected

Transition probability matrix.

Definition at line 366 of file hmm.hpp.

Referenced by mlpack::hmm::HMM< distribution::DiscreteDistribution >::Transition().

The documentation for this class was generated from the following file:

src/mlpack/methods/hmm/hmm.hpp

Public Member Functions

Protected Member Functions

Protected Attributes

Private Attributes

Detailed Description

template<typename Distribution = distribution::DiscreteDistribution> class mlpack::hmm::HMM< Distribution >

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation

template<typename Distribution = distribution::DiscreteDistribution>
class mlpack::hmm::HMM< Distribution >