---
title: "Hidden Markov Models"
author: "Carl Herrmann"
date: "27/01/2020"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,cache=TRUE)
```
## Principle
Working with HMMs is a 2 phase problem:
1. **learn** the parameters of the HMM using oberved **training** data (Baum-Welch algorithm, based on expectation maximization)
2. use the learned model on new **test data** to compute the likelihood of observed data and the most likely chain of unobserved states
## Training data: Julius Caesar
We are using here a text corpus taken from Shakespear Julius Caesar.
```{r}
library(HMM)
library(lattice)
corpus = readChar('https://www.dropbox.com/s/plwknlglf29qjv2/ceasar.txt?dl=1',100000)
```
We remove unwanted characters such as space of others, and define the 26 letters of the alphabet (with the underline symbol) as our set of possible outcomes:
```{r}
corpus = gsub('[\n\r]*','',corpus)
a = unlist(strsplit(gsub("[^a-z]", "_", tolower (corpus)), ""))
letter.labels=c("_", letters)
```
Let's see:
```{r}
substr(corpus,1,1000)
```
The variable `a` contains the characters as a table:
```{r}
head(a,n=10)
```
We define a small helper function needed later on:
```{r}
prob = function(x) {x/sum(x)}
```
## Learning the parameters: Baum-Welch
We first need to specify the number of hidden states that we expect in our data; here, we start with `n=2`, ut we can try changing this parameter!
```{r}
n = 2
```
In order to learn the parameters, we initialize the emission and transition matrices, and run the Baum-Welch algorithm with a specified number of iterations.
We start by generating a random transition and emission probability matrix:
```{r}
# initialize emission
B0 = apply(matrix(runif(n*27), n), 1, prob)
# initialize transition
A0 = apply(matrix(runif(n^2), n), 1, prob)
```
We now initialize an hmm object which, at this point, only contains the random matrices A0 and B0:
```{r}
hmm0 = initHMM(paste0("State ",1:n), letter.labels, startProbs=(prob(runif(n))),
transProbs=A0,
emissionProbs=B0
)
```
We are now ready to run the Baum Welch algorithm with a number of iterations:
```{r}
nit = 200
hmm = baumWelch(hmm0, a,nit)
```
### looking at the learning parameters
We can now check the learned emission probabilities, and compare them with the random initialized probabilities:
```{r}
## Plot the learned emission parameters
par(mfrow=c(2,1),mar=c(4,2,2,2))
barplot(t(B0)[,-1],beside=TRUE,names=letter.labels[-1],main='Initial',
col=c('red','lightgreen'));
barplot(hmm$hmm$emissionProbs[,-1],beside=TRUE,main='Final',col=c('red','lightgreen'));legend('topright',legend=c('State 1','State 2'),fill=c('red','lightgreen'))
```
Can you identify what the two states correspond to?
## Applying the model to new data
We can now apply the forward and Viterbi algorithms to a new sentence:
```{r}
sentence = 'this_is_a_sentence_which_should_be_analysed_by_my_hidden_markov_model'
test = unlist(strsplit(gsub("[^a-z]", "_", tolower(sentence)), ""))
```
Now apply the Forward algorithm to compute using a dynamic programming approach the intermediate probabilities; the output is a table with the intermediate probabilities (given as ln(p)):
```{r}
## apply forward
f = forward(hmm$hmm,test)
heatmap(f,Rowv = NA,Colv=NA)
```
```{r}
f
```
We can also apply the Viterbi algorithm to the new sentence to check the most likely states:
```{r}
## apply viterbi
v = viterbi(hmm$hmm,test)
data.frame(prediction=v,sentence=test)
```