New representations for a semi-Markov chain and related filters

Paul Malcolm (ANU DST)

07-Jun-2022, 06:00-07:00 (22 months ago)

Abstract: It is now usual that the null-hypothesis for a finite-state stochastic process is conveniently taken to be the standard Markov chain. In the absence of any other system knowledge this is the model that is often used. Some reasons for this are; Markov chains are relatively simple, they have been well studied and much is known about these processes. Added to this there are now decades of history applying the standard Hidden Markov Model (HMM) to: defence science, gene sequencing, health science, machine learning, artificial intelligence and many other areas. In this seminar we will briefly recall two common application domains of estimation with latent Markov processes, 1) parts-of-speech tagging (POS) in natural language processing and 2) tracking a maneuvering object with a Jump Markov System. Semi-Markov models relax an implicit feature of every state in a first-order time-homogeneous Markov chains, that is, the sojourn random variables of such states are geometrically distributed and are therefore, (uniquely) memoryless random variables. In contrast, semi-Markov chains allow arbitrary sojourn models. Consequently, a Hidden semi-Markov Model (HsMM) offers a richer class of model, but retains the classical HMM as a special degenerate case.

The main task we address in this seminar concerns model calibration, or parameter estimation of a HsMM. We develop an Expectation Maximization (EM) algorithm to compute the best fitting (in the Maximum Likelihood sense) HsMM for a given set of observation data. There are several parts to this task, the first is to derive a recursive filter and smoother for a partially observed semi-Markov chain. The second and more challenging part of the task is to derive filters and smoothers for various processes derived from the latent semi-Markov chain, for example, a counting process that counts the number of transitions between two distinct states labelled "i" and "j", up to an including time k. We will see that estimators for such quantities are non-trivial, largely because of the sojourn dependence in transition probabilities.

The estimators we present are all for partially observed joint events, that is, the state of the semi-Markov chain at time "k" and the cumulative time it has remained in this state. This means we are assured of exponential forgetting of initial conditions in our estimators. Separate estimators for individual quantities such as the semi-Markov state alone are easily computed via marginalization.

computational biologycomputational engineering, finance, and sciencenumerical analysiscomputational physics

Audience: researchers in the topic


ANU Mathematics and Computational Sciences Seminar

Organizers: Matthew Hole, Quanling Deng*
*contact for this listing

Export talk to