Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Hidden Semi-Markov Models: Theory, Algorithms and Applications
Hidden Semi-Markov Models: Theory, Algorithms and Applications
Hidden Semi-Markov Models: Theory, Algorithms and Applications
Ebook358 pages8 hours

Hidden Semi-Markov Models: Theory, Algorithms and Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Hidden semi-Markov models (HSMMs) are among the most important models in the area of artificial intelligence / machine learning. Since the first HSMM was introduced in 1980 for machine recognition of speech, three other HSMMs have been proposed, with various definitions of duration and observation distributions. Those models have different expressions, algorithms, computational complexities, and applicable areas, without explicitly interchangeable forms.

Hidden Semi-Markov Models: Theory, Algorithms and Applications provides a unified and foundational approach to HSMMs, including various HSMMs (such as the explicit duration, variable transition, and residential time of HSMMs), inference and estimation algorithms, implementation methods and application instances. Learn new developments and state-of-the-art emerging topics as they relate to HSMMs, presented with examples drawn from medicine, engineering and computer science.

  • Discusses the latest developments and emerging topics in the field of HSMMs
  • Includes a description of applications in various areas including, Human Activity Recognition, Handwriting Recognition, Network Traffic Characterization and Anomaly Detection, and Functional MRI Brain Mapping.
  • Shows how to master the basic techniques needed for using HSMMs and how to apply them.
LanguageEnglish
Release dateOct 22, 2015
ISBN9780128027714
Hidden Semi-Markov Models: Theory, Algorithms and Applications
Author

Shun-Zheng Yu

Shun-Zheng Yu is a professor at the School of Information Science and Technology at Sun Yat-Sen University, China.. He was a visiting scholar at Princeton University and IBM Thomas J. Watson Research Center from 1999 to 2002. He has authored two hundred journal papers that used artificial intelligence/machine learning methods for inference and estimation, among which fifty papers involved hidden semi-Markov models. Professor Yu is a well-recognized expert in the field of HSMMs and their applications. He has developed new estimation algorithms for HSMMs and applied them in various fields. The papers entitled "Hidden Semi-Markov Models (2010)" Published in the Elsevier Journal Artificial Intelligence , "Practical Implementation of an Efficient Forward-Backward Algorithm for an Explicit Duration Hidden Markov Model (2006) published in IEEE Signal Processing Letters", "A Hidden Semi-Markov Model with Missing Data and Multiple Observation Sequences for Mobility Tracking (2003)" Published in the Elsevier Journal Signal Processing and " An Efficient Forward-Backward Algorithm for an Explicit Duration Hidden Markov Model (2003) published in IEEE Signal Processing Letters " have been cited by hundreds of papers.

Related to Hidden Semi-Markov Models

Related ebooks

Mathematics For You

View More

Related articles

Reviews for Hidden Semi-Markov Models

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Hidden Semi-Markov Models - Shun-Zheng Yu

    Hidden Semi-Markov Models

    Theory, Algorithms and Applications

    Shun-Zheng Yu

    Table of Contents

    Cover image

    Title page

    Copyright

    Preface

    Acknowledgments

    Chapter 1. Introduction

    Abstract

    1.1 Markov Renewal Process and Semi-Markov Process

    1.2 Hidden Markov Models

    1.3 Dynamic Bayesian Networks

    1.4 Conditional Random Fields

    1.5 Hidden Semi-Markov Models

    1.6 History of Hidden Semi-Markov Models

    Chapter 2. General Hidden Semi-Markov Model

    Abstract

    2.1 A General Definition of HSMM

    2.2 Forward–Backward Algorithm for HSMM

    2.3 Matrix Expression of the Forward–Backward Algorithm

    2.4 Forward-Only Algorithm for HSMM

    2.5 Viterbi Algorithm for HSMM

    2.6 Constrained-Path Algorithm for HSMM

    Chapter 3. Parameter Estimation of General HSMM

    Abstract

    3.1 EM Algorithm and Maximum-Likelihood Estimation

    3.2 Re-estimation Algorithms of Model Parameters

    3.3 Order Estimation of HSMM

    3.4 Online Update of Model Parameters

    Chapter 4. Implementation of HSMM Algorithms

    Abstract

    4.1 Heuristic Scaling

    4.2 Posterior Notation

    4.3 Logarithmic Form

    4.4 Practical Issues in Implementation

    Chapter 5. Conventional HSMMs

    Abstract

    5.1 Explicit Duration HSMM

    5.2 Variable Transition HSMM

    5.3 Variable-Transition and Explicit-Duration Combined HSMM

    5.4 Residual Time HSMM

    Chapter 6. Various Duration Distributions

    Abstract

    6.1 Exponential Family Distribution of Duration

    6.2 Discrete Coxian Distribution of Duration

    6.3 Duration Distributions for Viterbi HSMM Algorithms

    Chapter 7. Various Observation Distributions

    Abstract

    7.1 Typical Parametric Distributions of Observations

    7.2 A Mixture of Distributions of Observations

    7.3 Multispace Probability Distributions

    7.4 Segmental Model

    7.5 Event Sequence Model

    Chapter 8. Variants of HSMMs

    Abstract

    8.1 Switching HSMM

    8.2 Adaptive Factor HSMM

    8.3 Context-Dependent HSMM

    8.4 Multichannel HSMM

    8.5 Signal Model of HSMM

    8.6 Infinite HSMM and HDP-HSMM

    8.7 HSMM Versus HMM

    Chapter 9. Applications of HSMMs

    Abstract

    9.1 Speech Synthesis

    9.2 Human Activity Recognition

    9.3 Network Traffic Characterization and Anomaly Detection

    9.4 fMRI/EEG/ECG Signal Analysis

    References

    Copyright

    Elsevier

    Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands

    The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK

    225 Wyman Street, Waltham, MA 02451, USA

    Copyright © 2016 Elsevier Inc. All rights reserved.

    No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

    This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

    Notices

    Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

    Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

    To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

    ISBN: 978-0-12-802767-7

    British Library Cataloguing-in-Publication Data

    A catalogue record for this book is available from the British Library

    Library of Congress Cataloging-in-Publication Data

    A catalog record for this book is available from the Library of Congress

    For information on all Elsevier publications visit our website at http://store.elsevier.com/

    Preface

    A hidden semi-Markov model (HSMM) is a statistical model. In this model, an observation sequence is assumed to be governed by an underlying semi-Markov process with unobserved (hidden) states. Each hidden state has a generally distributed duration, which is associated with the number of observations produced while in the state, and a probability distribution over the possible observations.

    Based on this model, the model parameters can be estimated/updated, the predicted, filtered, and smoothed probabilities of partial observation sequence can be determined, goodness of the observation sequence fitting to the model can be evaluated, and the best state sequence of the underlying semi-Markov process can be found.

    Due to those capabilities of the HSMM, it becomes one of the most important models in the area of artificial intelligence/machine learning. Since the HSMM was initially introduced in 1980 for machine recognition of speech, it has been applied in more than forty scientific and engineering areas with thousands of published papers, such as speech recognition/synthesis, human activity recognition/prediction, network traffic characterization/anomaly detection, fMRI/EEG/ECG signal analysis, equipment prognosis/diagnosis, etc.

    Since the first HSMM was introduced in 1980, three other basic HSMMs and several variants of them have been proposed in the literature, with various definitions of duration distributions and observation distributions. Those models have different expressions, algorithms, computational complexities, and applicable areas, without explicitly interchangeable forms. A unified definition, in-depth treatment and foundational approach of the HSMMs are in strong demand to explore the general issues and theories behind them.

    However, in contrast to a large number of published papers that are related to HSMMs, there are only a few review articles/chapters on HSMMs, and none of them aims at filling the demand. Besides, all existing review articles/chapters were published several years ago. New developments and emerging topics that have surfaced in this field need to be summarized.

    Therefore, this book is intended to include the models, theory, methods, applications, and the latest information on development in this field. In summary, this book will provide:

    • a unified definition, in-depth treatment and foundational approach of the HSMMs;

    • a survey on the latest development and emerging topics in this field;

    • examples helpful for the general reader, teachers and students in computer science, and engineering to understand the topics;

    • a brief description of applications in various areas;

    • an extensive list of references to the HSMMs.

    For these purposes, this book presents nine chapters in three parts. In the first part, this book defines a unified model of HSMMs, and discusses the issues related to the general HSMM, which include:

    1. the forward–backward algorithms that are the fundamental algorithms of HSMM, for evaluating the joint probabilities of partial observation sequence;

    2. computation of the predicted/filtered/smoothed probabilities, expectations, and the likelihood function of observations, which are necessary for inference in HSMM;

    3. the maximum a posteriori estimation of states and the estimation of best state sequence by Viterbi HSMM algorithm;

    4. the maximum-likelihood estimation, training and online update of model parameters; proof of the re-estimation algorithms by the EM algorithm;

    5. practical issues in the implementation of the forward–backward algorithms.

    By introducing certain assumptions and some constraints on the state transitions, the general HSMM becomes the conventional HSMMs, including explicit duration HMM, variable transition HMM, and residual time HMM. Those conventional models have different capability in modeling applications, with different computational complexity and memory requirement involved in the forward–backward algorithms and the model estimation.

    In the second part, this book discusses the state duration distributions and the observation distributions, which can be nonparametric or parametric depending on the specific preference of the applications.

    Among the parametric distributions, the most popular ones are the exponential family distributions, such as Poisson, exponential, Gaussian, and gamma. A mixture of Gaussian distributions is also widely used to express complex distributions.

    Other than the exponential family and the mixed distributions, the Coxian distribution of state duration can represent any discrete probability density function, and the underlying series–parallel network also reveals the structure of different HSMMs.

    A multispace probability distribution is applied to express a composition of different dimensional observation spaces, or a mixture of continuous and discrete observations. A segmental model of observation sequence is used to describe parametric trajectories that change over time. An event sequence model is used to model and handle an observation sequence with missed observations.

    In the third part, this book discusses variants and applications of HSMMs. Among the variants of HSMMs, a switching HSMM allows the model parameters to be changed in different time periods. An adaptive factor HSMM allows the model parameters to be a function of time. A context-dependent HSMM lets the model parameters be determined by a given series of contextual factors. A multichannel HSMM describes multiple interacting processes. A signal model of HSMM uses an equivalent form to express an HSMM.

    There usually exists a class of HSMMs that are specified for the applications in an area. For example, in the area of speech synthesis, speech features (observations to be obtained), instead of the model parameters, are to be determined. In the area of human activity recognition, unobserved activity (hidden state) is to be estimated. In the area of network traffic characterization/anomaly detection, performance/health of the entire network is to be evaluated. In the area of fMRI/EEG/ECG signal analysis, neural activation is to be detected.

    Acknowledgments

    I would like to thank Dr Yi XIE, Bai-Chao LI, and Jian-Zheng LUO, who collected a lot of papers that are related to HSMMs, and sorted them based on the relevancy to the applicable theories, algorithms, and applications. Their work is instrumental for me to complete the book in time. I also want to express my gratitude toward the reviewers who carefully read the draft and provide me with many valuable comments and suggestions. Dr Yi XIE, Bai-Chao LI, Wei-Tao WU, Qin-Liang LIN, Xiao-Fan CHEN, Yan LIU, Jian KUANG, and Guang-Rui WU proofread the chapters. Without their tremendous effort and help, it would have been extremely difficult for me to finish this book as it is now.

    Chapter 1

    Introduction

    Abstract

    A hidden semi-Markov model (HSMM) can be considered as an extension of a hidden Markov model (HMM) by allowing the underlying process to be a semi-Markov process, or an extension of a semi-Markov process by allowing the states to be hidden and their emissions to be observable. The conditional dependencies among the random variables of an HSMM can be described by a directed acyclic graph (dynamic Bayesian network: DBN) or an undirected probabilistic graphical model (conditional random field: CRF). This chapter reviews all these models that are closely related to HSMMs, including Markov renewal process, semi-Markov process, HMMs, DBNs, and CRFs. Then the concepts and terms of HSMMs are illustrated, and the history of HSMMs is briefly introduced.

    Keywords

    Markov renewal process; semi-Markov process; hidden Markov model (HMM); dynamic Bayesian network (DBN); conditional random field (CRF)

    This chapter reviews some topics that are closely related to hidden semi-Markov models, and introduces their concepts and brief history.

    1.1 Markov Renewal Process and Semi-Markov Process

    In this chapter, we briefly review the Markov renewal process and semi-Markov process, as well as generalized semi-Markov process and discrete-time semi-Markov process.

    1.1.1 Markov Renewal Process

    A renewal process is a generalization of a Poisson process that allows arbitrary holding times. Its applications include such as planning for replacing worn-out machinery in a factory. A Markov renewal process is a generalization of a renewal process that the sequence of holding times is not independent and identically distributed. Their distributions depend on the states in a Markov chain. The Markov renewal processes were studied by Pyke (1961a, 1961b) in 1960s. They are applied in M/G/1 queuing systems, machine repair problem, etc. (Cinlar, 1975).

    , n. If

    . Define the state transition probabilities by

    , that is, at any time epoch multiple transitions are not allowed. Let

    is called a Markov renewal counting process.

    1.1.2 Semi-Markov Process

    A semi-Markov process is equivalent to a Markov renewal process in many aspects, except that a state is defined for every given time in the semi-Markov process, not just at the jump times. Therefore, the semi-Markov process is an actual stochastic process that evolves over time. Semi-Markov processes were introduced by Levy (1954) and Smith (1955) in 1950s and are applied in queuing theory and reliability theory.

    are the sojourn times in the states. Every transition from a state to the next state is instantaneously made at the jump times.

    For a time-homogeneous semi-Markov process, the transition density functions are

    . It is the probability density function that after having entered state i at time zero the process transits to state j . They must satisfy

    . That is, state i .

    is

    is the probability that the process will not make transition from state i ) is thus

    Suppose the current time is tis a continuous time homogeneous Markov process.

    The semi-Markov process can be generated by different types of random mechanisms (Nunn and Desiderio, 1977), for instances:

    , and

    . In this model,

    , and then randomly determines the successor state j is the density function of the waiting time for transition out of state i defined by

    Enjoying the preview?
    Page 1 of 1