Models
Overview
tMAVEN provides many built-in models that you can run on your data. Generally, they are divided into several categories. On this page, we discuss each model, and then provide timings for how long they take to run on a typical dataset.
Description of Models
Mixture Models
threshold
A threshold is applied and datapoints on each side of the threshold are clustered together.
kmeans
The K-means clustering algorithm is use to cluster the datapoints into
mlgmm
A maximum likelihood Gaussian mixture model (GMM) clustering algorithm is use to cluster the datapoints into
vbgmm
A variational Bayes Gaussian mixture model (GMM) clustering algorithm is use to cluster the datapoints into
Individual HMMs
mlhmm
This is a separate maximum likelihood HMM for each trajectory. No ensemble model is generated, therefore no statistics are provided and much of the plotting functionality cannot be performed.
vbhmm
This is a separate variational Bayes HMM (i.e., vbFRET) for each trajectory. No ensemble model is generated, therefore no statistics are provided and much of the plotting functionality cannot be performed.
Composite HMMs
Composite HMMs created by modeling each trajectory with its own HMM. We do this by using model selection (where appropriate; see below). Thus we run the one through
kmeans_mlhmm
A maximum likelihood HMM is run on each trajectory using
kmeans_vbhmm
A variational Bayes HMM (i.e., vbFRET) is run on each trajectory using model selection from one to
vbgmm_vbhmm
A variational Bayes HMM (i.e., vbFRET) is run on each trajectory using model selection from one to
threshold_vbhmm
A variational Bayes HMM (i.e., vbFRET) is run on each trajectory using model selection from one to
threshold_vbconhmm
A global variational Bayes HMM is run on all trajectories using
Global HMMs
vbconhmm
This is a global variational Bayes HMM. It is conceptually similar to vbFRET, but all of the trajectories and assumed to be idependent and identically distributed (IID). This means that they will all obey the same HMM.
ebhmm
This is an empirical Bayes HMM (i.e., ebFRET). The model provided in the empirical prior. This is a pseudo-global method in that it also models each trajectory individually. The idealized (Viterbi) paths in the plot are from the individual posteriors. Parameters are from the empirical prior.
Model Selection
These are variations of several models discussed above. Specifically for the Bayesian-based methods, we use the maximum evidence or evidence lower bound (ELBO) to identify the optimal number of states. This works by running the same type of model, each time using a different number of states. The variation with the largest evidence/ELBO is chosen as the best model. Generally, you want to run the one state through at least the six state model.
vbgmm_modelselection
This is a variational Bayes GMM (mixture model) with model selection from one to
vbhmm_modelselection
This is a variational Bayes HMM (i.e., vbFRET) with model selection from one to
vbconhmm_modelselection
This is a global variational Bayes HMM (i.e., global vbFRET) with model selection from one to
vbgmm_vbhmm_modelselection
This is a composite variational Bayes HMM (i.e., composite vbFRET). Model selection with the vbhmm is performed from one to
Mixture Models
Timing
Use test_timing.py
to run all of the models on the test dataset (L1-tRNA; ribosomal complex with tRNA
Apple M2 Pro
Mixture | Time (s) |
---|---|
threshold | 0.107 |
kmeans | 0.167 |
mlgmm | 1.024 |
vbgmm | 0.287 |
HMM | Time (s) |
---|---|
mlhmm | 0.453 |
vbhmm | 0.508 |
Composite | Time (s) |
---|---|
vbconhmm | 8.230 |
ebhmm | 6.479 |
Global | Time (s) |
---|---|
kmeans_mlhmm | 0.529 |
kmeans_vbhmm | 1.783 |
vbgmm_vbhmm | 1.913 |
threshold_vbhmm | 1.671 |
threshold_vbconhmm | 14.339 |
w/ Model Selection(1-6) | Time (s) |
---|---|
vbgmm_modelselection | 3.663 |
vbhmm_modelselection | 13.550 |
vbconhmm_modelselection | 101.481 |
vbgmm_vbhmm_modelselection | 41.001 |