Markov Chain Monte Carlo: New Tools for Bayesian

Markov Chain Monte Carlo: New Tools for Bayesian Modeling in Paleontology
Michael D. Karcher 07

(1),

Steve C. Wang

(1),

Peter Roopnarine

(2),

Kenneth Angielczyk

(3)

(1) Department of Mathematics and Statistics, Swarthmore College (2) Department of Invertebrate Zoology & Geology, California Academy of Sciences (3) Department of Earth Sciences, University of Bristol

Introduction

Mysterious Mass Extinction

Paleontologists often need to model complex systems
with many variables and complex relationships. In
such models, information is often characterized by
high-dimensional statistical distributions that are
difficult to analyze mathematically. In this poster, we
describe the use of Markov Chain Monte Carlo (MCMC)
in generating an approximate sample from any desired
distribution. In so doing, the sample provides
important insights into the nature of the distribution at
hand. MCMC is an iterative approach towards
simulating a sample, creating a sequence of values (or
vector of values) whose distribution more closely
approximates the desired distribution the longer the
chain is allowed to run. More specifically, we present
the Metropolis-Hastings Algorithm, a particular
implementation of MCMC, which easily adapts to highdimensional problems.

The end-Permian or Permian-Triassic extinction event
occurred approximately 251 million years ago and
nearly wiped out all life on Earth. Possibly as few as
ten percent of all species then extant survived the
event. In contrast, approximately half of the species
present 65 million years ago survived the endCretaceous extinction. However, the differences do
not end there. The causes of the end-Cretaceous
event are well-explained by clear physical evidence.
Conversely, less direct physical evidence remains of
the causes (primary extinctions) leading up to the endPermian event. There are numerous theories to
account for the mass extinction, including massive
volcanism (Figure 1) and a meteor impact scenario
similar to the end-Cretaceous event (Figure 2),

In this poster, we use MCMC to gain insight into the
primary causes of the Permian mass extinction. We
use a computer model to simulate secondary
extinctions in a Late Permian food web. Then, using
MCMC methods and Bayesian modeling, we infer the
level of primary producer shutdown needed to cause
observed levels of secondary extinction in Late
Permian Karoo basin fauna.

Food Web Collapse

During any ecological disturbance, repercussions
propagate throughout the entire food web. It is
possible to simulate these effects computationally.
The necessary framework to create these simulations
has already been explored (Roopnarine 2006,
Paleobiology 32:1, 1-19). Using a network graph to
describe the food web in terms of guilds (for
example, Large Herbivores, Small Amphibians,
etc.), the function ceg(x) (Cascading Extinction on
Graphs) takes a vector of primary extinctionsvalues
in [0,1] representing extinctions as a direct result of an
extinction eventand outputs a vector of secondary
extinctionsextinctions as a result of food web
collapse. For example, in a system with three guilds,

ceg(0.45, 0.05, 0.0) = (0.85, 0.75, 0.8),
shows that moderate extinction in one guild can cause
major effects in other guilds, mirroring reality.

Metropolis-Hastings
Algorithm
One particular MCMC technique that adapts well to
high-dimensional problems is the Metropolis-Hastings
Algorithm. It takes a Random-Walk (Figure 3)
approach towards exploring the given distribution.
Suppose we wish to sample from a given distribution
with density function f(x). First, we choose a
jumping distribution g(x|) that is easy to sample
from and depends only on one parameter (e.g. Normal
with mean ). We choose x1 from any point where f(x)
( x* ) gstep
( xt |xxt*of
) the chain, we sample
is positive, and atfeach
r
a new point x* from
f ( the
xt ) gjumping
( x* | xt ) distribution and
calculate:
And then we set:

x* with probability min(r,1)
xt 1
xt otherwise

Fig. 1

Paleontological Uses
Using the data from the simulations of food web
collapse, it is possible to treat the potential primary
extinction scenarios in the end-Permian event as a
statistical distribution and analyze them using MCMC.
In order to do so, we set up a Metropolis-Hastings chain
with points from the possible primary extinctions. We
apply the CEG function (transforming primary
extinction into secondary extinction) to each point
many times to generate a likelihood function by
counting how many of those outputs fall within a
predetermined n-cube around the observed secondary
extinction. Once the chain has sufficiently explored the
distribution, we can apply Bayes Theorem with a flat
prior to calculate the posterior distribution for the
primary extinction scenarios. Using simple statistical
techniques, we can then construct estimates and
confidence intervals about which primary extinction
scenario caused the end-Permian event.

There are several techniques to improve the sample,
including discarding the first part of the chain to reduce the
effect of choosing x1 arbitrarily and discarding all but
every kth step to reduce correlation.

Fig. 2

Markov Chain Monte Carlo
Statisticians often need to explore statistical
distributions that are difficult to analyze directly.
Inconvenient distributions often have pathological or
even no mathematical formulae, domain spaces that
are too large to fully exhaust, or other traits that make
direct analysis nearly impossible. One class of tools
statisticians possess to work through this is Markov
Chain Monte Carlo (MCMC). What binds the tools of
MCMC together is that they are all iterative and
asymptotic processes; they all proceed step-by-step
and get more accurate the longer they are run. The
Markov property plays a large role in the utility of the
techniques. As a given MCMC process iterates, each
step depends only on the last step and no other
information. It can be shown that as the length of a
chain with the Markov property increases, it converges
to some distribution, and methods exist to ensure that
it converges to any desired distribution, even
inconvenient distributions, allowing analysis to
continue.

Fig. 5

Future
Fig. 4

Example: MCMC in Action
The plots in Figure 4 were created using the
Metropolis-Hastings Algorithm in one dimension. It
was run on the food web networks from the Karoo
basin dataset and its objective was to find the true
value of the primary extinction.

Fig. 3: Metropolis-Hastings Random-Walk Markov Chain

For further information: e-mail [email protected]
[email protected]

or

The simulations produce a great deal of data, and
much work still remains in its interpretation. Running
the Algorithm on the real world data produces 24dimensional results. Figure 5 is a collection of
histograms from one of our runs summarizing the data
marginally. More useful inferences can be gathered
from analyzing the variables and their correlations.
However, this is no trivial task in 24 dimensions. A
great deal of time could be spent in future projects
analyzing and determining how to analyze the data
that is produced.

Recently Viewed Presentations

  • Tackling Turbulence with Holographic Particle Image Velocimetry

    Tackling Turbulence with Holographic Particle Image Velocimetry

    The closures being used seldom reflect what's really going in the flow and thus bring major uncertainty to the CFD results. On the other hand, flow velocimetry for decades has been limited to single-point measurement like Hot-Wire and LDV, and...
  • CHAPTER 3 VECTORS NHAA/IMK/UNIMAP INTRODUCTION Definition 3.1 a

    CHAPTER 3 VECTORS NHAA/IMK/UNIMAP INTRODUCTION Definition 3.1 a

    Definition 3.1. a . VECTOR. is a mathematical quantity that has both . MAGNITUDE AND DIRECTION. VECTOR: represented by arrow where the direction of arrow indicates the . DIRECTION. of the vector & the length of arrow indicates the ....
  • Asperger's at UNI

    Asperger's at UNI

    Hans Asperger - Pediatrician in Nazi-occupied Vienna. Studied 4 children with cluster of symptoms. Bravely argued against a Nazi law introduced for "the prevention of offspring suffering from hereditary diseases" History of Asperger'sSyndrome
  • PowerPoint-esitys

    PowerPoint-esitys

    * * Yhteiset opinnot: vieraat kielet, englanti 2 osp Kodinhoito Tehtävät 1-5 Osion kansilehti 24.3.2016 Virpi Peltoniemi Tehtäviä Tehtäviä Tehtäväsivu 24.3.2016 Virpi Peltoniemi Tehtävä 1 Yhdistä käännökset viivalla: a mess likainen messy puhdas tidy sotku tidy up puhdistaa clean sotkuinen...
  • Java Object-Oriented Programming - 國立臺灣大學

    Java Object-Oriented Programming - 國立臺灣大學

    Chapter 27 - Java Object-Oriented Programming Outline 27.1 Introduction 27.2 Superclasses and Subclasses 27.3 protected Members 27.4 Relationship between Superclass Objects and Subclass Objects 27.5 Implicit Subclass-Object-to-Superclass-Object Conversion 27.6 Software Engineering with Inheritance 27.7 Composition vs. Inheritance
  • Habitats and Adaptations - District Five Schools of ...

    Habitats and Adaptations - District Five Schools of ...

    Buttresses- these support a tall tree with shallow roots. Camouflage- this allows an animal to blend in with its environment. Coniferous- this type of tree has tough needles that help it to conserve water during winter. Deciduous- this type of...
  • Presentación de PowerPoint

    Presentación de PowerPoint

    Sin embargo, bien hicisteis en participar conmigo en mi tribulación. 15 . Y sabéis también vosotros, oh filipenses, que al principio de la predicación del evangelio, cuando partí de Macedonia, ninguna iglesia participó conmigo en razón de dar y recibir,...
  • Bio & E/E Warm-Up: 8/26/10

    Bio & E/E Warm-Up: 8/26/10

    AWAKE AT NIGHT; ASLEEP DURING DAY ?: What is an example of a human behavior influenced by circadian rhythm? TIRED AT NIGHT, MORE ALERT DURING DAY ?: What can disrupt a normal circadian rhythm? JET LAG, TIME CHANGES, SLEEPING PILLS...