mirror of
https://github.com/vale981/phys512
synced 2025-03-04 17:11:42 -05:00
Update monte carlo method descriptions
This commit is contained in:
parent
596b124130
commit
845965a2b9
2 changed files with 19 additions and 10 deletions
|
@ -1,20 +1,22 @@
|
||||||
# Metropolis-Hastings algorithm
|
# Metropolis-Hastings algorithm
|
||||||
|
|
||||||
Another way to sample a probability distribution is to use a **Markov Chain Monte Carlo** method. In these methods, we generate a sequence (or chain) of $x$ values that sample a given probability distribution $f(x)$. Each sample $x_i$ is generated from the previous one $x_{i-1}$ in some probabilistic way, so $x_i$ depends only on the state before it $x_{i-1}$ (this is the Markov chain part).
|
Another way to sample a probability distribution is to use a **Markov Chain Monte Carlo** method. In these methods, we generate a sequence (or chain) of $x$ values that sample a given probability distribution $f(x)$. Each sample $x_i$ is generated from the previous one $x_{i-1}$ in some probabilistic way, so $x_i$ depends only on the state before it $x_{i-1}$ (this is the Markov chain part). By choosing the dependence of $x_i$ on $x_{i-1}$ in an appropriate way, we can generate a sequence $\{x_i\}$ that are random samples from any given probability distribution $f(x)$.
|
||||||
|
|
||||||
The **Metropolis-Hastings algorithm** is a simple method to do this. To generate the next value in the chain $x_{i+1}$ given the current value $x_i$, the procedure is as follows:
|
The **Metropolis-Hastings algorithm** is a simple method to do this. To generate the next value in the chain $x_{i+1}$ given the current value $x_i$, the procedure is as follows:
|
||||||
|
|
||||||
- Propose a jump from the current value $x_i=x$ to a new candidate value $x^\prime$ according to a proposal distribution $p(x^\prime|x)$. The proposal distribution should be symmetric, ie. the probability of choosing a new location $x^\prime$ given that we are now at $x$ should be the same as the probability of choosing $x$ if we were located at $x^\prime$. For example, in 1D we could choose $x^\prime = x_i + \Delta x$ with $\Delta x$ drawn from a Gaussian distribution.
|
1. Propose a jump from the current value $x_i=x$ to a new candidate value $x^\prime$ according to a proposal distribution $p(x^\prime|x)$. The proposal distribution should be symmetric, ie. the probability of choosing a new location $x^\prime$ given that we are now at $x$ should be the same as the probability of choosing $x$ if we were located at $x^\prime$. For example, in 1D we could choose $x^\prime = x_i + \Delta x$ with $\Delta x$ drawn from a Gaussian distribution.
|
||||||
|
|
||||||
- Evaluate the ratio $\alpha = f(x^\prime)/f(x)$. If $\alpha>1$, ie. the new point is located in a region with larger $f(x)$, then we accept the move and set $x_{i+1} = x^\prime$. If $\alpha<1$, we accept the move with probability $\alpha$. If the move is rejected, then we set $x_{i+1} = x_i$. (Note that this entire step can be achieved by choosing a uniform deviate $u$ between 0 and 1, and accepting the move if $u<\alpha$.)
|
2. Evaluate the ratio $\alpha = f(x^\prime)/f(x)$. If $\alpha>1$, ie. the new point is located in a region with larger $f(x)$, then we accept the move and set $x_{i+1} = x^\prime$. If $\alpha<1$, we accept the move with probability $\alpha$. If the move is rejected, then we set $x_{i+1} = x_i$. (Note that this entire step can be achieved by choosing a uniform deviate $u$ between 0 and 1, and accepting the move if $u<\alpha$.)
|
||||||
|
|
||||||
After an initial *burn-in* phase, the chain will reach equilibrium and the samples $\{x_i\}$ will be distributed according to the probability distribution $f(x)$. The reason this works is that in equilibrium, the rate at which we move from point $x$ to $x^\prime$ should be equal to the rate at which we move in the reverse direction, from $x^\prime$ to $x$. This is the principle of detailed balance. Therefore
|
The size of the jump in step 1 is set by the width of the Gaussian used to sample the jump size $\Delta x$. The width should be chosen so that a reasonable number of trial jumps are accepted. Typically an acceptance fraction of $\approx 0.25$ is a good choice.
|
||||||
|
|
||||||
|
After an initial *burn-in* phase, the chain will reach equilibrium and the samples $\{x_i\}$ will be distributed according to the probability distribution $f(x)$. The reason this works is that in equilibrium, the rate at which we move from point $x$ to $x^\prime$ should be equal to the rate at which we move in the reverse direction, from $x^\prime$ to $x$. This is the principle of *detailed balance*. Therefore
|
||||||
|
|
||||||
$$n(x) \times (\mathrm{rate\ of}\ x\rightarrow x^\prime)
|
$$n(x) \times (\mathrm{rate\ of}\ x\rightarrow x^\prime)
|
||||||
=
|
=
|
||||||
n(x^\prime) \times (\mathrm{rate\ of}\ x^\prime\rightarrow x)
|
n(x^\prime) \times (\mathrm{rate\ of}\ x^\prime\rightarrow x),
|
||||||
$$
|
$$
|
||||||
|
where $n(x)$ is the density of points at $x$.
|
||||||
If $x^\prime$ has a larger value of the probability $f(x^\prime)>f(x)$, then the transition from $x$ to $x^\prime$ will always happen (we always accept), i.e. the rate is 1. The rate of the reverse transition from $x^\prime$ to $x$ is then $f(x)/f(x^\prime)<1$. Therefore
|
If $x^\prime$ has a larger value of the probability $f(x^\prime)>f(x)$, then the transition from $x$ to $x^\prime$ will always happen (we always accept), i.e. the rate is 1. The rate of the reverse transition from $x^\prime$ to $x$ is then $f(x)/f(x^\prime)<1$. Therefore
|
||||||
|
|
||||||
$$n(x) = n(x^\prime) f(x)/f(x^\prime)$$
|
$$n(x) = n(x^\prime) f(x)/f(x^\prime)$$
|
||||||
|
@ -23,6 +25,11 @@ $${n(x)\over n(x^\prime)}= {f(x)\over f(x^\prime)}$$
|
||||||
|
|
||||||
which shows that $n(x)$ will be distributed as $f(x)$.
|
which shows that $n(x)$ will be distributed as $f(x)$.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
```{admonition} Exercise
|
```{admonition} Exercise
|
||||||
|
|
||||||
Code up this algorithm and generate $10^4$ samples from the distribution $f(x) = \exp(-\left|x^3\right|)$. Confirm that you are getting the correct distribution by comparing the histogram of $x$-values with the analytic expression.
|
Code up this algorithm and generate $10^4$ samples from the distribution $f(x) = \exp(-\left|x^3\right|)$. Confirm that you are getting the correct distribution by comparing the histogram of $x$-values with the analytic expression.
|
||||||
|
|
|
@ -1,15 +1,17 @@
|
||||||
# Monte Carlo integration
|
# Monte Carlo integration
|
||||||
|
|
||||||
In our discussion of the rejection method, you might have wondered whether we could use the samples of $f(x)$ to calculate the integral under the curve. Indeed, for a uniform sampling of $x$ values $\left\{x_i\right\}$ and the corresponding function values $f(x_i)$,
|
In our discussion of the rejection method, you might have wondered whether we could use the samples of $f(x)$ to calculate the integral under the curve. Indeed, for a uniform sampling of $x$ values $\left\{x_i\right\}$ between $x=a$ and $x=b$ and the corresponding function values $f(x_i)$,
|
||||||
we can estimate the value of the integral
|
we can estimate the value of the integral using
|
||||||
|
|
||||||
$$\int_a^b dx f(x) \approx {(b-a)\over N} \sum_i f(x_i).$$
|
$$\int_a^b dx f(x) \approx {(b-a)\over N} \sum_i f(x_i).$$
|
||||||
|
|
||||||
Depending on the shape of the integral, this estimate can be improved by using a different distribution for the $x$-values which concentrates the samples of $x$ in the regions where the integrand is largest (the ideal case would be to sample from $f(x)$ itself). In this case, we have to rescale the weight of a given $x$-value according to the ratio $f(x)/p(x)$. The integral then becomes
|
You can think of this as $\sum_i f(x_i) \Delta x$ with the mean spacing of points $\Delta x = (b-a)/N$.
|
||||||
|
|
||||||
|
Depending on the shape of the integral, this estimate can be improved by using a different distribution for the $x$-values which concentrates the samples of $x$ in the regions where the integrand is largest (the ideal case would be to sample from $f(x)$ itself). In this case, we have to rescale the weight of a given $x$-value according to the ratio $f(x)/p(x)$. The procedure for estimating the integral is: (1) select $N$ samples of $x$ from the probability distribution $p(x)$, and (2) evaluate the integral using
|
||||||
|
|
||||||
$$ {1\over N} \sum_i {f(x_i)\over p(x_i)}.$$
|
$$ {1\over N} \sum_i {f(x_i)\over p(x_i)}.$$
|
||||||
|
|
||||||
In the case of a uniform distribution, $p(x) = 1/(b-a)$, we recover the expression above.
|
The factor $1/p(x_i)$ measures the effective spacing of the samples at $x=x_i$. If $p(x)$ is large, many points will be chosen near that value of $x$, so the effective spacing between points is small in that region. In the case of a uniform distribution, $p(x) = 1/(b-a)$, we recover our original expression for uniform sampling.
|
||||||
|
|
||||||
```{admonition} Exercise
|
```{admonition} Exercise
|
||||||
|
|
||||||
|
|
Loading…
Add table
Reference in a new issue