Adds material on random numbers

2025-03-04 09:11:37 -05:00 · 2023-09-21 11:58:25 -04:00 · 2023-09-21 11:58:25 -04:00 · a51f792adf
commit a51f792adf
parent 7494db9e87
5 changed files with 413 additions and 8 deletions
--- a/_toc.yml
+++ b/_toc.yml
@ -18,11 +18,16 @@ parts:
        - file: derivatives
        - file: interpolation
        - file: integration
-    - caption: Solutions to Exercises
+        - file: function_solutions
+          sections:
+            - file: derivatives_solutions
+            - file: interpolation_solutions
+            - file: integration_solutions
+    - caption: Random numbers
      chapters:
-        - file: derivatives_solutions
-        - file: interpolation_solutions
-        - file: integration_solutions
+        - file: random
+        - file: lcg
+        - file: generating_random
    - caption: Homework
      chapters:
        - file: hw1
--- a/function_solutions.md
+++ b/function_solutions.md
@ -0,0 +1 @@
+# Solutions to exercises
--- a/generating_random.md
+++ b/generating_random.md
@ -0,0 +1,53 @@
+# Probability distributions
+
+Suppose we have access to a source of random numbers that are uniformly distributed. How do we generate numbers that are distributed according to some other probability distribution?
+
+## Rejection method
+
+If $f(x)$ is the probability distribution that we want, the rejection method is to generate points that are uniformly distributed in the $(x,y)$ plane and only accept points that lie within the curve $y=f(x)$. The $x$ values will then be distributed according to $f(x)$. 
+
+For example, consider the distribution $f(x)=\sin x$ for $x=0$ to $x=\pi$ (note that this is normalized to unity as expected for a probability distribution). We choose a set of $(x,y)$ pairs in which $x$ is uniformly-distributed between $0$ and $\pi$, and $y$ is uniformly-distributed between $0$ and $1$ ($1$ is the maximum value of $\sin x$). We keep only the values of $x$ which have a corresponding $y$ that is $y<f(x)$. These $x$-values will be distributed according to $f(x)=\sin x$.
+
+This method is very straightforward to implement, but has the disadvantage that we have to reject some fraction of the sampled points. The rejection fraction can be large if the probability distribution is far from uniform (e.g. very peaked). A way around this is to use another probability distribution for generating the test points which is shaped such that a small number of points end up being rejected.
+
+## Transformation method
+
+In the transformation method, we make a change of variables from $x$ to $y$ in such a way that $y$ is uniformly-distributed. We can then choose a sequence of $y$ values from our generator and then transform back to get the $x$ values.
+
+A example here is the exponential distribution 
+
+$$f(x)dx = e^{-\lambda x} dx.$$
+
+If we choose $y = -\lambda^{-1} e^{-\lambda x}$ then we have $dy/dx = e^{-\lambda x}$, so we can write 
+
+$$ f(x) dx = e^{-\lambda x} dx = dy = f(y) dy$$
+
+ie. we have $f(y) = 1$, a uniform distribution. The range of $x$ is $0$ to $\infty$, corresponding to the range of $y$ from $-1/\lambda$ to $0$. We choose a set of $y$ values from the uniform distribution between these limits and then transform each one back to $x$ using the inverse transform:
+
+$$x = -{1\over \lambda} \ln \left(-\lambda y\right).$$
+
+The values of $x$ will be exponentially-distributed.
+
+This method has the advantage that all samples are used (there is no rejection). Note also that we have used a sampling of points in a finite range to generate a distribution that covers a semi-infinite range in $x$ ($0$ to $\infty$).
+The disadvantage of this technique is that the forward and inverse transforms may not be available analytically or hard to evaluate.
+
+## Ratio of uniforms
+
+The ratio of uniforms is a very interesting method that also relies on selecting points in the 2D plane, but in a different way. For a probability distribution $f(x)$, the procedure is:
+
+1. generate points $u$ and $v$ in the 2D plane
+
+2. keep only those points which have 
+
+$$0\leq u \leq \sqrt{2f\left({v\over u}\right)}$$
+
+3. form $x$ values by taking the ratio of $v$ and $u$, ie. $x=v/u$.
+
+4. the $x$ values will be distributed according to $f(x)$.
+
+In this method, $f(x)$ doesn't need to be normalized, only the shape of the function is needed.  This method was introduced by [Kinderman and Monahan 1977](https://dl.acm.org/doi/pdf/10.1145/355744.355750). 
+Again note how this method allows us to access the range $x=0$ to $\infty$, since the ratio $v/u\rightarrow \infty$ for $u\rightarrow 0$. 
+
+```{admonition} Exercise:
+Implement these three methods for the exponential distribution and check that they work by comparing a histogram of your $x$ values with the analytic function.
+```
--- a/lcg.ipynb
+++ b/lcg.ipynb
--- a/random.md
+++ b/random.md
@ -1,10 +1,12 @@
 # Random numbers

-## Pseudo random number generators
+Random numbers are useful in many applications in computational physics. An obvious one is to simulate noise or stochastic systems, but they also allow efficient sampling of functions for integration or for parameter searches. 

-## Rejection method
+Although there are sources of true randomness (so-called entropy sources) in operating systems, e.g. linked to thermal noise in hardware components, in general you will be dealing with **pseudo random number generators** which are algorithms that produce deterministic sequences of numbers that have the statistical properties of random numbers.
+
+An important concept is the random number **seed**. This provides a starting point for the random sequence, and enables the random sequence to be reproduced, i.e. each seed gives rise to a certain sequence of random numbers. This is important if you want to be able to reproduce results exactly based on random numbers. If you don't specify a seed, the system will generate one, but the exact sequence of numbers will be different. This may or may not be important, e.g. if you are interested only in the statistical properties. It can certainly be good to try different seeds and make sure your results don't depend on choice of seed.
+
+ 

-## Transformation method

-## Ratio of uniforms