Estimating π using Monte-Carlo method
Idea
Let's compute an approximation of π using the Monte Carlo method. The idea is to draw points from the uniform distribution on a unit square and count the ratio of points that are inside the quadrant of the unit circle. Since the probably $p$ for a point to be inside the quadrant is the area of the quadrant ($= π / 4$) divided by the area of the unit square ($= 1$), we can estimate $π$ by estimating the probability $p = π / 4$:
xs = rand(10_000)
ys = rand(10_000)
p = count(xs.^2 .+ ys.^2 .< 1) / length(xs)
4 * p
3.1524
–- File:Pi 30K.gif - Wikipedia by nicoguaro is licensed under CC BY 3.0.
RNG on GPU
We try to do this computation on a GPU using FoldsCUDA.jl:
using CUDA
using FLoops
using FoldsCUDA
As of writing, CUDA.CURAND
does not provide the API usable inside the loop body (i.e., the device API). However, we can use pure-Julia pseudo number generator quite easily. In particular, we use a Counter-based random number generator (CBRNG) provided by Random123.jl (documentation).
using Random123
In this example, we use Random123.Philox2x
. This RNG gives us two UInt64
s for each counter which wraps around at typemax(UInt64)
:
rng_a = Philox2x(0)
rng_b = Philox2x(0)
set_counter!(rng_b, typemax(UInt64))
rand(rng_b, UInt64, 2)
@assert rng_a == rng_b
Using counter-based RNG
Let's create a helper function that divides UInt64(0):typemax(UInt64)
into n
equal intervals:
function counters(n)
stride = typemax(UInt64) ÷ n
return UInt64(0):stride:typemax(UInt64)-stride
end
This lets us use "independent" RNG for each ctr
-th iteration:
function monte_carlo_pi(n, m = 10_000, ex = has_cuda_gpu() ? CUDAEx() : ThreadedEx())
@floop ex for ctr in counters(n)
rng = set_counter!(Philox2x(0), ctr)
nhits = 0
for _ in 1:m
x = rand(rng)
y = rand(rng)
nhits += x^2 + y^2 < 1
end
@reduce(tot = 0 + nhits)
end
return 4 * tot / (n * m)
end
πₐₚₚᵣₒₓ = monte_carlo_pi(2^12)
3.1411634765625
This page was generated using Literate.jl.