Calculating the area under the normal curve in Ruby
Summary: Attached is a pure Ruby implementation of the AS66 algorithm (Hill 1973), ported from the Fortran code available here. It estimates the integral of the normal distribution, defaulting to the area under the right tail.
Background: I need to calculate p-values for some Z-scores I have, and started out using the Statistics2 bundle. Unfortunately, a single one of the 655 values I was calculating turned out negative (Z=8.21919181750338 was giving p=-2.22044604925031e-16, to be exact), which should be impossible. Evaluating it in IRB gives a positive number, so I’m not sure why. I decided that I couldn’t trust the numbers it was generating, and implemented the function myself.
Description: The alnorm function evaluates standardised Z-scores, meaning that the standard distribution it integrates has a mean of zero and a standard deviation of one. The upper option tells it to evaluate the right tail of the normal curve instead of the left. The identity alnorm(x, true) == 1.0-alnorm(x, false) should hold.
Code:
# Calculates the tail area of the normal curve using the AS66 algorithm (Hill, 1976)
# Ported from the Fortran available at http://users.bigpond.net.au/amiller/apstat.html
# * x => Z-score
# * upper => calculate the upper tail area instead of lower (default)
def alnorm(x, upper = true)
# Local variables
con = 1.28 ; fn_val = 0.0
# Machine dependent constants
# I arbitrarily left them untouched -- refer to the paper for more information
ltone = 7.0 ; utzero = 18.66
p = 0.398942280444 ; q = 0.39990348504 ; r = 0.398942280385
a1 = 5.75885480458 ; a2 = 2.62433121679 ; a3 = 5.92885724438
b1 = -29.8213557807 ; b2 = 48.6959930692
c1 = -3.8052E-8 ; c2 = 3.98064794E-4 ; c3 = -0.151679116635 ; c4 = 4.8385912808 ; c5 = 0.742380924027 ; c6 = 3.99019417011
d1 = 1.00000615302 ; d2 = 1.98615381364 ; d3 = 5.29330324926 ; d4 = -15.1508972451 ; d5 = 30.789933034
up = upper
z = x
if z < 0.0
up = !up
z = -z
end
if (z <= ltone or (up and z <= utzero))
y = 0.5*z*z
if (z > con)
fn_val = r*Math.exp(-y)/(z+c1+d1/(z+c2+d2/(z+c3+d3/(z+c4+d4/(z+c5+d5/(z+c6))))))
else
fn_val = 0.5 - z*(p-q*y/(y+a1+b1/(y+a2+b2/(y+a3))))
end
else
fn_val = 0.0
end
if (!up) ; fn_val = 1.0 - fn_val ; end
fn_val
end