Source Code Reference

src.language.utils

The NIAID Stats Calculator supports I18n internationalization. It currently provides English and Spanish translations. The translation files are stored in the locales folder.

setLocale(locale)

Sets the locale to use for the language service.

Arguments
  • locale (String) – the device locale or default

getCurrentLocale()

Retrieves the device locale. Will be used to define intial language state in reducer.

isRTL()

Is it a right-to-left (RTL) language?

translateHeaderText()

translateHeaderText: screenProps => coming from react-navigation (defined in app.container.js) langKey => will be passed from the routes file depending on the screen.(We will explain the usage later int the coming topics)

src.components

The React Native components that comprise the views in the application.

src.components.Navigation

The Navigation component that defines the top NavBar.

src.components.ResultsContainer

The component that will display the results of the equations.

src.components.fishersExact

The components that define the input form area and input form for the Fishers Exact Test.

src.components.fishersExact.FishersInputForm

The component that defines the input form for the Fishers Exact Test.

src.components.studentT

The components that define the input form area and input form for the Fishers Exact Test.

src.components.studentT.StudentTInputForm

The component that defines the input form for the Student’s t-test.

src.lib

The JavaScript code that does all of the math and related heavy lifting lives here.

src.lib.utils

A utilities file containing generic functions used by the stats package. It exports asVector and toScientific.

flattenArray(data, outputArray)

Flattens a multidimensional array or a primitive into an array.

Arguments
  • data (any) – the data to attempt to flatten

  • outputArray (Array) – the array to flatten the data into, because arrays are pass by reference, the array object will be updated directly, no return is necessary.

asVector(x)

Returns input as an Array if possible or in it’s original format otherwise. Returns undefined if input is of type Symbol. Mimicking R internals.

Arguments
  • x (*) – the thing to convert

Returns

Array|Object – an array if the input is convertible, otherwise returns the input or undefined if the input was typeof ‘symbol’

round(value, decimals)

Rounds a number to the specified number of decimal points. Note that the built in toFixed() function has known errors when rounding and returns a string, not a number.

From https://stackoverflow.com/questions/6134039/format-number-to-always-show-2-decimal-places Answer provided by Nate.

Arguments
  • value (number) – the value to round

  • decimals (number) – the specified number of decimal places

toScientific(n, decimals)

Formats a number into scientific notation, sanely. Numbers between 1 - 1000 and -1 - -1000 are formatted to the requested significant digits after the decimal. All others are formatted in scientific notation with the requested significant digits after the decimal.

Arguments
  • n (number) – the number to format

  • decimals (number) – the number of decimal places to format to

Returns

number – the number either formatted to the requested decimal places or in scientific notation with the requested decimal places

src.lib.studentst

Calculates the Student’s t-test statistic, degrees of freedom, and p-value.

calcTStatistic(mA, mB, nA, nB, Sa, Sb)

Calculates the Student’s t-test statistic use the unpaired t-test equation:

\[t = \frac {m_A - m_B} {\sqrt{\frac {S_a^2} {n_A} + \frac {S_b^2} {n_B}}}\]

where \(m_A, m_B, n_A, n_B, S_a, and S_b\) are the means, sizes, and standard deviations of groups A and B respectively. http://www.sthda.com/english/wiki/t-test-formula

Arguments
  • mA (number) – the mean of group A

  • mB (number) – the mean of group B

  • nA (number) – the size of group A

  • nB (number) – the size of group B

  • Sa (number) – the standard deviation of group A

  • Sb (number) – the standard deviation of group B

Returns

number – the Student’s t-test statistic

calcDf(nA, nB, Sa, Sb)

Calculates the degrees of freedom using the equation:

\[df = \frac {(\frac {S_a^2} {n_A} + \frac {S_b^2} {n_B})^2} {\frac {(\frac {S_a^2} {n_A})^2} {n_A - 1} + \frac {(\frac {S_b^2} {n_B})^2} {n_B - 1}}\]

where \(S_a, S_b, n_A, n_B\) are the standard deviation and size of groups A and B respectively. This is the Welch–Satterthwaite equation for calculating degrees of freedom when the sample size and variance may be unequal. This will be identical to the standard n-2 calculation if the variance is equal. Ref. https://en.wikipedia.org/wiki/Student%27s_t-test

Arguments
  • nA (number) – size of group A

  • nB (number) – size of group B

  • Sa (number) – standard deviation of group A

  • Sb (number) – standard deviation of group B

Returns

number – the degrees of freedom

calcPvalue(x, n, lowerTail=true)

Calculates the p-value for the Welch’s t-test using the formula \(I_\frac {n} {(t^2+n)}(\frac {n} {2}, \frac {1} {2})\) where \(I\) is the regularized incomplete beta function. This code is based on the R code for the pt() function and provides corrections for extremely large values of n to avoid buffer overflow.

Arguments
  • x (number) – the test statistic

  • n (number) – the degrees of freedom

  • lowerTail (number) – defaults to true, set to false if calculating a right-tailed p-value

src.lib.fishers

Calculates the Fishers Exact two tailed, left tailed, and right tailed tests using the equation

\[p= \frac {( a + b )! ( c + d )! ( a + c )! ( b + d )!} {a ! b ! c ! d ! N !}\]

where a, b, c, d are the individual frequencies and N is the total frequency

group 1

group 2

pos

a

b

a+b

neg

c

d

c+d

a+c

b+d

N = (a+b)+(c+d) = (a+c)+(b+d)

to calculate the probability of a given matrix. Based on the gist by Sukumaran, J. and M. T. Holder at https://gist.github.com/jeetsukumaran/2189099.

References related to the calculation of the Fishers Exact test and when to use it: https://software.broadinstitute.org/gatk/documentation/article.php?id=8056 http://www.biostathandbook.com/fishers.html https://onlinecourses.science.psu.edu/stat504/node/89/ http://vassarstats.net/tab2x2.html

binomialCoefficient(population, sample)

Returns the binomial coefficient (population choose sample).

Arguments
  • population (number) – the full size of the population the sample will be selected from

  • sample (number) – the size of the sample to select from the population

Returns

number – the number of ways of picking k unordered outcomes from n possibilities where k = sample and n = population

hypergeometricPMF(x, m, n, k)

Calculates the hypergeometric probability of obtaining the observed results. Used to approximate the Fisher’s Exact calculation, which cannot be calculated.

Given a population consisting of m items of class M and n items of class N, this returns the probability of observing x items of class M when sampling k times without replacement from the entire population (i.e., {M,N}):

\[p(x) = \frac {\left( {m \atop x} \right) \left( {n \atop k-x} \right)} {\left( {m+n \atop k} \right)}\]
Arguments
  • x (number) – the number from col0, row0 of a 2x2 table

  • m (number) – the sum of the values in row one of a 2x2 table

  • n (number) – the sum of the values in row two of a 2x2 table

  • k (number) – the sum of the first column in a 2x2 table

Returns

number – the hypergeometric probability

rotateCW(table)

Rotates the table clockwise by 90 degrees.

Arguments
  • table (Array.Array.number) – a 2x2 2D array

Returns

Array.Array.number – the input matrix rotated 90 degrees clockwise

minRotation(a, b, c, d)

Rotates the table until the smallest value is in position [0][0].

Arguments
  • a (number) – the number from col0, row0 of a 2x2 table

  • b (number) – the number from col1, row0 of a 2x2 table

  • c (number) – the number from col0, row1 of a 2x2 table

  • d (number) – the number from col1, row1 of a 2x2 table

Returns

Array.Array.number – a 2x2 2D array with the smallest number in position [0][0]

shallowCopy(table)

Creates a shallow copy of a 2x2 matrix.

Arguments
  • table (Array.Array.number) – a 2x2 matrix (Array of Arrays)

Returns

Array.Array.number – a shallow copy of the input matrix

getLeftTailProbs(a, b, c, d)

Finds all possible matrices of non-negative integers that would be consistent with the given row and column totals with smaller observed values in position [0][0] than the minimum rotation table (the left tail).

Arguments
  • a (number) – the number from col0, row0 of a 2x2 table

  • b (number) – the number from col1, row0 of a 2x2 table

  • c (number) – the number from col0, row1 of a 2x2 table

  • d (number) – the number from col1, row1 of a 2x2 table

Returns

Array.number – an array containing all left-tail probabilities

getRightTailProbs(a, b, c, d)

Finds all possible matrices of non-negative integers that would be consistent with the given row and column totals with larger observed values in position [0][0] than the minimum rotation table (the right tail).

Arguments
  • a (number) – the number from col0, row0 of a 2x2 table

  • b (number) – the number from col1, row0 of a 2x2 table

  • c (number) – the number from col0, row1 of a 2x2 table

  • d (number) – the number from col1, row1 of a 2x2 table

Returns

Array.number – an array containing all right-tail probabilities

calcSingleTailP(tailProbs, tableProb)

Calculates the Fishers Exact one-tail test:

probabilityOfOriginalTable + sum(TailProbs)

Arguments
  • tailProbs (Array.number) – an array of the one-tail probabilities

  • tableProb (number) – the Fishers Exact probability of the original table

Returns

number – The one-tail probability

calcTwoTailP(tableProb, leftTailProbs, rightTailProbs)

Calculates the Fishers Exact two-tail test:

probabilityOfOriginalTable + sum(all probabilities less than or equal to the original)

Arguments
  • tableProb (number) – the Fishers Exact probability of the original table

  • leftTailProbs (Array.number) – an array of the left-tail probabilities

  • rightTailProbs (Array.number) – an array of the right-tail probabilities

Returns

number – The two-tailed probability

calcOddsAndCI(a, b, c, d)

Calculates the odds ratio and confidence interval of a 2x2 matrix.

Arguments
  • a (number) – value [0,0]

  • b (number) – value [0,1]

  • c (number) – value [1,0]

  • d (number) – value [1,1]

src.lib.chiSq

Calculates Pearson’s Chi-squared with Yates correction 1 using the equation

\[X^2_{Yates} = \frac {N(|ad-bc|-\frac {N} {2})^2} {NsNfNaNb}\]

for a 2x2 table of observed values like:

S

F

A

a

b

Na

B

c

d

Nb

Ns

Nf

N

where \(Na = a + b, Nb = c + d, Ns = a + c, Nf = b + d, N = Na + Nb = Ns + Nf\)

The degrees of freedom, which is equal to the number of rows minus one times the number of columns minus one (e.g. \((R-1)*(C-1)\)), is fixed at 1 for a 2x2 table.

The p-value calculation is taken from https://www.codeproject.com/Articles/432194/How-to-Calculate-the-Chi-Squared-P-Value and translated into JS.

calcChiSqYatesValue(a, b, c, d)

Calculates the Chi-squared value with the Yates correction.

Arguments
  • a (number) – the number from col0, row0

  • b (number) – the number from col1, row0

  • c (number) – the number from col0, row1

  • d (number) – the number from col1, row1

Returns

number – the Chi-squared value

computePvalue(Cv, Dof)

Calculates the p-value given the Chi-squared value and the degrees of freedom.

Arguments
  • Cv (number) – the Chi-squared value

  • Dof (number) – the degrees of freedom

Returns

number – the p-value for Chi-squared

src.lib.stats

These libraries contain the generic math functions needed to perform the statistical tests. Many of these functions were translated from R 2 or borrowed from the jStat 3 library.

Exports jStat and mle modules.

src.lib.stats.jstatFunc

These are mostly generic statistical functions.

gammaln(x)

Calculates the Log-gamma function.

Arguments
  • x (number) – a positive number

Returns

number – log gamma of x

igamma(S, Z)

Calculates the incomplete Gamma function.

Arguments
  • S (number) – degrees of freedom * 0.5

  • Z (number) – Chi-squared value * 0.5

Returns

number – the incomplete Gamma function result

betaln(x, y)

Calculates the natural logarithm of the beta function.

Arguments
  • x (number) – 0 < x < Inf

  • y (number) – 0 < y < Inf

Returns

number – lnB(x, y)

betacf(x, a, b)

Evaluates the continued fraction for incomplete beta function by modified Lentz’s method.

This is directly from jStat, so there is no error checking.

Arguments
  • x (number) –

  • a (number) –

  • b (nubmer) –

ibeta(x, a, b)

Returns the incomplete beta function I_x(a,b).

Arguments
  • x (number) – a number 0 < x < 1

  • a (number) – a number 0 < a < INF

  • b (number) – a number 0 < b < INF

Returns

number – I_x(a,b)

pnorm(x, mu, sigma, lowerTail=true)

Evaluates near-minimax approximations derived from those in “Rational Chebyshev approximations for the error function” by W. J. Cody, Math. Comp., 1969, 631-637. This transportable program uses rational functions that theoretically approximate the normal distribution function to at least 18 significant decimal digits.

Translated from R.

Arguments
  • x (number) – a finite number

  • mu (number) –

  • sigma (number) – a number > 0

  • lowerTail (boolean) – indicates which tail to calculate. Defaults to true.

Returns

number – a probability

The final two methods in jStatFunc are submethods required by the pnorm calculation.

doDel(X, temp)
Arguments
  • X (number) –

  • temp (number) –

swapTail(x, cum, ccum, lower)

swap ccum <–> cum

src.lib.stats.mle

Defines all functions needed to calculate the Maximum Likelihood Estimate and confidence interval for the Fisher’s Exact Test. These calculations are translations from R 2 for use in React Native. The Maximum Likelihood Estimate is used to approximate the odds ratio.

There are two distinct groupings of methods in the mle file. The first are the statistical calculations that the odds ratio and confidence interval calculations depend on:

bd0(x, np)

Evaluates the “deviance part”

\[bd0(x,M) := M * D0(\frac {x} {M}) = M*[ \frac {x} {M} * log(\frac {x} {M}) + 1 - (\frac {x} {M}) ] = x * log(\frac {x} {M}) + M - x\]

where \(M = E[X] = n*p\) (or = lambda), for \(x, M > 0\)

in a manner that should be stable (with small relative error) for all x and \(M=np\). In particular for \(\frac {x} {np}\) close to 1, direct evaluation fails, and evaluation is based on the Taylor series of \(log(\frac {(1+v)} {(1-v)})\) with \(v = \frac {(x-M)} {(x+M)} = \frac {(x-np)} {(x+np)}\). From R.

Arguments
  • x (number) –

  • np (number) –

stirlerr(n)

Computes the log of the error term in Stirling’s formula. For n > 15, uses the series 1/12n - 1/360n^3 + … For n <=15, integers or half-integers, uses stored values. For other n < 15, uses lgamma directly (don’t use this to write lgamma!)

\[\begin{split}\begin{align} stirlerr(n) &= log(n!) - log( sqrt(2*pi*n)*(n/e)^n )\\ &= log Gamma(n+1) - 1/2 * [log(2*pi) + log(n)] - n*[log(n) - 1]\\ &= log Gamma(n+1) - (n + 1/2) * log(n) + n - log(2*pi)/2 \end{align}\end{split}\]

From R.

Arguments
  • n (number) –

Returns

number – the log of the error term in Stirling’s formula

dbinom(x, n, p, q, giveLog=false)

Computes the binomial probability. Check for 0 <= p <= 1 and 0 <= q <= 1 or NaN’s in the calling function. This is R’s dbinom_raw.

Arguments
  • x (number) –

  • n (number) –

  • p (number) – a number between 0 and 1

  • q (number) – 1-p

  • giveLog (boolean) – indicates whether to return the log

Returns

number – the binomial probability

dhyper(x, r, b, n, giveLog)

Given a sequence of r successes and b failures, we sample \(n = (b+r)\) items without replacement. The hypergeometric probability is the probability of x successes:

\[p(x; r,b,n) = \frac {choose(r, x) * choose(b, n-x)} {choose(r+b, n)} = \frac {dbinom(x,r,p) * dbinom(n-x,b,p)} {dbinom(n,r+b,p)}\]

for any p. For numerical stability, we take \(p = \frac {n} {(r+b)}\) with this choice, the denominator is not exponentially small.

Arguments
  • x (number) – expected successes

  • r (number) – number of successes

  • b (number) – number of failures

  • n (number) – number of samples

  • giveLog (boolean) – indicates if the log of p should be returned.

Returns

number – p(x; r,b,n)

dnhyper(ncp, logdc, support)

Density of the central hypergeometric distribution. From R. Does not work for boundary values for ncp (0, Inf).

Arguments
  • ncp (number) – the non-centrality parameter. 0 < ncp < Inf

  • logdc (Array.number) – Density of the central hypergeometric distribution on its support

  • support (Array.number) – the range

Returns

Array.number – an array

mnhyper(ncp, support, logdc)

Calculates the sum of the density of the central hypergeometric distribution across the range. Support function for MLE calculation from R.

Arguments
  • ncp (number) – the non-centrality parameter 0 < ncp < Inf

  • support (Array.number) – the range

  • logdc (Array.number) – the log of p(x; r,b,n) (dhyper) over the range

Returns

number – the sum of the densities

pdhyper(x, NR, NB, n, logp)

Calculate

\[log \left(\frac {phyper (x, NR, NB, n, TRUE, FALSE)} {dhyper (x, NR, NB, n, FALSE)} \right)\]

without actually calling phyper. This assumes that

\[x * (NR + NB) <= n * NR\]
Arguments
  • x (number) –

  • NR (number) –

  • NB (number) –

  • n (number) –

  • logp (number) –

phyper(x, NR, NB, n, lowerTail, logp)

Sample of n balls from NR red and NB black ones; x are red

pnhyper(q, x, m, n, k, logdc, support, ncp=1, upperTail=false)

Calculates the cumulative distribution function for a negative hypergeometric random variable.

Arguments
  • q (number) –

  • x (number) –

  • m (number) –

  • n (number) –

  • k (number) –

  • logdc (Array.number) – the log of p(x; r,b,n) (dhyper) over the range

  • support (Array.number) – the range

  • ncp (number) –

  • upperTail (boolean) –

zeroin2(ax, bx, fa, fb, f, info, Tol, Maxit)

function ZEROIN - obtain a function zero within the given range

From r-source/src/library/stats/src/zeroin.c

Algorithm

G.Forsythe, M.Malcolm, C.Moler, Computer methods for mathematical computations. M., Mir, 1980, p.180 of the Russian edition

The function makes use of the bisection procedure combined with the linear or quadric inverse interpolation. At every step program operates on three abscissae - a, b, and c.

b - the last and the best approximation to the root
a - the last but one approximation
c - the last but one or even earlier approximation than a that
1) |f(b)| <= |f(c)|
2) f(b) and f(c) have opposite signs, i.e. b and c confine the root

At every step Zeroin selects one of the two new approximations, the former being obtained by the bisection procedure and the latter resulting in the interpolation (if a,b, and c are all different the quadric interpolation is utilized, otherwise the linear one). If the latter (i.e. obtained by the interpolation) point is reasonable (i.e. lies within the current interval [b,c] not being too close to the boundaries) it is accepted. The bisection result is used in the other case. Therefore, the range of uncertainty is ensured to be reduced at least by the factor 1.6. R_zeroin2() is faster for “expensive” f(), in those typical cases where f(ax) and f(bx) are available anyway.

Arguments
  • ax (number) – Left border of the range the root is searched for in

  • bx (number) – Right border of the range the root is searched for in

  • fa (number) – f(a)

  • fb (number) – f(b)

  • f (function) – Function under investigation f(x, info)

  • info (*) – Additional info passed on to f

  • Tol (number) – Acceptable tolerance for the root value. May be specified as 0.0 to cause the program to find the root as accurate as possible

  • Maxit (number) – Max # of iterations

Returns

Array – an array containing 0)an estimate for the root with accuracy 4*EPSILON*abs(x) + tol, 1)actual # of iterations or -1 if maxit was reached without convergence, and 2)the estimated precision, or undefined if the input was invalid

sign(x)

computes the ‘signum(.)’ function:

\[\begin{split}\begin{align} sign(x) &= 1 if x > 0\\ sign(x) &= 0 if x == 0\\ sign(x) &= -1 if x < 0 \end{align}\end{split}\]
uniroot(f, interval, args, lower, upper, fLower, fUpper, extendInt=\"no\", checkConv=false, tol, maxIter=1000)

Searches the interval from lower to upper for a root (i.e., zero) of the function f with respect to its first argument.

Arguments
  • f (*) – a function

  • interval (Array.number) – an array containing 2 numbers, the min and max of the interval

  • args (Array.any) – an array containing any other arguments that need to be passed to f

  • lower (number) – the lower end of the interval. Will be calculated from the interval param by default.

  • upper (number) – the upper end of the interval. Will be calculated from the interval param by default.

  • fLower (number) – result of f run on lower

  • fUpper (number) – result of f run on upper

  • extendInt (string) – indicates if and how to extend the interval

  • checkConv (boolean) – logical indicating whether a convergence warning of the underlying uniroot should be caught as an error and if non-convergence in maxiter iterations should be an error instead of a warning.

  • tol (number) – the machine epsilon^25. Calculated by default.

  • maxIter (number) – defaults to 1000

The second grouping contains the actual calculations of the odds ratio and confidence interval:

calcOddsRatio(x, m, n, k, lo, hi, support, logdc)

Calculates the odds ratio using a maximum likelihood estimate. From R.

Arguments
  • x (number) – entry [0,0] from a 2x2 table

  • m (number) – the sum of col0

  • n (number) – the sum of col1

  • k (number) – the sum of row0

  • lo (number) – max(0, k - n)

  • hi (number) – min(k, m)

  • support (Array.number) – an array containing range(lo, hi + 1)

  • logdc (Array.number) – an array containing the log results of dhyper for each value in support given m, n, and k

calcNCPUpper(q, alpha, x, m, n, k, hi, support, logdc)

Calculates the upper end of the Confidence Interval.

Arguments
  • q (number) –

  • alpha (number) – a number 0 <= alpha <= 1 (1 - confidence_level)

  • x (number) –

  • m (number) –

  • n (number) –

  • k (number) –

  • hi (number) – the max of the range

  • support (Array.number) – the range

  • logdc (Array.number) – the log of p(x; r,b,n) (dhyper) over the range

Returns

number – the upper value of the Confidence Interval

calcNCPLower(q, alpha, x, m, n, k, lo, support, logdc)

Calculates the lower end of the Confidence Interval.

Arguments
  • q (number) –

  • alpha (number) – a number 0 <= alpha <= 1 (1 - confidence_level)

  • x (number) –

  • m (number) –

  • n (number) –

  • k (number) –

  • lo (number) – the min of the range

  • support (Array.number) – the range

  • logdc (Array.number) – the log of p(x; r,b,n) (dhyper) over the range

Returns

number – the lower value of the Confidence Interval

src.styles

Contains the style definitions for the components.

src.styles.formContainerStyles

Defines the styles for the Container components.

src.styles.formStyles

Defines the styles for the input forms components.

src.styles.resultsContainerStyles

Defines the styles for the ResultsContainer component.

src.templates

Contains the templates used by the components.

src.templates.horizontalrow

The template used by the input forms components that allows for a side-by-side layout of the input fields.