This page describes the relationship between sample size and error in estimating

**Standard Deviations (SD)** from a sample

As most statistical calculations are based on the critical assumption that the SD of a measurement is known and valid, the precision of an estimated SD is therefore important.

In many cases in the clinical situation, an assumed value for SD is used, based on published figures or from estimates using a small pilot study, but this often results in misleading conclusions

In particular, in engineering and in high precision biochemical laboratories, a valid assumed SD is critical, and the calculations presented in this panel is one of the methods used to obtain this.

The error, the confidence interval of the SD estimated, is expressed as a percent of the value. The greater the sample size, the narrower would be this confidence interval.

Two calculations are offered in this panel

- At the planning stage of the study, to estimate the sample size required to establish a desired confidence interval.
- After collecting all the data, to estimate the SD and its confidence interval

The other sub-panels for SD are

- Javascript program to calculate sample size, and to estimate confidence intervals
- R codes to do the same
- Tables of sample size and confidence intervals in the commonly used range of values

**References**

Greenwood JA and Sandomire MM (1950) Journal of the American Statistical
Association 45 (250) p. 257 - 260

Burnett RW (1975)Accurate estimation of standard deviations for quantitative
methods used in clinical chemistry. Clin. Chem. 21 (13) p. 1935-1938

This sub-panel provides the R codes for estimating sample size needed and error estimated with Standard Deviation, algorithm is as described in

Burnett RW (1975) Accurate estimation of standard deviations for quantitative methods used in clinical chemistry. Clin. Chem. 21 (13) p. 1935-1938

**Section 1. Supportive subroutines**

ChiSqToP<-function(chi,degF) #function to calculate probability from chisq and df
{
return (1 - pchisq(chi, df=degF)) #probability
}
Confidence <- function (u,df) # common to both ssiz and error
{
#x1 = pow(1 - u,2) * df
#x2 = pow(1 + u,2) * df
x1 = (1 - u)^2 * df
x2 = (1 + u)^2 * df
return (ChiSqToP(x1,df) - ChiSqToP(x2,df));
}
# Calculate sample size from confidence and % error
SSizSD <- function (Cf,Er) # percent confidence, tolerable error (as % of SD found)
{
Cf = Cf / 100; # converts to probability
Er = Er / 100; # converts to probability
# iterate for correct sample size
nl = 0;
nr = 100000000;
nm = nr / 2;
p = Confidence(Er,round(nm));
while(abs(nl-nr)>1)
{
if(p<Cf) { nl = nm; } else { nr = nm; }
nm = (nr + nl) / 2.0;
p = Confidence(Er,round(nm));
}
#print(ceiling(nm) + 1)
return (ceiling(nm) + 1); # nm is df so sample size needs plus 1
}
# Calculate Error from sample size (n) and % confidence
ErrSD <- function(cf,ssiz) # n= sample size, Cf = percent confidence
{
cf = cf / 100; # converts to probability
df = ssiz-1; # degrees of confidence
# iterate for error
ul = 0;
ur = 1;
um = 0.5;
p = Confidence(um,df);
while(abs(ul-ur)>0.0001)
{
if(p>cf) { ur = um; } else { ul = um; }
um = (ul + ur) / 2.0;
p = Confidence(um,df);
}
return (um * 100);
}

**Section 2. Main Programs**
**Propgram 1. Sample size**

txt = ("
Cf Err
90 10
95 5
99 1
") # input data Cf=% confidence Err= error in % of SD
df <- read.table(textConnection(txt),header=TRUE)
#df # optional display of input data frame
# extract arrays from data frame
arCf <- df$Cf
arErr <- df$Err
# Create results array
arSSiz <- vector()
# Calculate
for(i in 1:nrow(df))
{
cf = arCf[i]
er = arErr[i]
arSSiz <- append(arSSiz,SSizSD(cf, er)) # append sample size to result array
}
# Incorporatw results to original data frame
df$SSiz <- arSSiz
# show data frame with results
df

The results are as follows

> df
Cf Err SSiz
1 90 10 137
2 95 5 769
3 99 1 33175

**Propgram 2. Error** as % of SD found

txt = ("
Cf SSiz
90 137
95 769
99 33175
") # input data Cf=% confidence, SSiz = sample size
df <- read.table(textConnection(txt),header=TRUE)
#df # optional display of input data frame
# extract arrays from data frame
arCf <- df$Cf
arSSiz <- df$SSiz
# Create results array
arEr <- vector()
# Calculate
for(i in 1:nrow(df))
{
cf = arCf[i]
ssiz = arSSiz[i]
print(c(cf,ssiz))
arEr <- append(arEr,ErrSD(cf, ssiz)) # append error to result array
}
# Incorporatw results to original data frame
df$Error <- arEr
# show data frame with results
df

The results are

> df
Cf SSiz Error
1 90 137 9.9700928
2 95 769 5.0018311
3 99 33175 0.9979248