# 1. Testing of hypothesis on the variance of two normal populations.

Since the statistic is the same in both cases, it doesn't matter whether we use the correction or not; either way we'll see identical results when we compare the two methods using the techniques we've already described. Since the degree of freedom correction changes depending on the data, we can't simply perform the simulation and compare it to a different number of degrees of freedom. The other thing that changes when we apply the correction is the p-value that we would use to decide if there's enough evidence to reject the null hypothesis. What is the behaviour of the p-values? While not necessarily immediately obvious, under the null hypothesis, the p-values for any statistical test should form a uniform distribution between 0 and 1; that is, any value in the interval 0 to 1 is just as likely to occur as any other value. For a uniform distribution, the quantile function is just the identity function. A value of .5 is greater than 50% of the data; a value of .95 is greater than 95% of the data. As a quick check of this notion, let's look at the density of probability values when the null hypothesis is true:

## Here is how the process of statistical hypothesis testing works:

### So the null hypothesis in this example is

Finally, because of the significant costs associated with defense testing, questions about how much testing to do would be better addressed by statistical decision theory than by strict hypothesis testing. Cost considerations are especially important for complex single-shot systems (e.g., missiles) with high unit costs and highly reliable electronic equipment that might require testing over long periods of time (Meth and Read, ). Voting a system up or down against some standard of performance at a given decision point does not consider the potential for further improvements to the system. A better objective is to purchase the maximum possible military value/utility given the constraints of national security requirements and the budget. This broader perspective fits naturally into a decision analysis framework. Concerns about efficient use of testing resources have also stimulated work on reliability growth modeling (see the preceding section).

### The example states a 5% level of significance so (lpha = 0.5).

If the biologist set her significance level α at 0.05 and used the critical value approach to conduct her hypothesis test, she would reject the null hypothesis if her test statistic *t** were less than -1.6939 (determined using statistical software or a *t*-table):

## Testing of hypothesis problems .

The power question is fairly straightforward. Clearly there is now no single power for our test of the hypothesis, but a different power for each possible value of the binomial probability included in the alternative hypothesis. The power is a function rather than a single value. This function is described in detail in the next section for Example 4.2.