Notes on power simulation

In order to obtain power/sample size estimates by simulation, we need to understand what power is. The power of a test is the probability, assuming that the ALTERNATIVE hypothesis is true, of rejecting the null hypothesis. That means we need to simulate data under the alternative, compute the test statistic, then see if we reject the null (ie, is the p-value smaller than alpha).

Power depends on:
  - the true effect size (in our case, the true difference in means)
  - the SD of the observations
  - alpha
  - sample size n

We can only compute the power at specific alternatives (effect size) one at a time. We are also generally interested in how the power varies across changes as these 4 characteristics change. Thus, we want to obtain estimates of power for different combinations of these 4 things.

Here is a sketch of how you would do the simulation. It should be fairly straightforward to take this outline and code it in R. 

0. Work with the log rainfall, since that appears to be approximately normally distributed (you can see this by looking at normal QQ plots for each group).

1. Choose some scenarios to simulate:
  - start with a few reasonable effect sizes (for example, start with (perhaps approximately, or rounded) the observed difference in means and then choose a few other possibilities larger/smaller)
  - do the same for the SD. You could use different group SDs, or use the same if you would like to be able to match the values from power.t.test. Once you know your routine is working, it would be useful to look at different group SDs, since that seems to be more realistic
  - Use alpha sizes 0.10, 0.05, 0.01
  - You have to experiment to figure out which values of n will yield the level of power that we want (aim for power of 80%, 90% and 95%; you can use the same n for each group)

2. Suppose we have the group means specified as m11, m12, m13 and m21, m22, m23, to produce the 3 effect sizes ef1 = m11 - m21, ef2 = m12 - m22, ef3 = m13 - m23 (those 3 effect sizes should be different).

Choose 3 SDs (assume the same for each group): sd1, sd2, sd3

You have 3 alpha values: a1, a2, a3

Experiment a little with n, this will need some tuning to get it to match the desired power. Values from power.t.test can give a good starting point.

3. You will probably have a (multiply) nested loop structure to make the combinations for the multiple scenarios, it should look something like this:

set.seed(54321)
## you set the seed of the random number generator in order to make your work reproducible
## the value you input can be any integer (ie, you should pick your own value, you don't need to use 54321)

  for (m1_m2 in (combinations of m1 and m2)) {
    for(sd in sd1, sd2, sd3) {
      for(alpha in a1, a2, a3){
        for(n) {
 
          -- set power counter to 0

          for(1: number of reps (eg, 100)) {
        
            -- generate random samples of size n from each of 2 normal distributions (use rnorm),
            one with mean m1, sd and the other with mean m2, sd

            -- compute the 2-sample t-test for difference in means

            -- check whether the (1-sided) p-value is less than alpha

            -- if yes, power counter <- power counter + 1

          }
          
          power for that scenario = power counter/number of reps

        }
      }   
    }
  }

You should start smaller than the entire loop to test out whether your program is working properly. You might also prefer to put the sample size n at the outside level of the nested loops.

You will need to keep the n values that match the power you are trying to attain for that scenario.

4. Once you have all the results, you should organize them and present them using tables; graphs are also very useful for this.

5. Good luck, and ask if you have questions!