## confidence interval for difference in proportions

Confidence interval for a proportion. Due to the fact that the variances add together, we see that the variance of the sampling distribution is p1 (1 - p1 )/n1 + p2 (1 - p2 )/n2. The difference between these sample proportions (females – males) is 0.53 – 0.34 = 0.19. We can modify the plus-four confidence interval construction and obtain robust results. For example, if you had switched the males and females, you would have gotten –0.19 for this difference. Confidence Intervals for the Difference Many methods have been devised for computing confidence intervals for the difference between two proportions δ=p 1 −p 2. Of course, there are some guys out there that wouldn't admit they'd ever seen an Elvis impersonator (although they've probably pretended to be one doing karaoke at some point). Testing statistical hypotheses. Lehmann, EL . In practice, the statistician needs to make the choice which one to use in calculating the confidence interval for difference in proportions depending on the sample size situation. Bayesian derivation. The following formula gives us a confidence interval for the difference of two population proportions: (p̂1 - p̂2) +/- z* [ p̂1 (1 - p̂1 )/n1 + p̂2 (1 - p̂2 )/n2. The estimate of p2 is p̂2. This project was supported by the National Center for Advancing Translational Sciences, National Institutes of Health, through UCSF-CTSI Grant Numbers UL1 TR000004 and UL1 TR001872. (Refer to the following table for z*-values.). The formula to create this confidence interval. Instead we could simply calculate the exact difference. 0's and 1's this command can be omitted. In a similar way we can calculate a sample proportion from our second population. Both of these population proportions are estimated by a sample proportion. When a statistical characteristic, such as opinion on an issue (support/don't support), of the two groups being compared is categorical, people want to report on the differences between the two population proportions — for example, the difference between the proportion of women and men who support a four-day work week. Each of these are approximated by a normal distribution. Agresti and Caffo recommend the following confidence limit. We denote this statistic by p̂1. When you are dealing with two population proportions, what you want is to compute a confidence interval for the difference between two population proportions. Then take 0.34 ∗ (1 – 0.34) to obtain 0.2244. In the process we will examine some of the theory behind this calculation. Data Analysis", Chapman and Hall. These two statistics become the first part of our confidence interval. A standard error is useful because it effectively estimates a standard deviation. The mean of this distribution is the proportion p1. Now we are ready to construct our confidence interval. Confidence intervals are not only used for representing a credible region for a parameter, they can also be constructed for an operation between parameters. Date created: 06/05/2001 Deborah J. Rumsey, PhD, is Professor of Statistics and Statistics Education Specialist at The Ohio State University. Notice that you could get a negative value for. Date created: 06/05/2001 The first population proportion is denoted by p1. However, often the proportion is affected by covariates, and the adjustment of the predicted proportion is made using logistic regression. We now need a few results from mathematical statistics in order to determine the sampling distribution of p̂1 - p̂2. Then divide that by 100 to get 0.0025. To interpret these results within the context of the problem, you can say with 95% confidence that a higher percentage of females than males have seen an Elvis impersonator, and the difference in these percentages is somewhere between 6% and 32%, based on your sample. You also need to factor in variation using the margin of error to be able to say something about the entire populations of men and women. The parameter from this population is p2. Your 95% confidence interval for the difference between the percentage of females who have seen an Elvis impersonator and the percentage of males who have seen an Elvis impersonator is 0.19 or 19% (which you got in Step 3), plus or minus 13%. The lower end of the interval is 0.19 – 0.13 = 0.06 or 6%; the upper end is 0.19 + 0.13 = 0.32 or 32%. The result is called a confidence interval for the difference of two population proportions, p1 – p2. A confidence interval for a difference in proportions is a range of values that is likely to contain the true difference between two population proportions with a certain level of confidence. If the number of successes in our sample from this population is k2, and our sample proportion is p̂2 = k2 / n2. Notice all the values in this interval are positive. The first of these values is the estimate for the parameter. The default is the adjusted Wald (Agresti-Caffo) interval.

