# Thread: p-value of an odds ratio

1. Hi i'm really bad at this kind of statistics so i hope someone can help me who has more understanding of how it works.

I have a genetic data sheet, and i want to calculate an odds ratio. N=48 of Control with 15 positive for a certain polymorfism (heterozygote) , N=88 of Patient group, and of this group 19 heterozygote and 2 homozygote. To be easy i simply combined hetero and homozygotes (21).

To get the odds ratio i did the following.

pc=positives for polymorfism (control group)
N=number total
nc=negatives for polymorfism (control group)
pp=positives for polymorfism (patient group)
np=negatives for polymorfism (patient group)

OR=(((pp*nc)/np)/pc)

But i got stuck on how to calculate the p-value, i need a high and a low value to know if the polymorfism is a significant change. First time i ever tried this. Though i know the solution (due to a filler spreadsheet) i want to know the mechanics.

The solution to this is

OR=0,670
p-Low=0,306
p-High=1,490

So any help on how to get to these values will be very appreciated.

2.

3. Hmm, i really want this answered, sorry for the "spamming" but i'm kind of stuck (mostly in my head). If you need other information, please tell.

4. Hmm, i'm still shrugging along with this problem, although i have the solution, i still don't have the mechanism. Is this question so hard nobody can answer it? i really need it for my rapport about this, as when i get questioned i'll need a way to defend my claims, so i'll need to know how i got to that answer.

5. Hi Zwolver

I am not familiar with the problem you are asking here so some background in what a p-value for an odds ratio might help?

6. For the odds ratioOdds ratio - Wikipedia, the free encyclopedia

For the p-value
P-value - Wikipedia, the free encyclopedia

For the credibility interval,
Credible interval - Wikipedia, the free encyclopedia

It's a problem faced with biology in creating a genological databank. The point is to get a value that is sufficient to call a mutation or polymorphism relevant for the cause of a certain disease or deficiency. We use multiple healthy and sick people in our study.

7. Hi Zwolver, i hope it's not too late but i think i can help. (Not in too great a detail but enough to replicate the calculations)

Originally Posted by Zwolver
I have a genetic data sheet, and i want to calculate an odds ratio. N=48 of Control with 15 positive for a certain polymorfism (heterozygote) , N=88 of Patient group, and of this group 19 heterozygote and 2 homozygote. To be easy i simply combined hetero and homozygotes (21).
So restating the problem, we're interested in determining if a certain polymorphism is observed more (or less) frequently in one group (patients) over another group (the control). From the data it is easy to calculate the odds of an individual in the total population (patients+control) being positive for the polymorphism. Across both groups there are 36 positives and 100 negatives, so if the group is not a statistically significant factor then we'd expect that the odds of having the polymorphism (in either group) will be 0.36. Simple calculation reveals that this is not so, but is it a statistically significant result?

Originally Posted by Zwolver
But i got stuck on how to calculate the p-value, i need a high and a low value to know if the polymorfism is a significant change. First time i ever tried this. Though i know the solution (due to a filler spreadsheet) i want to know the mechanics.

The solution to this is

OR=0,670
p-Low=0,306
p-High=1,490
I'm not sure what you mean by high and low p-values, the values provided look more like confidence interval bounds for the Odds Ratio (OR). But i can show the mechanics of that too.

Wiki says that the distribution of the natural log of the OR statistic will have a normal distribution, ie. where is the sample odds ratio and OR is the population odds ratio, the sample odds ratio being an approximation to the population odds ratio. Knowing the distribution of the test statistic allows us to generate a confidence interval. (shown below) Note that the square root term is just the Standard Error (SE), or an estimator of the population variance. (as given by wiki)

Now we have all of the data we need to calculate a 95% CI for Ln(OR), ie. when z = 1.96. Note: i obtained a slightly different value for , mine being 0.6895 (which i've stuck with since i also don't know what level the CI in the spreadsheet uses)

or in terms of OR,

.

Now back to statistical significance, if the frequency of positives is statistically independent of which group the patient is in then we expect that the Odds Ratio will be equal to one. It isn't and so we examine whether or not the size of our OR value is extreme enough to provide sufficient evidence to dismiss the null hypothesis, in this case the hypothesis is that the odds ratio is equal to one. Again using the natural log of OR simplifies things greatly since the two sided p-value (again from wikipedia) is given by,

which in this case turns out to be 0.3524. Most criteria that i've seen requires a p-value of 0.05, or lower, to reject the null hypothesis. Thus by this same criteria there is insufficient evidence to reject the null hypothesis. (which should not be confused with evidence in support of the null hypothesis)

Let me know if this explanation is unsatisfying, the only thing i don't think i could elaborate on is the reason why the log-odds ratio has a normal distribution or how to derive the standard error term.

8. Well.. It has clarified a lot. Though i am not a mathematician and i usually use forms in spreadsheet for this. It automatically provides a result. Can you elaborate the following formulae?

I know how to fill it in, i want to see the logic to the 1/x thing happening inside the squareroot. It simply uses all of the values, and if there is 1 value extremely low (like 1), there will be a high variation of + and -? I also understand Z value, which in this case is indeed 1,96 from table.

But how do i deal with the Ln(OR) (with ^, not sure if it means anything)? simply do Ln for the OR i calculated?

9. =LN(OR)-1,96*(SQRT((1/PP)+(1/PC)+(1/NP)+(1/NC)))

This is what i filled into excel (pp pc etc are cell numbers in that formulae, but it did not work. I get an answer, but it's incorrect. What am i missing?

10. Originally Posted by Zwolver
Well.. It has clarified a lot. Though i am not a mathematician and i usually use forms in spreadsheet for this. It automatically provides a result. Can you elaborate the following formulae?

I know how to fill it in, i want to see the logic to the 1/x thing happening inside the squareroot. It simply uses all of the values, and if there is 1 value extremely low (like 1), there will be a high variation of + and -? I also understand Z value, which in this case is indeed 1,96 from table.

But how do i deal with the Ln(OR) (with ^, not sure if it means anything)? simply do Ln for the OR i calculated?
The formula computes the upper and lower bounds of a level confidence interval (CI) for the natural log of the odds ratio, then once this is done we can convert these values back into Odds ratios to obtain a 95% CI for the odds ratio. I'm almost certain that the p-high and p-low values in your opening post are the upper and lower bounds of a 95% CI for the odds ratio, for the simple reason that repeating the calculations using the values you provided i obtain the same result provided.

The term, in this case is the standard error (in the OR value) and it is an estimate of the population standard deviation. How this particular standard error term is derived is something that we will have to look up.

11. Meh, i still can't get the function to work, it gives minus values at some points, and even the low higher then the high value of CI.

12. Originally Posted by Zwolver
Meh, i still can't get the function to work, it gives minus values at some points, and even the low higher then the high value of CI.
If that's for the ln(OR) confidence interval then that's what you should be getting, you need to convert these values back to an OR value. ie. if then . It should not be possible to get negative values for the OR confidence interval.

13. Originally Posted by wallaby
Originally Posted by Zwolver
Meh, i still can't get the function to work, it gives minus values at some points, and even the low higher then the high value of CI.
If that's for the ln(OR) confidence interval then that's what you should be getting, you need to convert these values back to an OR value. ie. if then . It should not be possible to get negative values for the OR confidence interval.
I know, thats why i posted that. I'll need to recheck my values, but i think i'll just go to my supervisor and ask him to explain it, but i doubt he know anything more then me about it.

14. I think i figured it out. Approached it from the wrong side. I actually needed p value of Chi-square, and the reliability of OR. Seems like the one who asked me to do it left out what i had to do exactly, or i wrote it down wrong. Thanks anyway wallaby, you still cleared much up .

15. Originally Posted by Zwolver
I think i figured it out. Approached it from the wrong side. I actually needed p value of Chi-square, and the reliability of OR. Seems like the one who asked me to do it left out what i had to do exactly, or i wrote it down wrong. Thanks anyway wallaby, you still cleared much up .
This has happened to me a couple of times, glad you got it straightened out.

 Bookmarks
##### Bookmarks
 Posting Permissions
 You may not post new threads You may not post replies You may not post attachments You may not edit your posts   BB code is On Smilies are On [IMG] code is On [VIDEO] code is On HTML code is Off Trackbacks are Off Pingbacks are Off Refbacks are On Terms of Use Agreement