Notices
Results 1 to 16 of 16

Thread: Calculating statistical relevance

  1. #1 Calculating statistical relevance 
    Forum Professor Zwolver's Avatar
    Join Date
    May 2006
    Location
    Netherlands
    Posts
    1,667
    Hi all

    I'm having a statistical issue. I have a 2 sided dataset, of which i am trying to compare isotopes of 2 different radioactive compounds on statistical significance (i want to look at every point and be able to say that either it is withing the margins or not). My question is, how do i do this for each point? (This is a picture of the points)



    I have tried so far;

    T-Test
    ChiSquared
    And just to try if it had any effect, normal distribution etc.

    I'm using excel, and it has been a while for me since i have done any statistical math.

    I hope anyone could help.


    Last edited by Zwolver; April 9th, 2014 at 05:22 AM.
    Growing up, i marveled at star-trek's science, and ignored the perfect society. Now, i try to ignore their science, and marvel at the society.

    Imagine, being able to create matter out of thin air, and not coming up with using drones for boarding hostile ships. Or using drones to defend your own ship. Heck, using drones to block energy attacks, counterattack or for surveillance. Unless, of course, they are nano-machines in your blood, which is a billion times more complex..
    Reply With Quote  
     

  2.  
     

  3. #2  
    Forum Professor Zwolver's Avatar
    Join Date
    May 2006
    Location
    Netherlands
    Posts
    1,667
    I think i have been going at this the wrong way.

    Maybe it's just as easy as twice the standard-deviation as the difference between the measured value, and the linear extrapolated value (the function). But then again, which model proves the statistical relevance?


    Growing up, i marveled at star-trek's science, and ignored the perfect society. Now, i try to ignore their science, and marvel at the society.

    Imagine, being able to create matter out of thin air, and not coming up with using drones for boarding hostile ships. Or using drones to defend your own ship. Heck, using drones to block energy attacks, counterattack or for surveillance. Unless, of course, they are nano-machines in your blood, which is a billion times more complex..
    Reply With Quote  
     

  4. #3  
    Forum Radioactive Isotope MagiMaster's Avatar
    Join Date
    Jul 2006
    Posts
    3,440
    While I can't give you a complete answer, one thing I notice missing here is a confidence level. In statistics, nothing can be certain and error bars only show you the range where you can say "I'm this certain the answer lies between these bars." (A 100% confidence interval would just be everything.) BTW, 2-sigma (two standard deviations either way) would be very close to 95% confidence. 1-sigma is (IIRC) 67% and 3-sigma is 99%. (Those are actually pessimistic estimates for anything but a normal distribution.)
    Reply With Quote  
     

  5. #4  
    Forum Professor Zwolver's Avatar
    Join Date
    May 2006
    Location
    Netherlands
    Posts
    1,667
    Well, i can read up to 0,001 decimal, and the equipment is about 99,5% reliable. However the fluctuations in other reading indicate 0,38% standarddeviation. But if i read on higher energy levels there seems to be a 4% standard deviation. This however is a low energy variant, so only a 0,38%.

    So 0,99999*0,995*0,9962?
    Growing up, i marveled at star-trek's science, and ignored the perfect society. Now, i try to ignore their science, and marvel at the society.

    Imagine, being able to create matter out of thin air, and not coming up with using drones for boarding hostile ships. Or using drones to defend your own ship. Heck, using drones to block energy attacks, counterattack or for surveillance. Unless, of course, they are nano-machines in your blood, which is a billion times more complex..
    Reply With Quote  
     

  6. #5  
    Forum Radioactive Isotope MagiMaster's Avatar
    Join Date
    Jul 2006
    Posts
    3,440
    You just pick the confidence level you want. The higher the number, the more confident you can be that the results are significant and not just a statistical fluke. That said, yeah, it probably doesn't make much sense to pick a confidence level much higher than the uncertainties in your data. The numbers you've given work out to pretty close to 99%, so 3-sigma wouldn't be unreasonable. In that case, you just take the mean plus/minus 3 times the standard deviation as your interval.
    Reply With Quote  
     

  7. #6  
    Forum Professor Zwolver's Avatar
    Join Date
    May 2006
    Location
    Netherlands
    Posts
    1,667
    3 times the standard deviation sounds incredibly high. I haven't calculated it yet, but with 3*SD even negative numbers will be possible, and withing margins of error. Is there no way to correct for this?
    Growing up, i marveled at star-trek's science, and ignored the perfect society. Now, i try to ignore their science, and marvel at the society.

    Imagine, being able to create matter out of thin air, and not coming up with using drones for boarding hostile ships. Or using drones to defend your own ship. Heck, using drones to block energy attacks, counterattack or for surveillance. Unless, of course, they are nano-machines in your blood, which is a billion times more complex..
    Reply With Quote  
     

  8. #7  
    Forum Professor river_rat's Avatar
    Join Date
    Jun 2006
    Location
    South Africa
    Posts
    1,509
    Quote Originally Posted by Zwolver View Post
    3 times the standard deviation sounds incredibly high. I haven't calculated it yet, but with 3*SD even negative numbers will be possible, and withing margins of error. Is there no way to correct for this?
    Pick a different distribution? Lognormal?
    As is often the case with technical subjects we are presented with an unfortunate choice: an explanation that is accurate but incomprehensible, or comprehensible but wrong.
    Reply With Quote  
     

  9. #8  
    Forum Radioactive Isotope MagiMaster's Avatar
    Join Date
    Jul 2006
    Posts
    3,440
    You can pick a lower confidence interval, which would mean fewer data points would fall in that interval. If you don't want to do that you'd have to abandon the assumption that things were normally distributed, but then you wouldn't be able to just say plus/minus this amount gives me this much confidence. You'd have to pick a distribution that better fit what your data should be, but that's tricky and getting a confidence interval out of it requires some number crunching (as in evaluating integrals). Also, because a normal distribution is kind of special, it's always a safe assumption. If you assume your data follows some other distribution, you'll have to give some reasons or some data to back that up.

    Edit: As river_rat said, a log-normal distribution might be a good place to start. I don't know exactly what you're measuring though, so I can't really suggest anything more specific. If you're doing something like counting hits on a Geiger counter, for example, you'd expect that to follow a Poisson distribution.
    Reply With Quote  
     

  10. #9  
    Forum Professor Zwolver's Avatar
    Join Date
    May 2006
    Location
    Netherlands
    Posts
    1,667
    the problem there is i donīt know how to figure out a relevance aberration for a single value in a group of numbers for that. How do i do that?
    Growing up, i marveled at star-trek's science, and ignored the perfect society. Now, i try to ignore their science, and marvel at the society.

    Imagine, being able to create matter out of thin air, and not coming up with using drones for boarding hostile ships. Or using drones to defend your own ship. Heck, using drones to block energy attacks, counterattack or for surveillance. Unless, of course, they are nano-machines in your blood, which is a billion times more complex..
    Reply With Quote  
     

  11. #10  
    Forum Radioactive Isotope MagiMaster's Avatar
    Join Date
    Jul 2006
    Posts
    3,440
    If your distribution is continuous, you can't, directly. P(x=k) = 0 if the set x is drawn from isn't countable (at least, I think I got that right). That is, the probability of getting one specific number out of a continuum is 0. So instead, you have to rephrase that as what is P(x >= k) or P(x <= k). That is, you can ask what is the probability of getting at least a specific number. In those cases you need the integral of the probability density function from k to infinity (or something like that, depending on the distribution). The integral of the probability density function is called the cumulative density function, and you can find it already worked out on the Wiki page for most distributions. (It specifically answers P(x <= k) so you might have to rearrange things a bit if you need more than that.)

    Edit: The above assumes you already have a fully specified distribution. If you're trying to work out what the distribution is, or what the parameters of the distribution is, things get more complicated.
    Reply With Quote  
     

  12. #11  
    Forum Professor Zwolver's Avatar
    Join Date
    May 2006
    Location
    Netherlands
    Posts
    1,667
    Okay, i don't exactly understand what you mean by that. However i do think i know it's impossible like this (as you tried to tell me).

    So i should test the following statement X/Y=1 with 99% or 95% certainty. How should i calculate this? Using the Poisson distribution.

    N=24
    sd=1,2671
    mean=1,4578

    0,9105
    0,3658
    1,5779
    1,0032
    1,2967
    1,0778
    0,9956
    1,1811
    1,1452
    1,2249
    1,4056
    0,8649
    0,8521
    1,0182
    0,8987
    1,3134
    1,5165
    6,6597
    2,9059
    1,2916
    0,6927
    1,4892
    0,3258
    2,9744

    Now i do notice that some of the lower values were very different, and some of the higher values were quite the same, thus giving a relatively high SD.
    Growing up, i marveled at star-trek's science, and ignored the perfect society. Now, i try to ignore their science, and marvel at the society.

    Imagine, being able to create matter out of thin air, and not coming up with using drones for boarding hostile ships. Or using drones to defend your own ship. Heck, using drones to block energy attacks, counterattack or for surveillance. Unless, of course, they are nano-machines in your blood, which is a billion times more complex..
    Reply With Quote  
     

  13. #12  
    Forum Radioactive Isotope MagiMaster's Avatar
    Join Date
    Jul 2006
    Posts
    3,440
    The Poisson distribution isn't continuous, so you can get the probability of a specific point. It's also defined by a single parameter, it's mean. It's standard deviation should come out to the square root of that, which is pretty close to what you wrote, so that's promising, but the results should all be integers, so that's not quite right. (I mentioned that the Poisson distribution would be appropriate for something like the number of hits on a Geiger counter (over a fixed time span) which would always be an integer value.)

    If you want to work out stuff for a Poisson distribution, the CDF is , where is the mean and is the number of hits. (See the Wiki page for more details.) You'd probably just want to stick it in a spreadsheet instead of trying to solve it directly.

    Edit: What I was saying is that if your distribution is continuous, then P(x = 1) = 0. The chance of getting exactly 1 is vanishingly small. Instead, with a continuous distribution, you have to ask questions about ranges, such as a small region around 1, say 0.99 to 1.01 (or just anything 1 or less).
    Reply With Quote  
     

  14. #13  
    Forum Professor Zwolver's Avatar
    Join Date
    May 2006
    Location
    Netherlands
    Posts
    1,667
    Quote Originally Posted by MagiMaster View Post
    The Poisson distribution isn't continuous, so you can get the probability of a specific point. It's also defined by a single parameter, it's mean. It's standard deviation should come out to the square root of that, which is pretty close to what you wrote, so that's promising, but the results should all be integers, so that's not quite right. (I mentioned that the Poisson distribution would be appropriate for something like the number of hits on a Geiger counter (over a fixed time span) which would always be an integer value.)

    If you want to work out stuff for a Poisson distribution, the CDF is , where is the mean and is the number of hits. (See the Wiki page for more details.) You'd probably just want to stick it in a spreadsheet instead of trying to solve it directly.

    Edit: What I was saying is that if your distribution is continuous, then P(x = 1) = 0. The chance of getting exactly 1 is vanishingly small. Instead, with a continuous distribution, you have to ask questions about ranges, such as a small region around 1, say 0.99 to 1.01 (or just anything 1 or less).
    Yeah, the value's i have are in becquerel, so no integer (real value's) however i don't understand your formula. I'm really not good at math. I know what most parts mean, but i have no idea how to calculate with it. Like the k in brackets, or the limit i = 0, or why it is e^-lambda.
    Growing up, i marveled at star-trek's science, and ignored the perfect society. Now, i try to ignore their science, and marvel at the society.

    Imagine, being able to create matter out of thin air, and not coming up with using drones for boarding hostile ships. Or using drones to defend your own ship. Heck, using drones to block energy attacks, counterattack or for surveillance. Unless, of course, they are nano-machines in your blood, which is a billion times more complex..
    Reply With Quote  
     

  15. #14  
    Forum Radioactive Isotope MagiMaster's Avatar
    Join Date
    Jul 2006
    Posts
    3,440
    It is what it is. There's no point in worrying about why. (You can look up the details on the Wiki page if you really want though.) The brackets around the k are the floor function (largest integer less than or equal to k). There's no limit there though. That means start at i = 0 and go to floor(k).
    Reply With Quote  
     

  16. #15  
    Forum Freshman Anathema's Avatar
    Join Date
    Mar 2014
    Posts
    31
    It's been ages since I've done this, but I think you're over-thinking this, if I understand your goal correctly.

    You have a set of observed data points, x and y values. You have a linear model fit to those data points, with an R-squared value. You did this in Excel.

    So you have a formula that represents a continuous function for your data - you have a theoretical construct. You can actually plug your x values in to you formula (which Excel has so graciously provided) and calculate what the theoretical y value would be. You can calculate your actual variance between the observed y and the theoretical y values - which means you can construct whatever error bound you want, and you can identify which of your data points fall outside of that error bound.

    You can also make Excel plot error bars for you, since you already have a linear trend line plotted. Just click on the trend line to select it, then go to the "Chart Tools" menu. Under Layout, in the Analysis section, you should see that the "Error Bars" election is now available to you. It has several options available canned, or you can choose "More options" and set it up how you want. It just depends how fancy and detailed you need to be.
    Reply With Quote  
     

  17. #16  
    Forum Professor Zwolver's Avatar
    Join Date
    May 2006
    Location
    Netherlands
    Posts
    1,667
    I have been playing with these numbers, but i was looking for a definitive answer. First if they were statistically similar, and secondly for each point if it was statistically plausible. However, singular points are always statistically plausible. So now its just looking for a way to compare them all, and say if both isotope concentrations are connected.
    Growing up, i marveled at star-trek's science, and ignored the perfect society. Now, i try to ignore their science, and marvel at the society.

    Imagine, being able to create matter out of thin air, and not coming up with using drones for boarding hostile ships. Or using drones to defend your own ship. Heck, using drones to block energy attacks, counterattack or for surveillance. Unless, of course, they are nano-machines in your blood, which is a billion times more complex..
    Reply With Quote  
     

Similar Threads

  1. Replies: 6
    Last Post: December 31st, 2013, 06:02 AM
  2. Statistical software
    By Zwirko in forum General Discussion
    Replies: 1
    Last Post: December 9th, 2009, 05:45 PM
  3. The social relevance of gangs
    By RosenNoir in forum Politics
    Replies: 2
    Last Post: March 27th, 2009, 06:25 PM
  4. Statistical Thermodynamics
    By Robbie in forum Chemistry
    Replies: 1
    Last Post: June 16th, 2008, 07:18 PM
Tags for this Thread

View Tag Cloud

Bookmarks
Bookmarks
Posting Permissions
  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •