-
We now have two 10k groups. The test group from which the sensitivity and specificity are learned ("the same test group you used to measure the sensitivity/specificity"). And a second 10k (which may or may not be the same as the first, but in your example they are). This second group is having the test applied to them. The important thing is that these are two different things which I think we both agree on.
Yep.
The first group is used to judge and measure the test. Sensitivity and specificity are learned from this group. Whatever you do with the test after this, sensitivity and specificity do not change (unless they were wrong to start).
Yep.
The second group is that which the test being used on. In this case, we can say that the second 10,000 people in your example are being used as an analogue for the general population, no? The results will depend on the makeup of this 10,000 people. They will not mimic reality (that is, the tests are not perfect).
I used the same group of people again to highlight the fact that even if you use the same group of people you can get weird looking results for the accuracy of a specific outcome (not the test in general).
In my example above the overall accuracy was 95%. I don't doubt that. I've been talking about the accuracy of a positive result of the test. With the numbers above if you get a positive result then it's only a 50:50 chance of being accurate, despite the test having an overall accuracy of 95%.
So even if you apply your own test to the same test group you used to measure the sensitivity and specificity of the test you find that a positive result is not as accurate as you expected.
The tests are not 100% accurate, this was never a point of confusion.
Yes, but they're not 95% accurate for people who get a positive result.
So again, the sensitivity and specificity which are learned from the first group is what I thought at the very beginning you were claiming shifts,
No, I was calculating the accuracy of the test per outcome and showing how it differs massively for a negative and a positive result.
depending on prevalence in the population (this is what I quoted in my reply to Chalfie, and if you remember, the first thing I said to you after you replied was: "Ah, sorry. So the raw number of false positives/false negatives will shift depending on how many true negatives/true positives there are. Okay - I mistook your "-ve" and "+ve" to be analogues for specificity/sensitivity.")
The point is that an individual test may be 95% accurate but only if you know your true status (in which case the test is pointless).
When you get your test result the only thing you know is your result, so you have to look at the estimated/calculated accuracy for the individual result, and that is where they can be skewed well away from the expected 95%.
(I know you're not a moron. I'm probably using the wrong words/terms all over the place, apologies if I am.)
-
When you get your test result the only thing you know is your result, so you have to look at the estimated/calculated accuracy for the individual result, and that is where they can be skewed well away from the expected 95%.
Absolutely. I never disagreed with this and I think we've just been talking at each other because we've picked up on particular things which stood out as odd to us (or maybe I just did this). It's hard to talk these things through on the internet when you're generally doing something else. Much better suited to a pub.
We now have two 10k groups. The test group from which the sensitivity and specificity are learned ("the same test group you used to measure the sensitivity/specificity"). And a second 10k (which may or may not be the same as the first, but in your example they are). This second group is having the test applied to them. The important thing is that these are two different things which I think we both agree on.
The first group is used to judge and measure the test. Sensitivity and specificity are learned from this group. Whatever you do with the test after this, sensitivity and specificity do not change (unless they were wrong to start).
The second group is that which the test being used on. In this case, we can say that the second 10,000 people in your example are being used as an analogue for the general population, no? The results will depend on the makeup of this 10,000 people. They will not mimic reality (that is, the tests are not perfect).
The tests are not 100% accurate, this was never a point of confusion.
So again, the sensitivity and specificity which are learned from the first group is what I thought at the very beginning you were claiming shifts, depending on prevalence in the population (this is what I quoted in my reply to Chalfie, and if you remember, the first thing I said to you after you replied was: "Ah, sorry. So the raw number of false positives/false negatives will shift depending on how many true negatives/true positives there are. Okay - I mistook your "-ve" and "+ve" to be analogues for specificity/sensitivity.")
The results in the second group (be it a sample or the general population) are dependent on the makeup of that population and the sensitivity/specificity of the test. I agree with this and always have.
If I'm still misunderstanding you we can either take this off public chat or you can rest assured that you tried and I'm a moron.