Any maths buffs?

Posted on
Page
of 18
  • So, forget the dice.

    You're interested in two outcomes: A and B (and there are another 9 outcomes that we're not interested in).

    One of them has a probability of 1/36 and the other 1/18.

    How many outcomes do you need to observe to work out which is which?

    Obviously observing just one outcome is not enough, nor is two. Observing an infinite number will be enough. So the answer is somewhere between those two.

    My point is that the question needs to be bounded with a confidence interval, i.e.

    How many outcomes do you need to observe to determine which is which at a confidence at a level at/over 50%/75%/90%/95%?

  • (hangs head in shame)

  • FWIW, with a Monte Carlo simulation (each experiment is run 10,000 times) I get:-

    (My test is to simply to see whether the 1/18 outcome occurs more times than the 1/36 outcome.)

    50% confidence: ~19 rolls
    75% confidence: ~80 rolls
    90% confidence: ~200 rolls
    95% confidence: ~310 rolls
    98% confidence: ~475 rolls
    99% confidence: ~600 rolls

    With 1000 rolls there are still 0.1% (about 10 times) where the 1/36 outcome occurs more times than the 1/18 outcome. (I've not bothered to check how good the PRNG I'm using is or determine if I've exhausted the entropy pool.)

  • The way I calculated it was to take a sequence of rolls and calculate the likelihood of that series of rolls given that 11 and 12 are equally likely. If it was less than 0.05 then I would end the sequence there and count the number of rolls. Do that a bunch of times and then take the average. That gave me an answer of 28 rolls with value 11/12 which means about 336 rolls total.

    But I guess you can take any confidence value you want; the calculation will presumably be the same.

    @GreatSince78 this was before probability so Leibniz couldn't do that calculation. He thought that they were equally likely based on intuition. The question is, under the assumption that they're equally likely, how long would it take before you've proven empirically that they're not?

  • So I can go from n = number of rolls to a confidence interval by way of simulation, but (to answer the original question), I've got no idea how to go from a confidence interval to a number of rolls by calculation.

    Never enjoyed stats anyway.

  • It'll need Bayes theorem and probably some kind of amalgamation formula.

    I hated stats at school and university. Loved pure and mechanics.

  • Pretty reassuring that my method and yours both produce an answer of about ~300 for 95% confidence!

    I think for an analytical answer it's going to some sort of Bayesian thing; a posterior probability distribution that allows you to calculate a confidence interval and a uniform prior distribution?

  • You are correct, but Leibniz, despite being a mathematical genius, got it wrong and assumed 6,5 was the same outcome as 5,6 and therefore equally likely to get as 6,6.

    In his defence probability was a new field at the time, and we only understand it better with the benefit of centuries of history on our side.

    The question asks

    "how many 2d6 rolls is Liebniz expected to need before he can say they're not [equally likely]?"

    But it doesn't say what degree of certainty you'd need to have in order to say that.

    Consider the question

    how many throws of a die would you need before you could say that it was a loaded die?

    This is a similar kind of question where you compare your hypothesis with the observations.

    So you'd need some way of comparing your observed results against the expected results for that many throws.

    If, after n throws, our outcome is x instances of 11 and y instances of 12, does that fall within the realms of chance, or do we need to revise our model?

    edit: This is a slow response to a comment far upthread...

  • Yes, an extremely clear statement of the problem.

    It occurs to me now that the question is a bit nonsensical; ironically Leibniz wouldn't have been aware of confidence intervals, certainty, etc., so wouldn't actually have been able prove anything empirically anyway! But it's still an interesting question.

  • Dices.

    Grammar buffs thread >>>>>>>>>>>>>>

    Diocese.

    Mosque thread >>>>>>>>>>>>>>

  • Wondering if anyone here might have some thoughts on how to look at a problem I have...

    I'm trying to understand how I can deduplicate reach across channels. For example, I'm targeting an audience in the UK.

    On FB, there are 12m of them and I've reached 50% (or 6m people)
    On LinkedIn, there are 4m of them and I've reach 75% (or 3m people)
    On Twitter, there are 6m of them and I've reached 40% (or 2.4m people)

    Is there a way of calculating the probability of the maximum reach I can achieve when deduplicating the data sets, ie, not knowing the crossover in data between them, how many total people I may have reached? It won't be cumulative (6+3+2.4) and it won't be 100% duplication (6m people) but somewhere in the middle.

    I think, for two channels, I can calculate as follows:

    P(FB)=6/12
    P(LinkedIn)=3/12 - Using largest audience size as a base...

    P(FB)xP(LinkedIn) = 6/12x3/12 = 0.125

    Total possible reach is therefore 6+3=9
    With de-duped reach being 9/12-0.125 = 62.5% or 12x62.5% or 7.5m people

    Problem is, I have no idea how to scale this to multiple channels.
    @Sam_w - I guess you probably deal with this a lot...

  • Not sure you can do any better than a min/max range on this without knowing at least something about the crossover in accounts at the least. e.g. the 12m Facebook users could be completely distinct from the 6m LinkedIn or one group could be a subset of the others.

    Worst case scenario is that all your reached people are the same i.e. the 3m LI and 2.4m Twitter are just subsets of the 6m FB and all unreached users are distinct: you've reached 6/(12+1+3.6) = 38.9%. Best case scenario: the reverse, all your reached people are distinct and the overlap in accounts is 100%, i.e. (6+3+2.4)/12 = 95%.

    Other than that you have to make assumptions - in your calculation it looks like you've assumed all LinkedIn users have a Facebook account and your probability of reaching them on one platform is independent of whether you reached them on the other.

  • Thanks.

    As you say, a range is the only way of providing 'certainty' but I think assumptions can be made as I have done in my previous post. It's safe to say that (most) adults in the UK have a FB account and so this total available audience can be used as a proxy for total addressable audience size. I guess this would be better if we were running TV as who doesn't have one of those but we're not...

    If we do make that assumption that everyone who has a Twitter & LinkedIn account (and if we add more channels like newspapers, magazines etc etc) has a FB account then can I narrow down the range for more than two channels?

  • Worst case is easy. Find the largest single population % * effectiveness and assume that dominates all the others

    best case is similarly easy. Assume no overlap and just keep summing the population*effectiveness until you reach the total population.

    For the tough bit I made a guess, and just messing about in Excel (attached) quickly...

    I've assumed that the users are randomly spread (which is unlikely).

    Starting out
    universe population: 30,000,000
    effectiveness of no advertising: 0%

    so you have 100% of the population untouched

    next step
    facebook: population 12m (40%)
    effectiveness: 50%
    so you'd expect someone at random to have a 20% chance of being hit by facebook
    so given you have the entire population left, your chance of them being hit by nothing or facebook is 0% + 20%

    next, Linkedin
    pop: 4m (13.33%)
    effectiveness: 75%
    random hit by linkedin: 10%
    pop hit by linkedin after being missed by the prev lot. (100-20%) * 10% = 8%
    total hit population 28%

    Twitter
    pop: 6m (20%)
    effectiveness: 40%
    random hit by twitter: 8%
    pop hit by linkedin after being missed by the prev lot. (100-28%) * 8% = 5.76%
    total hit population 33.76%
    etc.


    1 Attachment

  • That's super useful - thanks.

    Question then, if I know UK addressable audience for FB (ie total number of adults with accounts) and I know the number of adults in the UK, I can use that as the effectiveness?

    I'll still be using the audience population and platform population for the specific audience (ie 18-50 yr old men who like football - not my actual audience but you get the point).

    For example (made up data):

    1. I know there are 25m adult men in the UK
    2. I know that there are 20m adult men with FB accounts (80% effectiveness of FB)
    3. I know there are 15m people in the UK who match m+football criteria (my universe in your sheet)
    4. I know I can find 13m people on FB who match m+football criteria (my platform population)
  • I don't think so. But I'm also not in the industry so might be making up terms

    I've assumed you have 30m adults in the UK

    So FB has 12m adults
    I.e. its reach is 40% of the UK population. maybe a better term is saturation?

    then the 'effectiveness' % I guessed was that, given someone is on facebook, the chance they saw the ad was 50%

    i.e. hit = reach * effectiveness

  • If we do make that assumption that everyone who has a Twitter & LinkedIn account (and if we add more channels like newspapers, magazines etc etc) has a FB account then can I narrow down the range for more than two channels?

    As in your post if you assume that probability of hitting a person on one platform is independent of whether you hit them on another platform, it's just like a binomial distribution with multiple p values. I'm out walking so I don't have a pen or anything but I think in your example you'd have P(X ≥ 1) = 1 - P(X = 0) = 1 - (6/12) × (9/12) × (9.6/12) = 70% where X is the number of platforms you reach someone on.

  • Do you planning tools (assuming big agency) not have anything about overlap of reach of your targeted audience in each platform?

  • Not for the super niche audiences we use - they work with TV buying audiences etc but that's about it. When you're targeting fund managers or HR in enterprise companies, the planning tools shit the bed.

    They're all designed for B2C audiences.

  • A friend sent me this and I’m wondering if anyone here can help. I don’t think it’s possible but would appreciate confirmation!

    If x = (1 - vn)/ä

    where:

    v = 1/(1 + i)

    and

    ä = ln(1 + i)

    Is there a way to write a formula for i in terms of n and x?

    Edit: just can’t see a way to free the i from the ln without locking the other up with an e…

  • I miss algebra

    It's crazy how quickly it evaporated from my rotting mind.

  • I've been banging my head against this for a while. I don't think there is, as i, n and x are all dependent. Wolfram Alpha helped a bit:

    https://www.wolframalpha.com/input?i=x+%3D+%281+-+vn%29%2Fa%2C+v%3D1%2F%281%2Bi%29%2C+a%3Dln%281%2Bi%29+&assumption=%22i%22+-%3E+%22Variable%22

  • Cheers - that is useful. Should have thought of wolfram alpha.

    I’m so used to doing stuff that I know has a solution it felt weird not knowing for sure if I was on the right track. Thankfully, I agree with WAs real solution so I’m happy with that!

  • Is ä the second derivative, and if so, the second derivative with respect to what?

  • Recasting it slightly (letting 1+i = y) shows that it can be rearranged, but only if you use a numerical-only function (the ProductLog function) .

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

Any maths buffs?

Posted by Avatar for deleted @deleted

Actions