Any maths buffs?

Posted on Sun 3rd, January 2010

Page of 18

First Prev Last Next

• #351

Greenbank in reply to @GreatSince78

So, forget the dice.

You're interested in two outcomes: A and B (and there are another 9 outcomes that we're not interested in).

One of them has a probability of 1/36 and the other 1/18.

How many outcomes do you need to observe to work out which is which?

Obviously observing just one outcome is not enough, nor is two. Observing an infinite number will be enough. So the answer is somewhere between those two.

My point is that the question needs to be bounded with a confidence interval, i.e.

How many outcomes do you need to observe to determine which is which at a confidence at a level at/over 50%/75%/90%/95%?
• #352

GreatSince78 in reply to @Brun

(hangs head in shame)
• #353

Greenbank

FWIW, with a Monte Carlo simulation (each experiment is run 10,000 times) I get:-

(My test is to simply to see whether the 1/18 outcome occurs more times than the 1/36 outcome.)

50% confidence: ~19 rolls
75% confidence: ~80 rolls
90% confidence: ~200 rolls
95% confidence: ~310 rolls
98% confidence: ~475 rolls
99% confidence: ~600 rolls

With 1000 rolls there are still 0.1% (about 10 times) where the 1/36 outcome occurs more times than the 1/18 outcome. (I've not bothered to check how good the PRNG I'm using is or determine if I've exhausted the entropy pool.)
• #354

frankenbike in reply to @Greenbank

The way I calculated it was to take a sequence of rolls and calculate the likelihood of that series of rolls given that 11 and 12 are equally likely. If it was less than 0.05 then I would end the sequence there and count the number of rolls. Do that a bunch of times and then take the average. That gave me an answer of 28 rolls with value 11/12 which means about 336 rolls total.

But I guess you can take any confidence value you want; the calculation will presumably be the same.

@GreatSince78 this was before probability so Leibniz couldn't do that calculation. He thought that they were equally likely based on intuition. The question is, under the assumption that they're equally likely, how long would it take before you've proven empirically that they're not?
• #355

Greenbank

So I can go from n = number of rolls to a confidence interval by way of simulation, but (to answer the original question), I've got no idea how to go from a confidence interval to a number of rolls by calculation.

Never enjoyed stats anyway.
• #356

mashton

It'll need Bayes theorem and probably some kind of amalgamation formula.

I hated stats at school and university. Loved pure and mechanics.
• #357

frankenbike in reply to @Greenbank

Pretty reassuring that my method and yours both produce an answer of about ~300 for 95% confidence!

I think for an analytical answer it's going to some sort of Bayesian thing; a posterior probability distribution that allows you to calculate a confidence interval and a uniform prior distribution?
• #358

Drakien in reply to @GreatSince78

You are correct, but Leibniz, despite being a mathematical genius, got it wrong and assumed 6,5 was the same outcome as 5,6 and therefore equally likely to get as 6,6.

In his defence probability was a new field at the time, and we only understand it better with the benefit of centuries of history on our side.

The question asks

"how many 2d6 rolls is Liebniz expected to need before he can say they're not [equally likely]?"

But it doesn't say what degree of certainty you'd need to have in order to say that.

Consider the question

how many throws of a die would you need before you could say that it was a loaded die?

This is a similar kind of question where you compare your hypothesis with the observations.

So you'd need some way of comparing your observed results against the expected results for that many throws.

If, after n throws, our outcome is x instances of 11 and y instances of 12, does that fall within the realms of chance, or do we need to revise our model?

edit: This is a slow response to a comment far upthread...
• #359

frankenbike in reply to @Drakien

Yes, an extremely clear statement of the problem.

It occurs to me now that the question is a bit nonsensical; ironically Leibniz wouldn't have been aware of confidence intervals, certainty, etc., so wouldn't actually have been able prove anything empirically anyway! But it's still an interesting question.
• #360

Oliver Schick in reply to @Brun

Dices.

Grammar buffs thread >>>>>>>>>>>>>>

Diocese.

Mosque thread >>>>>>>>>>>>>>
• #361

Soul

Wondering if anyone here might have some thoughts on how to look at a problem I have...

I'm trying to understand how I can deduplicate reach across channels. For example, I'm targeting an audience in the UK.

On FB, there are 12m of them and I've reached 50% (or 6m people)
On LinkedIn, there are 4m of them and I've reach 75% (or 3m people)
On Twitter, there are 6m of them and I've reached 40% (or 2.4m people)

Is there a way of calculating the probability of the maximum reach I can achieve when deduplicating the data sets, ie, not knowing the crossover in data between them, how many total people I may have reached? It won't be cumulative (6+3+2.4) and it won't be 100% duplication (6m people) but somewhere in the middle.

I think, for two channels, I can calculate as follows:

P(FB)=6/12
P(LinkedIn)=3/12 - Using largest audience size as a base...

P(FB)xP(LinkedIn) = 6/12x3/12 = 0.125

Total possible reach is therefore 6+3=9
With de-duped reach being 9/12-0.125 = 62.5% or 12x62.5% or 7.5m people

Problem is, I have no idea how to scale this to multiple channels.
@Sam_w - I guess you probably deal with this a lot...
• #362

frankenbike in reply to @Soul

Not sure you can do any better than a min/max range on this without knowing at least something about the crossover in accounts at the least. e.g. the 12m Facebook users could be completely distinct from the 6m LinkedIn or one group could be a subset of the others.

Worst case scenario is that all your reached people are the same i.e. the 3m LI and 2.4m Twitter are just subsets of the 6m FB and all unreached users are distinct: you've reached 6/(12+1+3.6) = 38.9%. Best case scenario: the reverse, all your reached people are distinct and the overlap in accounts is 100%, i.e. (6+3+2.4)/12 = 95%.

Other than that you have to make assumptions - in your calculation it looks like you've assumed all LinkedIn users have a Facebook account and your probability of reaching them on one platform is independent of whether you reached them on the other.
• #363

Soul in reply to @frankenbike

Thanks.

As you say, a range is the only way of providing 'certainty' but I think assumptions can be made as I have done in my previous post. It's safe to say that (most) adults in the UK have a FB account and so this total available audience can be used as a proxy for total addressable audience size. I guess this would be better if we were running TV as who doesn't have one of those but we're not...

If we do make that assumption that everyone who has a Twitter & LinkedIn account (and if we add more channels like newspapers, magazines etc etc) has a FB account then can I narrow down the range for more than two channels?
• #364

duncs in reply to @Soul
Worst case is easy. Find the largest single population % * effectiveness and assume that dominates all the others

best case is similarly easy. Assume no overlap and just keep summing the population*effectiveness until you reach the total population.

For the tough bit I made a guess, and just messing about in Excel (attached) quickly...

I've assumed that the users are randomly spread (which is unlikely).

Starting out
universe population: 30,000,000
effectiveness of no advertising: 0%

so you have 100% of the population untouched

next step
facebook: population 12m (40%)
effectiveness: 50%
so you'd expect someone at random to have a 20% chance of being hit by facebook
so given you have the entire population left, your chance of them being hit by nothing or facebook is 0% + 20%

next, Linkedin
pop: 4m (13.33%)
effectiveness: 75%
random hit by linkedin: 10%
pop hit by linkedin after being missed by the prev lot. (100-20%) * 10% = 8%
total hit population 28%

Twitter
pop: 6m (20%)
effectiveness: 40%
random hit by twitter: 8%
pop hit by linkedin after being missed by the prev lot. (100-28%) * 8% = 5.76%
total hit population 33.76%
etc.

1 Attachment
- ad_reach.xlsx
• #365

Soul in reply to @duncs
That's super useful - thanks.

Question then, if I know UK addressable audience for FB (ie total number of adults with accounts) and I know the number of adults in the UK, I can use that as the effectiveness?

I'll still be using the audience population and platform population for the specific audience (ie 18-50 yr old men who like football - not my actual audience but you get the point).

For example (made up data):
1. I know there are 25m adult men in the UK
2. I know that there are 20m adult men with FB accounts (80% effectiveness of FB)
3. I know there are 15m people in the UK who match m+football criteria (my universe in your sheet)
4. I know I can find 13m people on FB who match m+football criteria (my platform population)
• #366

duncs in reply to @Soul

I don't think so. But I'm also not in the industry so might be making up terms

I've assumed you have 30m adults in the UK

So FB has 12m adults
I.e. its reach is 40% of the UK population. maybe a better term is saturation?

then the 'effectiveness' % I guessed was that, given someone is on facebook, the chance they saw the ad was 50%

i.e. hit = reach * effectiveness
• #367

frankenbike in reply to @Soul

If we do make that assumption that everyone who has a Twitter & LinkedIn account (and if we add more channels like newspapers, magazines etc etc) has a FB account then can I narrow down the range for more than two channels?

As in your post if you assume that probability of hitting a person on one platform is independent of whether you hit them on another platform, it's just like a binomial distribution with multiple p values. I'm out walking so I don't have a pen or anything but I think in your example you'd have P(X ≥ 1) = 1 - P(X = 0) = 1 - (6/12) × (9/12) × (9.6/12) = 70% where X is the number of platforms you reach someone on.
• #368

stelfox

Do you planning tools (assuming big agency) not have anything about overlap of reach of your targeted audience in each platform?
• #369

Soul in reply to @stelfox

Not for the super niche audiences we use - they work with TV buying audiences etc but that's about it. When you're targeting fund managers or HR in enterprise companies, the planning tools shit the bed.

They're all designed for B2C audiences.
• #370

crow

A friend sent me this and I’m wondering if anyone here can help. I don’t think it’s possible but would appreciate confirmation!

If x = (1 - vn)/ä

where:

v = 1/(1 + i)

and

ä = ln(1 + i)

Is there a way to write a formula for i in terms of n and x?

Edit: just can’t see a way to free the i from the ln without locking the other up with an e…
• #371

skinny

I miss algebra

It's crazy how quickly it evaporated from my rotting mind.
• #372

Drakien in reply to @crow

I've been banging my head against this for a while. I don't think there is, as i, n and x are all dependent. Wolfram Alpha helped a bit:

https://www.wolframalpha.com/input?i=x+%3D+%281+-+vn%29%2Fa%2C+v%3D1%2F%281%2Bi%29%2C+a%3Dln%281%2Bi%29+&assumption=%22i%22+-%3E+%22Variable%22
• #373

crow in reply to @Drakien

Cheers - that is useful. Should have thought of wolfram alpha.

I’m so used to doing stuff that I know has a solution it felt weird not knowing for sure if I was on the right track. Thankfully, I agree with WAs real solution so I’m happy with that!
• #374

useless in reply to @crow

Is ä the second derivative, and if so, the second derivative with respect to what?
• #375

hamrack in reply to @crow

Recasting it slightly (letting 1+i = y) shows that it can be rearranged, but only if you use a numerical-only function (the ProductLog function) .

Page of 18

First Prev Last Next

Post a reply
- Bold
- Italics
- Link
- Image
- List
- Quote
- code
- Preview
Formatting Help

Don't worry about formatting, just type in the text and we'll take care of making sense of it. We will auto-convert links, and if you put asterisks around words we will make them bold.

Tips:

Create headers by underlining text with ==== or ----

To *italicise* text put one asterisk each side of the word

To **bold** text put two asterisks each side of the word

Embed images by entering:
![](https://www.google.co.uk/images/srpr/logo4w.png)
That's the hard one: exclamation, square brackets and then the URL to the image in brackets.

* Create lists by starting lines with asterisks

1. Create numbered lists by starting lines with a number and a dot

> Quote text by starting lines with >

Mention another user by @username

For syntax highlighting, surround the code block with three backticks:

```
Your code goes here
```
Just like Github, a blank line must precede a code block.

If you upload more than 5 files we will display all attachments as thumbnails.

For a full reference visit the Markdown syntax.

Any maths buffs?

About

Any maths buffs?

Actions

Any maths buffs?

Formatting Help

About

Any maths buffs?

Actions

LFGSS