Chi testing on non-public powerball dataPrev Topic Next Topic

New Topic New Poll

TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Feb 17, 2010, 6:50 pm

I've ran some chi-testing on powerball data, more specifically on the non-public sets. The light blue dotted line (53g mod-208) is for two ball sets ago (#21-24). The thick light maroon line (55g mod-301) is for the very last ball set (#25-28). The light green thick dashed line (59g-67) is for our current ball set (#29-32) upto 2/16/2010. The thin light purple line (59g R) is from random draw data for additional comparison purposes. Time goes from left to right meaning that newer draws would be found on the right. The y axis indicates chitest values between 0.55 and 1.

53g and 55g were stretched inwards to fit against/with the current ball set 59g. There were more draws in 55g, next amount was in 53g, and the current set depth has 67 points. Chi tests were ran after each new set of 6 non-public draws across the individual ball sets in that set of balls, meaning that for each draw a ball set is picked and then 6 draws are drawn and chi testing was computed by groupings as each ball set incremented the values to observe instead of going one individual non-public draw at a time. Once completed, the first 50 results were discarded for each set examined (116 six grouped draws minus the first 50 results equals the 67th depth for our current set).

53g appeared steady. 55g took a big dive right before they changed that ball set and increased the ball count on 1/7/2009. Which brings us to our current set 59g that appears to be failing as bad as 55g, and 59g is only at about 28% deep (time/duration) as 55g took to get to that point.

I would like to work with others on these findings that can shed additional light on the data.

I posted another topic about quad-duplication. Here are those results:

1>> Quad-duplication shows up in 62% of all draws (public & non-public). Of these, 84% did not end up being a "public" draw.

2>> There were also 3% more crossing quad-duplications found than there were draws (all public & non-public).

3>> In regards to 5 matching numbers, only two have ever been public, with a span of only 513 draws between on one. 32 other 5 matching (duplications) are not public (spans between range from 36 to 1,243). One, with a span of 468 draws between, even matched down to the powerball itself! Others (31%) seem to cause a ball set change and/or an increase in the heighth of the ball numbers.

*Span = public draws, count doesn't include non-public
TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Feb 18, 2010, 4:00 pm

Here's an updated graph that is not stretched to fit. Again, here's the break-down:

53g / light blue little dotted line = Two ball sets ago (#21-24)
55g / dark red thick line = Last ball set (#25-28)
59g / light green thick dashed line = Current ball set (#29-32)

Each interval represents chi-test on one draw (one draw meaning the actual "public" draw in addition to the other 5 non-public testing draws) amongst eachother with calculated expected values for the ranges as they went from earliest (left) to most recent (right).
jwhou

United States
Member #83,698
December 13, 2009
225 Posts
Offline

Mar 13, 2010, 10:26 pm

Quote: Originally posted by TinFoilHat85705 on Feb 17, 2010
I've ran some chi-testing on powerball data, more specifically on the non-public sets. The light blue dotted line (53g mod-208) is for two ball sets ago (#21-24). The thick light maroon line (55g mod-301) is for the very last ball set (#25-28). The light green thick dashed line (59g-67) is for our current ball set (#29-32) upto 2/16/2010. The thin light purple line (59g R) is from random draw data for additional comparison purposes. Time goes from left to right meaning that newer draws would be found on the right. The y axis indicates chitest values between 0.55 and 1.

53g and 55g were stretched inwards to fit against/with the current ball set 59g. There were more draws in 55g, next amount was in 53g, and the current set depth has 67 points. Chi tests were ran after each new set of 6 non-public draws across the individual ball sets in that set of balls, meaning that for each draw a ball set is picked and then 6 draws are drawn and chi testing was computed by groupings as each ball set incremented the values to observe instead of going one individual non-public draw at a time. Once completed, the first 50 results were discarded for each set examined (116 six grouped draws minus the first 50 results equals the 67th depth for our current set).

53g appeared steady. 55g took a big dive right before they changed that ball set and increased the ball count on 1/7/2009. Which brings us to our current set 59g that appears to be failing as bad as 55g, and 59g is only at about 28% deep (time/duration) as 55g took to get to that point.

I would like to work with others on these findings that can shed additional light on the data.

I posted another topic about quad-duplication. Here are those results:

1>> Quad-duplication shows up in 62% of all draws (public & non-public). Of these, 84% did not end up being a "public" draw.

2>> There were also 3% more crossing quad-duplications found than there were draws (all public & non-public).

3>> In regards to 5 matching numbers, only two have ever been public, with a span of only 513 draws between on one. 32 other 5 matching (duplications) are not public (spans between range from 36 to 1,243). One, with a span of 468 draws between, even matched down to the powerball itself! Others (31%) seem to cause a ball set change and/or an increase in the heighth of the ball numbers.

*Span = public draws, count doesn't include non-public

So how would the constraint that five numbers are drawn from a set each time relate to a distribution based analysis? The five numbers drawn constraint implies that no single number will appear within a set of 5 more than once and as these are discrete sets of 5, there are boundary situations where in the published data they are closer than 5 drawn but only because one is near the beginning of one set while the other is near the end. Ultimately the constraint means there's always the artificial limit of no number ever exceeding 1/5 of the sample set. Clearly this constraint isn't reflected by the chi square distribution nor the normal distribution.
TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Mar 15, 2010, 12:16 pm

I don't fully understand what you are saying. I don't know much about chi other than self-teachings and mostly because I got confirmation they use it for their abnormality testing. I do have 18 years of non-public data though and wish I had others that could help in dissecting it.

When you were responding, were you aware that there's 4 sets of balls each numbered 1 through XX and for each draw 6 draws actually occur with only one being labeled the public winning drawn numbers? So you have four different sets of data times 6 to allow for distribution based analysis....no?
dr65

50

Pennsylvania
United States
Member #74,094
May 2, 2009
24,135 Posts
Offline

Mar 15, 2010, 12:20 pm

Sorry, but I think I need a tin foil hat now.

You might have to simplify some data or at least explain it.
jwhou

United States
Member #83,698
December 13, 2009
225 Posts
Offline

Mar 15, 2010, 12:52 pm

Quote: Originally posted by TinFoilHat85705 on Mar 15, 2010
I don't fully understand what you are saying. I don't know much about chi other than self-teachings and mostly because I got confirmation they use it for their abnormality testing. I do have 18 years of non-public data though and wish I had others that could help in dissecting it.

When you were responding, were you aware that there's 4 sets of balls each numbered 1 through XX and for each draw 6 draws actually occur with only one being labeled the public winning drawn numbers? So you have four different sets of data times 6 to allow for distribution based analysis....no?

My understanding is that they run test draws first and if the criteria that they've set for independence isn't met, they switch to a set that did pass that criteria before making the draw, at least that's what the Texas lottery commission website says, their criteria is that no number be drawn more than five times in six draws. The idea is that if a ball set has been compromised then it would be unlikely to pass the pre-test and the alternate ball set would be used in the draw, of course the alternate ball set most also pass the pre-test. Something like chi square distribution simply gives you a confidence value in the similarity of the observed distribution and the expected distribution, in truth with lottery draws using balls, I would suspect that the sample set is so small and the expected distribution is so flat that an exact analysis would be possible and preferable to a chi square test. I can see them running chi square tests on pseudo random number generators as a pre-test to a computerized draw. The pre-test draws are public information in the Texas website if you know where to look for them, the link is labeled "Pre-Test Results". I suspect that the data is not intentionally kept from the public, it's just that the public rarely realize that it's available publicly. The ball sets in Texas are identified by numbers and the draw machines by letters. The machine and ball set used for the draw is also indicated in the public data and since they rotate through the machines and ball sets on a schedule, you could break out your statistics by ball sets and machine and make an educated guess as to which ball set and machine would be used in the next draw. Of course this divides down your data set tremendously. Technically speaking all the data is public as it's a state organization so if there's information that you want and you can't find it, you can file an "Open Records Request" but you may have to pay the wages of the intern they assign to compile the information for you.
konane

USA
United States
Member #1,265
March 13, 2003
9,246 Posts
Online

Mar 15, 2010, 1:24 pm

Quote: Originally posted by TinFoilHat85705 on Mar 15, 2010
I don't fully understand what you are saying. I don't know much about chi other than self-teachings and mostly because I got confirmation they use it for their abnormality testing. I do have 18 years of non-public data though and wish I had others that could help in dissecting it.

When you were responding, were you aware that there's 4 sets of balls each numbered 1 through XX and for each draw 6 draws actually occur with only one being labeled the public winning drawn numbers? So you have four different sets of data times 6 to allow for distribution based analysis....no?

Welcome to Lottery Post! Love your screen name!

We have several really good number crunchers here who may be able to assist you. I certainly hope so.

I am aware of 4 ballsets each for white and red balls. They recently confirmed that number, plus confirmed the same number a year ago when I inquired. 6 draws prior to jackpot draw is new information, thank you for letting us know.

Have fun and the best of luck to everyone!
TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Mar 16, 2010, 12:20 pm

You could say that my journey started when I wanted to know if forecast predictions could be made on "random" numbers. This quest will probably carry with me throughout my life. It is currently being applied to the lottery though, despite all those who have told me there's absolutely nothing to analyze (when the programmed word "random" is applied).

Seven years ago, I pulled up the PowerBall^® and the Fantasy 5 (now called The Pick) number histories. They were both similar in the idea that random numbers were drawn from a pool of numbers and people placed bets.

What I noticed almost immediately is that the Fantasy 5 was very evenly distributed when summing counts on individual ball numbers (over a range) versus the mountain peaks and ranges of the PB. Of other analysis I've conducted since then, the PB is still the only one that seems to have some things that are not easily explainable (and is the only one that I know of or have looked at that is still real balls versus computer simulated).

I've tried counts on ranges with focus on one too many levels and have decided to call it quits several times, until a little crack of light dawns and spreads new energy. One big example is when I found on PB's site the hidden part that not only shows which draw came from which ball set, but also indicated that 6 draws occur per draw. I spent countless hours trying to separate the ball sets for examination to not much avail.

Then I found what appeared to be some typos or discrepancies and contacted musl. To my surprise, I had indeed found typos and in return asked if there was any way to get data prior 2005. I now have data going back 18 years and I still found more typos (about 55 data points out of 11,000) to which half they told me are too old to get definitive answers on (which alter some of my results, but only by a hair of a margin).

The most recent findings that baffled me (besides the fact that since 1992 they've had 4 ball sets and 6 draws per draw) are: quad/fifth duplication and chi testing results. I'll try to concisely simplify my findings on both right here.

Quad/duplication (this is when I only had 4,000 rows of data versus the now 11,000):

First, I wanted to know how often 4 of the 5 drawn white ball #'s matched elsewhere in any other draw (as 4 out of even 55 equals 340,000 combinations, which implies that it shouldn't happen very often). I created a macro in Excel to point out every occurrence and found that it happened far more than anticipated. 103% to be exact, or rather, out of 4,000 rows of draw data, 4 out of 5 numbers on any one row match another row (in total) 4,120 times because some draws/rows can have a multiple quad-duplication elsewhere.

Second, I wanted to know how often 5 of the 5 matched. So far (when I was only examining 4,000), only 2 appear to be public and interestingly enough, numbers generated from atmospheric noise from a random # producing site yielded similar results, meaning that in a span of only 1,500 consecutive draws, all 5 can match identically. But back to the 5 of 5 (even though there's 5 million combinations) appearing in only 2 of the "public" winning drawn numbers when 31 other 5 identical matches just happened to fall on the pre and post tests - instead of "public" draws - (or possibly, they didn’t want the public to see that that kind of duplication does happen - system of the design anyone)?

Chi:

After probing further with musl on the typos they couldn’t pull up data for, they also shared that they break down the ball sets separately and run chi tests. I immediately hit the internet for what chi was and how I could use it. Nothing really helped except my subconscious soaking on it for a couple of hours when I realized that each ball set is one row of counts (moving history) by columns of ball # and that expected values can be calculated, all in excel. Manually, you can get it to give you a chi probability test percentage based on one setting of how deep to look. For me, I went from start to finish on a particular ball set and could get the entire last chi result of that whole set.

Then I realized, I could create a macro that would step interval 6 draws at a time by the ball set that it occurred at (grouping one actual draw together versus separating them one line at a time). What I mean by this is taking the very first ball set (of 6 actual draws, not just one), calculating the chi, and recording at its interval level/depth (in this instance, #1). Second, take the macro and step to the next ball set chosen, and re-calculate adding those 6 new rows of 5 balls drawn data. Because you've got the separation of ball sets AND 6 draws per draw, it doesn’t take more than 5 weeks to discard the first 50 results (instead of one full year), like most of the chi tutorials indicate is a needed examination step. What you get is the charts I had already posted to this thread (and the comments I included explaining how odd they behave). When compared to #'s generated from that random sample source mentioned earlier, PB seems controlled.

So now that I've got 18 years of data, down to the ball sets #'s and all 6 draws per every draw since then (despite having 4 draws that have errors in them, 1997, that effect 45 other draws later), my plans are to recalculate quad/fifth duplication and chi results because before, I was only doing it on 36% of the history.

Of course, if anyone cares enough to want to help dissect in any aspect... I'm willing to share the data I have (but most don't care about "every little detail", where as I have to know why they've done what they've done in order to attempt to come up with quality bets).
TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Mar 16, 2010, 4:05 pm

My reference that there have been thirty something duplications of 5 white balls is accurate for all 18 years (back to 1992). I thought I had found that many in only 4,000 draws (2005 to 2010), but there is actually 33 with only 2 being "public" draws (in all 11,202 draws) as of 3/10/2010.

So they're telling us that only 6% of the seemingly impossible odds (at minimum 1.2 million combinations) of when fifth duplication occurred, that it just so happened to land on "public" draws and the other 31 (or 94%) fell on pre & post-tests!?

All together, it happened 3% of the time (11,202 draws since 1992) and with combination odds of at least 1.2 million, is that why they would want to control what the public sees (or rather, how it looks)?
TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Mar 17, 2010, 8:09 pm

Chi Update:

My prediction is that any day now (literally), they are going to do away with the current ball set #29-32 (went in place on 1/7/2009). It's values are falling far lower than the previous set and unless their unknown rules are to behave like the set from 2000-2002, it just hasn't rebounded at all in the last 8 weeks. Before tonight's draw, the chi probability test result (compared to expected values) is at 0.58140.
TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Mar 18, 2010, 2:38 pm

With the results from 3/17/2010, it continued to decline (now): 0.56556
GASMETERGUY

NASHVILLE, TENN
United States
Member #33,371
February 20, 2006
1,044 Posts
Offline

Mar 20, 2010, 8:36 pm

The Powerball drawing is shown on TV. One can see the balls fall into the chute. This, IMHO, negates all that went before.

They can pre-test 24/7, change the ball set, mix and match all they want. The only draw that counts is the one they televise. The only draw we should be concerned with is the one they televise.

Hopefully someone will take the time to explain why those pre-tests are of such great concern. I don't see the significance.
TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Mar 23, 2010, 12:29 pm

I personally don't believe everything I see or get told, or I wouldn't have found the data I now have... it just personally interests me to dissect it and find things that don't seem to make sense. As of 3/15/2010... out of 11,202 draws (since 1992), 1,122 had the PB # match one of the white balls (10%). However, 83% of those coincidentally fell on a test draw versus a public draw. So the public only knows of this type of thing happening 1/5 of the time. Suurrre no one's in control of what the public gets. It's also kinda like how on 4/3/1993 the numbers that were drawn were: 15, 22, 24, 32, 39 and PB of 18 (no winner). Then, on 12/27/2000, the same exact numbers were drawn down to the PB matching as well. This time, there happened to be two winners. Shouldn't the odds of that happening be slimmer than slim? Out of 31 occurrences of all 5 white ball numbers matching, this is only one of the two that were not a test draw.
TinFoilHat85705

Tucson
United States
Member #86,501
February 5, 2010
28 Posts
Offline

Mar 23, 2010, 4:21 pm

Or how about on 4/30/1997, numbers drawn were: 19, 16, 17, 15, 18 and PB# 17... and yet it was only a pre-test! What are the chances that it happened, and didn't also happen to be a public winning draw?
rdgrnr

100

The Hall Of The Mountain Kings Tennessee
United States
Member #73,902
April 28, 2009
15,378 Posts
Offline

Apr 2, 2010, 5:05 pm

I just poured 4 fingers of Jack Daniels in a glass.

Those are the only numbers in this thread that I can comprehend.

Now I'm gonna slam that sucker and see if I can understand.

And then repeat as necessary.

New Topic New Poll

Subscribe to this topic

Chi testing on non-public powerball dataPrev TopicNext Topic

Chi testing on non-public powerball dataPrev Topic Next Topic