Monthly Archives: March 2012

I hate to use your list against you..

Often I like the material presented at www.informationisbeautiful.net, but a recent posting (http://www.informationisbeautiful.net/2012/rhetological-fallacies/) fell short.  The posting is an ‘infographic’ (a stretch of the term) of Rhetological Fallacies.  Nice.  I have spent a bit of time looking at these types of things when doing Current Reality Trees / Future Reality Trees (part of Goldratt’s Theory of Constraints).

However, reading through the list, I started to see that this list was a bit incomplete or incorrect (at least in the explanations of the entries).  I could even use entries in the list to refute other entries.  For example:

“Gambler’s Fallacy:  Assuming the history of outcomes will affect future outcomes”

Now, I know what the author was getting at, but the way this is stated is incorrect.  The example given was:

“I’ve flipped this coin 10 times in a row and it’s been heads therefore the next coin flip is more likely to come up tails”

So in the example, the author is correct — those events are independent (where the probability of the subsequent flip is not dependent on the outcome of previous flips.. aka Bernoulli Trial).  However, in the ‘definition’ of Gambler’s Fallacy, the author left out the critical word ‘independent’.  If the events are not independent (e.g. the weather conditions observed at the start of an hour), then the future outcomes are different depending on the outcomes observed in the past.  For example, we are more likely to observe rain at 2pm if we have observed rain at 1pm, with some measurable increase in probability.

Using the author’s own list, they fell prey to ‘Composition Fallacy: Assuming that a characteristic or beliefs of some or all of a group applies to the entire group’.  (That sentence needs some work).  Not all events are independent, and would lead someone to fall prey to a ‘Gambler’s Fallacy’, even though some events are independent.

There are a few others in this list which annoy me (such as ‘Appeal to Probability’), not because of the ‘idea’ behind it, but because of how it is vaguely expressed.

Leave a comment

Filed under General, Systems Engineering

Big Data use cases

A pretty good summary of use cases for ‘big data’.  This always ends up being the first set of questions when exposed to the idea of ‘big data’.  “What the heck do _we_ do which is considered Big Data?”  A lot of times this is because organizations don’t currently deal with these use cases BUT SHOULD to remain competitive.  Things are a-changing.

http://practicalanalytics.wordpress.com/2011/12/12/big-data-analytics-use-cases/

Leave a comment

Filed under Big Data, Data, Data Management, Systems Engineering

Incomplete analysis – finding patterns in noise

Kaiser Fung, author of Numbers Rule You World posted a blog entry about ‘Muzzling Data’ http://junkcharts.typepad.com/numbersruleyourworld/2012/03/we-sometimes-need-a-muzzle-.html

The entry talks about some automatic data analysis done by zip code, and how it was projecting the deviation of average lifespan for individuals in a zip code broken out by First Name. He goes on to show how and why this type of analysis is incomplete. Without a complete view of the data (i.e. what the population’s lifespan variability is overall), it is easy to find patterns in the noise of the data. He theorizes that this type of incomplete analysis might yield headlines such as:

“Your first name reduces your life expectancy!!”, or “Margaret, it’s time to become Elizabeth!”. And why not “James, if you want to live longer, become Elizabeth now!”

 The analyst needs to ensure that they are not identifying patterns in noise, due to an artifact of their methodology or incomplete analysis.

Leave a comment

Filed under Data, Systems Engineering