Blogtrek

Blogtrek

2004/01/18

Statistically complicated data

I have posted a new mathematics page, namely "Hamlet, Part II: Complicated Numbers". In this one I show that there are seven levels of complexity of numbers, and that almost all numbers that we deal with in everyday life, even those discussed among professional mathematicians, are in the simplest category, the countably complicated numbers. These numbers take a thousand symbols or less to describe. The next category is statistically complicated numbers, taking a thousand to a quadrillion symbols to describe. The best example I can think of, of a statistically complicated number, is ten thousand digits drawn at random in a row.

But not only numbers can be complicated. So can groups of numbers, and in fact databases. For example, the account history of all account holders at the Bank of America is statistically complicated; it probably takes billions of symbols to describe, as one has to describe each person and each transaction separately. It is a marvel in our age that our computers can handle such databases. But isn't it easier to think in terms of countably complicated structures where you can? Instead of stating for each person in a group of people, such as a Girl Scout troop or a military unit how much food each requires in a day, instead come up with some rule saying that each person requires, say, 3.2 kilograms a day (don't use this figure for planning purposes; I made it up). That is countably complicated, instead of listing it for each person separately, which would be statistically complicated for a group of 1,000 or more people.

In many places, it may pay to research into finding countably complicated rules to describe things rather than statistically complicated databases.

No comments: