Advertisment

Patterns From The Petabytes

author-image
DQI Bureau
New Update

Internet-age

apps have pushed gigabit backbones to breaking-point. Project Oxygen

will criss-cross the oceans with terabit fiber, a base for Internet

2. Lucent has a cable with 432 terabit fibers-several times the

internet's bandwidth. But our apps will push these frontiers.



Data users now talk tera-1012 bytes. A thousandfold up, we get to
peta, 1015 bytes, and then exa, 1018 bytes.




Advertisment

My company works

with about 1 terabyte of software and data. IBM manages 176TB on

its internal network, and 576TB for commercial accounts. The internet

has 1,000 publicly accessible terabytes. There's lots more (about

1,000 petabytes) on other networks, and 20 times that (20 exabytes)

offline. IBM estimates another 200-odd EB in analog form on our

planet.



It's a challenge

to store and carry this data, pushing aerial density and fiber to

limits. A bigger challenge is to make sense of this data.



Amex gets 10,000

responses from a million mailers. That's 990,000 junk mailers. Now,

if only they knew who would respond... The ten million rupees saved

would have made the offer competitive. Right now, customers now

pay more for a service because of the majority who don't respond.



Advertisment

Can Amex predict

respondents? That's the problem a new breed grapples with: the data

miners. They're looking for patterns in petabytes of data. Perhaps

those who own Marutis, travel six times a year, and own three appliances

are very likely to buy the Amex card. Perhaps two-wheeler owners

with one TV can be excluded...



Data mining

at Wal-Mart found a link between cosmetics and greeting cards. Applying

this, they pushed up sales in both categories by 30%.



Another store

almost dropped Feta, a low margin cheese for a small niche. Until

it found that its buyers also bought high-margin Swiss chocolates.

Enhancing the Feta choice helped increase chocolate sales, and profits.



Advertisment

Data mining-digging

up patterns in terabytes of data, and predicting future behavior-will

radically change our understanding of consumer behavior beyond Y2k,

and of so many areas that generate data.



Consider our

Election Commission's 600 million records. If the right ten demographic

parameters were included next time round, mining could evolve this

into a staggering marketing database.



Data mining

is growing up from jargon to killer app, one that will go mainstream

by 2001. Making those staggering numbers make sense.



pkr@cmil.com

Advertisment