DW Appliances: Kognitio
to Debut in U.S., Netezza to Scale into the Stratosphere
There's
plenty of action afoot in the teeming data warehouse appliance market
By Stephen Swoyer
1/9/2008
The data warehouse (DW)
appliance market is starting 2008 with two important announcements.
The latest development
in that already-teeming market segment: the reemergence of an old-new
name, the former Whitecross Systems, which plans to relaunch next
month -- at the TDWI World Conference in Las Vegas -- as Kognitio.
This isn't strictly true:
Kognitio is already a going concern in the U.K., where it absorbed
the assets of the former Whitecross back in 2005. Last year, the
company finally shifted its attention to the North American market,
tapping industry veteran John Thompson to head up its U.S. marketing
operations. Whereas Whitecross was a somewhat familiar name in the
U.S., the "new" Kognitio will be promoting its own data-as-a-service
spin on the (increasingly ubiquitous) DW appliance.
The question, of course,
is the degree to which Kognitio's data-as-a-service mantra is different,
and what its hardware-independent underpinnings will mean. Kognitio
will be making its splash as DW appliance veterans seek to shore
up -- or in many cases expand -- their own U.S. market positions.
Consider Netezza Inc.,
the company (along with the Teradata Corp.) that is widely credited
with helping legitimize and popularize the DW appliance model. This
week, Netezza announced plans to expand the footprint of its Netezza
Performance Server (NPS) appliances by several times their current
capacity limit -- announcing upcoming enhancements (including an
already-promised Compress Engine) that will result in appliance
capacities of greater than 200 TB, scaling, according to Netezza
officials, all the way up to 1 PB.
A Confident
Kognitio
Netezza's increased capacity
doesn't seem to faze Kognitio officials, who trumpet their own expertise
-- as well as that of predecessor Whitecross -- in the high-volume
data warehousing segment.
More to the point, argues
Thompson, Kognitio's spin on the DW appliance -- which it markets
in the form of its WX2 data warehousing software -- differs significantly
from those of its established U.S. competitors: it doesn't have
an official hardware complement. Kognitio officials claim that customers
can deploy WX2 on top of existing hardware assets.
"You take Teradata, they're
in my mind the original data warehouse appliance, and you also have
folks out there like Netezza and Dataupia and DATAllegro, which
is sort of a blended approach [of hardware and software]. Then you
have the next version appliance, which is WX2, which is software
only," Thompson says. "When we come into an environment and the
client says, 'I have five rack-mounted servers and I want to use
them for X application,' we can do that. We can run right on those
existing servers. Teradata and Netezza or any of those other [appliance]
vendors can't do that."
In this respect, Kognitio
sounds a lot like another DW vendor, ParAccel, which markets a columnar
data warehouse technology customers can deploy on top of existing
assets or order preinstalled and preconfigured on hardware assets
from Sun Microsystems Inc. and other OEMs.
The resemblance is there,
Thompson concedes, but there are key differences. For starters,
WX2 doesn't use a columnar database structure. While columnar databases
do have undeniable advantages, they can also be difficult to configure
and optimize, he argues. "We are a traditional relational database.
We don't use any indexing. We don't rely on any segmentation in
our partitioning. We allow people to bring data in, add data as
rapidly as they want, take data out. People like Vertica and ParAccel
are coming back with a columnar approach using compression, and
there are benefits to that, but -- as anyone who's ever built one
of these [columnar warehouses] knows -- when you build these massive
hypercubes, you have to do it three or four times to get it right."
Secondly, Thompson claims,
WX2 encapsulates the domain expertise that Whitecross developed
during its days as an application service provider (ASP). Consider
costing, which WX2 can compute on both a technological (i.e., how
much will a specific query cost to run in terms of system resources
or processing power) and a dollars-and-cents (i.e., how much will
a specific query actually cost the business unit to run) basis.
The takeaway, he says,
is a kind of service-enabled spin on chargeback.
"[T]he [WX2] software was written in a way so that if you write
SQL and send it into the machine, the first layer that grabs it
is what we call an optimizing compiler. This looks at the SQL and
decides whether or not to run it natively or to convert it into
machine code," he explains. "But it also does a costing allocation
-- which is based on machine resources, which can then be translated
into a dollar threshold -- and you can take that costing allocation
and convert it into dollars. So if [that allocation] comes in and
says, 'This is going to cost over X amount,' it will kick it back
to the user and say, 'Are you sure you want to run this?'"
There's more, too, according to Thompson, who cites WX2's on-the-fly
resource allocation (and deallocation) capabilities. What this means,
he says, is that customers can allocate additional resources to
meet changes in demand, deallocate resources as needed, or even
reconfigure an existing data warehouse environment to handle a completely
different workload.
"Say, during the day, you might want more of a traditional transactional
[workload], but in the evening, you want to run a regression analysis
looking at products in relation to one another," he explains. "That's
a very processor-heavy configuration, but we can automate that.
At 5:00 PM, we can reconfigure that and then set it back to the
reporting profile in the morning."
Kognitio plans to target a sizeable market swathe. Thompson doesn't
rule out the sub-TB segment, for one thing, and stresses that WX2
can scale to address multi-TB (and even double-digit TB) requirements,
too. Licensing, he says is flexible: it's available on a per-user,
per-seat, per-processor, per-server, or even per-capacity (e.g.,
10 TB or less) basis. "The key is that we want to be known as the
flexible alternative to these other [appliance] vendors," he indicates.
Netezza Scales
Even Larger
Kognitio's coming, but
DW appliance pioneer Netezza, for one, isn't sitting still.
Late last year, the firm
announced plans to deliver a new Compress Engine feature for its
NPS systems. That feature is slated to become available by the middle
of this year. Ditto for Netezza's upcoming NPS expansion, which
it says will result in DW appliance configurations that scale beyond
the 100 and 200 TB barrier -- scaling, in some cases, on up to 1
PB, according to officials.
Compression (or "compressability")
is of growing importance in the high-end data warehousing segment.
For one thing, it lets customers use fewer physical disk drives
or storage arrays to support ever-larger data warehousing configurations.
Furthermore, a reduction in physical storage translates into a corresponding
reduction in power, cooling, and data center real-estate costs.
Contrary to what you
might think, however, Netezza isn't re-architecting its relational
data warehouse format -- although its Compress Engine does
translate data into a kind of hybrid columnar database structure.
(Columnar databases tend to boast extremely high compression levels.)
Mostly, officials claim, Netezza relies on the processing power
of its field programmable gate arrays (FPGA) -- i.e., the PowerPC
processor engines that populate its Snippet Processing Units (SPU)
-- as well as proprietary compression algorithms, the combination
of which lets it achieve extremely high on-the-fly compression rates,
says Phil Francisco, director of product marketing with Netezza.
"What the Compress Engine
is is an extension of that FPGA capability to do that recompression
of the data as it comes off the disk drive as fast as you can read
it from the disk. You get sort of maximum performance on the throughput
of the system. We do [perform] the compression in a columnar way,
but the system is still a load-based database management system.
The [compression] algorithm that we use is our own patent-pending
one," he explains.
Chalk it up to the advantage
of Netezza's PowerPC-based architecture, which is cooler and more
efficient than competitive designs based on chips from Intel Corp.
and Advanced Micro Devices (AMD) Inc. "We can buy embedded versions
of PowerPC that use very little power and allow us to be very power
efficient, and that leads to a really significant power savings
for customers purchasing our solution," Francisco claims.
"Each one of those [processing
units] consumes only 30 watts per power and we can put a hundred
of those in a rack and get very high efficiency," he continues,
noting that competitive chips (such as multi-core designs from Intel
and AMD) dissipate several times as much power. "We can deliver
better data densities and not sacrifice performance in doing that
-- we'll be able to actually increase performance in doing so."
Netezza isn't sweating
the heightened competition in the DW appliance segment, either,
according to Francisco. "I like where we are in the market. I like
our opportunity and I like what we see in front of us."
Stephen
Swoyer is a technology writer based in Athens, Ga. You can contact
Stephen via E-mail at stephen.swoyer@spinkle.net.