February 2015 – Database Fog Blog

Thinking About the Pivotal Announcements…

Yesterday I provided a model for how business sees open source as a means to be profitable (here). This is the game Pivotal seems to be playing with their release of Hadoop, Gemfire, HAWQ, and Greenplum into open source. I do not know their real numbers… so they may need more or fewer additional customers than the mythical company to get back to break-even. But it is unlikely that any company can turn the corner from a license-based revenue stream to a recurring revenue stream in a year… so Pivotal must be looking at a loss. And when losses come it is usual to cut costs… to cut R&D.

There has already been a brain-drain out of the database ranks at Pivotal as they went “all in” on Hadoop. They likely hope for an open source community to pick up the slack… but there is not a body of success I can see in building a community to engineer a commercial product-turned-open. This is especially problematic for Gemfire, an old technology that has been in the commercial space for a very long time. HAWQ has to compete for database resources with the other Hadoop RDBMS technologies… that will be difficult. Greenplum has a chance as it is based on PostgreSQL… but it is a long way away from the current PostgreSQL code base these days. There is danger here.

The bottom line… Greenplum and HAWQ and Gemfire have become risky propositions for both the current customer base and for new customers. I’ll leave it to you to evaluate the risk as this story unfolds. Still, with the risk comes reward… the cost of acquiring Greenplum will drop dramatically and today Greenplum is a competitive product. In addition, if Greenplum gains some traction, it will put price pressure on the other database products. Note that HAWQ was already marked down to open source price levels… and part of Pivotal’s problem was that HAWQ was eating at the Greenplum market. With these products priced at similar levels there becomes some weirdness in choosing… but the advantage is to customers looking at Greenplum.

One great outcome comes for Pivotal Hadoop customers… the fact that Hortonworks will more-or-less subsume Pivotal Hadoop leaves those folks in a better place than before.

If you consider the thought experiment you would have to ask yourself why a company that was breaking even would take this risky route? It could be that they took the route because they were not breaking even and this was a possible path to get even. Also consider… open sourcing code is the modern graceful way to retire an unprofitable product line.

This is sound thinking by Pivotal… during the creation, EMC gave Pivotal several unprofitable troubled assets and these announcements give Pivotal a path forward. If the database product line cannot carry their weight then they will go into maintenance mode and slowly fade. Too bad… as you know I consider Greenplum a solid product whose potential was wasted. But Pivotal has a very nice product in Cloud Foundry… and they clearly see this as their route to profitability and to an IPO… a route that no longer includes a significant contribution from database products.

Open Source is Not a Market…

This post is more about the technology business than about technology… but it may be relevant as you try to sort out winners and losers… and this sort of sorting is important if you consider new companies who may, or may not, succeed in the long run.

To make my point let us do a little thought experiment. Imagine a company doing $100M in revenue with a commercial, not open source, database product. They win the $100M in revenue by competing with Oracle, IBM, Microsoft, Teradata, et cetera… and maybe competing a little here and there with some open source products.

Let’s assume that they make 50% of their revenue from services and support, and that their average sale is $2M… so they close 25 deals a year competing in this market. Finally, let’s assume that they break-even each year and spend 20% of their revenues on R&D. The industry average for support services is 20%.. so with each $2M sale they add $400K in recurring revenue.

They are considering making their product open source. Let’s assume that they make the base product free… and provide some value-added offering that costs $200K for the average buyer. Further, they offer a support package for the same $400K/year customers currently pay. How does the math work out?

Let’s baseline against the 25 deals/year…

If they make 25 sales and every buyer buys both the support package and the value-added offer the average sale drops from $2M to $200K, sales revenue drops from $50M to $5M, the annual revenue drops from $100M to $55M… and the company loses $45M. So… starting off they need to make 225 more sales just to break even. But now it gets complicated… if they sell 5 extra deals then in the next year they earn $2M extra in support fees… so if they sell 113 extra deals in year one then in year two they have made up the entire $45M difference and they are back to break-even going forward. If it takes them 2 years to get the extra recurring revenue then they lose money in year two… but are back to break-even in year three.

From here it gets even more complicated. The mythical company above sells the baseline of 25 new copies a year with an enterprise sales force that is expensive. There is no way that the same sales force that services 25 sales/year could service 100+ extra deals. So either costs go up or the 100+ extra customers becomes unattainable. We might hope that the cost of sales will drop way off as the sales price moves to $200K. This is not unreasonable… but certainly not guaranteed. Further, if you are one of the existing sales-staff then you have to sell 10X just to make the same commission. Finally these numbers assume that every customer buys the value-add and gets enterprise-level support. Reality will be something less than this.

We might ask: is it even possible to sell 100+ more with the same product in the same market? Let us be clear that the market the database product plays in has not changed. Open Source is not a market. All we have done is reduced the sales price for the product with some hope that price is a significant driver in the market.

This is not meant as an academic exercise. Tomorrow we will consider how this thought experiment applies to Pivotal’s announcements last week… and to the future of Pivotal’s database assets (here).

How DBMS Vendors Admit to an Architectural Limitation: Part 1 – Oracle Exadata

Database vendors don’t usually admit to shortcomings… they protest that they have no shortcomings until the market suggests otherwise… then they make some sort of change that signals an admission. This post will explore three of these admissions: Oracle and the shared-nothing architecture, DB2 on the mainframe and the shared-nothing architecture, and Teradata and in-memory processing.

For years Oracle verbally thrashed Teradata in the market… proclaiming that the shared-nothing architecture was bunk. But in the data warehousing space Teradata acquired a large chunk of the market; and more importantly, they won more business as the size of the data warehouses grew. The reason for this is two-fold: the shared-nothing architecture lets you deliver more I/O bandwidth to the problem… and once you have read the disk it provides scalability to deliver more compute to process complex queries.

Finally Oracle had enough and they delivered Exadata, a storage engine attached to the conventional Oracle RAC that provided shared-nothing I/O bandwidth to the biggest part of the problem… the full file scan of big fact tables. This was an admission that they had been wrong all along.

Exadata was a tack-on… not a fundamental redevelopment of the Oracle database engine. They used the 80/20 rule to quickly get something to market and stem the trickle of Oracle customers who were out of gas on RAC and headed to shared nothing products: Teradata, Netezza, and Greenplum.

This was a very smart move and it worked. Even though the 80/20 approach meant that there were a significant number of queries, the complex queries that needed to process large working sets to execute joins, Exadata solved enough of the problem to keep devout Oracle shops in the church. Only the shops who felt that complex query performance was important enough to warrant the cost of a migration (for an existing DW that had grown up) or the lesser cost of introducing a new technology (for a new DW) would move.

So, while Exadata was a smart move… it is a clear admission that shared-nothing is the right architecture for data warehouses and marts. This admission makes it clear that it is silly to build a warehouse or mart on normal Oracle or on RAC unless you consider your database an inviolable part of a technological creed.

In my opinion selecting a database is an engineering process that does not require orthodoxy… we should be strong enough engineers to pick the better technology and learn it. Being an “Oracle shop” is lazy.

Note that the in-memory technologies provided in Oracle12c are significant… and for warehouses and marts that will fit on a single node, 12c as it matures, will be a fine choice for the orthodox Oracle shop and for others. For bigger data applications you will require Exadata and the limitations that come with it.

This provides a nice transition to the Part 2 post on Teradata and in-memory.

Related Posts

Database Fog Blog

Other References

Thoughts on in-memory columnar add-ons (dbms2.com)
Major in memory performance boosts for Oracle 12c (techcentral.ie)
Oracle’s Ellison talks up ‘ungodly speeds’ of in-memory database. SAP: *Cough* Hana *cough* (theregister.com)
Oracle RAC (Wikipedia)
Kevin Closson Interview (Northern California Oracle Users Group Journal)

Share this:

Share this:

Share this: