Thinking About the Pivotal Announcements…

Yesterday I provided a model for how business sees open source as a means to be profitable (here). This is the game Pivotal seems to be playing with their release of Hadoop, Gemfire, HAWQ, and Greenplum into open source. I do not know their real numbers… so they may need more or fewer additional customers than the mythical company to get back to break-even. But it is unlikely that any company can turn the corner from a license-based revenue stream to a recurring revenue stream in a year… so Pivotal must be looking at a loss. And when losses come it is usual to cut costs… to cut R&D.

There has already been a brain-drain out of the database ranks at Pivotal as they went “all in” on Hadoop. They likely hope for an open source community to pick up the slack… but there is not a body of success I can see in building a community to engineer a commercial product-turned-open. This is especially problematic for Gemfire, an old technology that has been in the commercial space for a very long time. HAWQ has to compete for database resources with the other Hadoop RDBMS technologies… that will be difficult. Greenplum has a chance as it is based on PostgreSQL… but it is a long way away from the current PostgreSQL code base these days. There is danger here.

The bottom line… Greenplum and HAWQ and Gemfire have become risky propositions for both the current customer base and for new customers. I’ll leave it to you to evaluate the risk as this story unfolds. Still, with the risk comes reward… the cost of acquiring Greenplum will drop dramatically and today Greenplum is a competitive product. In addition, if Greenplum gains some traction, it will put price pressure on the other database products. Note that HAWQ was already marked down to open source price levels… and part of Pivotal’s problem was that HAWQ was eating at the Greenplum market. With these products priced at similar levels there becomes some weirdness in choosing… but the advantage is to customers looking at Greenplum.

One great outcome comes for Pivotal Hadoop customers… the fact that Hortonworks will more-or-less subsume Pivotal Hadoop leaves those folks in a better place than before.

If you consider the thought experiment you would have to ask yourself why a company that was breaking even would take this risky route? It could be that they took the route because they were not breaking even and this was a possible path to get even. Also consider… open sourcing code is the modern graceful way to retire an unprofitable product line.

This is sound thinking by Pivotal… during the creation, EMC gave Pivotal several unprofitable troubled assets and these announcements give Pivotal a path forward. If the database product line cannot carry their weight then they will go into maintenance mode and slowly fade. Too bad… as you know I consider Greenplum a solid product whose potential was wasted. But Pivotal has a very nice product in Cloud Foundry… and they clearly see this as their route to profitability and to an IPO… a route that no longer includes a significant contribution from database products.

How DBMS Vendors Admit to an Architectural Limitation: Part 1 – Oracle Exadata

Database vendors don’t usually admit to shortcomings… they protest that they have no shortcomings until the market suggests otherwise… then they make some sort of change that signals an admission. This post will explore three of these admissions: Oracle and the shared-nothing architecture, DB2 on the mainframe and the shared-nothing architecture, and Teradata and in-memory processing.

For years Oracle verbally thrashed Teradata in the market… proclaiming that the shared-nothing architecture was bunk. But in the data warehousing space Teradata acquired a large chunk of the market; and more importantly, they won more business as the size of the data warehouses grew. The reason for this is two-fold: the shared-nothing architecture lets you deliver more I/O bandwidth to the problem… and once you have read the disk it provides scalability to deliver more compute to process complex queries.

Finally Oracle had enough and they delivered Exadata, a storage engine attached to the conventional Oracle RAC that provided shared-nothing I/O bandwidth to the biggest part of the problem… the full file scan of big fact tables. This was an admission that they had been wrong all along.

Exadata was a tack-on… not a fundamental redevelopment of the Oracle database engine. They used the 80/20 rule to quickly get something to market and stem the trickle of Oracle customers who were out of gas on RAC and headed to shared nothing products: Teradata, Netezza, and Greenplum.

This was a very smart move and it worked. Even though the 80/20 approach meant that there were a significant number of queries, the complex queries that needed to process large working sets to execute joins, Exadata solved enough of the problem to keep devout Oracle shops in the church. Only the shops who felt that complex query performance was important enough to warrant the cost of a migration (for an existing DW that had grown up) or the lesser cost of introducing a new technology (for a new DW) would move.

So, while Exadata was a smart move… it is a clear admission that shared-nothing is the right architecture for data warehouses and marts. This admission makes it clear that it is silly to build a warehouse or mart on normal Oracle or on RAC unless you consider your database an inviolable part of a technological creed.

In my opinion selecting a database is an engineering process that does not require orthodoxy… we should be strong enough engineers to pick the better technology and learn it. Being an “Oracle shop” is lazy.

Note that the in-memory technologies provided in Oracle12c are significant… and for warehouses and marts that will fit on a single node, 12c as it matures, will be a fine choice for the orthodox Oracle shop and for others. For bigger data applications you will require Exadata and the limitations that come with it.

This provides a nice transition to the Part 2 post on Teradata and in-memory.

Related Posts

Database Fog Blog
Other References

Some Database Performance Concepts

I’m working on a new idea… it may or may not pan out… but here are some concepts for your consideration… with some thoughts on their performance implications.

First a reminder… a reality check. In my experience if you POC two databases at about the same price point…and one is 30% faster than the other, 1.3X, then 50% of the time the faster DBMS will win the business. If one DBMS is 2X faster… then it will win the business 90% of the time. The 10% where the faster product loses will be because of internal politics or, for an existing application, due to the migration costs. Note that IMO it is silly to do a POC if you know up front that you will not pick the winner.

Now to the concepts… Note that these are ballpark numbers to help you think about trade-offs…

The latency to start fetching data from DRAM is 100 ns… from disk it is 10M ns. If we assume that a smart RDBMS pre-fetches 80% of the data into DRAM then we can assume that an in-memory DBMS has a 200,000X performance advantage over a disk-based system.

The latency to a Flash/SSD device is 100K-200K ns. With the same 80% pre-fetch assumption an in-memory DBMS will be 20,000X faster.

Note that neither of these models include data transfer times which will favor in-memory databases even more.

If we have a hybrid system with both disk and SSD and we assume that 90% of the reads hit the SSD and that both layers in the storage hierarchy achieve 80% pre-fetch then then the in-memory system will be 38,000X faster. If fewer than 90% of the reads hit the SSD, then the latency goes up quickly.

These numbers form the basis for selecting in-memory caches like Teradata’s Intelligent Memory option as well as in-memory offerings from IBM, Microsoft, Oracle and SAP.

For typical data warehouse workloads column compression will provide around a 2.5X performance boost over row compression. This has two implications: you will get 2.5X better performance using column storage and you will get 2.5X more data into the faster levels of your storage hierarchy… more in SSD and more in-memory.

If we assume that a typical query only touches 10% of the columns in the tables addressed… then column projection provides a 9X performance boost over a row store. Exadata does not support column projection in the storage layer… and other hybrid row-or-column systems provide it only for columnar tables.

If we assume that the average latency from the processor caches is 10ns (.5ns L1, 7ns, L2, 15ns L3) and the latency to DRAM is 100ns then an in-memory system which pre-fetches data effectively into the processor caches will be 10X faster than one which goes to DRAM. If we assume that a standard RDBMS which processes uncompressed standard data types (no vector processing) gets a 20% cache hit ratio then the advantage to a cache aware RDBMS which loads full cache lines is around 8X. HANA, BLU, and the Oracle in-memory products are cache aware and get this boost with some caveats.

BLU and the Oracle in-memory option are hybrid systems that often convert data to a row form for processing (see here for some data on Oracle). If we assume that they use the full columnar in-memory vector-based structures 50% of the time then these products will see a 4X performance boost. HANA recommends that all data be stored in a columnar form so it would often see the full 8x boost.

These vector-based processes also avoid the cost of decompression… and since they process compressed vector data they can fit more information into each cache line. There is another 20%-200% (1.2X-2X) boost here but I cannot estimate it closer than that.

Finally, the vector based processes use the high performance computing instruction sets (AVX2) offered on modern CPUs… and this provides another 10X+ boost. Again, BLU and Oracle will utilize the vector form less often than HANA so they will see a boost over products like Teradata… but not see as large a boost as HANA.

There are other features at play here… some products, like HANA, shard data in-memory to get all of the cores busy on each query. I have not been able to determine if BLU or Oracle in-memory are there yet? Note that this powerful feature will allow a single query to run as fast as possible… but the benefit is mitigated when there is a workload of multiple concurrent queries (if I have 4 cores and 4 queries running concurrently, one query per core, then the 4 queries will take only a little more time than if I run the 4 queries serially with each query using all 4 cores).

It is important to note that the Oracle In-memory option does not run in the storage component of an Exadata cluster… only on the RAC layer. It is unclear how this works with the in-memory option.

The bottom line is that in-memory systems are not all alike. You can sort of add up the multipliers to get in the ballpark on how much faster Teradata will be with the Intelligent Memory option… how much faster than that a hybrid row and vector-column system like BLU or Oracle In-Memory… and how much faster a pure in-memory system might be. One thing is for sure… these in-memory options will always make the difference in a POC… in every case including them or not will blow away the 2X rule I started with… and in every case the performance benefit will outweigh the extra cost… the price/performance is very likely to be there.

I know that is skipped my usual referencing so that you can see where I pulled these numbers from… but most of this information is buried in posts here and there on my blog… and as I stated up front… these are ballpark numbers. Hopefully you can see the sense behind them… but if you think I’m off please comment back and I’ll try to adjust…

An Architecture for the IoT – Part 1

There are so many things in the Internet of Things (IoT) that might record data into your data fabric that a new approach may be required. Let’s think about this… define some terms, and see how these terms fit into current data fabric thinking, let’s consider how they fit into a more modern logical data warehouse architecture, and let’s think about whether the IoT might push us to a different approach.

I’m not going to go overboard on terms here… But we do need to distinguish between a sensor and a processor.

To my way of thinking a sensor is a thing. It creates, but does not necessarily process, data. A sensor has some means to communicate with a processor… but if there is no significant processing on the sensor other than communications then we will suggest that there is no “processor” in a meaningful sense. Let me give you four examples:

  • The first is courtesy of Ray Carnes, a chief architect at Boeing. Imagine a brake-pad in your car with 100,000 dust-sized RFID sensors randomly scattered as part of the pad. These sensors do nothing but signal on an interval that they are present. This allows a processor elsewhere to record the signals and determine how much of the brake pad has worn. If only 80,000 sensors report we can assume that 20% of the pad has worn away.
  • A Nest thermostat senses movement and temperature. It uses a network-connect to send the results of this sensing to the Nest mother-ship and performs little-or-no processing on site.
  • Sensors in my Audi detect rotation of the wheels. There is a network that sends the results to a small embedded anti-lock braking processor that monitors all four wheels as well as the pressure on the brake-pedal and sends signals to all five components to allow the car to brake evenly.
  • There is a sensor in the screen on the ATM I used yesterday that detects that I want to request service. This user interface communicates with a powerful general processor which then communicates with the Bank mother-ship to create and process banking transactions.

This last bullet is important… any device that takes user input is a sensor with an embedded processor. It is a “thing” just like the Nest thing. Today we tend to blur the line between sensor and processor as every thing has a powerful processor onboard. The IoT will change this assumption.

A processor then, is a computer that performs some analysis on the data generated by one or more sensors. A processor may also store data… a sensor will not.

Now let’s think about how we might combine sensors and processors in an architecture. To start lets consider the context of the data the processor can use for analysis:

  • If the processor has only the last data sensed we would say that the context is immediate and local to one sensor. The processor can see streamed data but can only operate on the last event. We would say that this sensor-processor configuration can provide a simple reflexive response. When you press the lock button in your car a sensor detects this event and signals to all four doors and the boot to lock it up.
  • Another configuration might allow the local processor to store more context from a single sensor over a longer period of time… so the context is historical and local. In the case of the anti-lock brakes… the processor receives signals from a group of sensors and stores a very short historical context. This grouped historical context is very powerful…
  • Another configuration might store the group context and then forward the event details to a bigger server that stores and analyzes a universal context of all things to look for patterns. Further, there could be a hierarchy of groups leading to a universal context.
  • Finally, a server with some group context could summarize the details for that group and pass on only a summary over time up to another group server or to a universal server.

I suspect that you can see where I’m going. There is a trade-off is this picture between the advantages of pushing analytic processing close to the sensor and the associated requirement for more analytic processors, the advantages of intermediate analysis requiring more data movement but fewer analytic processors, and the advantage of a central analytic mother ship where all data is stored and analyzed. In the next version of this thread I’ll try to tease apart the trade-offs.

Part 5: A Review of Processing Push-down

Continuing this thread on RDBMS-Hadoop integration (Part 1, Part 2, Part 3, Part 4) I have suggested that we could evaluate integration architecture using three criteria:

  1. How parallel are the pipes to move data between the RDBMS and the parallel file system;
  2. Is there intelligence to push down predicates; and
  3. Is there more intelligence to push down joins and other relational operators?

I want to be sure that I’ve conveyed the concepts behind these criteria properly… I may have rushed it in the early parts of this series.

Let’s imagine a query that joins a 2,000,000 row table with a 1000 row dimension table where both live in HDFS.

If all of the data has to be moved from HDFS to the RDBMS then 2,001,000  rows must be read and moved in order to apply a predicate or any other processing.. For fun lets say that the cost of moving this data is 2001K.

If there are 10 parallel pipes then the data movement is completed in one tenth the time… so the cost is 200K.

If a predicate is included that selects only 5% of the data from the big table, and the predicate is pushed down the cost is reduced to 101K. Add in parallel pipes and the cost is 10K

Imagine a query where there is a join between the two tables with predicates on one side and predicate push down… then you have to pay 101K to pull the projected data up and do the join in the RDBMS. If there is a join predicate that reduces the final answer set by another 95% then after the join you return 6K rows. Since everybody returns the same 6K rows as an answer we won’t add that in.

But if you can push the join down as well as the predicates then only 6K rows are moved up… so you can see how 2001K shrinks to 6K through the effective push down of processing.

Further, you can build arbitrarily complex queries and model them pretty well knowing that most of the cost is in data movement.

So think about how Teradata processes these two tables in Hadoop when you use the specialized SQL constructs and then again if you build the query from a BI tool. And stay tuned as I’ll show you how HANA processes the data next…. and then talk about several others.

On to Part 6

Specialized Databases vs. Swiss Army Knives

“That’s not a knife… THAT’S A KNIFE” – C. Dundee (Photo credit: Wikipedia)

Michael Stonbreaker has suggested several times… and again in this interview… that  databases will become more specialized and that “one size will fit none”. I’m sure that his argument is more nuanced than the sound bites in the interview, but in this post I’ll suggest a line of thinking that may lead to a different conclusion.

First, let’s agree that the word “fit” means the best price for the performance required to meet your company’s service level requirements.

Then let’s agree with the basic premise behind Dr. Stonebreaker’s argument… we agree that in any single-purpose application a specialized single-purpose DBMS can be developed that will out perform a generalized DBMS. This means that one-half of the fit, performance, is likely.

We would also agree that between the growth of open source databases and the general growth in the database space that  it is likely that someone can and will develop specialized databases and bring them to market in cases where there is enough market to make it worthwhile. But it is important to note that specialization will be not become infinitely narrow… there has to be enough market for the special case to generate an attractive product.

So where is the disagreement: I do not believe that data is ever used in a single specialized business context. Not ever.

Let’s imagine that we have a business requirement for an extremely high volume OLTP application… and let us assume that the performance and/or scalability requirements are beyond what any general DBMS can provide and that the business ROI is significant… in other words, let us imagine that we are Google or Facebook. In this case we have no choice but to select or to develop an extreme, specialized, DBMS to solve the problem and extract the return.

But in this case, once the OLTP transactions are recorded… what do we do with the data? We need to use the data elsewhere in the business as the basis for deep analytics or for basic business intelligence… so we have to replicate the data to a second database. Since the second DBMS is also sort of specialized… it does not have to support OLTP… we select a second specialized, data warehousish product.

And then come new requirements for doing light queries in near real time to support operational analytics… so we build some sort of operational data store. Again we can select a product with a narrow technical sweet spot… but we have to replicate the data a third time.

In other words… given my premise… that data is never used in a single specialized context… specialized databases force replication… and replication allows for further specialization.

But what if the requirements are not so extreme? Then we might use a single conventional RDBMS for the EDW and for the ODS problems. If a more generalized DBMS product exists that could handle both the operational reporting and the analytic reporting requirements in a single image of the data then we could eliminate one replica and one DBMS. In other words, if the problem is not so extreme then a generalized solution might provide a solution in a single instance of the data avoiding replication.

Now the issue becomes: is the cost of a specialized system plus replication plus a second specialized system plus the cost of operating these systems less than the cost of a single generalized system? I believe that the answer will often be in favor of a single system even when the specialized systems are low-cost open source. Since “fit” is about cost maybe one size does fit now and again.

This suggests the strategy of the OldSQL vendors. They are offering a Swiss Army Knife product that serves multiple requirements. Their feature sets have grown over 30 years and they are pretty capable across a wide array of business problems… and with the columnar and in-memory features being added they continue to cover ever more extreme uses cases… not the most extreme use cases… but they cover more ground each year.

The strategy of the NewSQL vendors is to focus tight and hard. They might develop an extreme OLTP DBMS with no ability to do a join… a product with extreme scalability and no performance… or a graph database to solve for an important, narrow, set of queries… or a columnar product that performs analytics but support no OLTP. This trend feeds the specialize and replicate meme advocated by Dr. Stonebreaker.

HANA is a horse of a different color… neither NewSQL nor OldSQL. It is a new code base designed to solve for a very wide set of uses cases in a single instance of the data. We certainly agree in this blog with Dr. Stonebreaker’s contention that the 30 year old legacy code base has to be retired. But SAP contends that you can build a new, generalized, DBMS that solves for all but the most extreme cases.

This is a great spot to end the year… having laid out the battle we will cover in this blog ongoing… with the legacy OldSQL vendors trying to tack on to their legacy code base… and doing pretty well at it… with the NewSQL vendors trying to specialize and replicate… and with HANA offering a new code base designed to solve for the the whole picture. 2014 will be great fun to watch. This also sets the stage to ask next year whether “big data” applications are so extreme as to force users to specialize-replicate-specialize.

Have a great holiday season…  and my best wishes. Thank you all for reading the Database Fog Blog in 2013… I hope for your continued attention in the New Year…

– Rob

Who Out-performs Who: A Story…

Performance Car Legends (Photo credit: Ginger Me)

In this blog I have stated explicitly and implied now and again that the big architectural features are what count… despite the fact that little features are often what are marketed. Here is a true story to reinforce this theme… and a reminder of the implications… a real-life battle between two vendors: we’ll call them NewVendor and LegacyVendor.

Four years ago, more or less, NewVendor sold a system to offload work from an existing LegacyVendor configuration. Winning the business was tough and the POC was a knife-fight. At that time the two vendors were architecturally similar with no major advantages on either side. In the end NewVendor won a fixed contract that provided 16 nodes and guaranteed to match the performance of LegacyVendor for a specific set of queries. The 16 node configuration was sized based on the uncompressed data in LegacyVendor’s system.

NewVendor sent in a team to migrate the data and the queries… what was expected to be a short project. But after repeated attempts and some outside effort by experts the queries were running 50% slower than the target. I was asked to have a look and could see no glaring mistakes that could account for such a large performance miss… I saw no obvious big tuning opportunities.

After a day or so of investigation I found the problem. LegacyVendor offered a nice dictionary-based compression scheme that shrunk the size of the data by… you guessed it… exactly 50%. Because the NewVendor solution had to read 50% more data with each query they were 50% slower. I recommended that NewVendor needed to supply eight more nodes to hit the performance targets.

In the course of making these recommendations I was screamed at, literally, by one technology executive and told that I was a failure by another. They refused to see the obvious, exact, connection between the performance and the compression. I was quickly replaced by another expert who spent four months on site tuning and tuning. He squeezed every last drop out of the database going so far as to reorder columns in every table to squeeze out gas where data did not align on word boundaries. His expert tuning managed to reduce the gap and in the end NewVendor purchased three fewer nodes than my recommendation. With those nodes they then hit the targets. But the cost of his time and expense (he was a contractor) exceeded the cost of the nodes he saved… and the extra four-month delay antagonized the customer such that the relationship never recovered.

In a world where the basics of query optimization and execution are known to all there are only big-ticket items that differentiate products. When all of the big-ticket, architectural, capabilities are the same the difference between any two mature RDBMS products will rarely be more than 10%-15% across a large set of queries. The big-ticket differentiators today are the application of parallelism, compression and column store, and I/O avoidance (i.e. in-memory techniques). The answer to the question who out-performs who can be found to a close approximation from looking at who is how parallel (here) and who is how columnar (here) and merging the two… with a dose of who best avoids I/O through effective use of memory. This is the first lesson of my story.

The second lesson is… throw hardware at tuning problems when there are no giant architectural mistakes. Even a fat server node costs around $15K… and you will be better off with faster hardware than with a warehouse or mart that is so finely tuned and fragile that the next change to the schema or the data volumes or the workload breaks it.

Epilogue

Soon after this episode NewVendor rolled out a Level 2 columnar feature. This provided them with a distinct advantage over LegacyVendor… an advantage almost exactly equal to their advantage in compression plus the advantage from columnar projection to reduce I/O… and for several years they did not lose a performance battle to LegacyVendor. Today LegacyVendor has a comparable capability and the knife-fight is on again… Architecture counts…

The Big Data Devil

Devil (Photo credit: Wikipedia)

I just finished a draft for next week on Big Data and thought that with this note I might form a preface…

First… Big Data is about, well…, Big Data. When Gartner devised the three V’s I suspect that they were trying to frame the new stuff that was emerging… not establish a concise definition. So let me be very clear about what I think that Big Data is and is not.

Big Data is about volume, not velocity, not variety. That is what the words “big” and “data” conjoined must mean. Velocity + Volume is Big Data. Variety + Volume is Big Data. By themselves Velocity and Variety are new, important, separate, technological trends.

Next, Big Data is a new thing. It is not a technology that was around in a meaningful way 5+ years ago. It was emerging just then so we should see evidence in the advances offered by the Web Scale companies like Google, Yahoo, and Netflix. It is not any data that was conventionally created, captured, or used before 2010 or so.

So what is new, big, and was emerging in recent history? It is the creation, capture, and use of machine-generated data: click-stream data, system log data, and sensor data. Big Data technology has to do with the creation, capture, and utilization of large volumes of machine-generated data… nothing more or less.

Rob: Big Data legitimately includes Social Data as well as Vitaliy rightfully commented… I’ll post on this soon…

Machines generate data at a very low-level of detail. It is said that the devil is in the detail… and the subject of the next post deals with the notion that in order to make our companies more profitable we must all chase this damnable devil.

PS

I wonder if damnable devil is redundant? Probably, yes.

2nd PS (sort of like 2nd breakfast)

Big Data is not about any and every new technology introduced in the last five years…

The Fog is Getting Thicker…

San Francisco in fog (Photo credit: Wikipedia)

I renamed this so that Teradata folks would not get here so often… its not really about Intelligent Memory… just prompted by it. The post on Intelligent Memory is here. – Rob

Two quick comments on Teradata’s recent announcement of Intelligent Memory.

First… very very cool. More on this to come.

Next… life is going to become very hard for my readers and for bloggers in this space. The notion of an in-memory database is becoming rightfully blurred… as is the notion of column store.

Oracle blurs the concepts with words like “database in-memory” and “hybrid column compression” which is neither an in-memory database or a column store.

Teradata blurs the concept with a strong offering that uses DRAM as a block-IO device (like the old RAM-disks we used to configure on our PCs).

Teradata and Greenplum blur the idea of a column store by adding columnar tables over their row store database engines.

I’m not a fan of the double-speak… but the ability of companies to apply the 80/20 rule to stretch their architectures and glue on new advanced technologies is a good thing for consumers.

But it becomes very hard to distinguish the products now.

In future blogs I’ll try to point out differences… but we’ll have to go a little deeper into the Database Fog.

How Good Is Teradata’s Intelligent Memory?

A 30 feet chunk of the cliff below the apartment building fell to Pacific Ocean. (Photo credit: Wikipedia)

Jason asked a great question in the comment section here… he asked… does Teradata’s Intelligent Memory erode HANA’s value proposition?  Let me answer here in a more general way that is applicable to the general database space…

Every time a vendor puts more silicon between the CPU and the disk they will improve their performance (and increase their price). Does this erode HANA’s value proposition? Sure. Every advance by any vendor erodes every other vendor’s position.

To win business a new database product has to be faster than the competition. In my experience you have to be at least 30% faster to unseat the incumbent. If you are 50% faster you will win a lot of business. If you are 2x, 100%, faster you win nearly every time.

Therefore the questions are:

  • Did the Teradata announcement eliminate a set of competitors from reaching these thresholds when Teradata is the incumbent? Yup. It is very smart.
  • Does Intelligent Memory allow Teradata to reach these thresholds when they compete against another incumbent. Yup.
  • Did it eliminate HANA from reaching these thresholds when competing with Teradata? I do not think so… in fact I’m pretty sure it is not the case… HANA should still be way over the 2x threshold… but the reasons why will require a deeper dive… stay tuned.

In the picture attached a 30 foot chunk eroded… but Exadata still stands. Will it be condemned?

Note: Here is a commercial post on the SAP HANA blog site that describes at a high level why I think HANA retains a distinct architectural advantage.

Exit mobile version
%%footer%%