GreenPlum DBMS
•highly scalable
•fault-tolerant
•high-performance
•Based on Postgres
•Shared-nothing architecture
•Commodity hardware
•Currently supported on Solaris, Linux
HANA should be 100X faster than Greenplum or more for a typical query. This is due to the performance boost from avoiding disk I/O. Because memory is more expensive than disk, the cost of a HANA system will be 2X-7X the cost of a disk-based system. But 100X faster for 5X the price is a pretty good deal after all, the correct measure of value should be price/performance not just price.
Database |
$$/TB |
HANA |
$200,000 |
Exadata X3 |
$66,000 |
Teradata |
$66,000 |
Greenplum |
$30,000 |
Latency and Price/Performance
Database |
Total Latency(ns) |
Price/Performance |
Delta |
HANA |
90 |
1,800 |
– |
HANA (2 nodes) |
1190 |
23,800 |
13x |
Exadata X3 |
2,054,523 |
13,559,854 |
7533x |
Teradata |
4,121,190 |
27,199,854 |
15111x |
Greenplum |
10,001,190 |
30,003,570 |
16669x |
GreenPlum Advantages:
Extreme Performance for Analytics
Optimized for BI and analytics
– Deep integration with statistical packages
– High performance parallel implementations
• Simple and automatic
– Just load and query like any database
– Tables are automatically distributed across nodes
• Extremely scalable
– MPP shared-nothing architecture
– All nodes can scan and process in parallel
– Linear scalability by adding nodes
Parallel Query Optimizer
Cost-based optimization looks for the most efficient plan
Physical plan contains scans, joins, sorts, aggregations, etc.
Global planning avoids sub-optimal ‘SQL pushing’ to segments
Directly inserts ‘motion’ nodes for inter-segment communication
Analytics Highlight: MADlib
Scalable in-database analytics
Data-parallel
– Mathematical Algorithms
– Statistical Algorithms
– Machine learning Algorithms
– Supports structured and unstructured data.
Open-source software
– Source Accessibility
– Converge business, academic, and open-source communities
Easy Manageability for Big Data
Single console for both Database and Hadoop
Administration
– Start, Stop Database
– Recover, Rebalance Segments
Interactive view of System Metrics
– Real-time
– Historic (Configurable by time period)
In-depth view for System Health
– Hardware health
– Software (Database, Hadoop)
Query Monitoring
– Search, Prioritize, Cancel Queries
– View Query‘s Execution Plan
Workload Management
– Configure Resource Queues
– Prioritize Users