Monday, 6 October 2014

How To Setup Voltdb Cluster on AWS Cloud (Part - 1/2)

What is VoltDb:
VoltDB is focused on specific workloads. Most existing RDBMSs are designed to be general purpose, one-size-fits-all systems.
VoltDB was designed to be the most scalable transaction processing system out there, often making compromises unsuitable for other workloads. For non-OLTP workloads, VoltDB is built to work in concert with other specialized systems.
Why Voltdb:
We believe that a set of specialized data management tools can replace the functionality of one-size-fits-all systems, increasing performance, scalability and fault tolerance dramatically.
Instalation Requirement :
1.CentOS 5.8 or later or 6.3 or later and Ubuntu 10.4 or 12.4 (64 Bit )
2.Memory 4 GB
3.sun jdk 6 or higher
4.NTP
For Clustering we require to choose : Cloud Cluster Instance in Amazon.
While creating the instance we Create a Palcement group , Security Group .
Note . Cc2.x large or greater Image to be chosen.
( Note: Make sure to open the port 3021 on all the nodes for voltdb connection and 123 for NTP)
5. PHP 5.3
Performance BENEFITS:
VoltDB is based on 3 big concepts. While none of these concepts are new ideas individually, VoltDB builds all three into the core of the product.
Concept 1: Exploit repeatable workloads.
VoltDB exclusively uses a stored procedure interface. It expects an application’s complete set of procedures to be known in advance. This allows it to pre-optimize execution paths for incoming transactions. Since most OLTP applications perform the same set of operations over and over, this model is a good fit for OLTP.
That’s not to say VoltDB is inflexible. When the application changes, the set of stored procedures can be amended or updated. VoltDB also supports some ad-hoc access for administrative tasks.
Concept 2: Partition data to horizontally scale.
VoltDB divides data among a set of machines (or nodes) in a cluster to achieve parallelization of work and near linear scale-out. Unlike custom sharding or partitioning solutions, this functionality is a native and fundamental feature. VoltDB manages consistency and distribution across cluster nodes and replicates data to multiple nodes to ensure smooth operation during many types of failure.
Concept 3: Build a SQL executor that’s specialized for the problem you’re trying to solve.
OLTP means high throughput of write-heavy transactions, each operating on a relatively small subset of data returning relatively small result sets. Unlike analytical workloads, few transactions scan more a few thousand rows, and most scan only a handful. Since our stored procedures a) never wait for disk access, b) never wait for external input in the middle of a transaction and c) never read or modify huge amounts of data, they can be executed on modern processors in microseconds. If they take microseconds, why interleave their execution with a complex system of row and table locks and thread synchronization? It’s much faster and simpler just to execute work serially.
In order to run single-threaded in a world of multi-core, VoltDB partitions data by core, not by cluster node. A three node cluster with 8 cores / node will run about 24 lean mean single-threaded SQL machines.
Bonus: Leverage the 3 concepts together.
Because workloads are known in advance, replication can be done by performing the same procedure twice in two places. This is simpler and faster than log-shipping changed tuples.
Because data is replicated synchronously, don’t bother with a write-ahead-log at each SQL executor to maintain consistency on a single node.

No comments:

Post a Comment