30 April 2014
30 April 2014,
 0

London, April 30th 2014

In “develop” branch (1.7-SNAPSHOT) it’s available the new Distributed Architecture with sharding features and simplified configuration.

Look at the new default default-distributed-db-config.json:

{
  "autoDeploy": true,
  "hotAlignment": false,
  "offlineMsgQueueSize" : 0,
  "readQuorum": 1,
  "writeQuorum": 2,
  "failureAvailableNodesLessQuorum": false,
  "readYourWrites": true,
  "clusters": {
    "internal": {
    },
    "index": {
    },
    "*": {
      "servers" : [ "<NEW_NODE>" ]
    }
  }
}

We removed some flags (like replication:boolean, now it’s deducted by the presence of “servers” field) and settings now are global (autoDeploy, hotAlignment, offlineMsgQueueSize, readQuorum, writeQuorum, failureAvailableNodesLessQuorum, readYourWrites), but you can overwrite them per-cluster.

Furthermore the sharding is not anymore declared per cluster but the it’s made per cluster. I explain myself better. Many NoSQL do sharding against an hashed key (see MongoDB or any DHT system like DynamoDB). This wouldn’t work well in a Graph Database where traversing relationships means jumping from a node to another one with very high probability.

So the sharding strategy is in charge to the developer/dba. How? By defining multiple clusters per class. Example of splitting the class “Client” in 3 clusters:


Class Client -> Clusters [ client_0, client_1, client_2 ]

This means that OrientDB will consider any record/document/graph element in any of such clusters as “Clients” (Client class relies on such clusters).

Now you can assign each cluster to one or more servers. If more servers are enlisted the records will be copied in all the servers. Example:

{
  "autoDeploy": true,
  "hotAlignment": false,
  "readQuorum": 1,
  "writeQuorum": 2,
  "failureAvailableNodesLessQuorum": false,
  "readYourWrites": true,
  "clusters": {
    "internal": {
    },
    "index": {
    },
    "client_0": {
      "servers" : [ "europe", "usa" ]
    },
    "client_1": {
      "servers" : [ "usa" ]
    },
    "client_2": {
      "servers" : [ "asia", "europe", "usa" ]
    },
    "*": {
      "servers" : [ "<NEW_NODE>" ]
    }
  }
}

In this case I’ve split the Client class in the clusters client_0, client_1 and client_2, each one with different configuration:

    • client_0, will write to “europe” and “usa” nodes
    • client_1, only to “usa” node
    • client_2, to all the nodes (it would be equivalent as writing “<NEW_NODE>”, see cluster ‘*’, the default one)

On application when I want to write a new Client to the first cluster I can use this syntax (Graph API):


graph.addVertex("class:Client,cluster:client_0");

Document API


ODocument doc = new ODocument("Client");
// FILL THE RECORD
doc.save( "client_0" );

 

OrientDB will send the record to both “europe” and “usa” nodes.

All the query works by aggregating the result sets of all the involved nodes. Example:


select from cluster:client_0

Will be executed against node “europe” or “usa”. Instead this query:


select from Client

 

Will be executed against all 3 clusters that made the Client class, so against all the nodes.

Cool!

Also projections works in a Map/Reduce fashion. Example:


select max(amount), count(*), sum(amount) from Client

In this case the query is executed across the nodes and then filtered again on starting node. Right now not 100% of the cases are covered, see below.

What’s missing yet?

(1) Class/Cluster selection strategy

Till now each class has a “defaultClusterId” that is the first cluster, by default. We need a pluggable strategy to select the cluster on create record. We could provide at least 2 implementations in bundle:
fixed, like now: takes the defaultClusterId
round-robin, clear by the name
balanced, will check the records in all the clusters and will try to write to the smaller cluster. This is super cool when you add new clusters because you’ve bought new servers. The new empty servers will be filled before the others because will have new clusters empty at the beginning. So after a while all the clusters will be balanced
Why don’t have just balanced? Because it could be more expensive to know the cluster size at run-time…. We could cache sizes and update them every XX seconds. We’ll fix this in 1.7 in a few days.

This point is not strictly related to the Distributed cfg, but it will be applied also to the default cfg.

(2) Hot change of distributed configuration

This will be introduced after 1.7 via command line and in visual way in the Workbench of the Enterprise Edition (commercial licensed).

(3) More Test Cases

I’ve created the TestSharding test case, but we’d need more complicated configurations and hours to try them. In facts it’s pretty time/cpu consuming starting up 2-3 servers in cluster every time!

(4) Merging results for all the functions

Some functions like AVG() doesn’t work in this way. I didn’t write 100% test coverage on all the functions yet. We’ll fix this in the next release.

(5) Update the Documentation

We’ll start updating the documentation with the new staff and more examples about configuration.

(6) Sharding + Replication

Mixed configuration with sharding + replication allow to split database in partitions but replicated across more than one node. This already works. Unfortunately the distributed query with aggregation in projections (sum, max, avg, etc) could return wrong results because same record could be browsed on different nodes. We’ll fix this in the next release.

 

Luca Garulli
CEO at Orient Technologies LTD
the Company behind OrientDB

http://about.me/luca.garulli

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>