Testing MongoDB Clustering with Morphia

Question

Testing MongoDB Clustering with Morphia

I am trying to configure MongoDB in a replica configuration to see how it scales / performs / handles.

I used Morphia (a POJO mapping layer on top of Mongo Java drivers ) to save 10,000 simple random documents into one collection. I annotated my POJO ( MyData in the snippet below) with the @Entity(concern="REPLICAS_SAFE") annotation @Entity(concern="REPLICAS_SAFE") in the hope that the data sent to the database will be saved.

My POJO consisted of an ObjectId field (Mongo primary key), a String random characters of random length (maximum 20 characters) and long , generated using Random.nextLong() .

My code is as follows:

 for (int i=0;i<10000;i++) { final MyData data = new MyData(); boolean written = false; do { try { ds.save(data); //ds is of type DataStore written=true; } catch (Exception e) { continue; } } while (!written); }

I installed a replica cluster with four nodes, ran the above program, and then began to metaphorically pull the cables to find out what happened.

The desired result was the work of the program until it deleted all the documents in the database.

The actual result, after a few minutes, was one of:

It was reported that he transferred 10 thousand records, but the database has only <10k
Java report that it passed <10k, and a database reporting either the same value or even less
Everything is working fine

In one case, the nodes that were returned could not actually catch up with the PRIMARY node and had to be started from scratch with a remote database. This was despite the increase in the opfile parameter to 2 concerts, which, it seemed to me, would be enough to reproduce 10,000 lines of very simple data.

Other things you should know:

All this works on the same hardware (2 gigabytes of Pentium D!) With a cluster running on two 32-bit instances of the Ubuntu Server VirtualBox with 128 megabytes of memory each, and a Java client running inside the Windows XP host. Two mongod processes mongod executed on each virtual machine, plus arbiter worked on one virtual machine.
The clock on the two virtualized machines was turned off for a few seconds (I need to install the VirtualBox guest add-ons to fix this), but not a big sum - 10gen says that time should not be a problem for clustering, but I Think I would say that.

I am aware of the limitation of 2 gigabytes with Mongo on a 32-bit machine, that other people lost their records, and I know that the machine with which I do these tests is not quite in the Top 500 (therefore, the data I chose for the save was small), but when my tests worked, they worked very well.

Am I having trouble proving that Mongo is not ready for prime time yet, or am I doing something inherently wrong?

I am using 1.6.5.

Any ideas, tips, tricks, pointers, explanations or criticism are greatly appreciated!

ps: I'm not trolling - I really like the NoSQL idea for the data for which it is good, so I really want it to work, but so far I'm not very lucky!

+4

mongodb cluster-computing morphia

Rich Dec 17 '10 at 15:14

source share

1 answer

Gates vp · Answer 1 · 2010-12-17T17:49:27+0000

MongoDB is definitely being used in prime time in many places right now. Therefore, it is worth taking a look at what else can happen here.

So, a couple of starting questions:

How does the "new MyData ()" work? Is it possible that you are clogging existing identifiers?
Are your replicas tuned to the whole process? You just “continue” with the error, so I'm not sure how the errors are handled. Are Morphia correct bubbling bugs?

I really appreciate that you went through and wrote a kind of "test case", but I think you need to take one step deeper in this matter. Can you try the following two things?

Set _id on MyData to i . This way you can see where in the process you are dying.
Do a console.write or equivalent every time you get an error message. See if you can understand where the data really disappeared.
To the same extent, execute console.write for each successful save.

If you follow these steps, you will receive a log of what is happening, and you can see what is or is not saved, and compare it with the data in the database.

I understand that all this is a bit tedious, but I think that you have one of two problems, and these steps will help to figure it out.

In any one 1. Morphia reports errors incorrectly (cannot be processed correctly) 2. You have found an actual problem with replica sets 3. You fall under the “possible sequence”.

In any case, with more details, we should be able to solve the problem.

Testing MongoDB Clustering with Morphia

More articles: