What could lead to this bad neo4j performance?

Question

What could lead to this bad neo4j performance?

In our stack, we use neo4j and are faced with classic performance issues: the application runs slowly as soon as it requires data from neo4j .

Listening only to my courage (pun intended) I started JVisualVM and did the profiling of the application.

This application is hosted on a JavaEE server (Glassfish) and uses the quasi-semantic stack of Empire-RDF , Blueprints and neo4j . Access to neo4j is offered by the JCA neo4j-connector .

As shown in the screenshot, there is strong evidence that there is a bottleneck in the search for neo4j.

Interesting fragments of a profiling session

My question is double but simple.

Is this performance level normal? (I think no)
What can I do to improve these features?

EDIT here are some testing information that should educate both of you.

My graphics structure is unknown to me: since I use Empire-RDF on top of Blueprints / Sesame / Neo4J, I only know the Java objects that I manipulate, which are ten interrelated classes, and they unfortunately are in the very heart of our business, so I don’t I can open them.

Consider, for this example, they create a tree of visual elements associated with objects representing URI objects.

I have a maven test that performs a combination of read / write operations (I will say that there are 20 to 50 JPA operations). This maven test passes after 300 seconds .

At a lower level

application
runs on Windows 7 and Mac OS X 10.6 with various under versions of Java 1.6.
The application is hosted on Glassfish 3.1.1.
neo4j DB is version 1.5, accessed through the neo4j-connector for the JCA (there are no settings made to the default settings).
Sesame - version 2.6.0
drawings version 1.1
Empire-RDF - version 0.7

As the last world, immersion in the jVisualVM sampler shows that most of the application time is spent on these calls to NodeManager#getNodeForProxy .

+4

java profiling neo4j rdf

Riduidel Feb 01 '12 at 13:52

source share

2 answers

The last time I used the neo4j sail, I was very disappointed with the performance. Insertions, even voluminous insertions, were unacceptably slow, and everything but the simplest queries was too slow for any user interface.

Of course, this was about two years ago, so most likely the performance is different (maybe even better) than the last time I looked at it, but at a time when it was so far from all the allocated RDF databases I do not "Imagine that they caught up."

neo4j is good if you use it as a graph repository, but I don't think it works well for RDF. You will be much better off using a real RDF database. Since you are using Empire, it should be easy to fall in most other RDF databases and see how this affects performance, assuming you are not relying on anything related to neo4j / Blueprints. If this is the case, Stardog includes bindings for drawings that are worth paying attention to.

+3

Michael Feb 01 '12 at 18:28

source share

Riduidel · Accepted Answer · 2012-02-03T13:34:49+0000

OK, time to put an end to this joke, and thanks to Mike who helped me.

The performance issue was not a bug of neo4J 1.5, neither Empire nor Blueprints one, but rather my poor understanding of my own save stack.

Do you remember that the used instance of neo4j was obtained from the JCA connector?

Well, I used version 0.2 of this connector, which worked with neo4j 1.4 ... Yes, 1.4!

At Fortunatly, I already prepared an update for this version, allowing me to send parameters directly to neo4j (for example, setting the cache_type parameter). So I finished this update , bundled it, deployed it to my local repository, integrated it into my domain, tested it and ... did it! a x20 performance improvement!

What could lead to this bad neo4j performance?

More articles: