Neo4j performance chess is a complex field.
Measurement performance
First of all: it all depends a lot on how the server is configured. Measuring something on a laptop is the wrong way to do it.
To measure performance, you should check the following:
- You have the appropriate server hardware ( requirements )
- The client and server are on the local network.
- Neo4j is configured correctly (memory mapping, web server thread pool, java heap size, etc.)
- The server is configured correctly (tcp Linux stack, maximum open files available, etc.)
- The server is warming up. Neo4j is written in Java, so you should do the appropriate workout before measuring the numbers (i.e. do some work for ~ 15 minutes).
And the last one is the corporate edition. The corporate version of Neo4j has some additional features that can significantly improve performance (i.e. HPC cache ).
Neo4j internally
Neo4j internally:
- Storage
- Core API
- Traverse API
- API Cypher
Everything is done without any additional network requests. The Neo4j server is built on top of this solid foundation.
So, when you make a request to the Neo4j server, you measure:
- Delay between client and server
- JSON serialization costs
- Web Server (Jetty)
- Additional modules for managing locks, transactions, etc.
- And Neo4j itself
So, the bottom line here is Neo4j pretty quickly on its own if it is used in native mode. But work with the Neo4j server is associated with additional costs.
The numbers
We had internal testing of Neo4j. We measured several cases.
Create Nodes
Here we use the vanilla Transactional Cypher REST API.
Topics: 2
Node per transaction: 1000 Execution time: 1635 Total nodes created: 7000000 Nodes per second: 7070
5 topics
Node per transaction: 750 Execution time: 852 Total nodes created: 7000000 Nodes per second: 8215
Huge database synchronization
This uses a specially designed unmanaged extension , with a binary protocol between the server and the client and some concurrency.
But this is still a Neo4j server (actually a Neo4j cluster).
Node count: 80.32M (80 320 000) Relationship count: 80.30M (80 300 000) Property count: 257.78M (257 780 000) Consumed time: 2142 seconds Per second: Nodes - 37497 Relationships - 37488 Properties - 120345
These numbers show the true power of Neo4j.
My numbers
I tried to measure performance right now
Fresh and unconfigured database (2.2.5), Ubuntu 14.04 (VM).
Results:
$ ab -p post_loc.txt -T application/json -c 1 -n 10000 http://localhost:7474/db/data/node This is ApacheBench, Version 2.3 <$Revision: 1604373 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking localhost (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests Server Software: Jetty(9.2.4.v20141103) Server Hostname: localhost Server Port: 7474 Document Path: /db/data/node Document Length: 1245 bytes Concurrency Level: 1 Time taken for tests: 14.082 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 14910000 bytes Total body sent: 1460000 HTML transferred: 12450000 bytes Requests per second: 710.13 [
This creates 10,000 nodes using the REST API, with no properties in 1 thread.
As you can see, the event on my laptop in Linux VM with the default settings - Neo4j can create nodes in 4 ms or less (99%).
Note. I preheated the database (created and deleted 100K nodes).
Bolt
If you are looking for the best Neo4j performance, you should follow the development of Bolt . This is the new binary protocol for the Neo4j server.
Additional information: here , here and here .