Customization
I am implementing a recommendation system running on an Ubuntu 12.4 server using Titan Rexster (titan-server-0.4.4.zip) with the Elasticearch backend. To connect to a Rexster server, I use the Bulbflow library for python.
Beta seemed to be working fine for 3 weeks, but with an increase in load (just a couple of users ~ 10), the Rexster server stopped responding. I do not know if my rexster configuration is incorrect or if I am using the Bulbflow library incorrectly.
Rexster / Titan Configuration
Here is my rexster-cassandra-es.xml:
<?xml version="1.0" encoding="UTF-8"?> <rexster> <http> <server-port>8182</server-port> <server-host>0.0.0.0</server-host> <base-uri>http://MY_IP</base-uri> <web-root>public</web-root> <character-set>UTF-8</character-set> <enable-jmx>false</enable-jmx> <enable-doghouse>true</enable-doghouse> <max-post-size>2097152</max-post-size> <max-header-size>8192</max-header-size> <upload-timeout-millis>30000</upload-timeout-millis> <thread-pool> <worker> <core-size>20</core-size> <max-size>40</max-size> </worker> <kernal> <core-size>10</core-size> <max-size>20</max-size> </kernal> </thread-pool> <io-strategy>leader-follower</io-strategy> </http> <rexpro> <server-port>8184</server-port> <server-host>0.0.0.0</server-host> <session-max-idle>1790000</session-max-idle> <session-check-interval>3000000</session-check-interval> <connection-max-idle>180000</connection-max-idle> <connection-check-interval>3000000</connection-check-interval> <enable-jmx>false</enable-jmx> <thread-pool> <worker> <core-size>8</core-size> <max-size>8</max-size> </worker> <kernal> <core-size>4</core-size> <max-size>4</max-size> </kernal> </thread-pool> <io-strategy>leader-follower</io-strategy> </rexpro> <shutdown-port>8183</shutdown-port> <shutdown-host>127.0.0.1</shutdown-host> <script-engines> <script-engine> <name>gremlin-groovy</name> <reset-threshold>-1</reset-threshold> <imports>com.tinkerpop.gremlin.*,com.tinkerpop.gremlin.java.*,com.tinkerpop.gremlin.pipes.filter.*,com.tinkerpop.gremlin.pipes.sideeffect.*,com.tinkerpop.gremlin.pipes.transform.*,com.tinkerpop.blueprints.*,com.tinkerpop.blueprints.impls.*,com.tinkerpop.blueprints.impls.tg.*,com.tinkerpop.blueprints.impls.neo4j.*,com.tinkerpop.blueprints.impls.neo4j.batch.*,com.tinkerpop.blueprints.impls.orient.*,com.tinkerpop.blueprints.impls.orient.batch.*,com.tinkerpop.blueprints.impls.dex.*,com.tinkerpop.blueprints.impls.rexster.*,com.tinkerpop.blueprints.impls.sail.*,com.tinkerpop.blueprints.impls.sail.impls.*,com.tinkerpop.blueprints.util.*,com.tinkerpop.blueprints.util.io.*,com.tinkerpop.blueprints.util.io.gml.*,com.tinkerpop.blueprints.util.io.graphml.*,com.tinkerpop.blueprints.util.io.graphson.*,com.tinkerpop.blueprints.util.wrappers.*,com.tinkerpop.blueprints.util.wrappers.batch.*,com.tinkerpop.blueprints.util.wrappers.batch.cache.*,com.tinkerpop.blueprints.util.wrappers.event.*,com.tinkerpop.blueprints.util.wrappers.event.listener.*,com.tinkerpop.blueprints.util.wrappers.id.*,com.tinkerpop.blueprints.util.wrappers.partition.*,com.tinkerpop.blueprints.util.wrappers.readonly.*,com.tinkerpop.blueprints.oupls.sail.*,com.tinkerpop.blueprints.oupls.sail.pg.*,com.tinkerpop.blueprints.oupls.jung.*,com.tinkerpop.pipes.*,com.tinkerpop.pipes.branch.*,com.tinkerpop.pipes.filter.*,com.tinkerpop.pipes.sideeffect.*,com.tinkerpop.pipes.transform.*,com.tinkerpop.pipes.util.*,com.tinkerpop.pipes.util.iterators.*,com.tinkerpop.pipes.util.structures.*,org.apache.commons.configuration.*,com.thinkaurelius.titan.core.*,com.thinkaurelius.titan.core.attribute.*,com.thinkaurelius.titan.core.util.*,com.thinkaurelius.titan.example.*,org.apache.commons.configuration.*,com.tinkerpop.gremlin.Tokens.T,com.tinkerpop.gremlin.groovy.*</imports> <static-imports>com.tinkerpop.blueprints.Direction.*,com.tinkerpop.blueprints.TransactionalGraph$Conclusion.*,com.tinkerpop.blueprints.Compare.*,com.thinkaurelius.titan.core.attribute.Geo.*,com.thinkaurelius.titan.core.attribute.Text.*,com.thinkaurelius.titan.core.TypeMaker$UniquenessConsistency.*,com.tinkerpop.blueprints.Query$Compare.*</static-imports> </script-engine> </script-engines> <security> <authentication> <type>none</type> <configuration> <users> <user> <username>rexster</username> <password>rexster</password> </user> </users> </configuration> </authentication> </security> <metrics> <reporter> <type>jmx</type> </reporter> <reporter> <type>http</type> </reporter> <reporter> <type>console</type> <properties> <rates-time-unit>SECONDS</rates-time-unit> <duration-time-unit>SECONDS</duration-time-unit> <report-period>10</report-period> <report-time-unit>MINUTES</report-time-unit> <includes>http.rest.*</includes> <excludes>http.rest.*.delete</excludes> </properties> </reporter> </metrics> <graphs> <graph> <graph-name>newspaper</graph-name> <graph-type>com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration</graph-type> <graph-read-only>false</graph-read-only> <properties> <storage.backend>cassandra</storage.backend> <storage.index.search.backend>elasticsearch</storage.index.search.backend> <storage.index.search.hostname>localhost</storage.index.search.hostname> <storage.index.search.client-only>true</storage.index.search.client-only> <storage.index.search.local-mode>false</storage.index.search.local-mode> </properties> <extensions> <allows> <allow>tp:gremlin</allow> </allows> </extensions> </graph> </graphs> </rexster>
I changed the kernel size and the maximum threadpool size for the worker and the kernel, without this change the Rexster server freezes / does not respond even faster.
What are the appropriate values ββfor kernel size and maximum size?
Using Bulbflow
To use bulbflow, I create a new Graph object every time I need to execute a query. There are many requests, so these objects are created very often.
Should I create a new Graph object for each new query?
Is it possible to create only one Graph object and use it whenever a new request is sent to the graph database or I run session problems?
Error message
When everything gets stuck and I force the program to terminate (ctrl-c), I get the following stack:
Exception happened during processing of request from ('my_ip', 57489) Traceback (most recent call last): File "/usr/lib/python2.7/SocketServer.py", line 284, in _handle_request_noblock self.process_request(request, client_address) File "/usr/lib/python2.7/SocketServer.py", line 310, in process_request self.finish_request(request, client_address) File "/usr/lib/python2.7/SocketServer.py", line 323, in finish_request self.RequestHandlerClass(request, client_address, self) File "/usr/lib/python2.7/SocketServer.py", line 638, in __init__ self.handle() File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/werkzeug/serving.py", line 200, in handle rv = BaseHTTPRequestHandler.handle(self) File "/usr/lib/python2.7/BaseHTTPServer.py", line 340, in handle self.handle_one_request() File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/werkzeug/serving.py", line 235, in handle_one_request return self.run_wsgi() File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/werkzeug/serving.py", line 177, in run_wsgi execute(self.server.app) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/werkzeug/serving.py", line 165, in execute application_iter = app(environ, start_response) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/flask/app.py", line 1836, in __call__ return self.wsgi_app(environ, start_response) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app response = self.full_dispatch_request() File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request rv = self.dispatch_request() File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/user/dir/recommender/project/api/start.py", line 65, in put_user graphdb.insert_user(user_id) File "project/api/graphdb.py", line 14, in insert_user user_with_id = g.users.index.lookup(user_sqlid=user_id) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/bulbs/titan/index.py", line 270, in lookup resp = self.client.lookup_vertex(self.index_name,key,value) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/bulbs/titan/client.py", line 348, in lookup_vertex return self.request.get(path,params) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/bulbs/rest.py", line 101, in get return self.request(GET, path, params) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/bulbs/rest.py", line 184, in request http_resp = self.http.request(uri, method, body, headers) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/httplib2/__init__.py", line 1593, in request (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/httplib2/__init__.py", line 1335, in _request (response, content) = self._conn_request(conn, request_uri, method, body, headers) File "/home/user/dir/env/venv_python/local/lib/python2.7/site-packages/httplib2/__init__.py", line 1291, in _conn_request response = conn.getresponse() File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse response.begin() File "/usr/lib/python2.7/httplib.py", line 407, in begin version, status, reason = self._read_status() File "/usr/lib/python2.7/httplib.py", line 365, in _read_status line = self.fp.readline() File "/usr/lib/python2.7/socket.py", line 430, in readline data = recv(1)
Recovery
To recover, I have to close rexster / titan and restart it. Whenever I stop the Rexster server (./bin/titan -c cassandra-es stop), I get the following output:
Killing Titan + Rexster (pid 26779)... Rexster shutdown timeout exceeded (60 seconds) Killing Cassandra (pid 26201)...
Rexster is completely stuck.
Looking forward to get a helpful guide.