Zookeeper not working?

I wanted to run the solr cloud with solr 4.3.0.

(I am using aws ubuntu-12.04-lts micro instances)

So, I followed this toturial :

which basically says run zookeeper and connect solr instances to it.

This is how I start a zookeeper.

  • First I copied the configuration as described in the tutorial

    sudo cp zookeeper-3.4.5/conf/zoo_sample.cfg zookeeper-3.4.5/conf/zoo.cfg 
  • Then I launched zookeeper

     ubuntu@ip-10-48-159-36 :/opt$ sudo zookeeper-3.4.5/bin/zkServer.sh start JMX enabled by default Using config: /opt/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED 

    It looks good so far.

  • I checked the status:

     ubuntu@ip-10-48-159-36 :/opt$ sudo zookeeper-3.4.5/bin/zkServer.sh status JMX enabled by default Using config: /opt/zookeeper-3.4.5/bin/../conf/zoo.cfg Error contacting service. It is probably not running. 

    Which seems a little strange already.

  • If I try to connect to a client (remote as well as local), it seems to work

     ubuntu@ip-10-234-223-69 :/opt$ zookeeper-3.4.5/bin/zkCli.sh -server ec2-54-247-144-120.eu-west-1.compute.amazonaws.com:2181 Connecting to ec2-54-247-144-120.eu-west-1.compute.amazonaws.com:2181 2013-06-07 11:07:01,996 [myid:] - INFO [main: Environment@100 ] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT 2013-06-07 11:07:02,000 [myid:] - INFO [main: Environment@100 ] - Client environment:host.name=ip-10-234-223-69.eu-west-1.compute.internal 2013-06-07 11:07:02,000 [myid:] - INFO [main: Environment@100 ] - Client environment:java.version=1.6.0_27 2013-06-07 11:07:02,002 [myid:] - INFO [main: Environment@100 ] - Client environment:java.vendor=Sun Microsystems Inc. 2013-06-07 11:07:02,003 [myid:] - INFO [main: Environment@100 ] - Client environment:java.home=/usr/lib/jvm/java-6-openjdk-amd64/jre 2013-06-07 11:07:02,003 [myid:] - INFO [main: Environment@100 ] - Client environment:java.class.path=/opt/zookeeper-3.4.5/bin/../build/classes:/opt/zookeeper-3.4.5/bin/../build/lib/*.jar:/opt/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/opt/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/opt/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/opt/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/opt/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/opt/zookeeper-3.4.5/bin/../conf: 2013-06-07 11:07:02,004 [myid:] - INFO [main: Environment@100 ] - Client environment:java.library.path=/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64/server:/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64:/usr/lib/jvm/java-6-openjdk-amd64/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib 2013-06-07 11:07:02,008 [myid:] - INFO [main: Environment@100 ] - Client environment:java.io.tmpdir=/tmp 2013-06-07 11:07:02,009 [myid:] - INFO [main: Environment@100 ] - Client environment:java.compiler=<NA> 2013-06-07 11:07:02,018 [myid:] - INFO [main: Environment@100 ] - Client environment:os.name=Linux 2013-06-07 11:07:02,019 [myid:] - INFO [main: Environment@100 ] - Client environment:os.arch=amd64 2013-06-07 11:07:02,019 [myid:] - INFO [main: Environment@100 ] - Client environment:os.version=3.2.0-40-virtual 2013-06-07 11:07:02,020 [myid:] - INFO [main: Environment@100 ] - Client environment:user.name=ubuntu 2013-06-07 11:07:02,020 [myid:] - INFO [main: Environment@100 ] - Client environment:user.home=/home/ubuntu 2013-06-07 11:07:02,021 [myid:] - INFO [main: Environment@100 ] - Client environment:user.dir=/opt 2013-06-07 11:07:02,029 [myid:] - INFO [main: ZooKeeper@438 ] - Initiating client connection, connectString=ec2-54-247-144-120.eu-west-1.compute.amazonaws.com:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@182d9c06 Welcome to ZooKeeper! 2013-06-07 11:07:02,074 [myid:] - INFO [main-SendThread(ip-10-48-159-36.eu-west-1.compute.internal:2181): ClientCnxn$SendThread@966 ] - Opening socket connection to server ip-10-48-159-36.eu-west-1.compute.internal/10.48.159.36:2181. Will not attempt to authenticate using SASL (unknown error) JLine support is enabled [zk: ec2-54-247-144-120.eu-west-1.compute.amazonaws.com:2181(CONNECTING) 0] 2013-06-07 11:07:32,100 [myid:] - INFO [main-SendThread(ip-10-48-159-36.eu-west-1.compute.internal:2181): ClientCnxn$SendThread@1083 ] - Client session timed out, have not heard from server in 30038ms for sessionid 0x0, closing socket connection and attempting reconnect 2013-06-07 11:07:33,204 [myid:] - INFO [main-SendThread(ip-10-48-159-36.eu-west-1.compute.internal:2181): ClientCnxn$SendThread@966 ] - Opening socket connection to server ip-10-48-159-36.eu-west-1.compute.internal/10.48.159.36:2181. Will not attempt to authenticate using SASL (unknown error) 
  • Now I tried to connect a solr instance to it. In the tomcat7 web interface it only tells me “503 - the server is shutting down”, so I checked the solr logs

     2013-06-07 11:16:36,065 [pool-2-thread-1] INFO org.apache.solr.servlet.SolrDispatchFilter . SolrDispatchFilter.init() 2013-06-07 11:16:36,100 [pool-2-thread-1] INFO org.apache.solr.core.SolrResourceLoader . Using JNDI solr.home: /opt/solr-4.3.0/example/solr 2013-06-07 11:16:36,132 [pool-2-thread-1] INFO org.apache.solr.core.CoreContainer . looking for solr config file: /opt/solr-4.3.0/example/solr/solr.xml 2013-06-07 11:16:36,138 [pool-2-thread-1] INFO org.apache.solr.core.CoreContainer . New CoreContainer 1285984216 2013-06-07 11:16:36,146 [pool-2-thread-1] INFO org.apache.solr.core.CoreContainer . Loading CoreContainer using Solr Home: '/opt/solr-4.3.0/example/solr/' 2013-06-07 11:16:36,152 [pool-2-thread-1] INFO org.apache.solr.core.SolrResourceLoader . new SolrResourceLoader for directory: '/opt/solr-4.3.0/example/solr/' 2013-06-07 11:16:36,567 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting socketTimeout to: 0 2013-06-07 11:16:36,568 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting urlScheme to: http:// 2013-06-07 11:16:36,568 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting connTimeout to: 0 2013-06-07 11:16:36,568 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting maxConnectionsPerHost to: 20 2013-06-07 11:16:36,568 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting corePoolSize to: 0 2013-06-07 11:16:36,568 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting maximumPoolSize to: 2147483647 2013-06-07 11:16:36,568 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting maxThreadIdleTime to: 5 2013-06-07 11:16:36,569 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting sizeOfQueue to: -1 2013-06-07 11:16:36,569 [pool-2-thread-1] INFO org.apache.solr.handler.component.HttpShardHandlerFactory . Setting fairnessPolicy to: false 2013-06-07 11:16:36,578 [pool-2-thread-1] INFO org.apache.solr.client.solrj.impl.HttpClientUtil . Creating new http client, config:maxConnectionsPerHost=20&maxConnections=10000&socketTimeout=0&connTimeout=0&retry=false 2013-06-07 11:16:36,879 [pool-2-thread-1] INFO org.apache.solr.core.CoreContainer . Registering Log Listener 2013-06-07 11:16:36,881 [pool-2-thread-1] INFO org.apache.solr.core.CoreContainer . Zookeeper client=ec2-54-247-144-120.eu-west-1.compute.amazonaws.com:2181 2013-06-07 11:16:36,888 [pool-2-thread-1] INFO org.apache.solr.client.solrj.impl.HttpClientUtil . Creating new http client, config:maxConnections=500&maxConnectionsPerHost=16&socketTimeout=0&connTimeout=0 2013-06-07 11:16:37,040 [pool-2-thread-1] INFO org.apache.solr.common.cloud.ConnectionManager . Waiting for client to connect to ZooKeeper 2013-06-07 11:16:52,046 [pool-2-thread-1] ERROR org.apache.solr.servlet.SolrDispatchFilter . Could not start Solr. Check solr/home property and the logs 2013-06-07 11:16:52,103 [pool-2-thread-1] ERROR org.apache.solr.core.SolrCore . null:java.lang.RuntimeException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper ec2-54-247-144-120.eu-west-1.compute.amazonaws.com:2181 within 15000 ms at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:130) at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:88) at org.apache.solr.cloud.ZkController.<init>(ZkController.java:170) at org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:242) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:495) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:358) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:326) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:124) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:277) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382) at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:103) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4638) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5294) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:895) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:871) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:615) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:649) at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1581) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) Caused by: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper ec2-54-247-144-120.eu-west-1.compute.amazonaws.com:2181 within 15000 ms at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:173) at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:127) ... 25 more 2013-06-07 11:16:52,104 [pool-2-thread-1] INFO org.apache.solr.servlet.SolrDispatchFilter . SolrDispatchFilter.init() done 

What does this tell me? In the same instance, I just connected to the client successfully ... :(

So where is the problem?

[Edit:] Instead of using amazon ec **. Amazon * Address I used 10.XXX network addresses to tell solr where the zookeeper is. This seems to work.

+4
source share
4 answers

You have an answer - your ZooKeeper is unavailable! Check your firewall configuration.

You can also check it with

 zkCli.sh -server localhost:2181 
+1
source

There should have been some kind of connection problem. I see you have decided that now. The next time you encounter such a situation, you should enter a field in which there are problems connecting and using telnet to find out if you can connect.

for example: from your solr window:

 telnet ec2-54-247-144-120.eu-west-1.compute.amazonaws.com 2181 

and then try also in zk box. He should begin to cover where your problems are.

This fixes any problems at the application level and will tell you fairly reliably, or you cannot connect. It cannot be connected, then almost always there is some kind of security problem - either the firewall (try - $ service iptables stop) works somewhere, or it is a problem with the configuration of the security group on Amazon.

The last potential problem is network availability. Despite what people think, the network is NOT reliable and should never be considered such. Anyone who works in SOA / distributed systems knows this well :) http://aphyr.com/posts/288-the-network-is-reliable

“A team from the University of Toronto and Microsoft Research studied the behavior of network failures in several Microsoft data centers. They found an average failure rate of 5.2 devices per day and 40.8 links per day with an average recovery time of approximately five minutes (and up to one week).

+1
source

When setting up SolrCloud and ZooKeeper, I also encountered a "bug communication service. It probably doesn't work." question. The reason was a typo in the file name that ZooKeeper required. The correct file name is "myid". I wrote "myip" in error. After renaming the file and restarting ZooKeeper (restarting. / ZkServer.sh) my problem was resolved.

0
source

try stopping your instance of solr.shutdown() so that you can create a new instance of CloudSolrServer for each thread

0
source

Source: https://habr.com/ru/post/1485022/


All Articles