scala - Elasticsearch-Hadoop library cannot connect to to docker container -


i have spark job reads cassandra, processes/transforms/filters data, , writes results elasticsearch. use docker integration tests, , running trouble of writing spark elasticsearch.

dependencies:

"joda-time"              % "joda-time"          % "2.9.4", "javax.servlet"          %  "javax.servlet-api" % "3.1.0", "org.elasticsearch"      %  "elasticsearch"     % "2.3.2", "org.scalatest"          %% "scalatest"         % "2.2.1", "com.github.nscala-time" %% "nscala-time"       % "2.10.0", "cascading"              %   "cascading-hadoop" % "2.6.3", "cascading"              %   "cascading-local"  % "2.6.3", "com.datastax.spark"     %% "spark-cassandra-connector" % "1.4.2", "com.datastax.cassandra" % "cassandra-driver-core" % "2.1.5", "org.elasticsearch"      %  "elasticsearch-hadoop"      % "2.3.2" excludeall(exclusionrule("org.apache.storm")), "org.apache.spark"       %% "spark-catalyst"            % "1.4.0" % "provided" 

in unit tests can connect elasticsearch using transportclient setup template , index

aka. works

val conf = new sparkconf().setappname("test_reindex").setmaster("local")   .set("spark.cassandra.input.split.size_in_mb", "67108864")   .set("spark.cassandra.connection.host", cassandrahoststring)   .set("es.nodes", elasticsearchhoststring)   .set("es.port", "9200")   .set("http.publish_host", "") sc = new sparkcontext(conf) esclient = transportclient.builder().build() esclient.addtransportaddress(new inetsockettransportaddress(inetaddress.getbyname(elasticsearchhoststring), 9300)) esclient.admin().indices().prepareputtemplate(testtemplate).setsource(source.frominputstream(getclass.getresourceasstream("/mytemplate.json")).mkstring).execute().actionget() esclient.admin().indices().preparecreate(estestindex).execute().actionget() esclient.admin().indices().preparealiases().addalias(estestindex, "hot").execute().actionget() 

however when try run

esspark.savetoes(   myrdd,   "hot/mytype",   map("es.mapping.id" -> "id", "es.mapping.parent" -> "parent_id") ) 

i receive stack trace

org.elasticsearch.hadoop.rest.eshadoopnonodesleftexception: connection error (check network and/or proxy settings)- nodes failed; tried [[172.17.0.2:9200]]  @ org.elasticsearch.hadoop.rest.networkclient.execute(networkclient.java:142) @ org.elasticsearch.hadoop.rest.restclient.execute(restclient.java:434) @ org.elasticsearch.hadoop.rest.restclient.executenotfoundallowed(restclient.java:442) @ org.elasticsearch.hadoop.rest.restclient.exists(restclient.java:518) @ org.elasticsearch.hadoop.rest.restclient.touch(restclient.java:524) @ org.elasticsearch.hadoop.rest.restrepository.touch(restrepository.java:491) @ org.elasticsearch.hadoop.rest.restservice.initsingleindex(restservice.java:412) @ org.elasticsearch.hadoop.rest.restservice.createwriter(restservice.java:400) @ org.elasticsearch.spark.rdd.esrddwriter.write(esrddwriter.scala:40) @ org.elasticsearch.spark.rdd.esspark$$anonfun$savetoes$1.apply(esspark.scala:67) @ org.elasticsearch.spark.rdd.esspark$$anonfun$savetoes$1.apply(esspark.scala:67) @ org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:66) @ org.apache.spark.scheduler.task.run(task.scala:89) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:214) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617) @ java.lang.thread.run(thread.java:745) 16/08/08 12:30:46 warn tasksetmanager: lost task 0.0 in stage 2.0 (tid 2, localhost): org.elasticsearch.hadoop.rest.eshadoopnonodesleftexception: connection error (check network and/or proxy settings)- nodes failed; tried [[172.17.0.2:9200]]  @ org.elasticsearch.hadoop.rest.networkclient.execute(networkclient.java:142) @ org.elasticsearch.hadoop.rest.restclient.execute(restclient.java:434) @ org.elasticsearch.hadoop.rest.restclient.executenotfoundallowed(restclient.java:442) @ org.elasticsearch.hadoop.rest.restclient.exists(restclient.java:518) @ org.elasticsearch.hadoop.rest.restclient.touch(restclient.java:524) @ org.elasticsearch.hadoop.rest.restrepository.touch(restrepository.java:491) @ org.elasticsearch.hadoop.rest.restservice.initsingleindex(restservice.java:412) @ org.elasticsearch.hadoop.rest.restservice.createwriter(restservice.java:400) @ org.elasticsearch.spark.rdd.esrddwriter.write(esrddwriter.scala:40) @ org.elasticsearch.spark.rdd.esspark$$anonfun$savetoes$1.apply(esspark.scala:67) @ org.elasticsearch.spark.rdd.esspark$$anonfun$savetoes$1.apply(esspark.scala:67) @ org.apache.spark.scheduler.resulttask.runtask(resulttask.scala:66) @ org.apache.spark.scheduler.task.run(task.scala:89) @ org.apache.spark.executor.executor$taskrunner.run(executor.scala:214) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1142) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:617) @ java.lang.thread.run(thread.java:745) 

i can verify using 'docker network inspect bridge trying connect correct ip address.

docker network inspect bridge [ {     "name": "bridge",     "id": "ef184e3be3637be28f854c3278f1c8647be822a9413120a8957de6d2d5355de1",     "scope": "local",     "driver": "bridge",     "enableipv6": false,     "ipam": {         "driver": "default",         "options": null,         "config": [             {                 "subnet": "172.17.0.0/16",                 "gateway": "172.17.0.1"             }         ]     },     "internal": false,     "containers": {         "0c79680de8ef815bbe4bdd297a6f845cce97ef18bb2f2c12da7fe364906c3676": {             "name": "analytics_rabbitmq_1",             "endpointid": "3f03fdabd015fa1e2af802558aa59523f4a3c8c72f1231d07c47a6c8e60ae0d4",             "macaddress": "02:42:ac:11:00:04",             "ipv4address": "172.17.0.4/16",             "ipv6address": ""         },         "9b1f37c8df344c50e042c4b3c75fcb2774888f93fd7a77719fb286bb13f76f38": {             "name": "analytics_elasticsearch_1",             "endpointid": "fb083d27aaf8c0db1aac90c2a1ea2f752c46d8ac045e365f4b9b7d1651038a56",             "macaddress": "02:42:ac:11:00:02",             "ipv4address": "172.17.0.2/16",             "ipv6address": ""         },         "ed0cfad868dbac29bda66de6bee93e7c8caf04d623d9442737a00de0d43c372a": {             "name": "analytics_cassandra_1",             "endpointid": "2efa95980d681b3627a7c5e952e2f01980cf5ffd0fe4ba6185b2cab735784df6",             "macaddress": "02:42:ac:11:00:03",             "ipv4address": "172.17.0.3/16",             "ipv6address": ""         }     },     "options": {         "com.docker.network.bridge.default_bridge": "true",         "com.docker.network.bridge.enable_icc": "true",         "com.docker.network.bridge.enable_ip_masquerade": "true",         "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",         "com.docker.network.bridge.name": "docker0",         "com.docker.network.driver.mtu": "1500"     },     "labels": {} } ] 

i running locally on macbook/osx. @ loss why can connect docker container using transportclient , through browser, function esspark.savetoes(...) fails.


Comments