Using HTTP POST, the following script can insert a new field createtimeor update lastupdatetime:
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"doc": {
"lastupdatetime": "2015-09-16T18:00:00"
}
"upsert" : {
"createtime": "2015-09-16T18:00:00"
"lastupdatetime": "2015-09-16T18:00",
}
}'
But in the spark script, after installation "es.write.operation": "upsert", I do not know how to paste createtimeat all. There is onlyes.update.script.* in the white paper ... So, can anyone give me an example?
The UPDATE . In my case, I want to save information about Android devices from the log into one elasticsearch type and set the time of first appearance as createtime. If the device appears again, I update only lastupdatetime, but leave it createtimeas it is.
So the document idis an Android identifier, if it exists, refresh lastupdatetime, otherwise insert createtimeand lastupdatetime. So here is the parameter (in python):
conf = {
"es.resource.write": "stats-device/activation",
"es.nodes": "NODE1:9200",
"es.write.operation": "upsert",
"es.mapping.id": "id"
}
rdd.saveAsNewAPIHadoopFile(
path='-',
outputFormatClass="org.elasticsearch.hadoop.mr.EsOutputFormat",
keyClass="org.apache.hadoop.io.NullWritable",
valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
conf=conf
)
I just don't know how to insert a new field if it iddoesn't exist.
source
share