esm github地址。里面有详细介绍。
https://github.com/medcl/esm
由于之前先做了一部分迁移,过了一个星期才正式使用,中间的一星期数据没迁移。此时数据已经开始忘es中写了。
有两种方案。 1、全量再迁移一次。不需要删除。相同的会覆盖掉。 2、记住时间 使用 -q 命令过滤某端时间的数据进去。
esm ?-s http://xxx.xxx.xxx.xxx:9200 -q "time字段:[1624932000000 TO 1625464800000]" ?-d http://xxx.xxx.xxx.xxx:9200 -n username:password -x 索引? -w=5 -b=10 -c 10000
忘记时间可以使用第一种。记得大概时间使用第二种即可。
我只记得上周二做了迁移 具体的时间忘记了。从周二早0点开始,到现在所有的数据全部迁移一遍。重复数据会自动覆盖。
命令:官方文档里有这里再贴一遍
-s, --source= source elasticsearch instance, ie: http://localhost:9200
-q, --query= query against source elasticsearch instance, filter data before migrate, ie: name:medcl
-d, --dest= destination elasticsearch instance, ie: http://localhost:9201
-m, --source_auth= basic auth of source elasticsearch instance, ie: user:pass
-n, --dest_auth= basic auth of target elasticsearch instance, ie: user:pass
-c, --count= number of documents at a time: ie "size" in the scroll request (10000)
--buffer_count= number of buffered documents in memory (100000)
-w, --workers= concurrency number for bulk workers (1)
-b, --bulk_size= bulk size in MB (5)
-t, --time= scroll time (1m)
--sliced_scroll_size= size of sliced scroll, to make it work, the size should be > 1 (1)
-f, --force delete destination index before copying
-a, --all copy indexes starting with . and _
--copy_settings copy index settings from source
--copy_mappings copy index mappings from source
--shards= set a number of shards on newly created indexes
-x, --src_indexes= indexes name to copy,support regex and comma separated list (_all)
-y, --dest_index= indexes name to save, allow only one indexname, original indexname will be used if not specified
-u, --type_override= override type name
--green wait for both hosts cluster status to be green before dump. otherwise yellow is okay
-v, --log= setting log level,options:trace,debug,info,warn,error (INFO)
-o, --output_file= output documents of source index into local file
-i, --input_file= indexing from local dump file
--input_file_type= the data type of input file, options: dump, json_line, json_array, log_line (dump)
--source_proxy= set proxy to source http connections, ie: http://127.0.0.1:8080
--dest_proxy= set proxy to target http connections, ie: http://127.0.0.1:8080
--refresh refresh after migration finished
--fields= filter source fields, comma separated, ie: col1,col2,col3,...
--rename= rename source fields, comma separated, ie: _type:type, name:myname
-l, --logstash_endpoint= target logstash tcp endpoint, ie: 127.0.0.1:5055
--secured_logstash_endpoint target logstash tcp endpoint was secured by TLS
--repeat_times= repeat the data from source N times to dest output, use align with parameter regenerate_id to amplify the data size
-r, --regenerate_id regenerate id for documents, this will override the exist document id in data source
--compress use gzip to compress traffic
-p, --sleep= sleep N seconds after finished a bulk request (-1)
|