2.3. Перебалансировка данных

2.3.1. Автоматическая перебалансировка данных

По умолчанию используется режим автоматической перебалансировки. Процесс перебалансировки запускается автоматически после добавления узлов (по умолчанию, если не задан параметр --no-rebalance) или перед удалением узла. Перебалансировку также можно запустить вручную. Суть процесса перебалансировки заключается в равномерном распределении секций для каждой сегментированной таблицы между группами репликации.

Процесс перебалансировки для каждой сегментированной таблицы итерационно определяет группу репликации с максимальным и минимальным количеством секций и создаёт задание на перемещение одной секции в группу репликации с минимальным количеством секций. Этот процесс повторяется, пока соблюдается условие max - min > 1. Для перемещения секций используется логическая репликация. Секции совместно размещённых таблиц перемещаются совместно с секциями сегментированных таблиц, на которые они ссылаются.

Важно помнить, что max_logical_replication_workers должно быть довольно большим, так как процесс перебалансировки использует до max(max_replication_slots, max_logical_replication_workers, max_worker_processes, max_wal_senders)/3 параллельных потоков. На практике можно использовать max_logical_replication_workers = Repfactor + 3 * task_num (task_num — количество параллельных заданий перебалансировки).

Чтобы выполнить перебалансировку сегментированных таблиц в кластере cluster0 вручную, выполните следующую команду (где etcd1, etcd2, etcd3 — это узлы кластера etcd):

 $ shardmanctl --store-endpoints http://etcd1:2379,http://etcd2:2379,http://etcd3:2379 rebalance 

Если процесс завершается ошибкой, необходимо вызвать команду shardmanctl cleanup с параметром --after-rebalance.

2.3.2. Ручная перебалансировка данных

Бывают случаи, когда необходимо определённым образом разместить секции сегментированных таблиц по узлам кластера. Для выполнения этой задачи в Shardman поддерживается режим ручной перебалансировки данных.

Как это работает:

  1. Получите список сегментированных таблиц с помощью команды shardmanctl tables sharded list. Вывод будет примерно следующим:

     $ shardmanctl shardmanctl tables sharded list  Sharded tables: public.doc public.resolution public.users  
  2. Запросите информацию о выбранных сегментированных таблицах. Пример:

     $ shardmanctl shardmanctl tables sharded info -t public.users  Table public.users Partitions: Partition RgID Shard Master 0 1 clover-1-shrn1 shrn1:5432 1 2 clover-2-shrn2 shrn2:5432 2 3 clover-3-shrn3 shrn3:5432 3 1 clover-1-shrn1 shrn1:5432 4 2 clover-2-shrn2 shrn2:5432 5 3 clover-3-shrn3 shrn3:5432 6 1 clover-1-shrn1 shrn1:5432 7 2 clover-2-shrn2 shrn2:5432 8 3 clover-3-shrn3 shrn3:5432 9 1 clover-1-shrn1 shrn1:5432 10 2 clover-2-shrn2 shrn2:5432 11 3 clover-3-shrn3 shrn3:5432 12 1 clover-1-shrn1 shrn1:5432 13 2 clover-2-shrn2 shrn2:5432 14 3 clover-3-shrn3 shrn3:5432 15 1 clover-1-shrn1 shrn1:5432 16 2 clover-2-shrn2 shrn2:5432 17 3 clover-3-shrn3 shrn3:5432 18 1 clover-1-shrn1 shrn1:5432 19 2 clover-2-shrn2 shrn2:5432 20 3 clover-3-shrn3 shrn3:5432 21 1 clover-1-shrn1 shrn1:5432 22 2 clover-2-shrn2 shrn2:5432 23 3 clover-3-shrn3 shrn3:5432  
  3. Переместите секцию в новый сегмент, как показано ниже:

     $ shardmanctl --log-level debug tables sharded partmove -t public.users --partnum 1 --shard clover-1-shrn1  2023-07-26T06:00:36.900Z DEBUG cmd/common.go:105 Waiting for metadata lock... 2023-07-26T06:00:36.936Z DEBUG rebalance/service.go:256 take extension lock 2023-07-26T06:00:36.938Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=3 2023-07-26T06:00:36.938Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=2 2023-07-26T06:00:36.938Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=1 2023-07-26T06:00:36.951Z DEBUG broadcaster/worker.go:51 repgroup 3 connect established 2023-07-26T06:00:36.951Z DEBUG broadcaster/worker.go:51 repgroup 2 connect established 2023-07-26T06:00:36.952Z DEBUG broadcaster/worker.go:51 repgroup 1 connect established 2023-07-26T06:00:36.952Z DEBUG extension/lock.go:35 Waiting for extension lock... 2023-07-26T06:00:36.976Z INFO rebalance/service.go:276 Performing move partition... 2023-07-26T06:00:36.977Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=3 2023-07-26T06:00:36.978Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=2 2023-07-26T06:00:36.978Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=1 2023-07-26T06:00:36.987Z DEBUG broadcaster/worker.go:51 repgroup 1 connect established 2023-07-26T06:00:36.989Z DEBUG broadcaster/worker.go:51 repgroup 2 connect established 2023-07-26T06:00:36.992Z DEBUG broadcaster/worker.go:51 repgroup 3 connect established 2023-07-26T06:00:36.992Z DEBUG rebalance/service.go:71 Performing cleanup after possible rebalance operation failure 2023-07-26T06:00:37.077Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=3 2023-07-26T06:00:37.077Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=1 2023-07-26T06:00:37.077Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=2 2023-07-26T06:00:37.082Z DEBUG rebalance/service.go:422 Rebalance will run 1 tasks 2023-07-26T06:00:37.095Z DEBUG rebalance/service.go:452 Guessing that rebalance() can use 3 workers 2023-07-26T06:00:37.096Z DEBUG rebalance/job.go:352 state: Idle {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"} 2023-07-26T06:00:37.111Z DEBUG rebalance/job.go:352 state: ConnsEstablished {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"} 2023-07-26T06:00:37.171Z DEBUG rebalance/job.go:352 state: WaitInitCopy {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"} 2023-07-26T06:00:38.073Z DEBUG rebalance/job.go:347 current state {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "WaitInitialCatchup"} 2023-07-26T06:00:38.073Z DEBUG rebalance/job.go:352 state: WaitInitialCatchup {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"} 2023-07-26T06:00:38.084Z DEBUG rebalance/job.go:347 current state {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "WaitFullSync"} 2023-07-26T06:00:38.084Z DEBUG rebalance/job.go:352 state: WaitFullSync {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"} 2023-07-26T06:00:38.108Z DEBUG rebalance/job.go:347 current state {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move", "state": "Committing"} 2023-07-26T06:00:38.108Z DEBUG rebalance/job.go:352 state: Committing {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"} 2023-07-26T06:00:38.254Z DEBUG rebalance/job.go:352 state: Complete {"worker_id": 1, "table": "users", "partition num": 1, "source rgid": 2, "dest rgid": 1, "kind": "move"} 2023-07-26T06:00:38.258Z DEBUG rebalance/service.go:583 Produce and process tasks on destination replication groups... 2023-07-26T06:00:38.258Z DEBUG rebalance/service.go:594 Produce and process tasks on source replication groups... 2023-07-26T06:00:38.258Z DEBUG rebalance/service.go:606 wait all tasks finish 2023-07-26T06:00:38.258Z DEBUG rebalance/service.go:531 Analyzing table public.users in rg 1 {"table": "public.users", "rgid": 1, "action": "analyze"} 2023-07-26T06:00:38.573Z DEBUG rebalance/service.go:531 Analyzing table public.users in rg 2 {"table": "public.users", "rgid": 2, "action": "analyze"} 2023-07-26T06:00:38.833Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=1 2023-07-26T06:00:38.833Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=2 2023-07-26T06:00:38.833Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=3  

    В этом примере секция номер 1 таблицы public.users будет перемещена в сегмент clover-1-shrn1.

    После ручного перемещения секции сегментированной таблицы автоматическая перебалансировка данных отключается для этой таблицы и всех совместно размещённых с ней таблиц.

Для получения списка таблиц с отключённой автоматической перебалансировкой, выполните команду shardmanctl table sharded norebalance. Пример:

 $ shardmanctl tables sharded norebalance  public.users  

Для включения автоматической перебалансировки данных для выбранной сегментированной таблицы, выполните команду shardmanctl tables sharded rebalance, как показано в примере ниже:

 $ shardmanctl tables sharded rebalance -t public.users  2023-07-26T07:07:00.657Z DEBUG cmd/common.go:105 Waiting for metadata lock... 2023-07-26T07:07:00.687Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=1 2023-07-26T07:07:00.687Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=2 2023-07-26T07:07:00.687Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=3 2023-07-26T07:07:00.697Z DEBUG broadcaster/worker.go:51 repgroup 1 connect established 2023-07-26T07:07:00.698Z DEBUG broadcaster/worker.go:51 repgroup 2 connect established 2023-07-26T07:07:00.698Z DEBUG broadcaster/worker.go:51 repgroup 3 connect established 2023-07-26T07:07:00.698Z DEBUG extension/lock.go:35 Waiting for extension lock... 2023-07-26T07:07:00.719Z DEBUG rebalance/service.go:381 Planned moving pnum 21 for table users from rg 1 to rg 2 2023-07-26T07:07:00.719Z INFO rebalance/service.go:244 Performing rebalance... 2023-07-26T07:07:00.720Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=1 2023-07-26T07:07:00.720Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=2 2023-07-26T07:07:00.720Z DEBUG broadcaster/worker.go:33 start broadcaster worker for repgroup id=3 2023-07-26T07:07:00.732Z DEBUG broadcaster/worker.go:51 repgroup 3 connect established 2023-07-26T07:07:00.732Z DEBUG broadcaster/worker.go:51 repgroup 1 connect established 2023-07-26T07:07:00.734Z DEBUG broadcaster/worker.go:51 repgroup 2 connect established 2023-07-26T07:07:00.734Z DEBUG rebalance/service.go:71 Performing cleanup after possible rebalance operation failure 2023-07-26T07:07:00.791Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=1 2023-07-26T07:07:00.791Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=2 2023-07-26T07:07:00.791Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=3 2023-07-26T07:07:00.795Z DEBUG rebalance/service.go:422 Rebalance will run 1 tasks 2023-07-26T07:07:00.809Z DEBUG rebalance/service.go:452 Guessing that rebalance() can use 3 workers 2023-07-26T07:07:00.809Z DEBUG rebalance/job.go:352 state: Idle {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"} 2023-07-26T07:07:00.823Z DEBUG rebalance/job.go:352 state: ConnsEstablished {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"} 2023-07-26T07:07:00.880Z DEBUG rebalance/job.go:352 state: WaitInitCopy {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"} 2023-07-26T07:07:01.886Z DEBUG rebalance/job.go:347 current state {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "WaitInitialCatchup"} 2023-07-26T07:07:01.886Z DEBUG rebalance/job.go:352 state: WaitInitialCatchup {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"} 2023-07-26T07:07:01.904Z DEBUG rebalance/job.go:347 current state {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "WaitFullSync"} 2023-07-26T07:07:01.905Z DEBUG rebalance/job.go:352 state: WaitFullSync {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"} 2023-07-26T07:07:01.932Z DEBUG rebalance/job.go:347 current state {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move", "state": "Committing"} 2023-07-26T07:07:01.932Z DEBUG rebalance/job.go:352 state: Committing {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"} 2023-07-26T07:07:02.057Z DEBUG rebalance/job.go:352 state: Complete {"worker_id": 1, "table": "users", "partition num": 21, "source rgid": 1, "dest rgid": 2, "kind": "move"} 2023-07-26T07:07:02.060Z DEBUG rebalance/service.go:583 Produce and process tasks on destination replication groups... 2023-07-26T07:07:02.060Z DEBUG rebalance/service.go:594 Produce and process tasks on source replication groups... 2023-07-26T07:07:02.060Z DEBUG rebalance/service.go:531 Analyzing table public.users in rg 2 {"table": "public.users", "rgid": 2, "action": "analyze"} 2023-07-26T07:07:02.060Z DEBUG rebalance/service.go:606 wait all tasks finish 2023-07-26T07:07:02.321Z DEBUG rebalance/service.go:531 Analyzing table public.users in rg 1 {"table": "public.users", "rgid": 1, "action": "analyze"} 2023-07-26T07:07:02.587Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=3 2023-07-26T07:07:02.587Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=2 2023-07-26T07:07:02.587Z DEBUG broadcaster/worker.go:75 finish broadcaster worker for repgroup id=1  

Чтобы включить автоматическую перебалансировку данных для всех сегментированных таблиц, выполните команду shardmanctl rebalance с параметром --force.

 $ shardmanctl rebalance --force