Skip to content

Commit

Permalink
tests: fix flaky tests in test_running_cluster.py
Browse files Browse the repository at this point in the history
The problem occurs when trying to stop cluster (either with 'tt stop' or
with 'tt restart') at the moment bootstrap have not been completed.
In such a scenario when the replica instance is stopped with either of
the mentioned commands the corresponding master instance exits instantly
with the critical failure as below:

F> can't initialize storage: Can't check who replica <...> at <...> chose its bootstrap leader

As a result 'tt stop/restart' for the master instance doesn't produce the
termination-message:

The Instance small_cluster_app:storage-master (PID = <...>) has been terminated

because such kind of message is produced only for a running instance.

This commit introduces waiting for the cluster bootstrap completion
after it is started or restarted.

Closes #1000
  • Loading branch information
elhimov authored and dmyger committed Nov 13, 2024
1 parent 9a56eaa commit 5347a00
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion test/integration/running/test_running_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

from utils import (control_socket, extract_status, get_tarantool_version,
lib_path, log_path, run_command_and_get_output, run_path,
wait_files, wait_pid_disappear)
wait_event, wait_files, wait_pid_disappear)

tarantool_major_version, tarantool_minor_version = get_tarantool_version()

Expand Down Expand Up @@ -45,6 +45,18 @@ def wait_cluster_started(tt_cmd, workdir, app_name, instances, inst_conf):
files.append(conf['console_socket'])
assert wait_files(5, files)

def are_all_box_statuses_running():
status_cmd = [tt_cmd, "status", app_name]
status_rc, status_out = run_command_and_get_output(status_cmd, cwd=workdir)
assert status_rc == 0
status_info = extract_status(status_out)
for inst in instances:
inst_id = app_name + ":" + inst
if status_info[inst_id].get("BOX") != "running":
return False
return True
assert wait_event(5, are_all_box_statuses_running)


def default_inst_conf(workdir, app_name, inst):
app_path = os.path.join(workdir, app_name)
Expand Down

0 comments on commit 5347a00

Please sign in to comment.