Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd client panics with "tsoStream.recvLoop internal panic" #9051

Closed
overvenus opened this issue Feb 10, 2025 · 1 comment · Fixed by #9056
Closed

pd client panics with "tsoStream.recvLoop internal panic" #9051

overvenus opened this issue Feb 10, 2025 · 1 comment · Fixed by #9056
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. severity/major type/bug The issue is confirmed as a bug.

Comments

@overvenus
Copy link
Member

Bug Report

/opt/tidb/db.log:[2025/02/09 12:14:02.117 +00:00] [FATAL] [stream.go:330] ["tsoStream.recvLoop internal panic"] [stacktrace="github.com/tikv/pd/client/clients/tso.(*tsoStream).recvLoop.func1\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20250107032658-5c4ab57d68de/clients/tso/stream.go:330\nruntime.gopanic\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.6.linux-amd64/src/runtime/panic.go:785\ngo.uber.org/zap/zapcore.CheckWriteAction.OnWrite\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/go.uber.org/zap@v1.27.0/zapcore/entry.go:196\ngo.uber.org/zap/zapcore.(*CheckedEntry).Write\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/go.uber.org/zap@v1.27.0/zapcore/entry.go:262\ngo.uber.org/zap.(*Logger).Panic\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/go.uber.org/zap@v1.27.0/logger.go:285\ngithub.com/pingcap/log.Panic\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/pingcap/log@v1.1.1-0.20241212030209-7e3ff8601a2a/global.go:54\ngithub.com/tikv/pd/client/clients/tso.(*tsoDispatcher).checkMonotonicity\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20250107032658-5c4ab57d68de/clients/tso/dispatcher.go:528\ngithub.com/tikv/pd/client/clients/tso.(*tsoDispatcher).processRequests.func2\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20250107032658-5c4ab57d68de/clients/tso/dispatcher.go:468\ngithub.com/tikv/pd/client/clients/tso.(*tsoStream).recvLoop\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20250107032658-5c4ab57d68de/clients/tso/stream.go:444"] [panicMessage="[tso] timestamp fallback"] [stack="github.com/tikv/pd/client/clients/tso.(*tsoStream).recvLoop.func1\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20250107032658-5c4ab57d68de/clients/tso/stream.go:330\nruntime.gopanic\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.6.linux-amd64/src/runtime/panic.go:785\ngo.uber.org/zap/zapcore.CheckWriteAction.OnWrite\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/go.uber.org/zap@v1.27.0/zapcore/entry.go:196\ngo.uber.org/zap/zapcore.(*CheckedEntry).Write\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/go.uber.org/zap@v1.27.0/zapcore/entry.go:262\ngo.uber.org/zap.(*Logger).Panic\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/go.uber.org/zap@v1.27.0/logger.go:285\ngithub.com/pingcap/log.Panic\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/pingcap/log@v1.1.1-0.20241212030209-7e3ff8601a2a/global.go:54\ngithub.com/tikv/pd/client/clients/tso.(*tsoDispatcher).checkMonotonicity\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20250107032658-5c4ab57d68de/clients/tso/dispatcher.go:528\ngithub.com/tikv/pd/client/clients/tso.(*tsoDispatcher).processRequests.func2\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20250107032658-5c4ab57d68de/clients/tso/dispatcher.go:468\ngithub.com/tikv/pd/client/clients/tso.(*tsoStream).recvLoop\n\t/home/jenkins/agent/workspace/build_tidb_multi_branch_master/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20250107032658-5c4ab57d68de/clients/tso/stream.go:444"]

Test: https://tcms.pingcap.net/dashboard/executions/case/13077637

What did you do?

Run jepsen tests.

What did you expect to see?

No panic.

What version of PD are you using (pd-server -V)?

TiDB: c292ec642b3fd69904c4d5223415ba6a2a040a51
PD: debceaf

@overvenus overvenus added the type/bug The issue is confirmed as a bug. label Feb 10, 2025
@rleungx rleungx added affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. labels Feb 11, 2025
@ti-chi-bot ti-chi-bot bot closed this as completed in aa7a4c6 Feb 11, 2025
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Feb 11, 2025
close tikv#9051

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot bot pushed a commit that referenced this issue Feb 12, 2025
…9068)

close #9051

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: Ryan Leung <rleungx@gmail.com>

Co-authored-by: Ryan Leung <rleungx@gmail.com>
@rleungx
Copy link
Member

rleungx commented Feb 13, 2025

The reason why this happens is because PD can provide TSO even if it is not the PD leader.

From the log, we can find after pd 5 write leader successfully but failed to rebase id. It might provide TSO even if it's not PD leader.

if !gta.member.GetLeadership().Check() {

Here we only use lease to determine if it can provide TSO which is not accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants