You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, I get this error related to the /health endpoint
$ kubectl describe pod tgi-tpu
...
Warning Unhealthy 109s (x13 over 41m) kubelet Liveness probe failed: Get "http://10.60.7.24:80/health": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
I also tested out if I could reach the endpoint using curl. When I do a /generate request first, it returns successfully:
$ curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Love?","parameters":{"max_new_tokens":40}}' -H 'Content-Type: application/json'
{"generated_text":"\n\nLove is a feeling of affection for someone or something.\n\nLove is a feeling of affection for someone or something.\n\nLove is a feeling of affection for someone or"}
$ curl 127.0.0.1:8080/health -X GET
$
However, if I don't do a /generate request beforehand, the /health request never returns.
should work if you rebuild the containers, thanks for flagging! Also see the original issue where this was listed at #65 (comment), and kudos to @tengomucho for solving those 👏🏻
Hi @Edwinhr716, @alvarobartt is right we have put some effort in improving TGI robustness with optimum-tpu. Latest release should be the most solid one, let us know if you still see the issue.
I'm planning on using the endpoint
/health
for liveness and readiness probes for my kubernetes deployments, but I've been running into issues.This is the deployment that I'm testing
However, I get this error related to the
/health
endpointI also tested out if I could reach the endpoint using curl. When I do a
/generate
request first, it returns successfully:However, if I don't do a
/generate
request beforehand, the/health
request never returns.Looking at the router code, looks like this path is not working properly on Optimum TPU https://github.com/huggingface/text-generation-inference/blob/main/router/src/infer/health.rs#L27
The text was updated successfully, but these errors were encountered: