You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When triggering a configuration snapshot where tedge does not have permissions to write to/create /var/tedge/file-transfer/<device_id>/, the file-transfer service generates a 500 internal server error. This appears to be treated as a transient error by the uploader code, so it attempts to retry the upload even though it is essentially guaranteed to fail without some human intervention.
Trigger a configuration-snapshot operation in Cumulocity
The operation gets stuck in the executing state. You can confirm there is a permissions error through observing the following:
$ systemctl status tedge-agent
Jan 31 17:45:21 james-Latitude-5511 tedge-agent[687003]: Permission denied (os error 13)
Jan 31 17:45:21 james-Latitude-5511 tedge-agent[687003]: 2025-01-31T17:45:21.24358042Z WARN upload::upload: Temporary failure: HTTP status server error (500 Internal Server Error) for url (http://127.0.0.1:8000/tedge/file-transfer/jrh-test.5/config_snapshot/tedge-configuration-plugin-c8y-mapper-141376). Retrying in 22s
Jan 31 17:45:43 james-Latitude-5511 tedge-agent[687003]: 2025-01-31T17:45:43.741483458Z ERROR tedge_agent::http_server::error: Request to upload to "jrh-test.5/config_snapshot/tedge-configuration-plugin-c8y-mapper-141376" failed: Directory Error. Check permissions for /var/tedge/file-transfer/jrh-test.5/config_snapshot.
Jan 31 17:45:43 james-Latitude-5511 tedge-agent[687003]: Caused by:
Jan 31 17:45:43 james-Latitude-5511 tedge-agent[687003]: Permission denied (os error 13)
Jan 31 17:45:43 james-Latitude-5511 tedge-agent[687003]: 2025-01-31T17:45:43.741968038Z WARN upload::upload: Temporary failure: HTTP status server error (500 Internal Server Error) for url (http://127.0.0.1:8000/tedge/file-transfer/jrh-test.5/config_snapshot/tedge-configuration-plugin-c8y-mapper-141376). Retrying in 36s
Jan 31 17:46:20 james-Latitude-5511 tedge-agent[687003]: 2025-01-31T17:46:20.488489158Z ERROR tedge_agent::http_server::error: Request to upload to "jrh-test.5/config_snapshot/tedge-configuration-plugin-c8y-mapper-141376" failed: Directory Error. Check permissions for /var/tedge/file-transfer/jrh-test.5/config_snapshot.
Jan 31 17:46:20 james-Latitude-5511 tedge-agent[687003]: Caused by:
Jan 31 17:46:20 james-Latitude-5511 tedge-agent[687003]: Permission denied (os error 13)
Jan 31 17:46:20 james-Latitude-5511 tedge-agent[687003]: 2025-01-31T17:46:20.489027966Z WARN upload::upload: Temporary failure: HTTP status server error (500 Internal Server Error) for url (http://127.0.0.1:8000/tedge/file-transfer/jrh-test.5/config_snapshot/tedge-configuration-plugin-c8y-mapper-141376). Retrying in 45s
Expected behavior
If the uploader cannot upload to the file transfer service because of a misconfiguration such as this it should abort the request rather than retrying it. I suspect we just want to have different transient-error detection behaviour depending on whether we are connecting to an internal API (where it is probably reasonable to assume a 500 is a permanent error) or an external one (including the Cumulocity proxy, as that will echo the status code from Cumulocity).
The text was updated successfully, but these errors were encountered:
Describe the bug
When triggering a configuration snapshot where
tedge
does not have permissions to write to/create/var/tedge/file-transfer/<device_id>/
, the file-transfer service generates a 500 internal server error. This appears to be treated as a transient error by the uploader code, so it attempts to retry the upload even though it is essentially guaranteed to fail without some human intervention.To Reproduce
/var/tedge/file-transfer/<device_id>/
sudo chown root "/var/tedge/file-transfer/ && sudo chmod 755 /var/tedge/file-transfer
The operation gets stuck in the executing state. You can confirm there is a permissions error through observing the following:
Expected behavior
If the uploader cannot upload to the file transfer service because of a misconfiguration such as this it should abort the request rather than retrying it. I suspect we just want to have different transient-error detection behaviour depending on whether we are connecting to an internal API (where it is probably reasonable to assume a 500 is a permanent error) or an external one (including the Cumulocity proxy, as that will echo the status code from Cumulocity).
The text was updated successfully, but these errors were encountered: