Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wakepy freezes ubuntu when using it repeatedly (10k times) in a loop on Ubuntu (Gnome SessionManager) #277

Open
fohrloop opened this issue May 9, 2024 · 4 comments
Assignees
Labels
Type: Bug Something isn't working

Comments

@fohrloop
Copy link
Owner

fohrloop commented May 9, 2024

Found this one accidentally when debugging #276. This will freeze Ubuntu 22.04, GNOME 42.9, development version of 0.8.0 wakepy (for example c574e52):

from wakepy import keep
      
for i in range(10_000):
    with keep.running() as m:
         print(i)

About the behavior:

  • It's not specific to wakepy. The same behavior can be reproduced by using just jeepney or dbus.
  • If it does not freeze the system, it will make mouse cursor laggy. This lag is removed after logout -> login.
  • Adding delay between function calls fixes the problem. For example, using 50ms delay and 80k iterations will not reproduce the issue. See this comment.
  • Same behavior on Ubuntu with org.freedesktop.ScreenSaver (also on wakepy 0.7.x)
  • This does not occur on Fedora 40 (Gnome 46 + dbus-broker), so either updating Gnome from 42.9 to 46 or changing dbus-daemon to dbus-broker probably fixes the broblem. (Thanks @takluyver)

Task:

  • Find out what is causing the freeze and fix it
@fohrloop fohrloop added this to the wakepy 0.8.0 milestone May 9, 2024
@fohrloop fohrloop added the Type: Bug Something isn't working label May 9, 2024
@fohrloop fohrloop self-assigned this May 9, 2024
fohrloop added a commit that referenced this issue May 9, 2024
Fixes the error with Gnome SessionManager based keepawake (issue #276)

DBus Adapters
-------------
* All DBusAdapters now have to keep reference to the connection(s).
  This was required as if the connection instance was garbage collected,
  the inhibit cookie created by Gnome SessionManager was wiped. 
  
* The DBusAdapter class now has _get_connection(bus) method or getting
  a connection instance for an adapter.
  
* The DBusAdapter subclasses now must implement a
  _create_new_connection(bus) method.
  
Other
----

Temporarily ignore warnings in tests/integration/conftest.py while
running gc.collect(). This should be addressed later (likely as part of
#277)
@fohrloop
Copy link
Owner Author

fohrloop commented May 11, 2024

Debugging: bottom-up

Trying to reproduce the behavior with smallest amount of code possible.

1) It's not caused by just opening and closing connections

This causes no problems:

from jeepney.io.blocking import open_dbus_connection

for i in range(10_000):
    print(i)
    connection = open_dbus_connection(bus="SESSION")
    connection.close()

2) A simple DBus method call does not cause the issue

This causes no problems:

from jeepney import DBusAddress, new_method_call
from jeepney.io.blocking import open_dbus_connection
from jeepney.wrappers import unwrap_msg

for i in range(10_000):
    print(i)
    connection = open_dbus_connection(bus="SESSION")

    addr = DBusAddress(
        object_path="/org/freedesktop/DBus",
        bus_name="org.freedesktop.DBus",
        interface="org.freedesktop.DBus",
    )

    msg = new_method_call(
        addr,
        method="ListNames",
        signature=None,
        body=None,
    )

    reply = connection.send_and_get_reply(msg, timeout=1)
    resp = unwrap_msg(reply)
    connection.close()

3) Inhibit + Uninhibit DBus calls to SessionManager

This will freeze my Ubuntu 22.04 laptop with GNOME 42.9 in around 400-600 iterations:

from jeepney import DBusAddress, new_method_call
from jeepney.io.blocking import open_dbus_connection
from jeepney.wrappers import unwrap_msg

addr = DBusAddress(
    object_path="/org/gnome/SessionManager",
    bus_name="org.gnome.SessionManager",
    interface="org.gnome.SessionManager",
)


def call(method, signature=None, body=None):
    msg = new_method_call(
        addr,
        method=method,
        signature=signature,
        body=body,
    )

    reply = connection.send_and_get_reply(msg, timeout=1)

    return unwrap_msg(reply)


for i in range(10_000):
    connection = open_dbus_connection(bus="SESSION")

    cookie = call(
        method="Inhibit",
        signature="susu",  # "app_id", "toplevel_xid", "reason", "flags"
        body=("wakepy", 42, "wakelock active", 8),
    )[0]

    inhibitors = call(
        method="GetInhibitors",
    )[0]

    print(f"{i}, {cookie=}, inhibitors = {len(inhibitors)}")
    call(
        method="Uninhibit",
        signature="u",  # "cookie"
        body=(cookie,),
    )

    inhibitors = call(
        method="GetInhibitors",
    )[0]
    print(f"    {i}, {cookie=}, inhibitors = {len(inhibitors)}")

    connection.close()

I can see that the number of inhibitors is either 1 or 2, so these are not piling up:

0, cookie=618529695, inhibitors = 2
    0, cookie=618529695, inhibitors = 1
1, cookie=1846700224, inhibitors = 2
    1, cookie=1846700224, inhibitors = 1

Variations:

  • Add gc.collect() after connection.close() to get into about ~2000 iterations before mouse starts to feel laggy. But system will not freeze even with 10k iterations. After finishing, mouse is permanently laggy until relogin.
  • Using random.randint(1, 100_000) in place of 42 as toplevel_xid has no effect.

4) Test with dbus library instead of jeepney

This also freezes the system in 400-600 cycles:

import dbus

for i in range(10_000):
    session_bus = dbus.SessionBus()
    proxy = session_bus.get_object(
        "org.gnome.SessionManager", "/org/gnome/SessionManager"
    )

    cookie = proxy.get_dbus_method(
        "Inhibit", dbus_interface="org.gnome.SessionManager"
    )("wakepy", 42, "wakelock active", 8)

    inhibitors = proxy.get_dbus_method(
        "GetInhibitors", dbus_interface="org.gnome.SessionManager"
    )()

    print(f"{i}, {cookie=}, inhibitors = {len(inhibitors)}")

    proxy.get_dbus_method("Uninhibit", dbus_interface="org.gnome.SessionManager")(
        int(cookie)
    )

    inhibitors = proxy.get_dbus_method(
        "GetInhibitors", dbus_interface="org.gnome.SessionManager"
    )()

    print(f"    {i}, {cookie=}, inhibitors = {len(inhibitors)}")
    session_bus.close()

Summary

It looks like it's not caused by jeepney since jeepney can process other dbus calls (ListNames example above) just fine. This also is reproduced with dbus-python, so it looks like it's connected to GNOME. I created an issue in the GNOME GitLab repository: gnome-settings-daemon/#798. I'll wait comments there before proceeding with this.

@fohrloop
Copy link
Owner Author

fohrloop commented May 19, 2024

I just figured out that this does not seem to cause any slow-downs: EDIT: No. it does. It's just a bit random with all the methods that how many cycles they need until there's slow-down and freeze.

Using con.close() + gc.collect() and just jeepney

Note: removing the gc.collect() will make freeze the system around 2000 iterations.

import gc 

from jeepney import DBusAddress, new_method_call
from jeepney.io.blocking import open_dbus_connection
from jeepney.wrappers import unwrap_msg

addr = DBusAddress(
    object_path="/org/gnome/SessionManager",
    bus_name="org.gnome.SessionManager",
    interface="org.gnome.SessionManager",
)


def call(method, signature=None, body=None):
    msg = new_method_call(
        addr,
        method=method,
        signature=signature,
        body=body,
    )

    reply = connection.send_and_get_reply(msg, timeout=1)

    return unwrap_msg(reply)


for i in range(10_000):
    connection = open_dbus_connection(bus="SESSION")

    cookie = call(
        method="Inhibit",
        signature="susu",  # "app_id", "toplevel_xid", "reason", "flags"
        body=("wakepy", 42, "wakelock active", 8),
    )[0]

    inhibitors = call(
        method="GetInhibitors",
    )[0]

    print(f"{i}, {cookie=}, inhibitors = {len(inhibitors)}")
    call(
        method="Uninhibit",
        signature="u",  # "cookie"
        body=(cookie,),
    )

    inhibitors = call(
        method="GetInhibitors",
    )[0]
    print(f"    {i}, {cookie=}, inhibitors = {len(inhibitors)}")

    connection.close()
    gc.collect()

so why does:

from wakepy import keep
      
for i in range(10_000):
    with keep.running() as m:
         print(i)

still after #282 cause mouse to be laggy/choppy and require a relogin, since what they both should do is (1) close connection (2) gc.collect()? Furthermore, since the first one works without issues (after the gc.collect()), I'm starting to suspect that it is not a problem in Gnome but somewhere in the python world.

@fohrloop
Copy link
Owner Author

fohrloop commented May 25, 2024

Just tested on the same system and also the "org.freedesktop.ScreenSaver" (keep.presenting mode) behaves exactly the same. It's likely that the org.freedesktop.ScreenSaver interface is registered by some Gnome process when running Gnome, so this is somewhat expected. While this is an annoying bug, I don't know if it's wakepy, jeepney, dbus, CPython or Gnome (or something else) where the fix could and should be made. It's also been there already in wakepy 0.7.x (using org.freedesktop.ScreenSaver), so I think it's okay to proceed with releasing 0.8.0 with this. Luckily it's relatively uncommon to inhibit sleep hundreds or thousands times. I hope to gather more understanding on this and fix it or find an alternative way for inhibiting sleep on Gnome.

@fohrloop fohrloop removed this from the wakepy 0.8.0 milestone May 25, 2024
@fohrloop
Copy link
Owner Author

fohrloop commented May 26, 2024

Adding delay between calls

Adding delay between function calls seems to fix the problem. I left my computer to do this (with wakepy on) for few hours:

code with four sleeps in it
import time

from jeepney import DBusAddress, new_method_call
from jeepney.io.blocking import open_dbus_connection
from jeepney.wrappers import unwrap_msg

addr = DBusAddress(
    object_path="/org/gnome/SessionManager",
    bus_name="org.gnome.SessionManager",
    interface="org.gnome.SessionManager",
)


def call(method, signature=None, body=None):
    msg = new_method_call(
        addr,
        method=method,
        signature=signature,
        body=body,
    )

    reply = connection.send_and_get_reply(msg, timeout=1)

    return unwrap_msg(reply)


for i in range(10_000):
    connection = open_dbus_connection(bus="SESSION")

    time.sleep(0.05) #1
    cookie = call(
        method="Inhibit",
        signature="susu",  # "app_id", "toplevel_xid", "reason", "flags"
        body=("wakepy", 42, "wakelock active", 8),
    )[0]

    time.sleep(0.2) #2
    print(f"{i}, {cookie=}")
    call(
        method="Uninhibit",
        signature="u",  # "cookie"
        body=(cookie,),
    )
    time.sleep(0.5) #3

    connection.close()
    time.sleep(0.5) #4

When I came back, system was not suffering of any kind of slowness. So most likely this is about doing the function calls fast and repeatedly. The fact that calling gc.collect() did help a bit in the problem supports this, as calling it takes some time.

Narrowing down the problem

Here are some results when removing one of the sleeps:

single sleep removed

  • When sleep #⁠1 is removed, I could not see any slow-downs (29k iterations).
  • When sleep #⁠2 is removed, I could not see any slow-downs (31k iterations).
  • When sleep #⁠3 is removed, I could not see any slow-downs (20k iterations).
  • When sleep #⁠4 is removed, I could not see any slow-downs (20k iterations).

two sleeps removed

  • When sleep #⁠1 and #⁠2 are removed, I could not see any slow-downs (30k iterations).
  • When sleep #⁠1 and #⁠3 are removed, I could not see any slow-downs (30k iterations).
  • When sleep #⁠1 and #⁠4 are removed, I could not see any slow-downs (30k iterations).
  • When sleep #⁠2 and #⁠3 are removed, I could not see any slow-downs (30k iterations).
  • When sleep #⁠2 and #⁠4 are removed, I could not see any slow-downs (30k iterations).
  • When sleep #⁠3 and #⁠4 are removed, I could not see any slow-downs (30k iterations).

three sleeps removed

  • When only sleep #⁠1 is used, I could not see any slow-downs (30k iterations).
  • When only sleep #⁠2 is used, I could not see any slow-downs (30k iterations).
  • When only sleep #⁠3 is used, I could not see any slow-downs (30k iterations).
  • When only sleep #⁠4 is used, I could not see any slow-downs (30k iterations).

Summary

Adding sleep anywhere within the full cycle (open connection, inhibit sleep, uninhibit, close connection) seems to fix the problem. Shortest sleep tried was 50ms. In addition to this, I tested with wakepy with 50ms sleep and 80k iterations:

for i in range(80_000):
    with keep.running() as m:
        print(i, m.used_method)
        time.sleep(0.05)

and could not reproduce any slow-downs. This seems to be a non-issue as the sleep is typically inhibited for a long-running task, and it's typical to have only one or few locks with one task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant