Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when rerunning test suite #5590

Closed
drasmuss opened this issue May 9, 2019 · 8 comments
Closed

Errors when rerunning test suite #5590

drasmuss opened this issue May 9, 2019 · 8 comments
Labels
area-testing bug Issue identified by VS Code Team member as probable bug

Comments

@drasmuss
Copy link

drasmuss commented May 9, 2019

Environment data

  • VS Code version: 1.33.1 (user setup)
  • Extension version (available under the Extensions sidebar): 2019.4.12954
  • OS and version: Windows 10
  • Python version (& distribution if applicable, e.g. Anaconda): Miniconda Python 3.6.8
  • Type of virtual environment used (N/A | venv | virtualenv | conda | ...): conda
  • Relevant/affected Python packages and their versions: tensorflow-gpu==1.13.1

Expected behaviour

Running the same test suite multiple times should result in the same result each time.

Actual behaviour

After restarting VS Code, the test suite runs normally the first time. But then running it again (sometimes on the second attempt, sometimes after rerunning it several times) results in a bunch of errors related to CUDNN ("Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR" or "Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED").

Based on these issues:

tensorflow/tensorflow#6698 (comment)

tensorflow/tensorflow#24496

https://stackoverflow.com/a/53707323

I believe that that CUDNN error is actually caused by the previous test process not shutting down completely, which causes those errors when the test suite runs again.

Note that this issue does not occur when running the tests normally through a command prompt (e.g., just calling pytest from the command line). It only happens when running the tests through VS Code.

Steps to reproduce:

  1. Create a test file containing
import tensorflow as tf
import numpy as np


def test_tf():
    a = tf.placeholder(tf.float32, (2, 4, 4, 1))
    b = tf.layers.conv2d(a, 2, kernel_size=3)

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        sess.run(b, feed_dict={a: np.ones((2, 4, 4, 1))})
  1. Run that test (using any of the methods described here https://code.visualstudio.com/docs/python/unit-testing#_run-tests).

  2. Repeat step 2 until error appears.

Logs

Output for Python in the Output panel (ViewOutput, change the drop-down the upper-right of the Output panel to Python)

___________________________________ test_tf ___________________________________

self = <tensorflow.python.client.session.Session object at 0x000001E530ED82E8>
fn = <function BaseSession._do_run.<locals>._run_fn at 0x000001E530F02268>
args = ({<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530...p_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530F01720> >], [], None, None)
message = 'Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a ...ed above.\n\t [[node conv2d/Conv2D (defined at d:\\Documents\\nengo-repos\\nengo-dl\\nengo_dl\\tests\\test_tf.py:9) ]]'
m = <_sre.SRE_Match object; span=(158, 182), match='[[{{node conv2d/Conv2D}}'>

    def _do_call(self, fn, *args):
      try:
>       return fn(*args)

D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1334: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

feed_dict = {<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530E...   [1.],
         [1.],
         [1.]],

        [[1.],
         [1.],
         [1.],
         [1.]]]], dtype=float32)}
fetch_list = [<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530F01720> >]
target_list = [], options = None, run_metadata = None

    def _run_fn(feed_dict, fetch_list, target_list, options, run_metadata):
      # Ensure any changes to the graph are reflected in the runtime.
      self._extend_graph()
      return self._call_tf_sessionrun(
>         options, feed_dict, fetch_list, target_list, run_metadata)

D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1319: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <tensorflow.python.client.session.Session object at 0x000001E530ED82E8>
options = None
feed_dict = {<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530E...   [1.],
         [1.],
         [1.]],

        [[1.],
         [1.],
         [1.],
         [1.]]]], dtype=float32)}
fetch_list = [<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530F01720> >]
target_list = [], run_metadata = None

    def _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list,
                            run_metadata):
      return tf_session.TF_SessionRun_wrapper(
          self._session, options, feed_dict, fetch_list, target_list,
>         run_metadata)
E     tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
E     	 [[{{node conv2d/Conv2D}}]]

D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1407: UnknownError

During handling of the above exception, another exception occurred:

    def test_tf():
        print(tf.__version__)
    
        a = tf.placeholder(tf.float32, (2, 4, 4, 1))
        b = tf.layers.conv2d(a, 2, kernel_size=3)
    
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
>           sess.run(b, feed_dict={a: np.ones((2, 4, 4, 1))})

nengo_dl\tests\test_tf.py:13: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:929: in run
    run_metadata_ptr)
D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1152: in _run
    feed_dict_tensor, options, run_metadata)
D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1328: in _do_run
    run_metadata)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <tensorflow.python.client.session.Session object at 0x000001E530ED82E8>
fn = <function BaseSession._do_run.<locals>._run_fn at 0x000001E530F02268>
args = ({<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530...p_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530F01720> >], [], None, None)
message = 'Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a ...ed above.\n\t [[node conv2d/Conv2D (defined at d:\\Documents\\nengo-repos\\nengo-dl\\nengo_dl\\tests\\test_tf.py:9) ]]'
m = <_sre.SRE_Match object; span=(158, 182), match='[[{{node conv2d/Conv2D}}'>

    def _do_call(self, fn, *args):
      try:
        return fn(*args)
      except errors.OpError as e:
        message = compat.as_text(e.message)
        m = BaseSession._NODEDEF_NAME_RE.search(message)
        node_def = None
        op = None
        if m is not None:
          node_name = m.group(3)
          try:
            op = self._graph.get_operation_by_name(node_name)
            node_def = op.node_def
          except KeyError:
            pass
        message = error_interpolation.interpolate(message, self._graph)
>       raise type(e)(node_def, op, message)
E       tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
E       	 [[node conv2d/Conv2D (defined at d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py:9) ]]
E       
E       Caused by op 'conv2d/Conv2D', defined at:
E         File "D:\Miniconda3\envs\tmp\lib\runpy.py", line 193, in _run_module_as_main
E           "__main__", mod_spec)
E         File "D:\Miniconda3\envs\tmp\lib\runpy.py", line 85, in _run_code
E           exec(code, run_globals)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pytest.py", line 91, in <module>
E           raise SystemExit(pytest.main())
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\config\__init__.py", line 77, in main
E           return config.hook.pytest_cmdline_main(config=config)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\main.py", line 218, in pytest_cmdline_main
E           return wrap_session(config, _main)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\main.py", line 185, in wrap_session
E           session.exitstatus = doit(config, session) or 0
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\main.py", line 225, in _main
E           config.hook.pytest_runtestloop(session=session)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\main.py", line 246, in pytest_runtestloop
E           item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 76, in pytest_runtest_protocol
E           runtestprotocol(item, nextitem=nextitem)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 91, in runtestprotocol
E           reports.append(call_and_report(item, "call", log))
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 171, in call_and_report
E           call = call_runtest_hook(item, when, **kwds)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 195, in call_runtest_hook
E           treat_keyboard_interrupt_as_exception=item.config.getvalue("usepdb"),
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 211, in __init__
E           self.result = func()
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 193, in <lambda>
E           lambda: ihook(item=item, **kwds),
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 121, in pytest_runtest_call
E           item.runtest()
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\python.py", line 1438, in runtest
E           self.ihook.pytest_pyfunc_call(pyfuncitem=self)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\python.py", line 166, in pytest_pyfunc_call
E           testfunction(**testargs)
E         File "d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py", line 9, in test_tf
E           b = tf.layers.conv2d(a, 2, kernel_size=3)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func
E           return func(*args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\layers\convolutional.py", line 424, in conv2d
E           return layer.apply(inputs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 1227, in apply
E           return self.__call__(inputs, *args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\layers\base.py", line 530, in __call__
E           outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 554, in __call__
E           outputs = self.call(inputs, *args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\keras\layers\convolutional.py", line 194, in call
E           outputs = self._convolution_op(inputs, self.kernel)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 966, in __call__
E           return self.conv_op(inp, filter)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 591, in __call__
E           return self.call(inp, filter)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 208, in __call__
E           name=self.name)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1113, in conv2d
E           data_format=data_format, dilations=dilations, name=name)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
E           op_def=op_def)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
E           return func(*args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
E           op_def=op_def)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
E           self._traceback = tf_stack.extract_stack()
E       
E       UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
E       	 [[node conv2d/Conv2D (defined at d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py:9) ]]

D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1348: UnknownError
---------------------------- Captured stdout call -----------------------------
1.13.1
---------------------------- Captured stderr call -----------------------------
WARNING:tensorflow:From d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py:9: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-05-09 16:50:28.857769: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2019-05-09 16:50:28.859645: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
------------------------------ Captured log call ------------------------------
deprecation.py             323 WARNING  From d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py:9: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
deprecation.py             323 WARNING  From D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
- generated xml file: C:\Users\dhras\AppData\Local\Temp\tmp-11448rObxAm5zBIM7.xml -
==================== 1 failed, 3 warnings in 2.14 seconds =====================
Error: Error: cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D)

Output from Console under the Developer Tools panel (toggle Developer Tools on under Help; turn on source maps to make any tracebacks be useful by running Enable source map support for extension debugging)

[Extension Host] Python Extension: Cached data exists ActivatedEnvironmentVariables, d:\Documents\nengo-repos\nengo-dl
console.ts:134 [Extension Host] Python Extension: getActivatedEnvironmentVariables, Class name = S, Arg 1: <Uri:d:\Documents\nengo-repos\nengo-dl>, Arg 2: undefined, Arg 3: undefined
console.ts:134 [Extension Host] Python Extension: Cached data exists getEnvironmentVariables, d:\Documents\nengo-repos\nengo-dl
notificationsAlerts.ts:40 There was an error in running the tests.
onDidNotificationChange @ notificationsAlerts.ts:40
_register.model.onDidNotificationChange.e @ notificationsAlerts.ts:26
fire @ event.ts:584
notify @ notifications.ts:113
notify @ notificationService.ts:53
r @ mainThreadMessageService.ts:83
_showMessage @ mainThreadMessageService.ts:44
$showMessage @ mainThreadMessageService.ts:38
_doInvokeHandler @ rpcProtocol.ts:399
_invokeHandler @ rpcProtocol.ts:384
_receiveRequest @ rpcProtocol.ts:304
_receiveOneMessage @ rpcProtocol.ts:226
_protocol.onMessage.e @ rpcProtocol.ts:101
fire @ event.ts:584
a @ ipc.net.ts:392
e @ ipc.net.ts:399
fire @ event.ts:584
_receiveMessage @ ipc.net.ts:678
_socketDisposables.push._socketReader.onMessage.e @ ipc.net.ts:549
fire @ event.ts:584
acceptChunk @ ipc.net.ts:212
_register._socket.onData.e @ ipc.net.ts:173
t @ ipc.net.ts:24
emit @ events.js:182
addChunk @ _stream_readable.js:279
readableAddChunk @ _stream_readable.js:264
Readable.push @ _stream_readable.js:219
onread @ net.js:636
console.ts:134 [Extension Host] rejected promise not handled within 1 second: Error: cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D)
t.log @ console.ts:134
$logExtensionHostMessage @ mainThreadConsole.ts:39
_doInvokeHandler @ rpcProtocol.ts:399
_invokeHandler @ rpcProtocol.ts:384
_receiveRequest @ rpcProtocol.ts:304
_receiveOneMessage @ rpcProtocol.ts:226
_protocol.onMessage.e @ rpcProtocol.ts:101
fire @ event.ts:584
a @ ipc.net.ts:392
e @ ipc.net.ts:399
fire @ event.ts:584
_receiveMessage @ ipc.net.ts:678
_socketDisposables.push._socketReader.onMessage.e @ ipc.net.ts:549
fire @ event.ts:584
acceptChunk @ ipc.net.ts:212
_register._socket.onData.e @ ipc.net.ts:173
t @ ipc.net.ts:24
emit @ events.js:182
addChunk @ _stream_readable.js:279
readableAddChunk @ _stream_readable.js:264
Readable.push @ _stream_readable.js:219
onread @ net.js:636
console.ts:134 [Extension Host] stack trace: Error: cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D)
    at n.then.e (file:///D:/Programs/Microsoft VS Code/resources/app/out/vs/workbench/workbench.main.js:4392:776)
t.log @ console.ts:134
$logExtensionHostMessage @ mainThreadConsole.ts:39
_doInvokeHandler @ rpcProtocol.ts:399
_invokeHandler @ rpcProtocol.ts:384
_receiveRequest @ rpcProtocol.ts:304
_receiveOneMessage @ rpcProtocol.ts:226
_protocol.onMessage.e @ rpcProtocol.ts:101
fire @ event.ts:584
a @ ipc.net.ts:392
e @ ipc.net.ts:399
fire @ event.ts:584
_receiveMessage @ ipc.net.ts:678
_socketDisposables.push._socketReader.onMessage.e @ ipc.net.ts:549
fire @ event.ts:584
acceptChunk @ ipc.net.ts:212
_register._socket.onData.e @ ipc.net.ts:173
t @ ipc.net.ts:24
emit @ events.js:182
addChunk @ _stream_readable.js:279
readableAddChunk @ _stream_readable.js:264
Readable.push @ _stream_readable.js:219
onread @ net.js:636
log.ts:173   ERR cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D): Error: cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D)
    at n.then.e (file:///D:/Programs/Microsoft VS Code/resources/app/out/vs/workbench/workbench.main.js:4392:776)
@ghost ghost added the triage-needed Needs assignment to the proper sub-team label May 9, 2019
@DonJayamanne
Copy link

I believe that that CUDNN error is actually caused by the previous test process not shutting down completely, which causes those errors when the test suite runs again.

Agreed.
My suggestion is to wait for the tests to run to completion before starting it again.
Please could you try waiting for tests to complete before starting, then let me know how this goes.

I don't see how we can solve this, as this is purely an environment issue.
Will leave this issue open to explore other avenues.

@DonJayamanne DonJayamanne added info-needed Issue requires more information from poster triage labels May 9, 2019
@ghost ghost removed triage-needed Needs assignment to the proper sub-team labels May 9, 2019
@DonJayamanne DonJayamanne added area-testing bug Issue identified by VS Code Team member as probable bug labels May 9, 2019
@drasmuss
Copy link
Author

drasmuss commented May 9, 2019

The above reproduction steps work even if you wait for the tests to run to completion each time. So the issue is that even though the tests have finished running, VS Code is keeping that process open in some way.

@drasmuss
Copy link
Author

drasmuss commented May 9, 2019

Also worth noting that although there seems to be some randomness in how many times you need to run the tests to trigger the failure, once it reaches this failure state then the tests will fail every time from then on with those CUDNN errors (suggesting there's some process permanently stuck in the background holding the GPU resources).

@DonJayamanne
Copy link

DonJayamanne commented May 10, 2019

The process doesn't stay alive, it dies.
I'd say the code that's running in the test isnt tearing down gracefully.
Do you know what needs to shutdown down, if you Ido, then I'd suggest adding an atexit handler or similar.

@drasmuss
Copy link
Author

drasmuss commented May 10, 2019

This problem only occurs when running through VS Code, not when running the tests in other ways (e.g. running pytest from the terminal, or running in PyCharm). So I am fairly certain that it is something specific to how VS Code is running or tearing down the tests. Do you have any suggestions on what might be different with VS Code? Or if there are some settings I can enable to change the shutdown behaviour, or force a full shutdown?

@DonJayamanne
Copy link

There's nothing specific to VS Code.
Here's what we do:

  • Run the tests using the standard pytest CLI
    Thats it, nothing more than that.
    I'll add some code to log the command thats being generated and executed. FYI - The output is displayed in the Python Test Log output panel.

@DonJayamanne DonJayamanne added needs PR and removed info-needed Issue requires more information from poster labels May 15, 2019
@DonJayamanne
Copy link

Solution:

  • Use python code instead of pytest cli
  • Log commands executed along with the output

@DonJayamanne DonJayamanne removed their assignment May 16, 2019
@luabud
Copy link
Member

luabud commented Oct 24, 2019

Closing in favour of #7608.

@luabud luabud closed this as completed Oct 24, 2019
@ghost ghost removed the needs PR label Oct 24, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-testing bug Issue identified by VS Code Team member as probable bug
Projects
None yet
Development

No branches or pull requests

3 participants