Errors when rerunning test suite #5590

drasmuss · 2019-05-09T19:57:56Z

Environment data

VS Code version: 1.33.1 (user setup)
Extension version (available under the Extensions sidebar): 2019.4.12954
OS and version: Windows 10
Python version (& distribution if applicable, e.g. Anaconda): Miniconda Python 3.6.8
Type of virtual environment used (N/A | venv | virtualenv | conda | ...): conda
Relevant/affected Python packages and their versions: tensorflow-gpu==1.13.1

Expected behaviour

Running the same test suite multiple times should result in the same result each time.

Actual behaviour

After restarting VS Code, the test suite runs normally the first time. But then running it again (sometimes on the second attempt, sometimes after rerunning it several times) results in a bunch of errors related to CUDNN ("Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR" or "Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED").

Based on these issues:

tensorflow/tensorflow#6698 (comment)

tensorflow/tensorflow#24496

https://stackoverflow.com/a/53707323

I believe that that CUDNN error is actually caused by the previous test process not shutting down completely, which causes those errors when the test suite runs again.

Note that this issue does not occur when running the tests normally through a command prompt (e.g., just calling pytest from the command line). It only happens when running the tests through VS Code.

Steps to reproduce:

Create a test file containing

import tensorflow as tf
import numpy as np


def test_tf():
    a = tf.placeholder(tf.float32, (2, 4, 4, 1))
    b = tf.layers.conv2d(a, 2, kernel_size=3)

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        sess.run(b, feed_dict={a: np.ones((2, 4, 4, 1))})

Run that test (using any of the methods described here https://code.visualstudio.com/docs/python/unit-testing#_run-tests).
Repeat step 2 until error appears.

Logs

Output for Python in the Output panel (View→Output, change the drop-down the upper-right of the Output panel to Python)

___________________________________ test_tf ___________________________________

self = <tensorflow.python.client.session.Session object at 0x000001E530ED82E8>
fn = <function BaseSession._do_run.<locals>._run_fn at 0x000001E530F02268>
args = ({<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530...p_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530F01720> >], [], None, None)
message = 'Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a ...ed above.\n\t [[node conv2d/Conv2D (defined at d:\\Documents\\nengo-repos\\nengo-dl\\nengo_dl\\tests\\test_tf.py:9) ]]'
m = <_sre.SRE_Match object; span=(158, 182), match='[[{{node conv2d/Conv2D}}'>

    def _do_call(self, fn, *args):
      try:
>       return fn(*args)

D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1334: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

feed_dict = {<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530E...   [1.],
         [1.],
         [1.]],

        [[1.],
         [1.],
         [1.],
         [1.]]]], dtype=float32)}
fetch_list = [<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530F01720> >]
target_list = [], options = None, run_metadata = None

    def _run_fn(feed_dict, fetch_list, target_list, options, run_metadata):
      # Ensure any changes to the graph are reflected in the runtime.
      self._extend_graph()
      return self._call_tf_sessionrun(
>         options, feed_dict, fetch_list, target_list, run_metadata)

D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1319: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <tensorflow.python.client.session.Session object at 0x000001E530ED82E8>
options = None
feed_dict = {<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530E...   [1.],
         [1.],
         [1.]],

        [[1.],
         [1.],
         [1.],
         [1.]]]], dtype=float32)}
fetch_list = [<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530F01720> >]
target_list = [], run_metadata = None

    def _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list,
                            run_metadata):
      return tf_session.TF_SessionRun_wrapper(
          self._session, options, feed_dict, fetch_list, target_list,
>         run_metadata)
E     tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
E     	 [[{{node conv2d/Conv2D}}]]

D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1407: UnknownError

During handling of the above exception, another exception occurred:

    def test_tf():
        print(tf.__version__)
    
        a = tf.placeholder(tf.float32, (2, 4, 4, 1))
        b = tf.layers.conv2d(a, 2, kernel_size=3)
    
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
>           sess.run(b, feed_dict={a: np.ones((2, 4, 4, 1))})

nengo_dl\tests\test_tf.py:13: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:929: in run
    run_metadata_ptr)
D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1152: in _run
    feed_dict_tensor, options, run_metadata)
D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1328: in _do_run
    run_metadata)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <tensorflow.python.client.session.Session object at 0x000001E530ED82E8>
fn = <function BaseSession._do_run.<locals>._run_fn at 0x000001E530F02268>
args = ({<tensorflow.python.pywrap_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530...p_tensorflow_internal.TF_Output; proxy of <Swig Object of type 'TF_Output *' at 0x000001E530F01720> >], [], None, None)
message = 'Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a ...ed above.\n\t [[node conv2d/Conv2D (defined at d:\\Documents\\nengo-repos\\nengo-dl\\nengo_dl\\tests\\test_tf.py:9) ]]'
m = <_sre.SRE_Match object; span=(158, 182), match='[[{{node conv2d/Conv2D}}'>

    def _do_call(self, fn, *args):
      try:
        return fn(*args)
      except errors.OpError as e:
        message = compat.as_text(e.message)
        m = BaseSession._NODEDEF_NAME_RE.search(message)
        node_def = None
        op = None
        if m is not None:
          node_name = m.group(3)
          try:
            op = self._graph.get_operation_by_name(node_name)
            node_def = op.node_def
          except KeyError:
            pass
        message = error_interpolation.interpolate(message, self._graph)
>       raise type(e)(node_def, op, message)
E       tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
E       	 [[node conv2d/Conv2D (defined at d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py:9) ]]
E       
E       Caused by op 'conv2d/Conv2D', defined at:
E         File "D:\Miniconda3\envs\tmp\lib\runpy.py", line 193, in _run_module_as_main
E           "__main__", mod_spec)
E         File "D:\Miniconda3\envs\tmp\lib\runpy.py", line 85, in _run_code
E           exec(code, run_globals)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pytest.py", line 91, in <module>
E           raise SystemExit(pytest.main())
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\config\__init__.py", line 77, in main
E           return config.hook.pytest_cmdline_main(config=config)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\main.py", line 218, in pytest_cmdline_main
E           return wrap_session(config, _main)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\main.py", line 185, in wrap_session
E           session.exitstatus = doit(config, session) or 0
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\main.py", line 225, in _main
E           config.hook.pytest_runtestloop(session=session)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\main.py", line 246, in pytest_runtestloop
E           item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 76, in pytest_runtest_protocol
E           runtestprotocol(item, nextitem=nextitem)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 91, in runtestprotocol
E           reports.append(call_and_report(item, "call", log))
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 171, in call_and_report
E           call = call_runtest_hook(item, when, **kwds)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 195, in call_runtest_hook
E           treat_keyboard_interrupt_as_exception=item.config.getvalue("usepdb"),
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 211, in __init__
E           self.result = func()
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 193, in <lambda>
E           lambda: ihook(item=item, **kwds),
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\runner.py", line 121, in pytest_runtest_call
E           item.runtest()
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\python.py", line 1438, in runtest
E           self.ihook.pytest_pyfunc_call(pyfuncitem=self)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\hooks.py", line 289, in __call__
E           return self._hookexec(self, self.get_hookimpls(), kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 68, in _hookexec
E           return self._inner_hookexec(hook, methods, kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\manager.py", line 62, in <lambda>
E           firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\pluggy\callers.py", line 187, in _multicall
E           res = hook_impl.function(*args)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\_pytest\python.py", line 166, in pytest_pyfunc_call
E           testfunction(**testargs)
E         File "d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py", line 9, in test_tf
E           b = tf.layers.conv2d(a, 2, kernel_size=3)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func
E           return func(*args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\layers\convolutional.py", line 424, in conv2d
E           return layer.apply(inputs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 1227, in apply
E           return self.__call__(inputs, *args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\layers\base.py", line 530, in __call__
E           outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 554, in __call__
E           outputs = self.call(inputs, *args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\keras\layers\convolutional.py", line 194, in call
E           outputs = self._convolution_op(inputs, self.kernel)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 966, in __call__
E           return self.conv_op(inp, filter)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 591, in __call__
E           return self.call(inp, filter)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 208, in __call__
E           name=self.name)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 1113, in conv2d
E           data_format=data_format, dilations=dilations, name=name)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
E           op_def=op_def)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
E           return func(*args, **kwargs)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
E           op_def=op_def)
E         File "D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
E           self._traceback = tf_stack.extract_stack()
E       
E       UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
E       	 [[node conv2d/Conv2D (defined at d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py:9) ]]

D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\client\session.py:1348: UnknownError
---------------------------- Captured stdout call -----------------------------
1.13.1
---------------------------- Captured stderr call -----------------------------
WARNING:tensorflow:From d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py:9: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
WARNING:tensorflow:From D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-05-09 16:50:28.857769: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2019-05-09 16:50:28.859645: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
------------------------------ Captured log call ------------------------------
deprecation.py             323 WARNING  From d:\Documents\nengo-repos\nengo-dl\nengo_dl\tests\test_tf.py:9: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d instead.
deprecation.py             323 WARNING  From D:\Miniconda3\envs\tmp\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
- generated xml file: C:\Users\dhras\AppData\Local\Temp\tmp-11448rObxAm5zBIM7.xml -
==================== 1 failed, 3 warnings in 2.14 seconds =====================
Error: Error: cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D)

Output from Console under the Developer Tools panel (toggle Developer Tools on under Help; turn on source maps to make any tracebacks be useful by running Enable source map support for extension debugging)

[Extension Host] Python Extension: Cached data exists ActivatedEnvironmentVariables, d:\Documents\nengo-repos\nengo-dl
console.ts:134 [Extension Host] Python Extension: getActivatedEnvironmentVariables, Class name = S, Arg 1: <Uri:d:\Documents\nengo-repos\nengo-dl>, Arg 2: undefined, Arg 3: undefined
console.ts:134 [Extension Host] Python Extension: Cached data exists getEnvironmentVariables, d:\Documents\nengo-repos\nengo-dl
notificationsAlerts.ts:40 There was an error in running the tests.
onDidNotificationChange @ notificationsAlerts.ts:40
_register.model.onDidNotificationChange.e @ notificationsAlerts.ts:26
fire @ event.ts:584
notify @ notifications.ts:113
notify @ notificationService.ts:53
r @ mainThreadMessageService.ts:83
_showMessage @ mainThreadMessageService.ts:44
$showMessage @ mainThreadMessageService.ts:38
_doInvokeHandler @ rpcProtocol.ts:399
_invokeHandler @ rpcProtocol.ts:384
_receiveRequest @ rpcProtocol.ts:304
_receiveOneMessage @ rpcProtocol.ts:226
_protocol.onMessage.e @ rpcProtocol.ts:101
fire @ event.ts:584
a @ ipc.net.ts:392
e @ ipc.net.ts:399
fire @ event.ts:584
_receiveMessage @ ipc.net.ts:678
_socketDisposables.push._socketReader.onMessage.e @ ipc.net.ts:549
fire @ event.ts:584
acceptChunk @ ipc.net.ts:212
_register._socket.onData.e @ ipc.net.ts:173
t @ ipc.net.ts:24
emit @ events.js:182
addChunk @ _stream_readable.js:279
readableAddChunk @ _stream_readable.js:264
Readable.push @ _stream_readable.js:219
onread @ net.js:636
console.ts:134 [Extension Host] rejected promise not handled within 1 second: Error: cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D)
t.log @ console.ts:134
$logExtensionHostMessage @ mainThreadConsole.ts:39
_doInvokeHandler @ rpcProtocol.ts:399
_invokeHandler @ rpcProtocol.ts:384
_receiveRequest @ rpcProtocol.ts:304
_receiveOneMessage @ rpcProtocol.ts:226
_protocol.onMessage.e @ rpcProtocol.ts:101
fire @ event.ts:584
a @ ipc.net.ts:392
e @ ipc.net.ts:399
fire @ event.ts:584
_receiveMessage @ ipc.net.ts:678
_socketDisposables.push._socketReader.onMessage.e @ ipc.net.ts:549
fire @ event.ts:584
acceptChunk @ ipc.net.ts:212
_register._socket.onData.e @ ipc.net.ts:173
t @ ipc.net.ts:24
emit @ events.js:182
addChunk @ _stream_readable.js:279
readableAddChunk @ _stream_readable.js:264
Readable.push @ _stream_readable.js:219
onread @ net.js:636
console.ts:134 [Extension Host] stack trace: Error: cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D)
    at n.then.e (file:///D:/Programs/Microsoft VS Code/resources/app/out/vs/workbench/workbench.main.js:4392:776)
t.log @ console.ts:134
$logExtensionHostMessage @ mainThreadConsole.ts:39
_doInvokeHandler @ rpcProtocol.ts:399
_invokeHandler @ rpcProtocol.ts:384
_receiveRequest @ rpcProtocol.ts:304
_receiveOneMessage @ rpcProtocol.ts:226
_protocol.onMessage.e @ rpcProtocol.ts:101
fire @ event.ts:584
a @ ipc.net.ts:392
e @ ipc.net.ts:399
fire @ event.ts:584
_receiveMessage @ ipc.net.ts:678
_socketDisposables.push._socketReader.onMessage.e @ ipc.net.ts:549
fire @ event.ts:584
acceptChunk @ ipc.net.ts:212
_register._socket.onData.e @ ipc.net.ts:173
t @ ipc.net.ts:24
emit @ events.js:182
addChunk @ _stream_readable.js:279
readableAddChunk @ _stream_readable.js:264
Readable.push @ _stream_readable.js:219
onread @ net.js:636
log.ts:173   ERR cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D): Error: cannot open file:///d%3A/Documents/nengo-repos/nengo-dl/D. Detail: File not found (file:///d:/Documents/nengo-repos/nengo-dl/D)
    at n.then.e (file:///D:/Programs/Microsoft VS Code/resources/app/out/vs/workbench/workbench.main.js:4392:776)

The text was updated successfully, but these errors were encountered:

DonJayamanne · 2019-05-09T20:52:48Z

I believe that that CUDNN error is actually caused by the previous test process not shutting down completely, which causes those errors when the test suite runs again.

Agreed.
My suggestion is to wait for the tests to run to completion before starting it again.
Please could you try waiting for tests to complete before starting, then let me know how this goes.

I don't see how we can solve this, as this is purely an environment issue.
Will leave this issue open to explore other avenues.

drasmuss · 2019-05-09T21:11:48Z

The above reproduction steps work even if you wait for the tests to run to completion each time. So the issue is that even though the tests have finished running, VS Code is keeping that process open in some way.

drasmuss · 2019-05-09T21:13:59Z

Also worth noting that although there seems to be some randomness in how many times you need to run the tests to trigger the failure, once it reaches this failure state then the tests will fail every time from then on with those CUDNN errors (suggesting there's some process permanently stuck in the background holding the GPU resources).

DonJayamanne · 2019-05-10T06:12:56Z

The process doesn't stay alive, it dies.
I'd say the code that's running in the test isnt tearing down gracefully.
Do you know what needs to shutdown down, if you Ido, then I'd suggest adding an atexit handler or similar.

drasmuss · 2019-05-10T13:02:20Z

This problem only occurs when running through VS Code, not when running the tests in other ways (e.g. running pytest from the terminal, or running in PyCharm). So I am fairly certain that it is something specific to how VS Code is running or tearing down the tests. Do you have any suggestions on what might be different with VS Code? Or if there are some settings I can enable to change the shutdown behaviour, or force a full shutdown?

DonJayamanne · 2019-05-15T16:37:51Z

There's nothing specific to VS Code.
Here's what we do:

Run the tests using the standard pytest CLI
Thats it, nothing more than that.
I'll add some code to log the command thats being generated and executed. FYI - The output is displayed in the Python Test Log output panel.

DonJayamanne · 2019-05-15T16:38:33Z

Solution:

Use python code instead of pytest cli
Log commands executed along with the output

luabud · 2019-10-24T18:38:37Z

Closing in favour of #7608.

ghost added the triage-needed Needs assignment to the proper sub-team label May 9, 2019

DonJayamanne added info-needed Issue requires more information from poster triage labels May 9, 2019

ghost removed triage-needed Needs assignment to the proper sub-team labels May 9, 2019

DonJayamanne added area-testing bug Issue identified by VS Code Team member as probable bug labels May 9, 2019

karrtikr assigned DonJayamanne May 15, 2019

DonJayamanne added needs PR and removed info-needed Issue requires more information from poster labels May 15, 2019

DonJayamanne removed their assignment May 16, 2019

DonJayamanne added reason-preexisting and removed triage labels May 16, 2019

luabud closed this as completed Oct 24, 2019

ghost removed the needs PR label Oct 24, 2019

lock bot locked as resolved and limited conversation to collaborators Oct 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors when rerunning test suite #5590

Errors when rerunning test suite #5590

drasmuss commented May 9, 2019 •

edited

Loading

DonJayamanne commented May 9, 2019

drasmuss commented May 9, 2019

drasmuss commented May 9, 2019 •

edited

Loading

DonJayamanne commented May 10, 2019 •

edited

Loading

drasmuss commented May 10, 2019 •

edited

Loading

DonJayamanne commented May 15, 2019

DonJayamanne commented May 15, 2019

luabud commented Oct 24, 2019

Errors when rerunning test suite #5590

Errors when rerunning test suite #5590

Comments

drasmuss commented May 9, 2019 • edited Loading

Environment data

Expected behaviour

Actual behaviour

Steps to reproduce:

Logs

DonJayamanne commented May 9, 2019

drasmuss commented May 9, 2019

drasmuss commented May 9, 2019 • edited Loading

DonJayamanne commented May 10, 2019 • edited Loading

drasmuss commented May 10, 2019 • edited Loading

DonJayamanne commented May 15, 2019

DonJayamanne commented May 15, 2019

Solution:

luabud commented Oct 24, 2019

drasmuss commented May 9, 2019 •

edited

Loading

drasmuss commented May 9, 2019 •

edited

Loading

DonJayamanne commented May 10, 2019 •

edited

Loading

drasmuss commented May 10, 2019 •

edited

Loading