You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @lzt5269, thanks for bringing up the error! I think it should work with #223. When the CUDA compilation fails, it will fallback to the triton kernel. The reason of the compilation failure should be that V100 does not support bf16.
Hi,
I'm running model.generate but got errors. I'm using Tesla V100. Is this issue raised because of it?
`RuntimeError: Error building extension 'xgrammar': [1/2] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output cuda.cuda.o.d -ccbin /home/qau3575/gcc-9.5.0/bin/gcc -DTORCH_EXTENSION_NAME=xgrammar -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/TH -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/qau3575/.conda/envs/vllm/include/python3.12 -D_GLIBCXX_USE_CXX11_ABI=0 --expt-relaxed-constexpr -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++17 --threads 4 -use_fast_math -c /home/qau3575/.cache/torch_extensions/py312_cu118/xgrammar/cuda.cu -o cuda.cuda.o
FAILED: cuda.cuda.o
/usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output cuda.cuda.o.d -ccbin /home/qau3575/gcc-9.5.0/bin/gcc -DTORCH_EXTENSION_NAME=xgrammar -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/TH -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/qau3575/.conda/envs/vllm/include/python3.12 -D_GLIBCXX_USE_CXX11_ABI=0 --expt-relaxed-constexpr -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++17 --threads 4 -use_fast_math -c /home/qau3575/.cache/torch_extensions/py312_cu118/xgrammar/cuda.cu -o cuda.cuda.o
/home/qau3575/.cache/torch_extensions/py312_cu118/xgrammar/cuda.cu(52): error: identifier "__ushort_as_bfloat16" is undefined
1 error detected in the compilation of "/home/qau3575/.cache/torch_extensions/py312_cu118/xgrammar/cuda.cu".
ninja: build stopped: subcommand failed.`
The text was updated successfully, but these errors were encountered: