RuntimeError: Error building extension 'xgrammar': #229

lzt5269 · 2025-03-06T22:51:42Z

Hi,

I'm running model.generate but got errors. I'm using Tesla V100. Is this issue raised because of it?

`RuntimeError: Error building extension 'xgrammar': [1/2] /usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output cuda.cuda.o.d -ccbin /home/qau3575/gcc-9.5.0/bin/gcc -DTORCH_EXTENSION_NAME=xgrammar -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/TH -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/qau3575/.conda/envs/vllm/include/python3.12 -D_GLIBCXX_USE_CXX11_ABI=0 --expt-relaxed-constexpr -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++17 --threads 4 -use_fast_math -c /home/qau3575/.cache/torch_extensions/py312_cu118/xgrammar/cuda.cu -o cuda.cuda.o
FAILED: cuda.cuda.o
/usr/local/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output cuda.cuda.o.d -ccbin /home/qau3575/gcc-9.5.0/bin/gcc -DTORCH_EXTENSION_NAME=xgrammar -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/torch/csrc/api/include -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/TH -isystem /home/qau3575/.conda/envs/vllm/lib/python3.12/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/qau3575/.conda/envs/vllm/include/python3.12 -D_GLIBCXX_USE_CXX11_ABI=0 --expt-relaxed-constexpr -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 -std=c++17 --threads 4 -use_fast_math -c /home/qau3575/.cache/torch_extensions/py312_cu118/xgrammar/cuda.cu -o cuda.cuda.o
/home/qau3575/.cache/torch_extensions/py312_cu118/xgrammar/cuda.cu(52): error: identifier "__ushort_as_bfloat16" is undefined

1 error detected in the compilation of "/home/qau3575/.cache/torch_extensions/py312_cu118/xgrammar/cuda.cu".
ninja: build stopped: subcommand failed.`

Ubospica · 2025-03-16T11:12:01Z

Hi @lzt5269, thanks for bringing up the error! I think it should work with #223. When the CUDA compilation fails, it will fallback to the triton kernel. The reason of the compilation failure should be that V100 does not support bf16.

Ubospica · 2025-04-13T11:14:39Z

Closing this issue due to prolonged inactivity. @lzt5269 Feel free to reopen it if you have further problems!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Error building extension 'xgrammar': #229

RuntimeError: Error building extension 'xgrammar': #229

lzt5269 commented Mar 6, 2025 •

edited

Loading

Ubospica commented Mar 16, 2025

Ubospica commented Apr 13, 2025

RuntimeError: Error building extension 'xgrammar': #229

RuntimeError: Error building extension 'xgrammar': #229

Comments

lzt5269 commented Mar 6, 2025 • edited Loading

Ubospica commented Mar 16, 2025

Ubospica commented Apr 13, 2025

lzt5269 commented Mar 6, 2025 •

edited

Loading