You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running 3.2-3B-Instruct-qlora-int4-eo8 in meta-reference-gpu instance via conda (Python 3.10 on Ubuntu 22.04)
Information
The official example scripts
My own modified scripts
🐛 Describe the bug
Error:
ValueError: Model 'meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8' is not available and no llama_model was specified in metadata. Please specify a llama_model in metadata or use a supported model identifier
Cause:
providers/inline/inference/meta_reference/inference.py:80
Parameters in wrong order to build_model_alias, sets 3.2-3B-instruct-qlora-int4-eo8 to try to match against hugging face repo for plain 3.2-3B-Instruct. Reversing order of these parameters fixes this issue.
Error logs
Traceback (most recent call last):
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 354, in
main()
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 286, in main
impls = asyncio.run(construct_stack(config))
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 191, in construct_stack
await register_resources(run_config, impls)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 96, in register_resources
await method(**obj.model_dump())
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/utils/telemetry/trace_protocol.py", line 101, in async_wrapper
result = await method(self, *args, **kwargs)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/routers/routing_tables.py", line 236, in register_model
registered_model = await self.register_object(model)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/routers/routing_tables.py", line 181, in register_object
registered_obj = await register_object_with_provider(obj, p)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/routers/routing_tables.py", line 37, in register_object_with_provider
return await p.register_model(obj)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/utils/telemetry/trace_protocol.py", line 101, in async_wrapper
result = await method(self, *args, **kwargs)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/inline/inference/meta_reference/inference.py", line 115, in register_model
model = await self.model_registry_helper.register_model(model)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/utils/inference/model_registry.py", line 100, in register_model
raise ValueError(
ValueError: Model 'meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8' is not available and no llama_model was specified in metadata. Please specify a llama_model in metadata or use a supported model identifier
Expected behavior
get_hugging_face_repo should receive "Llama3.2-3B-Instruct:int4-qlora-eo8" as model_descriptor, not just "Llama3.2-3B-Instruct" as is passed in current config.
build_model_alias receives the following parameters: provider_model_id and model_descriptor. The call to build_model_alias in inference.py (line 79) should pass model.core_model_id.value first and model.descriptor() second, not the opposite order they are currently.
The text was updated successfully, but these errors were encountered:
System Info
Running 3.2-3B-Instruct-qlora-int4-eo8 in meta-reference-gpu instance via conda (Python 3.10 on Ubuntu 22.04)
Information
🐛 Describe the bug
Error:
ValueError: Model 'meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8' is not available and no llama_model was specified in metadata. Please specify a llama_model in metadata or use a supported model identifier
Cause:
providers/inline/inference/meta_reference/inference.py:80
Parameters in wrong order to build_model_alias, sets 3.2-3B-instruct-qlora-int4-eo8 to try to match against hugging face repo for plain 3.2-3B-Instruct. Reversing order of these parameters fixes this issue.
Error logs
Traceback (most recent call last):
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 354, in
main()
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 286, in main
impls = asyncio.run(construct_stack(config))
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 191, in construct_stack
await register_resources(run_config, impls)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 96, in register_resources
await method(**obj.model_dump())
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/utils/telemetry/trace_protocol.py", line 101, in async_wrapper
result = await method(self, *args, **kwargs)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/routers/routing_tables.py", line 236, in register_model
registered_model = await self.register_object(model)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/routers/routing_tables.py", line 181, in register_object
registered_obj = await register_object_with_provider(obj, p)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/distribution/routers/routing_tables.py", line 37, in register_object_with_provider
return await p.register_model(obj)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/utils/telemetry/trace_protocol.py", line 101, in async_wrapper
result = await method(self, *args, **kwargs)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/inline/inference/meta_reference/inference.py", line 115, in register_model
model = await self.model_registry_helper.register_model(model)
File "/home/aidan/miniconda3/envs/llamastack-meta-reference-gpu/lib/python3.10/site-packages/llama_stack/providers/utils/inference/model_registry.py", line 100, in register_model
raise ValueError(
ValueError: Model 'meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8' is not available and no llama_model was specified in metadata. Please specify a llama_model in metadata or use a supported model identifier
Expected behavior
get_hugging_face_repo should receive "Llama3.2-3B-Instruct:int4-qlora-eo8" as model_descriptor, not just "Llama3.2-3B-Instruct" as is passed in current config.
build_model_alias receives the following parameters: provider_model_id and model_descriptor. The call to build_model_alias in inference.py (line 79) should pass model.core_model_id.value first and model.descriptor() second, not the opposite order they are currently.
The text was updated successfully, but these errors were encountered: