-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Running this code:
model = llama_cpp.Llama ("mxbai-embed-xsmall-v1-q8_0.gguf", embedding = True)
embeddings = model.embed (["Hello", "World"])
used to work in v0.3.14
Current Behavior
The code raises an exception RuntimeError: llama_decode returned -1
. The following messages are printed to the console:
init: invalid seq_id[3][0] = 1 >= 1
encode: failed to initialize batch
Environment and Context
llama-cpp-python was compiled in CUDA mode
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
Steps to Reproduce
Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import llama_cpp
>>> model = llama_cpp.Llama ("../models/mxbai-embed-xsmall-v1-q8_0.gguf", embedding = True)
...
>>> embeddings = model.embed (["Hello", "World"])
decode: cannot decode batches with this context (calling encode() instead)
init: invalid seq_id[3][0] = 1 >= 1
encode: failed to initialize batch
llama_decode: failed to decode, ret = -1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../site-packages/llama_cpp/llama.py", line 1108, in embed
decode_batch(s_batch)
File ".../site-packages/llama_cpp/llama.py", line 1045, in decode_batch
self._ctx.decode(self._batch)
File ".../site-packages/llama_cpp/_internals.py", line 327, in decode
raise RuntimeError(f"llama_decode returned {return_code}")
RuntimeError: llama_decode returned -1
suncloudsmoon, ruben-tsui, yuaoi000 and trandinhnguyen
Metadata
Metadata
Assignees
Labels
No labels