Replies: 1 comment
-
If I were to guess what I have to do...
If I'm not correct, I would appreciate correction. Updates: I did learn about KV caches and attention masks. I maybe need to store the whole state here. Logits can be stored by giving a decoder batch where it has been instructed to save logits. Then they're in state image and I can truncate the sequence to previous logit boundary. This is what I'm going to try eventually. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm planning to make a chat session where you can rewind the discussion and continue from there. I was planning of using save states for this, but saving the whole state every time is very expensive.
I'm not entirely clear on how the state works. Would it be possible to piece the state itself such that it is rewindable and that I would only have to store one state per session?
Beta Was this translation helpful? Give feedback.
All reactions