Skip to content

Releases: vllm-project/vllm

vLLM v0.1.3

02 Aug 23:56
aa84c92
Compare
Choose a tag to compare

What's Changed

Major changes

  • More model support: LLaMA 2, Falcon, GPT-J, Baichuan, etc.
  • Efficient support for MQA and GQA.
  • Changes in the scheduling algorithm: vLLM now uses a TGI-style continuous batching.
  • And many bug fixes.

All changes

New Contributors

Full Changelog: v0.1.2...v0.1.3

vLLM v0.1.2

05 Jul 04:51
1c395b4
Compare
Choose a tag to compare

What's Changed

  • Initial support for GPTBigCode
  • Support for MPT and BLOOM
  • Custom tokenizer
  • ChatCompletion endpoint in OpenAI demo server
  • Code format
  • Various bug fixes and improvements
  • Documentation improvement

Contributors

Thanks to the following amazing people who contributed to this release:

@michaelfeil @WoosukKwon @metacryptom @merrymercy @BasicCoder @zhuohan123 @twaka @comaniac @neubig @JRC1995 @LiuXiaoxuanPKU @bm777 @Michaelvll @gesanqiu @ironpinguin @coolcloudcol @akxxsb

Full Changelog: v0.1.1...v0.1.2

vLLM v0.1.1 (Patch)

22 Jun 07:38
83658c8
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.1.0...v0.1.1

vLLM v0.1.0

20 Jun 06:28
67d96c2
Compare
Choose a tag to compare

The first official release of vLLM!

See our README for details.

Thanks

Thanks @WoosukKwon @zhuohan123 @suquark for their contributions.