jp6/cu129/: gptqmodel versions
Because this project isn't in the mirror_whitelist
,
no releases from root/pypi are included.
Latest version on stage is: 4.0.0.dev0
Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Index | Version | Documentation |
---|---|---|
jp6/cu129 | 4.0.0.dev0 |