算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip

上传者: 38140936 | 上传时间: 2026-04-20 16:58:56 | 文件大小: 6.36MB | 文件类型: ZIP
算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip 算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip算法部署-使用TensorRT-LL

文件下载

资源详情

[{"title":"( 47 个子文件 6.36MB ) 算法部署-使用TensorRT-LLM部署大模型-附详细优化+分析流程教程-优质大模型部署项目实战.zip","children":[{"title":"TensorRT-LLM-ChatGLM3-main","children":[{"title":"app.py <span style='color:#111;'> 5.38KB </span>","children":null,"spread":false},{"title":"vLLM","children":[{"title":"results.txt <span style='color:#111;'> 750B </span>","children":null,"spread":false},{"title":"langchang_chatglm3_vllm.py <span style='color:#111;'> 405B </span>","children":null,"spread":false},{"title":"chatglm3_quant_awq.py <span style='color:#111;'> 773B </span>","children":null,"spread":false},{"title":"offline_chatglm3.py <span style='color:#111;'> 564B </span>","children":null,"spread":false},{"title":"prompts.txt <span style='color:#111;'> 22B </span>","children":null,"spread":false},{"title":"model_repo","children":[{"title":"vllm_model","children":[{"title":"config.pbtxt <span style='color:#111;'> 1.70KB </span>","children":null,"spread":false},{"title":"1","children":[{"title":"model.json <span style='color:#111;'> 247B </span>","children":null,"spread":false}],"spread":true}],"spread":true}],"spread":true},{"title":"client.py <span style='color:#111;'> 8.08KB </span>","children":null,"spread":false}],"spread":true},{"title":"Triton大模型部署.pdf <span style='color:#111;'> 7.41MB </span>","children":null,"spread":false},{"title":"tensorrt_llm","children":[{"title":"run_hf.py <span style='color:#111;'> 1.94KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 3.78KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"quantize.py <span style='color:#111;'> 5.64KB </span>","children":null,"spread":false},{"title":"see_chatglm3_model.py <span style='color:#111;'> 406B </span>","children":null,"spread":false},{"title":"process.py <span style='color:#111;'> 1.39KB </span>","children":null,"spread":false},{"title":"smoothquant.py <span style='color:#111;'> 5.14KB </span>","children":null,"spread":false},{"title":"requirements.txt <span style='color:#111;'> 75B </span>","children":null,"spread":false},{"title":"run_chat_trt.py <span style='color:#111;'> 7.83KB </span>","children":null,"spread":false},{"title":"build.py <span style='color:#111;'> 28.32KB </span>","children":null,"spread":false},{"title":"weight.py <span style='color:#111;'> 24.33KB </span>","children":null,"spread":false},{"title":"visualize.py <span style='color:#111;'> 2.70KB </span>","children":null,"spread":false}],"spread":false},{"title":"langchain_chatglm3.py <span style='color:#111;'> 4.21KB </span>","children":null,"spread":false},{"title":"triton_inference_server","children":[{"title":"model_repo","children":[{"title":"postprocessing","children":[{"title":"config.pbtxt <span style='color:#111;'> 2.85KB </span>","children":null,"spread":false},{"title":"1","children":[{"title":"model.py <span style='color:#111;'> 9.15KB </span>","children":null,"spread":false},{"title":"__pycache__","children":[{"title":"model.cpython-310.pyc <span style='color:#111;'> 4.79KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}],"spread":true},{"title":"ensemble","children":[{"title":"config.pbtxt <span style='color:#111;'> 9.47KB </span>","children":null,"spread":false}],"spread":true},{"title":"tensorrt_llm","children":[{"title":"config.pbtxt <span style='color:#111;'> 8.03KB </span>","children":null,"spread":false}],"spread":true},{"title":"tensorrt_llm_bls","children":[{"title":"config.pbtxt <span style='color:#111;'> 4.46KB </span>","children":null,"spread":false},{"title":"1","children":[{"title":"model.py <span style='color:#111;'> 15.16KB </span>","children":null,"spread":false},{"title":"__pycache__","children":[{"title":"model.cpython-310.pyc <span style='color:#111;'> 7.00KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}],"spread":true},{"title":"preprocessing","children":[{"title":"config.pbtxt <span style='color:#111;'> 3.54KB </span>","children":null,"spread":false},{"title":"1","children":[{"title":"model.py <span style='color:#111;'> 14.70KB </span>","children":null,"spread":false},{"title":"__pycache__","children":[{"title":"model.cpython-310.pyc <span style='color:#111;'> 8.53KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}],"spread":true}],"spread":true}],"spread":true},{"title":"img","children":[{"title":"content.jpg <span style='color:#111;'> 92.08KB </span>","children":null,"spread":false},{"title":"face.jpg <span style='color:#111;'> 43.81KB </span>","children":null,"spread":false}],"spread":true},{"title":"service","children":[{"title":"knowledge_service.py <span style='color:#111;'> 3.09KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 3.78KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 58B </span>","children":null,"spread":false},{"title":"chatglm_service.py <span style='color:#111;'> 1.51KB </span>","children":null,"spread":false},{"title":"chatglm_triton_service.py <span style='color:#111;'> 9.33KB </span>","children":null,"spread":false},{"title":"chatglm_trtllm_service.py <span style='color:#111;'> 6.90KB </span>","children":null,"spread":false},{"title":"config.py <span style='color:#111;'> 593B </span>","children":null,"spread":false}],"spread":true},{"title":"langchain_chatglm3_triton.py <span style='color:#111;'> 4.11KB </span>","children":null,"spread":false},{"title":"end_to_end_grpc_client.py <span style='color:#111;'> 11.98KB </span>","children":null,"spread":false},{"title":"requirements.txt <span style='color:#111;'> 304B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 591B </span>","children":null,"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明