FastServe: Distributed Inference Serving System for LLMs
Peking University Researchers Introduce FastServe: A Distributed Inference Serving System For Large Language Models LLMs Large language model (LLM) improvements create opportunities in various fields and inspire a new wave of interactive AI applications. The most noteworthy one is ChatGPT, … Read more