Examples¶

vLLM's examples are split into three categories:

If you are using vLLM from within Python code, see Offline Inference
If you are using vLLM from an HTTP application or client, see Online Serving
For examples of using some of vLLM's advanced features (e.g. LMCache or Tensorizer) which are not specific to either of the above use cases, see Others