Endpoints are the actual deployments of your models on your specified GPU resource.

  • You can create your endpoint by specifying the name, the model, and the instance configuration, consisting of your desired GPU specification.

  • On the details page of your endpoint, you can access the health and the endpoint URL of the endpoint instance.

  • To test out the deployed model through the web, we provide the playground interface, where you can try out your queries on a chatGPT-like interface.
  • Simply enter your query, adjust your settings, and generate your responses!

  • For general usages, send queries to your model through our API at the given endpoint address, accessible on the endpoint information tab.