# Model endpoints Denvr Model Endpoints provide containers to self-host GPU-accelerated inference services for open source and customized AI models. Model endpoints expose OpenAI-compatible APIs with API Key authentication for fast integration with agentic, chat, and LLM-based systems. * [Model Catalog](#model-catalog) * [Creating an Endpoint](#creating-an-endpoint) * [Managing Endpoints](#managing-model-endpoints) * [Accessing the Endpoint](#accessing-the-endpoint) * [Runtime Logs and Metrics](#runtime-logs-and-metrics) * [Custom Models](#custom-models) ## Model Catalog The model catalog contains the most commonly used GenAI open source models for chat, reasoning, tool calling, and code generation. Each model allows different precision and quantization options to match budget and SLA requirements.\ \ Catalog modes are pre-downloaded to Denvr AI Cloud to increase startup time. {% hint style="info" %} Denvr can enable tenant-specific models or different model parameterization as requested. {% endhint %}

## Creating an Endpoint Configure your model with the following options:


Name	Unique identifier or label for the model instance. Allows users to manage and track different models within their infrastructure.
Resource Pool	Defines how compute resources are allocated, either on-demand for dynamic allocation or reserved for dedicated single-tenant resources.
Model	Selection of model variants for parameters, active parameters, and model precision
API Keys	Generate or provide secret keys for Bearer-authorization token

Choose the GPU instance to run the model on, and Launch! {% hint style="info" %} Models are validated to work on listed instance types. The optimization engine will use model-parallelism (MP) or data-parallelism (DP) to use all of the GPUs available. {% endhint %} ## Managing Model Endpoints Application overview will display the Model overview, online status, connection info, and access to runtime logs for troubleshooting.

Endpoints can be stopped, restarted, as well as deleted completely. ## Accessing the Endpoint The model overview provides the Private and Public IPs as well as the public DNS. The connection info will show a full example of the HTTPS endpoint and example to use with curl.

The same URL, model, and API KEY can be used in any application or code that is OpenAI-compatible. This includes OpenWebUI, n8n, and OpenCode. ## Runtime Logs and Metrics The application details screen displays the model engine's log file for troubleshooting. {% hint style="info" %} Model runtime logs are currently not available via Denvr API. {% endhint %} ## Custom Models Models not listed in our Catalog, including private models, can be run using our vLLM Server or Ollama Server applications.

The configuration for vLLM Server allows you to specify:


Launch command	Overwrite the container entry command to include vLLM parameters.
Environment variables	View or edit parameters passed into the vLLM engine

--- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://docs.denvrdata.com/docs/platform/model-endpoints.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.