llm-api.arc.vt.edu

Description

https://llm-api.arc.vt.edu/api/v1/ provides an OpenAI-compatible API endpoint to a selection of LLMs hosted and run by ARC. It is based on the Open WebUI platform, and integrates several inference and integration capabilities, including retrieval-augmented generation (RAG), web search, vision, and image generation.

Access

  • All Virginia Tech students, faculty, and staff may access the service at https://llm-api.arc.vt.edu/api/v1/ using a personal API key. No separate ARC account is required to use the API.

  • There is no charge to individual users for accessing the hosted models via the API.

  • Users must generate an API key through https://llm.arc.vt.edu User profile > Settings > Account > API keys. Keys are unique to each user and must be kept confidential. Do not share your keys.

Restrictions

  • Data classification restriction. Researchers should not use this tool for high-risk data. The service is not approved for processing or storing sensitive or regulated data unless explicit, documented exception and additional protective controls are authorized by VT Privacy and Research Data Protection Program.

  • Users are subject to API and web interface limits to ensure fair usage: 60 requests per minute, 1000 requests per hour, and 2000 requests in a 3-hour sliding window.

Models

ARC currently runs three state-of-the-art models. ARC will add or remove models and scale instances dynamically to respond to user demand. You may select your preferred model in the request settings, e.g. "model": "GLM-4.5-Air".

  • Z.ai GLM-4.5-Air (see model card on Hugging Face). High-performance public model.

  • QuantTrio GLM-4.5V-AWQ (see model card on Hugging Face). GLM variant with vision capabilities.

  • OpenAI gpt-oss-120b (see model card on Hugging Face). OpenAI’s flagship public model.

Security

This service is hosted entirely on-premises within the ARC infrastructure. No data is sent to any third party outside of the university. All user interactions are logged and preserved in compliance with VT IT Security Office Data Protection policies.

Disclaimer

ARC has implemented safeguards to mitigate the risk of generating unlawful, harmful, or otherwise inappropriate content. Despite these measures, LLMs may still produce inaccurate, misleading, biased, or harmful information. Use of this service is undertaken entirely at the user’s own discretion and risk. The service is provided “as is”, and, to the fullest extent permitted by applicable law, ARC and VT expressly disclaim all warranties, whether express or implied, as well as any liability for damages, losses, or adverse consequences that may result from the use of, or reliance upon, the outputs generated by the models. By using this service, the user acknowledges and accepts these conditions, and agrees to comply with all applicable terms and conditions governing the use of the hosted models, associated software, and underlying platforms.

Examples

Please read the OpenAI API documentation for a comprehensive guide to understand the different ways to interact with the LLM. You may also consult the Open WebUI documentation for API endpoints for additional examples involing Retrieval Augmented Generation (RAG), knowledge collections, image generation, tool calling, web search, etc.

Chat Completions Curl API

curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
  -H "Authorization: Bearer sk-write-here-your-API-key" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "GLM-4.5-Air",
        "messages": [{"role":"user","content":"Why is the sky blue?"}]
      }'

Chat Completions Python API

from openai import OpenAI
import argparse

# Modify OpenAI's API key and API base to use the server.
openai_api_key = "sk-write-here-your-API-key"
openai_api_base = "https://llm-api.arc.vt.edu/api/v1"

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is Virginia Tech known for?"},
]


def parse_args():
    parser = argparse.ArgumentParser(description="Client for API server")
    parser.add_argument(
        "--stream", action="store_true", help="Enable streaming response"
    )
    return parser.parse_args()


def main(args):
    client = OpenAI(
        api_key=openai_api_key,
        base_url=openai_api_base,
    )

    models = client.models.list()
    model = models.data[0].id

    # Chat Completion API
    chat_completion = client.chat.completions.create(
        messages=messages,
        model=model,
        stream=args.stream,
    )

    print("-" * 50)
    print("Chat completion results:")
    if args.stream:
        for c in chat_completion:
            print(c)
    else:
        print(chat_completion)
    print("-" * 50)


if __name__ == "__main__":
    args = parse_args()
    main(args)

Image generation Python API

import base64
import requests
from urllib.parse import urlparse
from openai import OpenAI

openai_api_key = "sk-write-here-your-API-key"
openai_api_base = "https://llm-api.arc.vt.edu/api/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

response = client.images.generate(
    model="gpt-oss-120b",
    prompt="A gray tabby cat hugging an otter with an orange scarf",
    size="512x512",
)

base_url = urlparse(openai_api_base)
image_url = f"{base_url.scheme}://{base_url.netloc}" + response[0].url
headers = {"Authorization": f"Bearer {openai_api_key}"}
img_data = requests.get(image_url, headers=headers).content

with open("output.png", 'wb') as handler:
    handler.write(img_data)

Image to text Python API

import requests
import base64
from pathlib import Path
from typing import Optional, Any
import os

openai_api_key = "sk-write-here-your-API-key"

def convert_image_to_base64(image_path: str) -> str:
    image_path = Path(image_path)
    if not image_path.exists():
        raise FileNotFoundError(f"Image file not found: {image_path}")
    
    with open(image_path, "rb") as img_file:
        encoded = base64.b64encode(img_file.read()).decode("utf-8")
    return encoded

def chat_with_model(user_prompt: str, image_path: Optional[str] = None) -> Any:
    url = "https://llm-api.arc.vt.edu/api/v1/chat/completions"

    headers = {
        "Authorization": f"Bearer {openai_api_key}",
        "Content-Type": "application/json"
    }

    content_list = [{"type": "text", "text": user_prompt}]
    
    if image_path:
        image_b64 = convert_image_to_base64(image_path)
        content_list.append({
            "type": "image_url",
            "image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}
        })

    data = {
        "model": "GLM-4.5V-AWQ",
        "messages": [{"role": "user", "content": content_list}]
    }

    try:
        response = requests.post(url, headers=headers, json=data, timeout=30)
        response.raise_for_status()
    except requests.RequestException as e:
        print(f"Request failed: {e}")
        return None

    try:
        return response.json()["choices"][0]["message"]["content"]
    except (KeyError, IndexError) as e:
        print(f"Unexpected response structure: {response.text}")
        return None


if __name__ == "__main__":
    result = chat_with_model("Describe the picture", "bonnie.jpg")
    if result:
        print("Model response:", result)