Frontier Models

AI companies
Hardware, chip/GPU/TPU
Cloud Providers
Aggregators
Open Source

poe.com free 3,000 credits daily, a few tries on opus, codex.
nvidia
cerebras
openrouter
groq
opencode
alibaba
AWS, Google Cloud, Azure Grok (xAI)

Claude - Sonnet 4.6 - Opus 4.6 (Anthropic)

Gemini 3.1 (Google)

Codex - ChatGPT (OpenAI)

Docs
Practice

-----------------------------------

Minimax 2.5

DeepSeek
DeepSeek API
API Docs

Kimi K2

GLM-5

Qwen 3 (Alibaba)

Gemini CLI

Antigravity IDE

Gemini Canvas

Google AI Studio

Jules - 15 tasks per day

Stitch

Notebook LM

Firebase

Gemini Code Assist (IDE extension)

Google Co-Labs

LM Studio, llama.ccp

LM Studio 0.4 adds daemon mode for server-native deployment, parallel inference requests, and a stateful REST API that supports using local MCP servers.

Introducing llmster Parallel Requests Serve multiple concurrent inference requests. Requests to the same model can now be processed in parallel, instead of queued. Try this feature by putting two chats side-by-side in the new Split View feature. ollama
LM studio
llama.ccp
vllm-mlx on Apple Silicon

Local LLMs

Mac Studio
Apple M4 Max chip
16-core CPU, a 40-core GPU, and a 16-core Neural Engine
128GB of integrated memory
1TB SSD storage

130k baht
-----------------------

Mac Studio
Apple M3 Ultra chip
28-core CPU, a 60-core GPU, and a 32-core Neural Engine
256GB of integrated memory
1TB SSD storage

200k++ baht

https://huggingface.co/TeichAI/Qwen3-14B-Claude-4.5-Opus-High-Reasoning-Distill-GGUF/tree/main

Cloud Server

To RUN, EDIT, MANAGE a large model.
Lease hosted, managed server to use or modify Large LLMs

Distilling
Fine Tuning
Inference

Hostinger
digital Ocean, affiliate program
https://www.alibabacloud.com/en

CodeSpaces
Coding UI