Enterprise AI
Unified Gateway & Agent Deployment

Unified LLM API, online AI apps, custom agents & workflows, org-level quotas & billing — WANFLOW.AI runs everything your team needs on a single, globally available platform.

Start Free — 5M tokens Contact Enterprise

12 Global Nodes · 99.99% SLA OpenAI SDK compatible, switch in 5 minutes Enterprise Console · Per-employee quotas & usage

api.wanflow.ai/v1/chat/completions

# OpenAI SDK 兼容 — 改一个 base_url 就能用任意模型
curl "https://api.wanflow.ai/v1/chat/completions" \
  -H "Authorization: Bearer $WANFLOW_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-8",
    "messages": [{"role":"user","content":"分析 Q4 销售数据"}],
    "fallback": ["gpt-5", "gemini-2.5-pro"]
  }'

from openai import OpenAI

client = OpenAI(
    base_url="https://api.wanflow.ai/v1",
    api_key="wf-...",
)

resp = client.chat.completions.create(
    model="claude-opus-4-8",
    messages=[{"role": "user", "content": "分析 Q4 销售数据"}],
    extra_body={"fallback": ["gpt-5", "gemini-2.5-pro"]},
)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.wanflow.ai/v1",
  apiKey: process.env.WANFLOW_KEY,
});

const resp = await client.chat.completions.create({
  model: "claude-opus-4-8",
  messages: [{ role: "user", content: "分析 Q4 销售数据" }],
  fallback: ["gpt-5", "gemini-2.5-pro"],
});

CONSOLE · ACME CORP

This Month Usage

¥ 18,420.42↓ 32% vs direct

142 employees · 18 projects · 4 agents

claude-opus-4-8

42.1M

¥ 8,420

gpt-5

28.6M

¥ 5,720

gemini-2.5-pro

19.4M

¥ 3,180

flux-1.1-pro

1,240 张

¥ 1,100

200+Models SupportedLeading global LLMs

12Global NodesCN / US / EU / SEA · auto nearest routing

99.99%SLA UptimeMulti-model auto-fallback · real-time circuit breaker

1.2M+DevelopersServing 38,000+ teams & enterprises

Claude

OpenAI

Gemini

DeepSeek

Mistral

Llama 4

Qwen 3

Kimi K2

Grok

Flux

Veo 3

Sora

USAGE · TOP MODELS

Usage Ranking · Last 30 Days

+12% MoM

Claude Sonnet 4.738.2%

GPT-524.7%

Gemini 2.5 Pro14.1%

DeepSeek V3.29.8%

Llama 4 Maverick5.4%

Qwen 3 Max4.2%

Others3.6%

91.4M TOKENS · 30DWANFLOW · ROUTING

View Full Model Matrix →

◇ UNIFIED API · 01

Unified LLM API Platform
Smart Routing · Unified Billing

100% OpenAI SDK compatible — change base_url to access 200+ models
12 global nodes with nearest routing, end-to-end median latency 178ms
Multi-model auto-fallback, zero disruption for callers
Key isolation · usage attribution · enterprise-grade security

API · ROUTING· wanflow.ai/v1

REQUEST · OPENAI SDK

POST /chat/completions

Compatible mode · change base_url

ROUTER · SMART

Smart Router

→ claude · gpt-5 · gemini

PRIMARY

Claude Sonnet

HK · 178ms

FALLBACK

GPT-5

Standby · 220ms

FALLBACK

Gemini 2.5

Standby · 312ms

BILLING · UNIFIED

Unified Billing

$0.012 · 1.2K tokens · 178ms

5 NODES · 6 EDGESLIVE · 178MS P50

Global Nearest Routing12 zones

Auto-connect to nearest healthy node, 5s failover on zone failure. Dual-line BGP + Anycast for domestic and international users.

Multi-Model Fallbackfallback

Declare fallback models in your request — when the primary model times out, rejects, or is rate-limited, it automatically falls back to the next one, transparent to callers.

Semantic Cache · 50% Cost Reductioncache

Similar prompts auto-hit vector cache at 0 tokens. Average 30–50% monthly bill reduction. Hit rate and recall thresholds are configurable.

Image gen, video gen, voiceover, document analysis — ready-to-use AI tools.

Package the most powerful generative models into tools your team will actually use. Marketing doesn't need to learn APIs, PMs can generate images directly — image, video, voice, and document parsing, all in one workspace.

flux · seedream · midjourney

IMAGE

AI Image Generation

14 models · batch · style library · commercial license

veo 3 · sora · kling 2

VIDEO

AI Video Generation

Text/image to 8s clips · 4K · commercial license

elevenlabs · minimax

VOICE

Voice Cloning · TTS

32 languages · emotion control · real-time duplex

PDF · DOCX · 知识库

DOCUMENT

Document Analysis / RAG

Enterprise knowledge base · 100+ batch docs · citation links

Browse All 24 AI Apps

Turn business processes into intelligent workflows

Visual canvas orchestration: LLM nodes, tool calls, vector retrieval, human approval, scheduled triggers — drag, connect, deploy with zero code. Every enterprise knowledge base mounts with one click; employees trigger via IM / API / tickets, all results are traceable.

200+ Node Building Blocks

LLM, tools, retrieval, API, database, SaaS integrations — all pre-installed

Human Approval Nodes

High-risk actions route to Lark / DingTalk / WeCom for approval before continuing

Full Observability

Every step's duration, tokens, cost, and errors are tracked — replayable & rollbackable

View Workflow Templates

WORKFLOW · Customer Intent Classification

TRIGGER

New Ticket

Lark / Email

LLM · CLAUDE

Intent Classification

→ refund · ship · other

ROUTER

Branch by Intent

TOOL · 退款

Check Order → Refund

Needs approval · Finance

TOOL · 物流

Check Shipping Status

SF · JD · 美团

Reply to Customer

6 nodes · 6 edges running · 24,140 runs / 30d

Enterprise AI Usage Control

Org / department / project three-tier accounts, per-employee token quotas, auto-alert on overuse; private VPC deployment, SSO/SAML, audit logs, SOC 2 — everything enterprise IT cares about, natively built in.

QUOTA

Per-Employee Token Quotas

142 人

Employee	This Month Usage	Quota	Status
Zhang Ming · Algorithm eng@acme	8.4M	10M	Normal
Li Xin · Product pm@acme	4.9M	5M	Near Limit
Wang Chen · Design design@acme	1.2M	3M	Normal
Chen Hao · Operations ops@acme	3.0M	3M	Exhausted

Auto-pause on overuse · email / Lark alertsLearn about quota policies →

SECURITY & COMPLIANCE

Compliance & Isolation

SOC 2 · ISO 27001

SSO / SAML / SCIM

Lark · DingTalk · Okta · Azure AD · Google · custom IDP, one-click integration

Private VPC / Domestic Compliance

Alibaba Cloud / Tencent Cloud / Volcengine / AWS single-tenant private deployment, keys never leave your domain

Audit Logs · Data Retention Policies

Every call is traceable · prompt/response retention configurable: 0 / 24h / 7d / 90d

Tiered Sensitive Data Masking

Names, phones, IDs, bank cards — auto-masked before upstream, restored on downstream

Learn About Enterprise Plans

OWN STACK

Own Models, Own Compute

More than an API gateway — we build our own H100/H200 compute centers, 100% solar-powered, and train proprietary industry models. From hardware to models, every layer is in our hands.

COMPUTE

Self-Built Compute Center

NVIDIA

End-to-end self-built data center with NVIDIA H100 / H200 GPU clusters, NVLink + InfiniBand high-speed interconnect. Inference and training resources are independently pooled, no peak contention, SLA guaranteed by us.

H100Primary Inference

H200Large Model Training

NVLinkUltra-Low Latency

SOLAR POWER

100% Solar Powered

Direct PV Supply

Compute centers sited in high-sunlight regions with rooftop + campus PV arrays. Solar powers GPU clusters by day, battery storage takes over at night. Every inference comes with a traceable Renewable Energy Certificate (REC) for true carbon neutrality.

100%Solar Energy

RECTraceable RECs

0tScope 2 Emissions

IN-HOUSE MODELS

Proprietary Industry Models

WanFlow

WanFlow Tide series models trained on our own compute, covering code, writing, and more. Industry fine-tuning and private deployment available — data never leaves your domain, tuned to your business.

6+Model Series

自CapableFine-Tuning

私Avail.Exclusive Deploy

Global Multi-Region Deployment

Coverage across CN / US / EU / SEA / JP / KR major cities, dual-line BGP + Anycast. Single-node failure switches in 5s, zone circuit break without business interruption, end-to-end median latency 178ms.

178msP50 Latency

5sFailover

99.99%SLA

2.4BMonthly Calls

38,000+ teams are running AI in production.

"We used to need three SDKs, three invoices, three monitoring systems for three models. After switching to WANFLOW, billing is unified and monthly costs dropped 38%."

Luo Siyuan

CTO · Qiming Cloud

"Our ops team of 30+ people publishes promotions daily with image and text generation — no code written, everything runs on the workflow canvas."

Lin Xuerui

Head of Marketing · Coastal Consumer

"The employee quota feature plugged our team's token black hole — IT can finally explain every penny to finance."

Zhu Xiaopei

IT Director · Feituo Logistics

Qiming Cloud

Coastal Consumer

Feituo Logistics

Lanbo Medical

Lanqiao Finance

Haineng Education

Enterprise AIUnified Gateway & Agent Deployment

Unified LLM API PlatformSmart Routing · Unified Billing