August 2025 — Microsoft has announced the integration of OpenAI’s new open-weight model, gpt-oss, into its Azure AI Foundry and Windows AI Foundry platforms. This marks a major shift in AI deployment, giving developers and enterprises full control over running and customizing powerful AI models—both in the cloud and at the edge.
What is gpt-oss?
OpenAI’s gpt-oss is the company’s first open-weight model release since GPT-2, offering full transparency and control for developers. Two variants are now available:
- gpt-oss-120b: A 120-billion-parameter model designed for high-level reasoning and complex tasks like coding and advanced Q&A. Despite its size, it can run on a single enterprise-grade GPU.
- gpt-oss-20b: A lighter model optimized for tasks involving tool use, coding, and workflow integration. It can run on devices with 16GB+ GPU VRAM and is now supported on Windows with MacOS support coming soon.
These models are designed for performance across a variety of use cases—from data centers to laptops—with low-latency and high-efficiency support.
Full-Stack AI Development, From Cloud to Edge
Microsoft’s AI platform includes:
- Azure AI Foundry: A unified environment to build, fine-tune, and deploy AI models securely at scale.
- Windows AI Foundry: A local development stack embedded in Windows 11, supporting offline and secure AI deployments using local hardware (CPU, GPU, NPU).
With Foundry Local, developers can deploy open-source models directly on devices—making AI accessible even in bandwidth-constrained or offline settings.
Key Capabilities of the Platform
- Open-Weight Advantage: Fine-tune with custom data using methods like LoRA and QLoRA, inspect model behavior, retrain layers, or export to formats like ONNX for production use.
- Deployment Flexibility: Deploy via cloud or edge, adapt for domain-specific copilots, or scale seamlessly from prototype to production.
- Enterprise Readiness: Supports compliance, governance, and secure deployment—crucial for regulated industries and mission-critical applications.
Why It Matters
By bringing gpt-oss to Azure and Windows, Microsoft is enabling:
- Developers: To fully understand, customize, and deploy AI without relying on closed systems.
- Enterprises: To maintain data sovereignty, reduce costs, and optimize AI for specific tasks and environments.
This approach empowers organizations to build hybrid AI systems that are cloud-optional and tailored to real-world requirements.
What’s Next?
- gpt-oss-20b: Available now on Azure AI Foundry and Windows AI Foundry.
- gpt-oss-120b: Available now on Azure for data center-class deployments.
- MacOS support: Coming soon via Foundry Local.
To get started:
- Use Azure CLI to deploy via the Azure AI Model Catalog.
- Try Foundry Local on your Windows machine to run gpt-oss offline.
- Check out Managed Compute pricing for detailed cost information.
Microsoft’s move signals a broader commitment to open AI tools, developer transparency, and hybrid AI deployments—providing a foundation for the next generation of intelligent applications.
Image Source: Google
Image Credit: Respective Owner