GPT-OSS 120B: Specifications and GPU VRAM Requirements

GPT-OSS 120B

Open Source

Open Weights

Active Parameters

117B

Context Length

128K

Modality

Text

Architecture

Mixture of Experts (MoE)

License

Apache 2.0

Release Date

5 Aug 2025

Knowledge Cutoff

Jun 2024

Technical Specifications

Total Expert Parameters

5.1B

Number of Experts

128

Active Experts

Attention Structure

Multi-Head Attention

Hidden Dimension Size

2880

Number of Layers

Attention Heads

Key-Value Heads

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

Absolute Position Embedding

System Requirements

VRAM requirements for different quantization methods and context sizes

GPT-OSS 120B

GPT-OSS 120B is a large open-weight model from OpenAI, designed to operate in data centers and on high-end desktops and laptops. It is developed to support advanced reasoning, agentic tasks, and diverse developer use cases, functioning as a text-only model for both input and output modalities.

About GPT-OSS

Open-weight language models from OpenAI.

Other GPT-OSS Models

GPT-OSS 20B

Evaluation Benchmarks

Ranking is for Local LLMs.

Rank

Benchmark	Score	Rank
Coding Aider Coding	0.79	🥇 1
StackUnseen ProLLM Stack Unseen	0.93	🥇 1
Reasoning LiveBench Reasoning	0.78	6
Web Development WebDev Arena	1081.54	6
Agentic Coding LiveBench Agentic	0.10	10
Coding LiveBench Coding	0.59	11
Mathematics LiveBench Mathematics	0.70	11
Data Analysis LiveBench Data Analysis	0.57	13

Rankings

Overall Rank

Coding Rank

#1 🥇

GPU Requirements

Full Calculator

Quantization

Choose the quantization method for model weights

Context Size: 1,024 tokens

63k

125k

VRAM Required:

Recommended GPUs

Resources

Official Documentation Release Notes Download Weights Source Code