Google

Gemma 2 2B

Google's compact 2.6B model. Efficient and capable for mobile use.

2.6B parametersgemma2gemma8K context2.09GB - 3.09GB VRAM

About This Model

Gemma 2 2B is a large language model developed by Google, boasting 2.6 billion parameters and designed for efficient local deployment. This model excels in text generation tasks, including but not limited to, creative writing, summarization, and conversational responses. With a context length of 8192 tokens, it can handle longer sequences of text, making it suitable for applications that require understanding and generating coherent, context-rich content. The model is licensed under the gemma license, ensuring accessibility while maintaining certain usage guidelines.

Compared to other models in its size class, Gemma 2 2B punches well above its weight. It offers a balance between performance and resource efficiency, making it a strong contender for those who need robust text generation capabilities without the need for high-end hardware. The available quantizations (Q4_K_M, Q8_0) further enhance its efficiency, allowing it to run smoothly on systems with as little as 2.1 GB of VRAM. This makes it an ideal choice for developers, hobbyists, and small teams looking to deploy a powerful yet manageable language model on mid-range GPUs or even some CPUs. Realistically, anyone with a modern laptop or desktop equipped with at least 4 GB of RAM and a decent GPU can leverage Gemma 2 2B for a wide range of text-based projects.

Check Your Hardware

See which quantizations of Gemma 2 2B your hardware can run.

Quantization Options

QuantizationBitsFile SizeVRAM NeededRAM NeededQuality
Q4_K_M4.51.591 GB2.09 GB2.59 GB
85%
Q8_082.593 GB3.09 GB3.59 GB
98%

See It In Action

Real model outputs generated via RunThisModel.com — watch responses stream in real time.

Llama 3.3 70B responding...

Outputs generated by real AI models via RunThisModel.com. Generation speed shown is from cloud inference. Local speeds vary by hardware — check your device.

Frequently Asked Questions

How much VRAM do I need to run Gemma 2 2B?

Gemma 2 2B requires 2.09GB VRAM minimum with Q4_K_M quantization. For full precision, you need 3.09GB VRAM.

What is the best quantization for Gemma 2 2B?

Q4_K_M offers the best balance of quality and VRAM usage. Q8_0 is near-lossless if you have enough VRAM.