Can I run SmolLM2 360M on my device?

SmolLM2 360M requires a minimum of 0.75GB VRAM. Use RunThisModel to check your specific hardware compatibility and find the best quantization for your device.

How much VRAM does SmolLM2 360M need?

SmolLM2 360M needs 0.75GB VRAM at minimum (Q4_K_M quantization). Higher quality quantizations need more: Q4_K_M: 0.75GB, Q8_0: 0.86GB.

How do I download SmolLM2 360M?

You can download SmolLM2 360M in GGUF format from HuggingFace (0.252GB minimum). Use the RunThisModel iOS app to download and run it directly on your device, or download manually from HuggingFace.

Can SmolLM2 360M run on iPhone?

Yes, SmolLM2 360M can run on recent iPhones (iPhone 15 Pro and newer with 8GB RAM) using the Q4_K_M quantization.

HuggingFace

SmolLM2 360M

Name: SmolLM2 360M
Author: HuggingFace

Compact 360M model. Good for basic tasks on very constrained devices.

0.36B parameterssmollmapache-2.08K context0.75GB - 0.86GB VRAM

About This Model

SmolLM2 360M is a lightweight language model developed by HuggingFace, designed to offer efficient text generation capabilities with a relatively small footprint. With just 360 million parameters, this model is particularly adept at generating coherent and contextually relevant text, making it suitable for a wide range of applications such as chatbots, content creation, and summarization tasks. The model's impressive context length of 8192 tokens allows it to maintain a broader understanding of the input, which is crucial for tasks requiring long-term coherence and context retention.

In its size class, SmolLM2 360M punches well above its weight. Despite its compact architecture, it delivers performance that rivals larger models, making it an excellent choice for users who need a balance between computational efficiency and output quality. The model's quantization options, including Q4_K_M and Q8_0, further enhance its efficiency, allowing it to run smoothly on hardware with limited resources. This makes it ideal for developers and enthusiasts who want to deploy AI models on low-end or mid-range devices, such as older laptops or even some Raspberry Pi setups. With a VRAM requirement of only 0.8–0.9 GB, SmolLM2 360M is accessible to a broad audience, ensuring that more users can benefit from high-quality text generation without the need for expensive hardware.

Check Your Hardware

See which quantizations of SmolLM2 360M your hardware can run.

Quantization Options

Quantization	Bits	File Size	VRAM Needed	RAM Needed	Quality
Q4_K_M	4.5	0.252 GB	0.75 GB	1.25 GB	85%
Q8_0	8	0.36 GB	0.86 GB	1.36 GB	98%

Download & Run

HuggingFace

View model & download weights

Ollama

One-command install & run

See It In Action

Real model outputs generated via RunThisModel.com — watch responses stream in real time.

Llama 3.3 70B responding...

Outputs generated by real AI models via RunThisModel.com. Generation speed shown is from cloud inference. Local speeds vary by hardware — check your device.

Frequently Asked Questions

How much VRAM do I need to run SmolLM2 360M?

SmolLM2 360M requires 0.75GB VRAM minimum with Q4_K_M quantization. For full precision, you need 0.86GB VRAM.

What is the best quantization for SmolLM2 360M?

Q4_K_M offers the best balance of quality and VRAM usage. Q8_0 is near-lossless if you have enough VRAM.