hero://detect·tty0 · runthismodelRTM-CLI v0.42.1 · pid 8841
▍
135/145
models you can run
46.7B
largest · Mixtral 8x7B Inst…
600t/s
median throughput @ Q4_K_M
1.03 TB
curated weights indexed
./models·compatibility matrix · 145/145sorted by fit desc
| grade | model | params | vram | fit ↓ | tok/s | weights | dl/mo | ctx | arch |
|---|---|---|---|---|---|---|---|---|---|
| Whisper Tiny English (Quantized) | 0.039B | 0.1GB | 10080 | 32.2 MB | 72.9k | — | whisper | ||
| all-MiniLM-L6-v2 | 0.023B | 0.1GB | 10080 | 23.0 MB | 224.5M | 256 | bert | ||
| BGE Small EN v1.5 | 0.033B | 0.1GB | 10080 | 36.8 MB | 53M | 512 | bert | ||
| Snowflake Arctic Embed S | 0.033B | 0.1GB | 10080 | 36.0 MB | 40.2k | 512 | bert | ||
| Piper TTS - Amy (English) | 0.02B | 0.1GB | 10080 | 63.1 MB | 0 | — | piper | ||
| Piper TTS - Lessac (English) | 0.02B | 0.1GB | 10080 | 63.2 MB | 0 | — | piper | ||
| Piper TTS - Spanish (MLS) | 0.02B | 0.1GB | 10080 | 63.1 MB | 0 | — | piper | ||
| Piper TTS - German (Thorsten) | 0.02B | 0.1GB | 10080 | 63.1 MB | 0 | — | piper | ||
| Piper TTS - Chinese (Huayan) | 0.02B | 0.1GB | 10080 | 63.2 MB | 0 | — | piper | ||
| Piper TTS - Japanese (Kokoro) | 0.02B | 0.1GB | 10080 | 63.0 MB | 0 | — | piper | ||
| Piper TTS - Korean | 0.02B | 0.1GB | 10080 | 63.0 MB | 0 | — | piper | ||
| Piper TTS - Russian (Irina) | 0.02B | 0.1GB | 10080 | 63.2 MB | 0 | — | piper | ||
| Piper TTS - Portuguese (Faber) | 0.02B | 0.1GB | 10080 | 63.2 MB | 0 | — | piper | ||
| Piper TTS - Arabic (Kareem) | 0.02B | 0.1GB | 10080 | 63.2 MB | 0 | — | piper | ||
| Jina Reranker Tiny EN | 0.033B | 0.1GB | 10080 | 67.5 MB | 8k | 8.2k | bert | ||
| Whisper Tiny | 0.039B | 0.2GB | 10080 | 77.7 MB | 1.5M | — | whisper | ||
| Whisper Base | 0.074B | 0.3GB | 10080 | 148.0 MB | 4.6M | — | whisper | ||
| Whisper Base English | 0.074B | 0.3GB | 10080 | 148.0 MB | 23.8k | — | whisper | ||
| Nomic Embed Text v1.5 | 0.137B | 0.3GB | 10080 | 146.1 MB | 17.9M | 8.2k | nomic-bert | ||
| Piper TTS - French (Siwis) | 0.02B | 0.5GB | 10080 | 28.1 MB | 0 | — | piper | ||
| Piper TTS - Italian (Riccardo) | 0.02B | 0.5GB | 10080 | 28.1 MB | 0 | — | piper | ||
| Piper TTS - LibriTTS-R (English) | 0.02B | 0.6GB | 10080 | 78.6 MB | 0 | — | piper | ||
| Kokoro 82M TTS | 0.082B | 0.6GB | 10080 | 86.0 MB | 517.4k | — | kokoro | ||
| SmolLM2 135M | 0.135B | 0.6GB | 10080 | 144.8 MB | 1.7M | 8.2k | smollm | ||
| SmolLM2 360M | 0.36B | 0.8GB | 5000 | 270.6 MB | 283.9k | 8.2k | smollm | ||
| MusicGen Small | 0.3B | 0.8GB | 6000 | 302.4 MB | 197.6k | — | musicgen | ||
| Danube 3 500M | 0.5B | 0.8GB | 3600 | 317.9 MB | 31.1k | 8.2k | danube | ||
| BGE Large EN v1.5 | 0.335B | 0.8GB | 5373 | 358.2 MB | 13.8M | 512 | bert | ||
| Whisper Small | 0.24B | 0.9GB | 7500 | 487.6 MB | 2.4M | — | whisper | ||
| Qwen 2.5 0.5B | 0.5B | 1.0GB | 3600 | 491.4 MB | 4.2M | 32.8k | qwen2 | ||
| TinyLlama 1.1B | 1.1B | 1.1GB | 1636 | 668.8 MB | 2M | 2k | llama | ||
| Qwen 2.5 Coder 0.5B | 0.5B | 1.1GB | 3600 | 675.7 MB | 99.1k | 32.8k | qwen2 | ||
| Llama 3.2 1B Instruct | 1.24B | 1.3GB | 1452 | 807.7 MB | 7.4M | 131.1k | llama | ||
| Gemma 3 1B | 1B | 1.3GB | 1800 | 806.1 MB | 1.8M | 32.8k | gemma3 | ||
| Granite 3.0 1B-A400M | 1.3B | 1.3GB | 4500 | 821.8 MB | 878 | 4.1k | granitemoe | ||
| DeepSeek Coder 1.3B | 1.3B | 1.3GB | 1385 | 873.6 MB | 43.3k | 16.4k | llama | ||
| Yi Coder 1.5B | 1.5B | 1.4GB | 1200 | 963.7 MB | 5.1k | 4.1k | yi | ||
| Qwen2-VL 2B | 2.2B | 1.4GB | 818 | 986.0 MB | 3.7M | 32.8k | qwen2-vl | ||
| SmolLM2 1.7B | 1.7B | 1.5GB | 1059 | 1.06 GB | 163.4k | 8.2k | smollm | ||
| Falcon 3 1B | 1B | 1.5GB | 1800 | 1.06 GB | 9.9k | 8.2k | falcon | ||
| Moondream 2 | 1.8B | 1.5GB | 1000 | 1.00 GB | 1.9M | 2k | moondream | ||
| Qwen 2.5 1.5B | 1.5B | 1.5GB | 1200 | 1.12 GB | 10.7M | 32.8k | qwen2 | ||
| DeepSeek R1 Distill 1.5B | 1.5B | 1.5GB | 1200 | 1.12 GB | 681.8k | 131.1k | qwen2 | ||
| Qwen 2.5 Coder 1.5B | 1.5B | 1.5GB | 1200 | 1.12 GB | 748.8k | 32.8k | qwen2 | ||
| Stable Diffusion 2.1 Base (CoreML) | 0.86B | 1.6GB | 2093 | 1.14 GB | 40 | — | unet-diffusion | ||
| BGE Reranker v2 M3 | 0.568B | 1.6GB | 3169 | 1.16 GB | 14.1M | 8.2k | xlm-roberta | ||
| Distil-Whisper Large v3 | 0.76B | 1.9GB | 2368 | 1.52 GB | 869.8k | — | whisper | ||
| Whisper Medium | 0.77B | 1.9GB | 2338 | 1.53 GB | 475.7k | — | whisper | ||
| Granite 3.3 2B | 2B | 1.9GB | 900 | 1.55 GB | 21.9k | 8.2k | granite | ||
| Whisper Large v3 Turbo | 0.81B | 2.0GB | 2222 | 1.62 GB | 7.7M | — | whisper | ||
| CodeGemma 2B | 2B | 2.0GB | 900 | 1.63 GB | 31k | 8.2k | gemma | ||
| EXAONE 3.5 2.4B | 2.4B | 2.0GB | 750 | 1.64 GB | 63.8k | 32.8k | exaone | ||
| Gemma 2 2B | 2.6B | 2.1GB | 692 | 1.71 GB | 315.4k | 8.2k | gemma2 | ||
| StableLM Zephyr 3B | 3B | 2.1GB | 600 | 1.71 GB | 28.4k | 4.1k | stablelm | ||
| Rocket 3B | 3B | 2.1GB | 600 | 1.71 GB | 420 | 4.1k | stablelm | ||
| Stable Code 3B | 3B | 2.1GB | 600 | 1.71 GB | 2.2k | 16.4k | stablelm | ||
| MiniCPM-V 2.6 | 2B | 2.1GB | 900 | 1.60 GB | 151.6k | 2k | minicpm-v | ||
| Stable Diffusion 1.5 (GGUF) | 0.86B | 2.1GB | 2093 | 1.75 GB | 1.3k | — | unet-diffusion | ||
| StarCoder2 3B | 3B | 2.3GB | 600 | 1.89 GB | 123k | 16.4k | starcoder | ||
| Falcon 3 3B | 3B | 2.4GB | 600 | 2.01 GB | 6.2k | 8.2k | falcon | ||
| Llama 3.2 3B Instruct | 3.2B | 2.4GB | 562 | 2.02 GB | 1.4M | 131.1k | llama | ||
| Granite 3.0 3B-A800M | 3.4B | 2.4GB | 2250 | 2.06 GB | 3.4k | 4.1k | granitemoe | ||
| Qwen 2.5 3B | 3B | 2.5GB | 600 | 2.10 GB | 12.7M | 32.8k | qwen2 | ||
| Qwen 2.5 Coder 3B | 3B | 2.5GB | 600 | 2.10 GB | 229.1k | 32.8k | qwen2 | ||
| Stable Diffusion 1.5 (CoreML) | 0.86B | 2.5GB | 2093 | 1.57 GB | 1.6M | — | unet-diffusion | ||
| PaliGemma 3B | 3B | 2.5GB | 600 | 2.00 GB | 198.8k | 256 | paligemma | ||
| Stable Diffusion 2.1 (GGUF) | 0.86B | 2.7GB | 2093 | 2.32 GB | — | — | unet-diffusion | ||
| Phi-3.5 Mini 3.8B | 3.8B | 2.7GB | 474 | 2.39 GB | 901.4k | 131.1k | phi3 | ||
| Danube 3 4B | 4B | 2.7GB | 450 | 2.39 GB | 429 | 8.2k | danube | ||
| Gemma 3 4B | 4B | 2.8GB | 450 | 2.49 GB | 1.5M | 32.8k | gemma3 | ||
| Phi-4 Mini 3.8B | 3.8B | 2.8GB | 474 | 2.49 GB | 1.1M | 131.1k | phi4 | ||
| Nemotron Mini 4B | 4B | 3.0GB | 450 | 2.70 GB | 421k | 8.2k | nemotron | ||
| Phi-3.5 Vision | 4.2B | 3.2GB | 429 | 2.50 GB | 2M | 131.1k | phi3v | ||
| Stable Diffusion XL (CoreML) | 3.5B | 3.3GB | 514 | 3.05 GB | 1.4M | — | unet-diffusion | ||
| Whisper Large v3 | 1.55B | 3.4GB | 1161 | 3.10 GB | 5.1M | — | whisper | ||
| Yi 1.5 6B Chat | 6B | 3.9GB | 300 | 3.67 GB | 5.9k | 4.1k | yi | ||
| DeepSeek Coder 6.7B | 6.7B | 4.3GB | 269 | 4.08 GB | 143.7k | 16.4k | llama | ||
| Code Llama 7B | 7B | 4.3GB | 257 | 4.08 GB | 244.6k | 16.4k | llama | ||
| OLMoE 1B-7B | 6.9B | 4.4GB | 1385 | 4.21 GB | 37k | 4.1k | olmoe | ||
| Mistral 7B Instruct v0.3 | 7.3B | 4.6GB | 247 | 4.37 GB | 3.1M | 32.8k | mistral | ||
| OpenChat 3.5 7B | 7B | 4.6GB | 257 | 4.37 GB | 4.9k | 8.2k | mistral | ||
| StarCoder2 7B | 7B | 4.7GB | 257 | 4.46 GB | 12.3k | 16.4k | starcoder | ||
| OLMo 2 7B | 7B | 4.7GB | 257 | 4.47 GB | 49.4k | 4.1k | olmo | ||
| Qwen 2.5 Coder 7B | 7.6B | 4.9GB | 237 | 4.68 GB | 2.1M | 32.8k | qwen2 | ||
| InternLM 2.5 7B | 7.7B | 4.9GB | 234 | 4.71 GB | 109.7k | 32.8k | internlm2 | ||
| EXAONE 3.5 7.8B | 7.8B | 4.9GB | 231 | 4.77 GB | 139.5k | 32.8k | exaone | ||
| LLaVA 1.6 7B | 7B | 5.0GB | 257 | 4.40 GB | 705.7k | 4.1k | llava | ||
| Falcon 3 7B | 7B | 5.0GB | 257 | 4.40 GB | 9.7k | 8.2k | falcon | ||
| SDXL Turbo (GGUF) | 3.5B | 5.0GB | 514 | 3.50 GB | 783.7k | — | unet-diffusion | ||
| DeepSeek R1 Distill 8B | 8B | 5.1GB | 225 | 4.92 GB | 439k | 131.1k | llama | ||
| Llama 3.1 8B Instruct | 8B | 5.1GB | 225 | 4.92 GB | 9.9M | 131.1k | llama | ||
| Dolphin 3.0 Llama 3.1 8B | 8B | 5.1GB | 225 | 4.92 GB | 360.8k | 131.1k | llama | ||
| NeuralDaredevil 8B (abliterated) | 8B | 5.1GB | 225 | 4.92 GB | 13.5k | 8.2k | llama | ||
| Llama 3.1 8B Instruct (abliterated) | 8B | 5.1GB | 225 | 4.92 GB | 4.4k | 131.1k | llama | ||
| Stheno L3 8B v3.2 | 8B | 5.1GB | 225 | 4.92 GB | 13.4k | 8.2k | llama | ||
| Granite 3.3 8B | 8B | 5.1GB | 225 | 4.94 GB | 62.1k | 8.2k | granite | ||
| Qwen 2.5 7B Instruct | 7.6B | 5.3GB | 237 | 4.70 GB | 11.9M | 131.1k | qwen2 | ||
| Qwen3 8B Base | 8B | 5.3GB | 225 | 4.80 GB | 453.7k | 32.8k | qwen3 | ||
| CodeGemma 7B | 8.5B | 5.5GB | 212 | 5.33 GB | 2.6k | 8.2k | gemma | ||
| Yi 1.5 9B Chat | 9B | 5.5GB | 200 | 5.33 GB | 18.2k | 4.1k | yi | ||
| Yi Coder 9B | 9B | 5.5GB | 200 | 5.33 GB | 8.9k | 4.1k | yi | ||
| Gemma 2 9B Instruct | 9.2B | 5.9GB | 196 | 5.76 GB | 391k | 8.2k | gemma2 | ||
| Stable Audio Open | 1B | 6.0GB | 1800 | 2.50 GB | 43k | — | stable-audio | ||
| Falcon 3 10B | 10B | 6.4GB | 180 | 6.29 GB | 4.4k | 8.2k | falcon | ||
| Solar 10.7B | 10.7B | 6.5GB | 168 | 6.46 GB | 52.1k | 4.1k | llama | ||
| Gemma 3 MoE 9B | 9B | 7.0GB | 720 | 5.50 GB | — | 8.2k | gemma3-moe | ||
| Gemma 3 12B | 12B | 7.3GB | 150 | 7.30 GB | 2.6M | 32.8k | gemma3 | ||
| Mistral Nemo 12B | 12B | 7.5GB | 150 | 7.48 GB | 451.4k | 131.1k | mistral | ||
| Magnum v4 12B | 12B | 7.5GB | 150 | 7.48 GB | 686 | 131.1k | mistral | ||
| Rocinante 12B v1.1 | 12B | 7.5GB | 150 | 7.48 GB | 811 | 131.1k | mistral | ||
| Mistral Nemo Base 12B | 12B | 7.7GB | 150 | 7.20 GB | 29.7k | 131.1k | mistral | ||
| Code Llama 13B Instruct | 13B | 7.8GB | 138 | 7.87 GB | 2.7k | 16.4k | llama | ||
| ACE-Step 1.5XL | 1.5B | 8.0GB | 1200 | 3.00 GB | — | — | acestep | ||
| Qwen 2.5 14B | 14B | 8.9GB | 129 | 8.99 GB | 1.9M | 131.1k | qwen2 | ||
| Qwen 2.5 Coder 14B | 14B | 8.9GB | 129 | 8.99 GB | 3M | 32.8k | qwen2 | ||
| Phi-4 | 14B | 8.9GB | 129 | 9.05 GB | 814.3k | 16.4k | phi3 | ||
| Stable Diffusion 3 Medium (GGUF) | 2.5B | 9.2GB | 720 | 9.29 GB | 3.1k | — | mmdit-diffusion | ||
| Rocinante XL 16B v1 | 16B | 9.6GB | 112 | 9.75 GB | 68 | 131.1k | mistral | ||
| DeepSeek MoE 16B | 16.4B | 11.0GB | 643 | 9.50 GB | 14.3k | 4.1k | deepseek-moe | ||
| TRELLIS Image Large | 1.2B | 12.0GB | 1500 | 2.40 GB | 1.2M | — | trellis | ||
| Mistral Small 22B | 22B | 12.9GB | 82 | 13.34 GB | 127.5k | 32.8k | mistral | ||
| Codestral 22B (abliterated) | 22B | 12.9GB | 82 | 13.34 GB | 7.5k | 32.8k | mistral | ||
| Magnum v4 22B | 22B | 12.9GB | 82 | 13.34 GB | 248 | 32.8k | mistral | ||
| Dolphin 3.0 R1 Mistral 24B | 24B | 13.8GB | 75 | 14.33 GB | 686 | 131.1k | mistral | ||
| Cydonia 24B v4.3 | 24B | 13.8GB | 75 | 14.33 GB | 6k | 32.8k | mistral | ||
| FLUX.1 Schnell (GGUF) | 12B | 14.0GB | 150 | 12.00 GB | 301.4k | — | rectified-flow | ||
| FLUX.1 Dev (GGUF) | 12B | 14.0GB | 150 | 12.00 GB | 1.1M | — | rectified-flow | ||
| Dolphin Mistral 24B (Venice Edition) | 24B | 14.9GB | 75 | 14.40 GB | 7.8k | 32.8k | mistral | ||
| Gemma 3 27B | 27B | 15.9GB | 67 | 16.55 GB | 1.4M | 32.8k | gemma3 | ||
| Wan 2.2 TI2V 5B | 5B | 16.0GB | 360 | 10.00 GB | 8.3k | — | wan-dit | ||
| CogVideoX 5B | 5B | 16.0GB | 360 | 10.00 GB | 16.8k | — | cogvideox | ||
| Hunyuan3D 2 | 2.5B | 16.0GB | 720 | 5.00 GB | 76.1k | — | hunyuan3d | ||
| Skyfall 31B v4.2 | 31B | 18.2GB | 58 | 18.98 GB | 1k | 131.1k | mistral | ||
| Qwen 2.5 32B | 32B | 19.0GB | 56 | 19.85 GB | 1M | 131.1k | qwen2 | ||
| Qwen3 30B-A3B | 30.5B | 20.0GB | 545 | 18.00 GB | — | 32.8k | qwen3-moe | ||
| Phi-3.5 MoE | 41.9B | 24.1GB | 272 | 25.35 GB | 123.9k | 131.1k | phimoe | ||
| Mixtral 8x7B Instruct | 46.7B | 25.1GB | 134 | 26.44 GB | 806.7k | 32.8k | mixtral | ||
| Mochi 1 Preview | 10B | 30.0GB | offload | 20.00 GB | 3.5k | — | asymdit | ||
| Llama 3.1 70B Instruct | 70B | 40.1GB | offload | 42.52 GB | 630.4k | 131.1k | llama | ||
| Euryale L3.3 70B v2.3 | 70B | 40.1GB | offload | 42.52 GB | 1.4k | 131.1k | llama | ||
| Llama 3.1 70B (lorablated) | 70B | 40.1GB | offload | 42.52 GB | 57 | 131.1k | llama | ||
| Magnum v4 72B | 72B | 44.7GB | offload | 47.42 GB | 764 | 131.1k | qwen2 | ||
| HunyuanVideo 13B | 13B | 60.0GB | offload | 26.00 GB | 924 | — | hunyuan-dit | ||
| Qwen3 235B-A22B | 235B | 144.0GB | offload | 140.00 GB | — | 32.8k | qwen3-moe | ||
| Mixtral 8x22B Instruct | 141B | 88.0GB | offload | 85.00 GB | 32.6k | 65.5k | mixtral |
distribution·vram footprint × Q4_K_Myour gpu cap: 24 GB
0 – 2 GB49
2 – 6 GB53
6 – 12 GB17
12 – 24 GB16
24 – 48 GB7
48+ GB3
fits tight overflowcutoff @ 24 GB
activity·tail -f /var/log/rtm.loglive
14:02:11MODELqwen3-30b-a3b · +12.4k DL/24h+3s
14:01:48BENCHRTX 5090 · llama-3.1-8b · 142 t/s+10s
13:58:02ADDcandidates.json +3 (deepseek-r2)+17s
13:54:30SYNCHF metadata · 144/145 ok+24s
13:51:09WARNflux-dev · NC license · review+31s
13:47:55SCAN10.7.91.* · M4 Pro · 18GB unified+38s
13:45:01BENCHM3 Max · gemma-3-12b · 28 t/s+45s
13:43:22MODELwhisper-v3-turbo · 1.2M DL/wk+52s
13:39:10FAILgpu probe · WebGPU unavailable+59s
13:36:44SCAN23.51.18.* · RTX 4090 · 24GB+6s