使用 llmfit 查看模型情况
llmfit info stelterlab/Qwen3-Coder-30B-A3B-Instruct-AWQ
=== stelterlab/Qwen3-Coder-30B-A3B-Instruct-AWQ ===
Provider: stelterlab
Parameters: 4.6B
Quantization: Q4_K_M
Best Quant: Q8_0
Context Length: 262144 tokens
Use Case: Code generation and completion
Category: Coding
Released: 2025-07-31
Runtime: llama.cpp (est. ~17.2 tok/s)
Score Breakdown:
Overall Score: 66.7 / 100
Quality: 68 Speed: 43 Fit: 61 Context: 100
Estimated Speed: 17.2 tok/s
Resource Requirements:
Min VRAM: 2.4 GB
Min RAM: 2.6 GB (CPU inference)
Recommended RAM: 4.3

