deepseek-ai/DeepSpec
DeepSpec: a full-stack codebase for training and evaluating speculative decoding algorithms
项目说明
DeepSpec
DeepSpec is a full-stack codebase for training and evaluating draft models for speculative decoding. It contains data preparation utilities, draft model implementations, training code, and evaluation scripts.
Environment
Install the Python dependencies:
python -m pip install -r requirements.txt
Data preparation additionally requires an inference engine to serve the target model when regenerating answers; see scripts/data/README.md for details.
Workflow
Run the stages in order — each stage's output feeds the next:
- Data Preparation — download prompts, regenerate target answers, and build the target cache.
- Training — train a draft model against the cached target outputs.
- Evaluation — measure speculative-decoding acceptance on benchmark tasks.
Data Preparation
See scripts/data/README.md for the step-by-step data pipeline:
- download and split training data,
- regenerate answers,
- prepare the target cache (storage warning: this can be very large — roughly 38 TB for the default
Qwen/Qwen3-4Bsetting).
Training
bash scripts/train/train.sh
train.sh launches train.py, which spawns one worker per visible GPU. Select the algorithm and target model by pointing at one of the configs under (e.g. ); see the script header for the full list of configs, how to override / , and how to use to override individual config fields. Checkpoints are written to .
