How to use DeepSpeed on Amazon SageMaker
Instruction fine-tuning (IFT) language models is the most effective technique for improving performance on domain specific tasks. Full-parameter IFT on the top open source models like Mistral and Llama 3 requires hundreds of GBs of GPU VRAM. Even with top end hardware, it’s impractical to train a model on a single machine simply because the model and training state won’t fit in VRAM. That’s where provisioning compute nodes with a cloud service like Amazon SageMaker comes into play. When you […]