OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-15B.
-
Updated
May 10, 2024 - Python
OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-15B.
pytorch implementation of several CNNs for image classification
Experimental GPT-2 scale (~124M param) LLM trained from scratch on Google Colab. Trained on C4, Cosmopedia/Alpaca/Python mix. Includes full training pipeline, mixed dataset loader with Colab-resilient checkpointing, and log analysis tools. Honest write-up of what went wrong.
Add a description, image, and links to the train-from-scratch topic page so that developers can more easily learn about it.
To associate your repository with the train-from-scratch topic, visit your repo's landing page and select "manage topics."