text generation Models
Browse open-source large language models in this category.
22 open-source models available
LLaMA 3.1 8B
8BCompact yet capable 8B parameter model. Runs on consumer hardware while maintaining impressive performance.
Mistral 7B v0.3
7BEfficient 7B parameter model with sliding window attention. Excellent for local deployment.
LLaMA 3.1 70B
70BMeta's 70B parameter model offering excellent performance with lower resource requirements than 405B.
Qwen 2.5 7B
7BCompact Qwen model with strong performance for its size. Great for local deployment.
Gemma 2 2B
2BUltra-lightweight 2B model. Runs on phones and edge devices.
Phi 3.5 Mini
3.8BMicrosoft's 3.8B parameter model that punches far above its weight class.
Gemma 2 9B
9BEfficient 9B model from Google with strong reasoning. Fits on most modern GPUs.
LLaMA 3.2 3B
3BMeta's lightweight 3B model for edge and mobile deployment.
Qwen 2.5 72B
72BAlibaba's flagship 72B model with exceptional coding and mathematical reasoning.
LLaMA 3.1 405B
405BMeta's largest open model with 405B parameters. State-of-the-art performance across reasoning, code, and multilingual tasks.
Gemma 2 27B
27BGoogle's powerful 27B model with knowledge distillation from Gemini. Excellent quality.
Phi 3 Medium
14BMicrosoft's 14B parameter model with excellent reasoning capabilities.
Mixtral 8x22B
141BSparse mixture-of-experts model with 141B total / 39B active parameters. Outstanding efficiency.
DeepSeek V3
671BDeepSeek's 671B MoE model with 37B active parameters. Matches GPT-4o on many benchmarks.
Mistral Large 2
123BMistral's flagship 123B parameter model with strong multilingual and coding capabilities.
SmolLM2 1.7B
1.7BHuggingFace's ultra-compact 1.7B model. Best-in-class for its size.
Yi 1.5 34B
34B01.AI's 34B parameter model with strong bilingual (EN/CN) capabilities.
WizardMath 70B
70BMath-specialized LLaMA 2 70B fine-tuned with RLHF for mathematical reasoning.
Command R+
104BCohere's 104B parameter model optimized for RAG and tool use.
Command R
35BCohere's 35B model for RAG, summarization, and tool use.
Falcon 180B
180BTII's massive 180B model trained on 3.5T tokens of RefinedWeb data.
Granite 3.1 8B
8BIBM's enterprise-grade 8B model with strong reasoning and code capabilities.