NVIDIA: Llama 3.1 Nemotron 70B Instruct

NVIDIAID: nvidia/llama-3.1-nemotron-70b-instruct

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.1-70b-instruct) architecture and Reinforcement Learning from Human Feedback (RLHF), it excels in automatic alignment benchmarks. This model is tailored for applications requiring high accuracy in helpfulness and response generation, suitable for diverse user queries across multiple domains. Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).

Pricing per 1M Tokens

Input (Prompt)	$1.20
Output (Completion)	$1.20
Cache Read	Free
Cache Write	Free
Image	N/A

Specifications

Context Length	131K
Max Output Tokens	16K
Input Modalities	Text
Output Modalities	Text
Tokenizer	Llama3
Instruct Type	llama3
Top Provider Context	131K
Top Provider Max Output	16K
Moderated	No