首页 News 正文

On June 14th local time, Nvidia opened up the Nemotron-4 340B (340 billion parameter) series model. According to NVIDIA, developers can use this series of models to generate synthetic data for training Large Language Models (LLMs) for commercial applications in healthcare, finance, manufacturing, retail, and other industries.
The Nemotron-4 340B includes the base model, instruction model, and reward model. Nvidia used 9 trillion tokens (text units) for training. In common sense reasoning tasks such as ARC-c, MMLU, and BBH benchmark tests, Nemotron-4 340B-Base can be comparable to Llama-3 70B, Mixture 8x22B, and Qwen-2 72B models.
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

海角七号 注册会员
  • 粉丝

    0

  • 关注

    1

  • 主题

    29