美股抱团部落
汇聚各类美股投资者与交易者的交流论坛

论坛 论坛 美股投资论坛 DeepMind’s New AI Models

Tagged: 

  • DeepMind’s New AI Models

     MelisaWood updated 3 years ago 1 Member · 1 Post
  • MelisaWood

    Member
    12 2 月, 2022 at 8:13 上午

    Large language models (LLMs) like GPT-3 have supercharged the branch of artificial intelligence (AI) called natural language processing (NLP). Because NLP algorithms operate on text data, they have myriad applications but are complex and expensive, often limiting their utility. GPT-3, for example, contains 175 billion parameters, costs ~$12 million to train, and requires roughly 20 GPUs for inference alone.

    Recently, Google (GOOGL) subsidiary DeepMind published a novel type of LLM architecture called RETRO(Retrieval-Enhanced Transformer). RETRO models are unique because they reference a separate database for information instead of relying on learned knowledge contained in their model parameters. Therefore, RETRO models can shrink to a fraction of the size of other LLMs. Though RETRO is roughly 5% the size of GPT-3 with 7.5 billion parameters, RETRO and GPT-3 perform on par with one another.

    As some have suggested, RETRO-style models will be useful in healthcare applications. Unlike many machine learning (ML) tasks with immutable truth labels, healthcare and biology knowledge changes frequently. Traditional LLMs would require constant retraining to remain relevant in biology. Because RETRO references a separate database, the models can change without recalibration and could be useful in analyzing patient medical records and informing genetic counselors.