Llama 2: the future of open-source language models

large language models (llm) meta reinforcement learning Jul 20, 2023
Llama 2: Meta's bid in the LLM race




Meta has just taken a giant leap forward in the generative AI race with the launch of Llama 2, a new family of open source large language models (LLMs) that is poised to democratize access to these technologies.

This new generation of models has been trained on 40% more data than Llama 1, its predecessor. In total, it has been trained on 2 trillion tokens, which is the basic unit of text that the model can process (characters, words, punctuation marks, etc).

This new version of Llama has also doubled the length of its context compared to its predecessor, now using a context of 4096 tokens. To understand what this means, we need to understand how LLMs generate their outputs. When they are given a prompt, language models use that text to predict what the next token in that text sequence should be, which would correspond to the beginning of the response to our question. Then, the LLM uses both the prompt text and the first token of its response to predict what the second token of its response should be, and so on. A context of 4096 tokens means Llama 2 can take into account the last 4096 tokens of our conversation when generating each token of its response. The larger the context size, the more coherence and quality its responses will have. Therefore, the fact that this value has doubled implies a jump in quality.



Technical Specifications


In addition to the improvements outlined above, it should be noted that there are different versions of Llama 2, of varying size. The most compact has 7 billion parameters, the medium 13 billion, and the largest 70 billion. The number of parameters measures the amount of language information and patterns that the model is able to store as an internal representation, and measures the quality of the responses it can generate.

The smallest of these models has an approximate size of 13 Gb. This means it can be used comfortably locally on systems with mid-range GPUs. What this implies is that, unlike other language models like ChatGPT and Bard, which are much larger in size, individual users and researchers can conduct their own experiments and modifications on this neural network, leading to a much higher rate of innovation than closed models allow.

In addition, each of the three different sized models is available in two different versions: as a pretrained language model and also as a fine-tuned chat interface. Let's look at the difference:

The pretrained version of Llama 2 has learned the structure and patterns of large volumes of text, code, and other types of sequential data. Given a certain sequence of tokens, the model learns what the next token that continues it coherently should be. In short, it is a model trained to continue a sequence of text coherently.

In addition to these models, there are three others that, taking the previous ones as a starting point, have undergone a process of fine-tuning to be able to carry out natural conversations. To do this, Llama 2's authors have collected a dataset of around 100,000 questions and answers and have trained the neural network in a supervised way so that it learns to produce responses similar to those observed. However, this process alone is not enough.

The second part of this fine-tuning process consists of the use of Reinforcement Learning with Human Feedback (RLHF). In it, humans evaluate the responses the model generates based on their quality and safety, and give them a corresponding rating. After that, the model follows a process of trial and error in which it receives a prompt, produces a response that is evaluated, and based on that evaluation adjusts its responses to achieve better and better ratings over time.

Remember that at Escape Velocity Labs we teach the fundamentals and advanced algorithms of Reinforcement Learning in our course series on this branch of AI.




Benchmarks and Conclusion


Llama 2 has been compared on various benchmarks with other open source LLMs, obtaining superior results to all of them. In the following chart you can see the results obtained on each of these benchmarks:



In short, Llama 2 is a significant leap in the development of open source AI, and its compact size will allow thousands of developers to extend, improve and advance language models at an ever increasing pace.



AI moves fast. We help you keep up with it.

Get a monthly selection of the most groundbreaking advances in the world of AI, top code repositories, and our best articles and tutorials. 

We hate SPAM. We will never sell your information, for any reason.