Last April, Meta announced that it was working on a rather exciting project: a large language model entirely “open source”, but nevertheless capable of competing with the leaders in the sector. The fruit of this work, Llama 3.1has just been officially presented in a communiqué of the company.
This nomenclature is a bit misleading, because it suggests that it is only a minor evolution of Llama 3 that was released a few months ago. But it is quite the opposite: it is significantly more complex. In particular, the number of parameters (the set of weights and biases, numerical values that define the importance of the links between the different virtual neurons of the network) has exploded: it goes from 140 billion for version 3.0 to 405 billion for version 3.1.
A real rival for GPT-4
Granted, that’s still a long way from GPT-4, which has a total of 1,760 billion parameters (or, more rigorously, 8 x 220 billion, since this model is built around a Mixture-of-Experts architecture that consists of making several semi-independent subunits communicate). But it is common knowledge that size is not everything, and Meta maintains that its product has nothing to envy that of OpenAI. Apparently, Llama 3.1 even outperformed GPT-4o on some benchmarks.
On IFEval, a general performance test, it scores 88.6 points, 3 more than GPT-4o. It also does very slightly better on GSM8K (mathematics) and ARC Challenge (reasoning), and seems significantly more comfortable in different languages (+5 points on Multilingual MGSM). The exact numbers should be taken with a grain of salt, but overall, these results show that Llama 3.1 is more or less in the same league as the industry leaders.
It also stands out thanks to its extensive compatibility. Indeed, to facilitate its integration into the digital ecosystem, Meta has signed partnerships with big names such as Amazon Web Systems, Nvidia, Microsoft, IBM and Google Cloud.
An “open source” model, Really … ?
The company, however, remains very discreet about the data that was used to train this model; it simply specifies that it used synthetic data, that is, data generated or labeled by other language models (in this case Llama 2). As always, it is practically impossible to know if Meta has drawn on content covered by copyright to fuel its creation. This is a very important point, because this way of proceeding goes completely against one of Meta’s main claims.
Indeed, Meta regularly repeats that Llama is a family of “open source” models. It is true that the company provides access to a significant portion of the code and parameters of its models. In this respect, Llama 3.1 is indeed more open than OpenAI’s models, which jealously guard the parameters of its GPT models. Mark Zuckerberg himself has also published a ticket on “l’IA open source”. As you will have understood, Meta is banking a lot on this label… but that does not mean that it is justified in this context.
Being more open than a fully closed model is not enough to meet the open source criteriaand it is clear that Llama does not really fit into this category. In addition to deny transparency at the training data levelMeta also imposes significant restrictions on the use of its product. The license explicitly states that the company can at any time cut off access to the model and prevent users from exploiting its productions. Talking about open source in this context is therefore a gross abuse of language.
L’open source as moral guarantee
And this is far from being an innocent mistake. Meta is obviously aware that its product does not meet the traditional criteria of open source. We can therefore interpret this recurring use of the term as a desire to distort reality for communication purposesusing this vocabulary as a kind of caution morale. In essence, Meta is doing everything to mark its difference with the closed models of OpenAI and others for give an almost philanthropic dimension to its work in AI, even if it means confusing the issue and usurping this title to appropriate its positive image without respecting the standards that go with it.
Inevitably, this approach considerably irritates the supporters of the TRUE open source. Since the prehistory of the web, entire communities of dedicated enthusiasts have been working to create amazing tools that anyone can use and modify as they wish, regardless of context, with the sole motivation of contributing to the digital ecosystem. In a sense, Meta’s approach can therefore be interpreted as a betrayal of the ideals embodied by these initiatives, and this could lead to a regrettable devaluation of this label.
However, those with the necessary technical skills can now experiment with Llama 3.1 by downloading the different versions of the model from the Meta site or via Hugging Face. However, be sure to read the licence and the usage policy to avoid any unpleasant surprises.
Source: www.journaldugeek.com