LLaMA: Meta’s Open-Source Rival to Google and OpenAI

Meta’s LLaMA is open-source, challenging Google and OpenAI. Learn more about LLaMA, how it fares against competitors, and the future of LLM development.

Ankur A. Patel

Ishita Jaiswal

, and

Dina Sostarec

Jul 11, 2023

LLaMA (Large Language Model Meta AI) is a collection of four foundational large language models ranging from 7B to 65B parameters, each requiring different computational power. It was first announced in a research paper and a blog post by Meta AI.
LLaMA is an autoregressive transformer model similar to other popular LLMs. It can also be fine-tuned to carry out diverse tasks. Because of its excellent training and robust framework, LLaMA delivers state-of-the-art performance, giving benchmark results in multiple domains.
LLaMA either outperforms or competes with LLMs like Chinchilla and PaLM in common sense reasoning, question answering, reading comprehension, and code generation.
What makes Meta different from competitors like Google and OpenAI is that it has shared the weights of LLaMA under a non-commercial license. By releasing the weights, Meta has allowed developers to study the model’s inner workings, replicate its performance, and further customize it for specific applications. These weights were also leaked in the form of a downloadable torrent.
LLaMA’s supposed leak and the reactions of Meta’s competitors re-emphasize the importance of the debate between open-source and proprietary development. Despite the huge push for open-source and the remarkable developments made there, the danger of these technologies cannot be ignored.
In the artificial intelligence industry, the majority of power does and will lie with major corporations like Google, Microsoft, Meta, and Amazon because they have the vast amount of resources required. Their approach to AI development should be conscious of ethics and safety.

This post is sponsored by Multimodal, a NYC-based development shop that focuses on building custom natural language processing solutions for product teams using large language models (LLMs).

With Multimodal, you will reduce your time-to-market for introducing generative AI in your product. Projects take as little as 3 months from start to finish and cost less than 50% of newly formed NLP teams, without any of the hassle. Contact them to learn more.

OpenAI and Google have been racing to create the best large language model for a while now. They have made great progress with multiple GPT models and BARD. But Meta is now giving them a run for their money with LLaMA, its latest open-source large language model.

In many ways, LLaMA is identical to other LLMs out there. In this article, we will explore Meta’s LLaMA, its key features and comparative performance, and why its open-source nature sparks debate about the future of artificial intelligence research.

What Is Meta’s LLaMA?

Meta heats up the tech giants' fight with the launch of LLaMA, an AI language model three times bigger than GPT-3 — Meta LlaMA (Source)

LLaMA (Large Language Model Meta AI) is a collection of four foundational large language models ranging from 7B to 65B parameters, each requiring different computational power. It was first announced in a research paper and a blog post by Meta AI. The four LLaMA models are available on Hugging Face and the official Facebook repository.

LLaMA was exclusively trained on open, publicly accessible data from sources like CommonCrawl, C4, Github, and Wikipedia. LLaMA’s datasets include data in multiple languages. The quality of the dataset is comparable to that of other major LLMs like PaLM and Chinchilla.

LLaMA is an autoregressive transformer model similar to other popular LLMs. It can also be fine-tuned to carry out diverse tasks. Because of its excellent training and powerful framework, LLaMA delivers state-of-the-art performance, giving benchmark results in multiple domains.

One of the most notable aspects of LLaMA is that it can run on computers with significantly less processing power compared to other LLMs, including OpenAI’s latest GPT versions. For instance, LLaMA 13B can also run on the Nvidia Tesla V100 GPU.

Before we explore its comparative functionality with other LLMs, let’s learn a bit more about its key features and latest developments.

LLaMA: Release and Reception

When Meta first released LLaMA in February 2023, it intended to keep the model open-source so that the AI community, including researchers and educators, could use LLMs easily and with significantly less expensive hardware than usually required for such models. According to Meta:

“Even with all the recent advancements in large language models, full research access to them remains limited because of the resources that are required to train and run such large models. This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues.” (Source)

As a result, the permission to use LLaMA was granted on a case-by-case basis through the official Meta website on a noncommercial license. Developers can still apply for official access.

LLaMA is not a public chatbot like ChatGPT. So if you’re an average internet user trying to use apps built on LLMs, LLaMA is not for you. It’s mostly made of raw components that need significant technological expertise — the kind independent AI researchers, businesses, or educators would have.

Is Meta the only company that has made its LLM publicly accessible? Certainly not. But what makes Meta different is that it has shared the weights of LLaMA under a non-commercial license.

Weights are the parameters used and adjusted during the model training phase. They are the specific mathematical values that the system learns as it analyzes data. In a nutshell, weights represent the learned knowledge and patterns from the training data.

By releasing them, Meta has allowed developers to study the model’s inner workings, replicate its performance, and further customize it for specific applications.

In the case of LLMs like GPT and its subsequent versions, developers had access to the API through OpenAI’s official website. But even with APIs, doing something custom with the LLM on consumer-grade, affordable hardware wasn’t possible. LLaMA has made it possible, which makes it appealing to the AI community.

Currently, LLaMA’s 13B version can run on an A100 GPU, which is an enterprise-grade graphics card. Most AI professionals, especially those actively working in the research and development of AI systems, have access to this kind of hardware. Those who do not can rent it over the cloud affordably. This way, LLaMA is not just more adaptable but also affordable to run and customize.

For instance, a software developer recently made a version of LLaMA that could run on a Mac. People have also successfully, albeit slowly, run it on Pixel 6 and Raspberry Pi.

LLaMA has the same shortcomings that most major large language models share, including bias, toxicity, and misrepresentation. The model also generates false or inaccurate responses.

The Great “Leak”

Just a week after Meta announced LLaMA, its weights were leaked online on 4chan. A downloadable torrent containing the system was released to the public and sparked curiosity and debate.

Since the leak, there have been multiple speculations about the source of the leak, with some experts even accusing Meta of the leak and the following cover-up. Meta did issue a vague statement regarding the leak:

“While the [LLaMA] model is not accessible to all … some have tried to circumvent the approval process.” (Source)

Moreover, the company also filed takedown requests of the system claiming copyright infringement.

Soon after the LLaMA leak, users confirmed the authenticity of the leaked version vis a vis the original one distributed by Meta. Like every other open-source release in the AI industry, LLaMA’s release also came with fears of misuse. At the same time, it came with the “tit-for-tat” from competitors like Google.

Just weeks after the LLaMA weight leak, Google released the API to its large language model PaLM. PaLM is Google’s most powerful language model to date. Along with the API, Google also released MakerSuite to make using the API easier for developers. According to the press release by Google:

“With MakerSuite, you’ll be able to iterate on prompts, augment your dataset with synthetic data, and easily tune custom models” (Source)

PaLM is very similar to the GPT family of LLMs and arguably more powerful than many GPT versions. LLaMA 13B is at-par with PaLM 540B. LLaMA also competes well with Deepmind’s Chinchilla 70B.

How Does LLaMA Compare to Other Large Language Models?

The fact that LLaMA is more adaptable and can run on less powerful hardware does raise doubts about its competency. But tests conducted by its development team show its powerful performance.

The researchers compared the model’s performance to major existing LLMs on different types of tasks. Here are the key results:

Common Sense Reasoning: LLaMA performed exceptionally well on common sense reasoning tasks. It outperformed SOTA model architectures in PIQA, SIQA, and OpenBookQA reasoning benchmarks. As the research paper outlines: “LLaMA 65B outperforms Chinchilla-70B on all reported benchmarks but BoolQ. Similarly, this model surpasses PaLM 540B everywhere but on BoolQ and WinoGrande. LLaMA 13B model also outperforms GPT-3 on most benchmarks despite being 10 times smaller.”
Closed Book Question Answering and Trivia: When tested on its ability to answer humanlike questions, LLaMA outperformed GPT-3, Chinchilla, PaLM, and Gopher in almost all benchmarks.

Reading Comprehension: LLaMA has a similar performance to PaLM 540B and outperforms GPT-3 in reading comprehension.

Code Generation: LLaMA outperforms Google’s LAMDA and PaLM in code generation on standard benchmarks.

Massive Multitask Language Understanding: These tasks include answering multiple choice questions mainly from humanities and STEM subjects. LLaMA 65B performed worse than Chinchilla 70B and PaLM 540B on standard MMLU benchmarks.

Open-Source or Proprietary — What Does the Future of LLMs Look Like?

There’s always been a battle between open-source and proprietary technology, and it’s likely that the trend will continue.

Decades ago, Tim Berners Lee refused to patent the World Wide Web, which led to a massive revolution. At the same time, there have been instances of things going horribly wrong with people misusing AI to create deepfakes, phishing messages, etc.

LLaMA’s supposed leak and the reactions of Meta’s competitors re-emphasize the importance of this debate.

On one hand, releasing such powerful technologies can have an excellent impact on research, development, and accessibility. Smaller businesses and researchers can both use these models and improve their functionality and impact. Many AI researchers firmly believe that the future of this industry should be open. Unless that’s the case, the development of the industry itself will be hindered.

Companies like StabilityAI, the creator of StableDiffusion, also recently open-sourced their AI-powered design Studio. Other companies that have released free AI software include Feather AI.

The open source community also argues, as did Meta officials, that powerful technologies such as LLMs and other Generative AI tools should not be monopolized by just a few big players like Open AI and Google.

Obviously, monopolies are never good. However, it is more important that they also challenge the development and improvement of AI.

After OpenAI released ChatGPT last year, the sudden surge in the number of AI-related startups showed how valuable this kind of technology is to multiple industries. Yet, their control remains with large corporations with loyalties to investors. For example, Microsoft’s entire ethics team was laid off recently after their huge investment in OpenAI, raising doubts about the company’s commitment to ethical AI development.

Despite the huge push for open-source and the remarkable developments made there, the danger of these technologies cannot be ignored. For example, researchers from Stanford were able to release Alpaca, a fine-tuned version of LLaMA. This version was quickly used to generate problematic and violent text. As the New York Times points out:

“In one instance, the system provided instructions for disposing of a dead body without being caught. It also generated racist material, including comments that supported the views of Adolf Hitler.” (Source)

While Stanford took Alpaca down quickly, most malicious agents aren’t as responsible as Stanford. That doesn’t mean these technologies shouldn’t be widely available, but their distribution should be more regulated.

Truly Open Or Another Play of Power?

Meta has repeatedly argued that LLaMA’s weight release makes sure that AI research gets more democratized. But it’s also important to remember that Meta too is a multi-billion dollar corporation just like Google, Microsoft, and Amazon.

These corporations have enormous resources to develop such powerful systems, so it's safe to say that they’ll continue to have the most “AI power” in the near future.

The industry would be truly democratized if smaller firms could compete effectively with these giants. But just by the nature of AI models and the complexity related to their research and development, that wouldn’t be possible. The responsibility, therefore, lies majorly on these corporations to ethically develop and distribute AI systems — whether paid or open-source.

Ankur’s Newsletter

Discussion about this post