What is open source AI? New definition shows Meta’s version isn’t what it claims to be

The Open Source Initiative has just set a new international definition for AI that could throw a spanner in the works for tech companies.

Meta and some other technology firms rolling out so-called open source generative artificial intelligence (AI) models are “depriving the pubic from having innovation cycles” and making a profit from it, according to the group that has pioneered the open source term in software for the past 25 years.

Open Source is yet another buzzword in AI circles with Big Tech companies such as Meta and Elon Musk’s Grok AI model stating open source is “good for the world,’” according to Facebook founder Mark Zuckerberg.

But no one can agree on what open source AI means.

That could change as the Open Source Initiative (OSI), the organisation that is the self-appointed steward of the term, sets a final definition for open source AI on Monday, and it is not the same as Meta’s version of the term.

“They fail, especially Meta, because their terms of use and terms of distribution are incompatible with the open source definition and the open source principles,” Stefano Maffulli, who heads the OSI, told Euronews Next.

“They’re basically the Microsoft, the Oracle, the Adobe of this space where they say ‘build on top of my platform don’t worry about it and I’ll keep on getting grants from you using our platforms. But they also say ‘it’s open, so everyone can use it,’” he added.

What is the OSI’s AI definition?

The OSI definition took a couple of years to cook up, and the organisation consulted a 70-person group of researchers, lawyers, policymakers and activists as well as representatives of big tech companies such as Microsoft, Meta, and Google.

It states that an open source AI can be used for any reason without getting permission from the company, and researchers should be able to freely see how the system works.

It also says that the AI system can be modified for any purpose, including to change its output and share it for others to use with or without modifications for any reason.

Meta’s Llama 3.1 model is partially open source, according to the OSI definition, in that developers and researchers can download it for free and customise it.

But Meta does not specify where it got the data to train Llama 3.1, which can be problematic for users as it could lead to copyright issues or biased data.

Maffulli said that if tech companies do say where the data comes from, they are often vague and will say the Internet. But he said that the “real innovation” and way that AI models perform better is in how the datasets are passed through the training machinery.

“If you talk to companies, they don’t want to release that code,” Maffulli said, adding that “that’s where the innovation happens”.

What are the consequences?

By confusing which AI models are truly open source, Meta and other firms may hamper the long-term development of AI models that are controlled by the user rather than several tech companies, Maffulli said.

“I fear that society as a whole would be in a worse place if we let a handful of companies just go on and be the only ones who have the edge and the access to innovation this way,” he added.

Euronews Next contacted Meta for a reaction but did not receive a reply at the time of publication.

However, Zuckerberg said in a blog post “we’re taking the next steps towards open source AI becoming the industry standard” and said that Llama “has been a bedrock of AI innovation globally”.

Maffulli said that other companies such as Microsoft and Google had retracted using the open source term for their models that were not fully open, as per the definition. But he said that talks with Meta did not produce any result.

What is in it for Meta?

The open source label can have positive connotations for a tech company’s image as it is free to use.

But confusion around the term can lead to ‘openwashing’, experts have previously told Euronews Next, which means they promote open models without contributing to the commons which can affect innovation and the public’s understanding of AI.

Using the open source term can also impact a company’s bottom line as other companies can use the open source technology which then integrates new innovations into its products.

In a February earnings call, Zuckerberg said: “Open source software often becomes an industry standard, and when companies standardise on building with our stack, that then becomes easier to integrate new innovations into our products”.

The open source future

Unlike in the 2000s, when social media and the Big Tech companies took off and were largely unregulated, Maffulli believes it will be a different story with AI as now “regulators are watching and are already regulating”.

While the OSI are the stewards of the open source AI definition, it does not have any strong power to enforce the definition. However, judges and courts around the world are starting to recognise that the open source definition is important, especially when it comes to mergers but also regulation.

“We do expect the definition to have an impact on regulators,” Maffulli said.

“They’re watching us. We have become credible interlocutors”.