Stanford AI Team Engulfed in Plagiarism Scandal: Llama 3-V Accused of Copying Tsinghua’s Model

On May 29, a team from Stanford University announced the development of Llama 3-V, a groundbreaking AI model, claiming it outperformed other leading models like GPT-4V, Gemini Ultra, and Claude Opus, while being significantly smaller and cheaper to train. However, the excitement was short-lived as accusations of plagiarism emerged, suggesting that Llama 3-V heavily borrowed from the MiniCPM-Llama3-V 2.5 model developed by Tsinghua University's AI company, Mianbi Intelligence. The controversy has since escalated, with evidence surfacing that Llama 3-V might have copied substantial parts of the MiniCPM model, leading to a heated debate in the AI community.

Key Takeaways

Model Announcement: Stanford's team claimed Llama 3-V was a superior, cost-effective model compared to other leading AI models.
Plagiarism Allegations: The model was accused of copying from Tsinghua's MiniCPM-Llama3-V 2.5, including its structure and code.
Evidence of Plagiarism: Detailed comparisons revealed striking similarities between the models, including shared configurations and codebases.
Stanford's Defense: The Stanford team denied the accusations, stating they used only the tokenizer from MiniCPM.
Deletion of Evidence: Following the controversy, the Stanford team deleted related posts and repositories, further fueling suspicions.

Analysis

The scandal began when the Stanford team published an article on Medium, boasting about Llama 3-V's capabilities. The model was promoted as a state-of-the-art, multimodal AI significantly smaller and cheaper than its competitors. However, AI enthusiasts and experts soon noticed that Llama 3-V bore an uncanny resemblance to Tsinghua's MiniCPM-Llama3-V 2.5.

Several pieces of evidence were presented to support these allegations:

Model Structure and Code: Comparisons showed that Llama 3-V and MiniCPM-Llama3-V 2.5 shared nearly identical structures and configurations, differing only in variable names.
Tokenization Process: The Stanford team claimed to have used only the tokenizer from MiniCPM. However, it was pointed out that the specific tokenizer used in MiniCPM-Llama3-V 2.5 was not publicly available before Llama 3-V's development, raising questions about how Stanford accessed it.
Behavioral Similarities: Tests revealed that Llama 3-V's performance and errors closely mirrored those of MiniCPM-Llama3-V 2.5, suggesting more than coincidental similarity.
Deleted Repositories: The abrupt deletion of GitHub and HuggingFace repositories by the Stanford team further intensified the controversy, implying an attempt to cover up.

In response, the Stanford team provided a defense that was met with skepticism. They claimed their work predated the release of MiniCPM-Llama3-V 2.5 and that their model used publicly available configurations. However, inconsistencies in their explanations and the striking similarities between the models led to widespread disbelief.

The controversy reached a peak when Mianbi Intelligence's team provided additional proof, including specific functionalities like the recognition of ancient Chinese characters (Qinghua bamboo slips), which were exclusive to MiniCPM-Llama3-V 2.5. This level of detail, they argued, could not have been replicated without access to their proprietary data.

Did You Know?

Multimodal AI Models: These models, like Llama 3-V and MiniCPM-Llama3-V 2.5, are designed to process and interpret multiple types of data inputs (e.g., text, images) simultaneously, significantly enhancing their versatility and application range.
Tokenizer: This is a crucial component in AI language models that breaks down text into manageable pieces (tokens), making it easier for the model to process and understand. The specificity and customization of tokenizers are essential for the accuracy and efficiency of AI models.
Qinghua Bamboo Slips: These ancient Chinese texts date back to the Warring States period (475–221 BC) and are considered extremely rare and valuable for historical research. The ability of an AI model to recognize and interpret these texts indicates a high level of sophistication and specialized training.

The Llama 3-V plagiarism scandal has sparked intense debate in the AI community, highlighting the ethical challenges and competitive pressures in the field of artificial intelligence research. The outcome of this controversy could have significant implications for academic integrity and intellectual property in AI development.

Stanford AI Team Engulfed in Plagiarism Scandal: Llama 3-V Accused of Copying Tsinghua’s Model