🚀 Gate.io #Launchpad# for Puffverse (PFVS) is Live!
💎 Start with Just 1 $USDT — the More You Commit, The More #PFVS# You Receive!
Commit Now 👉 https://www.gate.io/launchpad/2300
⏰ Commitment Time: 03:00 AM, May 13th - 12:00 PM, May 16th (UTC)
💰 Total Allocation: 10,000,000 #PFVS#
⏳ Limited-Time Offer — Don’t Miss Out!
Learn More: https://www.gate.io/article/44878
#GateioLaunchpad# #GameeFi#
Stability AI quickly released the Llama 2 fine-tuning model FreeWilly, whose performance is comparable to ChatGPT! Netizens exclaimed that the rules of the game have changed
Source: Xinzhiyuan
As soon as Meta's Llama 2 was released, it detonated the entire open source community.
As OpenAI scientist Karpathy said, this is an extremely important day for the entire field of large language models. Of all the models with open weights, Llama 2 is the most powerful one.
From then on, the gap between open-source big models and closed-source big models will be further narrowed, and the opportunity to build big models will be equal to all developers.
Just now, Stability AI and CarperAI Labs jointly released a fine-tuning model based on the LLaMA 2 70B model - FreeWilly2.
And, based on the fine-tuning of the original model of LLaMA 65B - FreeWilly1.
In various benchmark tests, FreeWilly2 has demonstrated excellent reasoning capabilities, and even surpassed GPT-3.5 in some tasks.
Both models are research experiments and released under a non-commercial license.
Data generation and collection
Stability AI said that the training of the FreeWilly model was directly inspired by the Microsoft paper "Orca: Progressive Learning from Complex Explanation Traces of GPT-4".
However, while the data generation process is similar, the sources are different.
The dataset variant of FreeWilly contains 600,000 data points (roughly 10% of the dataset size used in the original Orca paper), and the model is bootstrapped by using a high-quality instruction dataset created by Enrico Shippole:
COT Submix Original
NIV2 Submix Original
FLAN 2021 Submix Original
T0 Submix Original
With this approach, Stability AI generated 500,000 examples using a simpler LLM model, and an additional 100,000 examples using a more complex LLM model.
Although the training sample size is only one-tenth of the original Orca paper, the resulting FreeWilly model not only performs well in various benchmark tests, but also verifies the feasibility of the method of synthetically generating datasets.
Evaluation of model performance
In terms of performance evaluation, Stability AI researchers adopted EleutherAI's lm--harness and added AGI.
Judging from the results, FreeWilly excels in many areas, including complex reasoning, understanding the subtleties of language, and answering complex questions related to professional domains (such as legal and mathematical problem solving).
Basically, FreeWilly 2 has achieved a level comparable to ChatGPT, and even surpassed it in some evaluations.
It can be seen that in the Open LLM leaderboard, FreeWilly 2 ranks first with an absolute lead, and the average score is 4 percentage points higher than that of the original version of Llama 2.
For an open future
It can be said that FreeWilly1 and FreeWilly2 set a new standard for open source large language models.
The introduction of these two models has not only greatly advanced the research in related fields, enhanced the ability of natural language understanding, but also supported the completion of complex tasks.
Stability AI said that the team is very excited about the infinite possibilities that these models can bring to the AI community, and looks forward to the new applications that they will inspire.
In addition, a heartfelt thank you to the passionate team of researchers, engineers, and partners whose extraordinary efforts and dedication have enabled Stability AI to reach this important milestone.
EXCITING TIME
Once the model was released, netizen "Phil Howes" used Tuhin Srivastava's Llama v2 framework to complete the implementation of FreeWilly 2 in less than a minute.
After 275GB of weight loading, the model runs at 23 token/s out of the box.