in

AI’s Next Wave: Beyond Scaling

AI’s Next Wave: Beyond Scaling

 

For years, computations and data increases have been a straight path to improve AI. The large models trained on wide datases perform better on widespread tasks, from understanding of language to image generation. at all Tedai San Francisco In October 2024, Naum Brown, a research scientist at Open, said “In the last five years, incredible progress in AI can be summarized in one word: scale.”

However, recent trends show that the scale can no longer be enough to make great improvement. Despite continuous investment in forever larger models, some companies are gaining competitive performance through alternative perspectives. For example, the Chinese AI Startup DPSC showed that advanced performance can be achieved significantly with low costs, which can highlight the growing change from this assumption that the big one is always better.

Is Ai hitting once?

Initially, increasing the size of the model and training data increased significantly in performance. But as the models grew up, the benefits became small.

In May 2024, Open CEO Sam Altman told staff that he expects his generative AI model Orion to be significantly better than the last flagship model released a year ago. While Orion’s performance ended over the previous models, The quality increase was too small than the jump between GPT-3 and GPT-4. Even the names of the open models appear to be pointing to slow progress: GPT -4, GPT -4O, O1, O3. And,, in fact, Sam Altman Recently announced on X This Orion will be sent as GPT-4.5 instead of the expected GPT-5.

A Challenge with AI: High quality training data is limited. The models have already used high quality text available online, which has less effective impact. Experience is happening with AI-infield training data but with limited success. Orion was trained On AI-inflated data, which is manufactured by other open AI models. However, some fear that this can cause a new problem where new models resemble the old model in some aspects.

And when we can technically throw more computers (such as scaling to large GPU clusters) on an AI issue, the cost of this strategy can be forbidden. Brown said in Tedai“Anyway, are we really going to train models that will cost hundreds of billions of dollars or trillions of dollars? At some point, the scaling sample breaks down.

What’s ahead?

Since the Bruit Force Scaling Payment appears to be over, the AI ​​innovators are looking for new ways. Instead of relying on forever larger models and large -scale budgets, the spotlight is moving towards a more efficient and cost -effective strategy.

Small, Task specific models

Small language samples are efficient, cost efficient, and can be fine for specific applications such as fraud detection or personal recommendations. A data expert noted in the alphasins transcript The tendency towards small, more prepared models with consideration for cost implications in the case of model and model size is a tendency toward more enthusiasm. Another expert has predicted The emergence of new platforms that will allow companies to create and customize their small language models.

To describe this trend is AI startup writer, claiming its latest language model Makes the largest top -tear model performance on many main matrix Nevertheless, in some cases, there are only 20th parameters.

These small architecture also connect well with hybrid strategies. For a wide domain knowledge or more complex reasoning, organizations can tap in large models. They can rely on minor issues for the most targeted use issues.

Advance

Large model training is not the only way to promote performance. Rapid, AI teams are detecting the imagination of the imaginary phase-when a trained model is used, it is applied. The reasoning of China’s thinking allows the technique to break the tasks in small steps, which improves accuracy.

Openai’s reasoning model O1 Improves its performance by using more computing resources – And take more time – as it answers users’ questions. Nevertheless, -15 million per million input token-O1 is twelve times more expensive for GPT-4O, which makes many people raise the question of whether the cost has been justified in most use matters.

Meanwhile, Deep Sak, which was founded in 2023, Reached the number 1 A week after the release of its R1 model on Apple’s App Store, which works with lines like Openi’s O1. Presented with a complex challenge, Deep Civic took time to consider alternative ways before choosing the best solution, explaining the chain of reasoning to users.

Architectural Innovations and Cost intelligence

The R1 model of the Deepesic, which is mentioned above, is not just the Openi O1 model is not matching quality – it is significantly cheaper and double. Only for In training costs .6.6 million -Less than hundreds of millions of dollars in Openi-DiPsic showed the world that a high-quality reasoning model is not too expensive. A portion of the edge comes from multi -token forecasts and developments in floating point architecture.

In addition, V3 Model of DiPsicIntroduced a month before the R1, the MOE also uses a mixture of architecture, where the AI ​​model is divided into separate sub -networks, or “experts”. Each expert specializes in the input data subset, which can take the model more effectively on information.

Apple researchers Recently offered more insights In DiPsic’s “secret sauce”, pointing out that a portion of the depressic performance is to be reduced. In a viral model, many weights are inactive during some special counts, meaning only a small sub -set of experts is activated, which significantly reduces memory and computer overheads without sacrificing performance.

The costs to reduce another key technique are Oson, where a small model learns from a large model by asking hundreds of thousands of questions and analyzing answers. This allows companies to produce most of the latest model performance without the same mass training overhead. Openi said Were indicators That DiPsic did this.

Recently, a research lab at Novsky, California, Berkeley, Released an east version of the open source model from AlibabaClaiming the equivalent performance of the recent open -up model – only in 50 450. Immediately after, researchers at Stanford and Washington University Trained and open -minded model S1 It was less than one of Google’s argument model in less than $ 50.

Some are still scaling

Not everyone has withdrawn from the Bruit Force scaling. As an AI expert noted in alphansius transcriptFor, for, for,. The possibility of being once more likely to be more than a long -term, as the cost structure can be improved over time.

In January, Openi announced the Star Gate ProjectWhich is intended to invest $ 500 billion in building a new AI infrastructure for Open AI in the United States over the next four years. The same month, Meta revealed This year plans to spend up to $ 65 billion to increase its AI infrastructure. Whether this massive investment will be broken through a potential plateau or reinforcing the decrease return of the brutal force extension.

Ai’s future

When the Bruit Force starts to disappear from the scaling, AI is ready to be fresh, meaningful ways. By focusing on architectural innovations, improvement of diagnostic phase, and domain -made models, organizations can chart new ways of high performance without astronomical costs. In many ways, this shift indicates a new era where smart engineering and adaptation have the same raw computational horsepower – which once secured for tech giants even empowers small players to run successes.

 

Leave a Reply

Avatar

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

    What do you think?

    EveryOneHub

    Written by EveryOneHub

    Welcome to EveryoneHub.com – your go-to destination for trending stories, insights, and ideas! From Business and Technology to Health and Entertainment, we bring you curated content that informs, inspires, and engages. Join our community, share your voice, and explore the world with us.

    How to Deal with Homesickness: 6 Tips for Coping

    How to Deal with Homesickness: 6 Tips for Coping