How do AI large models and Web3 coexist? |All Creation Camp AI+Crypto Studio

IntermediateJan 31, 2024
This article explains how blockchain technology can solve the current bottlenecks in training large AI models, including: data volume and privacy balance, cost and computing power, etc., and explores the interactive relationship between AI and Social.
How do AI large models and Web3 coexist? |All Creation Camp AI+Crypto Studio

As the fastest-growing high-tech technology in human history, large models have attracted everyone’s attention. Web3, which was just a thing of yesterday, is increasingly being challenged legally. But as completely different technologies, there is no substitution between the two.The person in charge of the “AI+Crypto Studio” on the Island of All Things - Mr. Tian Hongfei, will discuss with you the problems encountered in the development of large models and how companies in the web3 field are committed to solving these problems.

Large Model Industry Problems and How to Incorporate Web3 to Solve Them

As we all know, the Internet industry entered the oligopoly stage after 2015, and countries around the world have conducted antitrust reviews of platform companies. The emergence of large models further intensifies the monopoly position of oligopolies. Large models include algorithms, computing power and data:

  • In the field of algorithms, while there is a degree of monopoly, algorithms can remain largely open due to open source forces and the rivalry of research universities, as well as people’s distrust of oligarchs;
  • In terms of computing power, due to the extremely high cost of training large models, computing power can only be afforded by large enterprises, so this essentially results in the production of algorithms being completely controlled by large enterprises;
  • In terms of data, while the training of large models relies on public data, public data will soon be depleted based on the growth of the parameters of the large models, and therefore the continued growth of the large models relies on private data. While the absolute amount of data owned by a large number of small businesses is huge, it is difficult to utilize in isolation, so large businesses still have a monopoly on the data.

As a result, the Big Model era is more centrally controlled than ever before, and the world of the future will likely be controlled by a handful or even a single computer. (Even in the decentralized Web3 world, Vitalik’s proposed End Game for Ethereum will be run by a giant out-of-block machine.)

In addition, the OpenAI company that developed ChatGPT has only more than 20 core personnel. Due to various reasons, the algorithm of ChatGPT has not been open sourced so far. The original non-profit enterprise nature has been changed to limited profit. As various applications that rely on ChatGPT have changed human life, some modifications to the ChatGPT model will have a great impact on humans. Compared with Google’s don’t do evil principle, ChatGPT has a deeper impact on people.

Therefore, the computational credibility of the model will become an important issue. Although OpenAI can be operated as a non-profit, the control of power by a few people will still have many adverse consequences. (In contrast, although the Ethereum End Game proposed by Vitalik is produced by a machine, it will maintain transparency through very easy verification by the public.)

At the same time, there are still problems in the large model industry: shortage of computing power, available training data is about to be consumed, and model sharing. According to statistics, before 2021, the problem in the artificial intelligence industry is the lack of data, and all deep learning companies are looking for data in vertical industries; and after large models, the lack of computing power becomes an obstacle.

Large model development is divided into several stages: data collection, data preprocessing, model training, model fine-tuning, and deployment query inference. From these stages, let’s briefly describe the contribution of blockchain to large models and how to combat the harm of excessive concentration of large models.

  • In terms of data, since public data will be consumed after 2030, more valuable and larger amounts of private data need to be utilized while protecting privacy through blockchain technology;
  • In terms of data annotation, tokens can be used to incentivize larger-scale annotation and verification of data;
  • In the model training stage, computing power sharing is achieved through model sharing and collaborative training;
  • During the model fine-tuning phase, community participation can be incentivized through tokens;
  • In the user query and reasoning calculation phase, the blockchain can protect user data privacy.

In particular:

1) Scarce computing power

Computing power is a necessary production factor for large models, and it is the most expensive production factor today, so much so that startups that have just raised funds have to transfer 80% of their funds to NVIDIA to purchase GPUs. Companies that produce their own large models have to spend at least $50 million to build their own data centers, while small startups have to purchase expensive cloud computing services.

However, the short-term popularity of large models and the huge consumption of computing resources by the large models themselves have greatly exceeded NVIDIA’s supply capacity. According to statistics, the demand for computing power of large models doubles every few months. Between 2012 and 2018, the demand for computing power increased by 300,000 times, and the cost of large model calculations increased by 31 times every year.

For Chinese Internet companies, they also have to face the US embargo on high-end GPUs.It can be said that the huge training cost is the core reason why large model technology is controlled by a few people.

So how to solve the computing power problem of large models through blockchain?

Considering the production of large models, it is mainly divided into large model training, fine tuning and user query inference calculation. Although large models are notoriously expensive to train, a version of a large model only needs to be generated once. Most of the time, for large model service users, only inferential computation is required. According to AWS statistics, this is also confirmed, 80% of the computing power is actually consumed in inference calculations.

Although the training of large models requires high-speed communication capabilities between GPUs, it cannot be completed on the network (unless you choose to trade time extension for low cost). But inference calculations can be done on a single GPU. Fine tuning is based on the large model that has been generated and given professional data, so it requires much less computing resources than large model training.

When it comes to graphics rendering, it’s clear that consumer GPUs perform better than enterprise GPUs and are idle most of the time. Since the University of California, Berkeley launched SETI to search for aliens in 1999, and Grid Computing became popular in 2000, there have been some technical architectures that use idle computing resources to collaborate to complete some huge computing tasks. Before the emergence of blockchain, these collaborations were usually focused on scientific tasks and relied on the enthusiasm and public welfare participation of participants, limiting the scope of impact. Now using blockchain technology, its wide range of applications can be incentivized through tokens.

Just like the decentralized cloud computing project Akash, a general computing network has been established, and users can deploy machine learning models for reasoning calculations and image rendering. There are also blockchain-based AI projects such as Bittensor, Modulus Lab, Giza, and ChainML, all of which are aimed at query inference calculations.

The blockchain AI computing protocol Gensyn and the open source generative AI platform Together are determined to build a decentralized computing network that serves large model training.

Challenge: For decentralized computing networks, the difficulty lies not only in low-speed and unreliable communication networks, inability to synchronize computational states, dealing with multiple types of GPU-type computing environments, but also dealing with economic incentives, participant cheating, proof-of-workload, security, privacy protection, and anti-spam attacks.

2) Scarce data and data correction

The core algorithm of the large model, Reinforcement Learning from Human Feedback (RLHF), requires human participation in fine-tuning training to correct errors and eliminate bias and harmful information. OpenAI used RLHF to fine-tune GPT3 to generate ChatGPT. In the process, OpenAI found experts from Facebook Group and paid Kenyan laborers $2 per hour. Optimization training often requires the participation of human experts on data from specialized fields, and its implementation can be fully combined with ways to incentivize community participation through tokens.

The Decentralized Physical Infrastructure Networks (DePINs) industry uses tokens to encourage people to share real, real-time data from the physical world according to sensors for various model training. Including: React collects energy usage data, DIMO collects vehicle driving data, WeatherXM collects weather data, and Hivemapper collects map data through token incentives to encourage people to mark traffic signs and help its RLHF machine learning algorithm improve the accuracy.

At the same time, as the parameters of large models increase, the existing public data will be exhausted by 2030, and the continued progress of large models will have to rely on private data. The amount of private data is 10 times that of public data, but it is scattered in the hands of enterprises and individuals, and is private and confidential in nature, making it difficult to exploit. A double dilemma arises. On the one hand, the large model needs data, but although the party with the data needs the large model, it does not want to hand over the data to the large model. This double problem can also be solved through technology in the blockchain field.

For open source inference models, because they require less computing resources, the model can be downloaded to the data segment for execution; for non-public models or large models, the data needs to be desensitized and uploaded to the model end. Desensitization methods include synthetic data and zero-knowledge proofs.

Whether the model is downloaded to the data side or the data is uploaded to the model side, the authority problem needs to be solved to prevent model or data cheating.

Challenge: Although Web3’s token incentives can help solve this problem, the problem of cheating needs to be solved.

3) Model collaboration

In the Civitai community, the world’s largest AI painting model sharing platform, people share models and can easily copy a model and modify it to generate a model that meets their own requirements.

Bittensor, an open source AI rookie and dual-consensus blockchain project, has designed a set of token-incentive decentralized models. Based on the collaboration mechanism of a mixture of experts, it jointly produces a problem-solving model and supports knowledge distillation, which can be shared between models. Information, accelerated training, which provides numerous startups with the opportunity to participate in large models.

As a unified network for off-chain services such as automation, oracles, and shared AI, Autonolas has designed a collaboration framework for agents to reach consensus through Tendermint.

Challenge:The training of many models still requires a lot of communication, and the reliability and time efficiency of distributed training are still huge obstacles;

Big models and innovation in Web3

In conjunction with the above discussed how Web3 can be utilized to solve some of the problems in the large modeling industry. The combination of two important forces will result in some innovative applications.

1) Use ChatGPT to write smart contracts

Recently, an NFT artist used prompts to operate ChatGPT without any programming knowledge to release his own smart contract and issue the token Turboner. The artist used YouTube to record his creation process for a week, inspiring everyone to use ChatGPT. Participate in smart contract creation.

2) Crypto payment empowers intelligent management

The development of big models has greatly improved the intelligence of smart assistants, and combined with encrypted payments, smart assistants will be able to coordinate more resources and collaborate on more tasks in the smart assistant market.AutoGPT demonstrates the reliance on a user-supplied credit card, and he can help the user to automate the purchase of cloud computing resources and the booking of flights, but is limited by the automated log-in or other security authentication, and AutoGPT’s capabilities are severely limited by automatic login or other security authentication. The Multi Agent System (MAS) design, including the Contract Net Protocol, includes the collaboration of multiple intelligent assistants in an open marketplace, and if supported by tokens, such collaboration will break through the limited collaboration based on trust and become a larger-scale collaboration based on the market economy, just as human society moves from a primitive society to a monetary society.

3) zkML(Zero Knowledge Machine Learning)

The application of zkp (Zero Knowledge Proof) technology in blockchain is divided into two categories. One is to solve the performance of blockchain by transferring computing requirements to off-chain, and then to on-chain certification through zkp; the second category is Used to protect transaction privacy. The applications of zkp in large models include model trustworthy calculations (to prove the consistency and authenticity of model calculations) and privacy calculations of training data. In a decentralized environment, the service provider of the model needs to prove to customers that the model sold is the model promised to the customer, without cutting corners; for training data partners, they need to participate in training or use the model on the premise of protecting their own privacy. . Although zkp offers some possibilities, there are still many challenges, and solutions such as homomorphic computing and federated privacy computing are still immature.

Solution based on BEC (Blockchain Edge Client) architecture

In addition to the above schools, there is another school that has not received widespread attention due to the lack of token incentives and the use of minimalist blockchain applications.

The BEC-based architecture has many similarities with the concepts of Web5 mentioned by Jack Dorsey and Solid by Tim Berners-Lee in many aspects.

They all think:

  • Each person has a corresponding control edge node;
  • Computing and storage in most application scenarios should be handled at edge nodes;
  • The collaboration between individual nodes is completed through the blockchain;
  • Communication between nodes is completed through P2P;
  • Individuals can fully control their own nodes alone or entrust trusted people to entrust management nodes (called relay servers in some scenarios);
  • Achieved the greatest possible decentralization;

When this node corresponding to each person and controlled by the individual stores personal data and loads the large model, a completely personalized, 100% privacy-protected personal intelligent agent (Agent) can be trained. SIG’s Chinese founding partner Dr. Gong Ting romantically compared the future personal node to the personal cloud above Olaf’s head in “Frozen” that always follows him.

In this way, the Avatar in the Metaverse will no longer be an image controlled by a keyboard, but an agent with a soul. He can study online news, process emails, and even automatically reply to your social chat messages on our behalf 24 hours a day. (Attention, nagging girlfriends, you may need a way to detect whether your boyfriend is using an agent to deal with you in the future). When your agent needs new skills, just like installing an app on a mobile phone, you can install a new app in your node.

Summary

Historically, with the continuous platformization of the development of the Internet, although the time for the birth of unicorn companies has become shorter and shorter, it has become increasingly detrimental to the development of starups.

With the efficient content distribution platform provided by Google and Facebook, Youtube, which was born in 2005, was acquired by Google for US$1.6 billion just one year later.

Along with the efficient application distribution platform of the Apple App Store, Instagram was founded in 2012 by just more than 10 people and was acquired by Facebook for US$1 billion in 2012.

With the support of the ChatGPT large model, Midjourney, which has only 11 people, earned US$100 million a year. And OpenAI, which has no more than 100 people, is valued at more than 20 billion US dollars.

Internet platform companies are becoming more and more powerful, and the emergence of large models has not changed the existing pattern of the Internet being monopolized by large enterprises. The three elements of large models, algorithms, data and computing power are still monopolized by large enterprises. Start-up companies do not have the ability to innovate large models and do not have the financial strength to train large models. They can only focus on the application of large models in vertical fields. Although large models seem to promote the popularization of knowledge, the real power is controlled by no more than 100 people in the world who have the ability to produce models.

If large models penetrate into all aspects of people’s lives in the future, and you ask ChatGPT about your daily diet, your health, your work emails, and your lawyer’s letters, then in theory, those who master large models only need to secretly change some parameters. It can greatly affect the lives of countless people. Some unemployment caused by the large model may be solved through UBI or Worldcoin, but the consequences of the possibility of evil caused by the large model being controlled by a few people are more serious. This is the original intention of OpenAI. Although OpenAI solves profit-driven problems through non-profit methods, how does it solve power-driven problems? Obviously, large models quickly train knowledge models using knowledge accumulated by humans for decades and shared freely on the Internet, but this model is controlled by a very small number of people.

  1. Therefore, there is a huge conflict in values ​​between large models and blockchain. Blockchain practitioners need to participate in large-model entrepreneurship and use blockchain technology to solve large-model problems.If the huge amount of data freely available on the Internet is the common knowledge of mankind, then the large models generated based on these data should belong to the entire mankind.Just as OpenAI recently started paying for literature databases, OpenAI needs to pay for the personal blogs that you and I dedicate ourselves to.

Disclaimer:

  1. This article is reprinted from [ThreeDAO, Island of All Things]. All copyrights belong to the original author [36C]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!
Create Account