Reshaping the Boundaries of Computing: The Current Situation and Prospects of Decentralized Computing Power

IntermediateJan 04, 2024
With the development of AI and other fields, many industries will achieve huge changes in the underlying logic, computing power will rise to a more important position, and various aspects related to it will also cause extensive exploration in the industry. Decentralized computing power networks have their Its own advantages can reduce the risk of centralization and can also serve as a complement to centralized computing power.
Reshaping the Boundaries of Computing: The Current Situation and Prospects of Decentralized Computing Power

Computing power in demand

Since the release of “Avatar” in 2009, it launched the first battle of 3D movies with unparalleled real images. As a huge contributor behind it, Weta Digital contributed to the visual effects rendering of the entire movie. In its 10,000-square-foot server farm in New Zealand, its computer cluster processed up to 1.4 million tasks per day and processed 8GB of data per second. Even so, it continued to run for more than a month before all renderings were completed. Work.

With large-scale machine deployment and cost investment, “Avatar” has achieved outstanding achievements in the history of film.

On January 3 of the same year, Satoshi Nakamoto mined the genesis block of Bitcoin on a small server in Helsinki, Finland, and received a block reward of 50 BTC. Since the first day of cryptocurrency, computing power has played a very important role in the industry.

The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power.

—— Bitcoin Whitepaper

In the context of the PoW consensus mechanism, the contribution of computing power provides guarantee for the security of the chain. At the same time, the continuously rising Hashrate can also prove the miners’ continued investment in computing power and positive income expectations. The industry’s real demand for computing power has also greatly promoted the development of chip manufacturers. Mining machine chips have gone through development stages such as CPU, GPU, FPGA, and ASIC. Currently, Bitcoin mining machines are usually chips based on ASIC (Application Specific Ingrated Circuit) technology that can efficiently execute specific algorithms, such as SHA-256. The huge economic benefits brought by Bitcoin have also driven up the demand for computing power in related mining. However, overly specialized equipment and cluster effects have caused a siphon effect among its own participants, whether they are miners or mining machine manufacturers. They all show a trend of capital-intensive concentrated development.

With the advent of Ethereum’s smart contracts, its programmability, composability and other features have formed a wide range of applications, especially in the field of DeFi, which has made the price of ETH rise all the way, while it is still in the PoW consensus The mining difficulty of Ethereum in this stage has also been rising. Miners’ computing power requirements for Ethereum mining machines are also increasing day by day. However, unlike Bitcoin, which uses ASIC chips, Ethereum needs to use a graphics processing unit (GPU) for mining calculations, such as the Nvidia RTX series. In this way, it is more suitable for general computing hardware to participate. This even triggered the market competition for GPUs, which caused high-end graphics cards on the market to be out of stock.

When the time came to November 30, 2022, ChatGPT developed by OpenAI also demonstrated the epoch-making significance in the field of AI. Users marveled at the new experience brought by ChatGPT, which can complete various user-proposed tasks based on context just like a real person. Require. In the new version launched in September this year, generative AI that adds multi-modal features such as voice and images has brought the user experience to a newer stage.

But correspondingly, GPT4 has more than one trillion parameters involved in model pre-training and subsequent fine-tuning. These are the two parts with the greatest demand for computing power in the AI ​​field. In the pre-training phase, a large amount of text is studied to master language patterns, grammar and associated context. Enable it to understand language patterns to generate coherent and contextual text based on input. After pre-training, GPT4 is fine-tuned to better adapt to specific types of content or styles and improve performance and specialization in specific demand scenarios.

Since the Transformer architecture adopted by GPT introduces the self-attention mechanism, this mechanism allows the model to simultaneously pay attention to the relationship between different parts of the sequence when processing the input sequence. Therefore, the demand for computing power has increased sharply. Especially when processing long sequences, a large amount of parallel computing and storage of a large number of attention scores are required, which also requires a large amount of memory and high-speed data transmission capabilities. The current mainstream LLM with the same architecture has huge demand for high-performance GPUs, which also shows that the investment cost in the field of AI large models is huge. According to relevant SemiAnalysis estimates, the cost of training a GPT4 model is as high as $63 million. In order to achieve a good interactive experience, GPT4 also needs to invest a lot of computing power in daily operations to maintain its daily operations.

Computing hardware classification

Here we need to understand the current main computing power hardware types. What computing power demand scenarios can be handled by CPU, GPU, FPGA, and ASIC respectively.

• From the architectural diagram of the CPU and GPU, the GPU contains more cores, which enable the GPU to process multiple computing tasks at the same time. Parallel computing has stronger processing capabilities and is suitable for processing a large number of computing tasks, so in the fields of machine learning and deep learning Has been widely used. The CPU has a smaller number of cores and is suitable for processing a single complex calculation or sequence task more intensively, but it is not as efficient as the GPU when processing parallel computing tasks. In rendering tasks and neural network computing tasks, a large number of repeated calculations and parallel calculations usually need to be processed, so the GPU is more efficient and suitable than the CPU in this aspect.

• FPGA (Field Programmable Gate Array) field programmable logic gate array is a semi-custom circuit in the field of application specific integrated circuit (ASIC). An array composed of a large number of small processing units, FPGA can be understood as a programmable digital logic circuit integrated chip. The current application mainly focuses on hardware acceleration, and other tasks are still completed on the CPU, allowing the FPGA and CPU to work together.

• ASIC (Application Specific Integrated Circuit) refers to an integrated circuit designed to meet specific user requirements and the needs of specific electronic systems. Compared with general-purpose integrated circuits, ASIC has the advantages of smaller size, lower power consumption, improved reliability, improved performance, enhanced confidentiality, and reduced cost during mass production. Therefore, in the inherent scenario of Bitcoin mining, which only needs to perform specific computing tasks, ASIC is the most suitable. Google has also launched a TPU (Tensor Processing Unit) specially designed for machine learning as a type of ASIC, but currently it mainly provides computing power rental services through Google Cloud.

• ASIC compared to FPGA, ASIC is an application specific integrated circuit and the integrated circuit is fixed once the design is completed. FPGA integrates a large number of basic digital circuit gates and memories into the array. Developers can define the circuit by programming the FPGA configuration, and this programming is replaceable. However, given the current update speed in the AI ​​field, customized or semi-customized chips cannot be adjusted and reconfigured in time to perform different tasks or adapt to new algorithms. Therefore, the general adaptability and flexibility of GPU make it shine in the field of AI. Major GPU manufacturers have also made relevant optimizations for the adaptation of GPUs in the AI ​​field. Taking Nvidia as an example, it has launched the Tesla series and Ampere architecture GPUs designed specifically for deep learning. These hardware contain hardware units (Tensor Cores) optimized for machine learning and deep learning calculations, which enable the GPU to perform more efficiently and more efficiently. Low energy consumption to perform forward and backward propagation of neural networks. In addition, a wide range of tools and libraries are provided to support AI development, such as CUDA (Compute Unified Device Architecture) to help developers use GPUs for general-purpose parallel computing.

Decentralized computing power

Decentralized computing power refers to the method of providing processing power through distributed computing resources. This decentralized approach is usually combined with blockchain technology or similar distributed ledger technology to pool idle computing resources and distribute them to users in need to achieve resource sharing, transactions and management.

Background

• Strong computing hardware demand. The prosperity of the creator economy has brought digital media processing into an era of universal creation. The demand for visual effects rendering has surged, and specialized rendering outsourcing studios, cloud rendering platforms and other forms have emerged. However, this approach also You need to invest a lot of money yourself in the early procurement of computing power hardware.

• The computing power hardware comes from a single source. The development of the AI ​​field has intensified the demand for computing hardware. The world’s leading GPU manufacturing companies, led by Nvidia, have made a lot of money in this AI computing power competition. Its supply capacity has even become a key factor that can restrict the development of a certain industry. Nvidia’s market value also exceeded one trillion US dollars for the first time this year.

• The provision of computing power still mainly relies on centralized cloud platforms. What is really benefiting from the surge in demand for high-performance computing is the centralized cloud vendors represented by AWS. They have launched GPU cloud computing services. Taking the current AWS p4d.24xlarge as an example, renting One such HPC server specializing in ML, containing eight Nvidia A100 40GB GPUs, costs US$32.8 per hour, and its gross profit margin is estimated to be 61%. This has also caused other cloud giants to rush to participate and hoard hardware to gain as much advantage as possible in the early stages of the industry’s development.

• Political, human intervention and other factors lead to uneven development of the industry. Imbalance It is not difficult to see that the ownership and concentration of GPUs are more tilted towards organizations and countries with abundant funds and technology, and are dependent on high-performance computing clusters. This has caused the chip and semiconductor manufacturing powers represented by the United States to also implement more stringent restrictions on the export of AI chips to weaken the research capabilities of other countries in the field of general artificial intelligence.

• The allocation of computing power resources is too concentrated. The development initiative in the AI ​​field is in the hands of a few giant companies. Currently, the giants represented by OpenAI have the blessing of Microsoft, and behind them are the rich computing resources provided by Microsoft Azure. This makes OpenAI every The release of new products is a reshaping and integration of the current AI industry, making it difficult for other teams to keep up in the field of large models.

So in the face of high hardware costs, geographical restrictions, and uneven industrial development, are there other solutions?

The decentralized computing power platform emerged as the times require. The purpose of the platform is to create an open, transparent and self-regulating market to more effectively utilize global computing resources.

adaptive analysis

  1. Decentralized computing power supply side

The current high hardware prices and artificial control on the supply side have provided the soil for the construction of decentralized computing power networks.

• From the perspective of the composition of decentralized computing power, various computing power providers range from personal PCs to small Internet of Things Equipment is as large as data centers, IDCs, etc., and the accumulated computing power can provide more flexible and scalable computing solutions, thereby helping more AI developers and organizations make more effective use of limited resources. Decentralized computing power sharing can be achieved through the idle computing power of individuals or organizations. However, the availability and stability of these computing power are subject to the usage restrictions of the users or the upper limit of sharing.

• Possible potential source of high-quality computing power is the computing power provided directly by the transformation of relevant mines after Ethereum is converted to PoS. human resources. Take Coreweave, the leading GPU integrated computing power provider in the United States, as an example. It was formerly the largest Ethereum mining farm in North America and is based on a complete infrastructure that has been built. In addition, retired Ethereum mining machines also contain a large number of idle GPUs. It is reported that there were about 27 million GPUs working online at the peak of the Ethereum mining era. Revitalizing these GPUs can also further become an important part of the decentralized computing power network. source of computing power.

  1. Decentralized computing power demand side

• From a technical implementation point of view, decentralized computing resources are used in graphics rendering and video transcoding. Such calculations are complex. For low-level tasks, the economic system combining blockchain technology and web3 can bring tangible income incentives to network participants and accumulate effective business models and customer groups while ensuring the safe transmission of information data. The AI ​​field involves a large amount of parallel computing, communication and synchronization between nodes, and has very high requirements on network environment and other aspects. Therefore, current applications are also focused on fine-tuning, inference, AIGC and other more application layers.

• From a business logic perspective, a market that simply buys and sells computing power lacks imagination, and the industry can only deal with the supply chain and pricing. Strategies, but these happen to be the advantages of centralized cloud services. Therefore, the market ceiling is low and there is no room for more imagination, so we can also see that networks that were originally doing simple graphics rendering are seeking AI transformation. For example, Render Network and 2023 Q1 also launched a native integrated Stability AI tool set, which users can This function introduces Stable Diffusion operations, and the business is no longer limited to rendering operations but expands to the AI ​​field.

• From the perspective of the main customer groups, it is obvious that large B-side customers will prefer centralized integrated cloud services. They usually With sufficient budgets, they are usually engaged in the development of large underlying models and require a more efficient form of computing power aggregation; therefore, decentralized computing power serves more small and medium-sized development teams or individuals, and is mostly engaged in model fine-tuning. Or application layer development, which does not have high requirements on the form of computing power provided. They are more sensitive to price. Decentralized computing power can fundamentally reduce the initial cost investment, so the overall cost of use is also lower. Based on the cost previously calculated by Gensyn, the computing power is converted into the equivalent value provided by V100. Computing power, Gensyn’s price is only US$0.4 per hour, which is 80% lower than AWS’s equivalent computing power of US$2 per hour. Although this part of the business does not account for the majority of spending in the current industry, as the use scenarios of AI applications continue to expand, the future market size cannot be underestimated.

• From the perspective of the services provided, it can be found that the current project is more like the concept of a decentralized cloud platform, providing A complete set of management from development, deployment, online, distribution, and transaction. The advantage of this is to attract developers, who can use relevant tool components to simplify development and deployment and improve efficiency; at the same time, it can attract users to use these complete application products on the platform. , forming an ecological moat based on its own computing power network. But this also puts forward higher requirements for project operations. How to attract excellent developers and users and achieve retention is particularly important.

Applications in different fields

1. Digital Media Processing

Render Network A global rendering platform based on blockchain, its goal is to help creators with digital creativity. It allows creators to extend GPU rendering work to global GPU nodes on demand, providing a faster and cheaper rendering capability. After the creator confirms the rendering results, the blockchain network sends the code to the node. Coin rewards. Compared with traditional visual effects implementation methods, establishing local rendering infrastructure or adding corresponding GPU expenses to purchased cloud services requires high upfront investment.

Since its founding in 2017, Render Network users have rendered more than 16 million frames and nearly 500,000 scenes on the network. Data released from Render Network 2023 Q2 can also show that both the number of rendering frame jobs and the number of active nodes are increasing. In addition, Render Network and 2023 Q1 have also launched a natively integrated Stability AI toolset. Users can use this function to introduce Stable Diffusion operations, and the business is no longer limited to rendering operations and expands to the AI ​​field.

Livepeer provides real-time video transcoding services to creators through network participants contributing their own GPU computing power and bandwidth. Broadcasters can complete the transcoding of various types of videos by sending videos to Livepeer and distribute them to various end-side users, thereby realizing the dissemination of video content. At the same time, you can easily pay in legal currency to obtain services such as video transcoding, transmission, and storage.

In the Livepeer network, anyone is allowed to contribute personal computer resources (CPU, GPU, and bandwidth) to transcode and distribute videos to earn fees. The native token (LPT) represents the rights and interests of network participants in the network. The number of pledged tokens determines the node’s weight in the network, thus affecting its chances of obtaining transcoding tasks. At the same time, LPT also plays a role in guiding nodes to complete assigned tasks safely, reliably and quickly.

2. AIarea exhibition

In the current ecosystem in the AI ​​field, the main players can be roughly divided into:

Starting from the demand side, there are obvious differences in the demands for computing power at different stages of the industry. Taking the development of the underlying model as an example, the pre-training process requires very high parallel computing, storage, communication, etc. to ensure the effectiveness of the training results. This requires a large computing power cluster to complete related tasks. At present, the main supply of computing power mainly relies on self-built computer rooms and centralized cloud service platforms. In the subsequent stages of model fine-tuning, real-time reasoning and application development, the requirements for parallel computing and inter-node communication are not so high. This is exactly where decentralized computing power can show its full potential.

Looking at the projects that have gained considerable popularity before, Akash Nework has made some attempts in the direction of decentralized computing power:

Akash Network combines different technology components to allow users to efficiently and flexibly deploy and manage applications in a decentralized cloud environment. Users can use Docker container technology to package applications, and then deploy and scale them through Kubernetes through CloudMOS on the cloud resources provided by Akash. Akash uses a “reverse auction” approach, which makes the price lower than traditional cloud services.

Akash Network also announced in August this year that it would launch the sixth upgrade of its main network, incorporating support for GPUs into its cloud services, and providing computing power to more AI teams in the future.

Gensyn.ai, a project that has attracted much attention in the industry this year, was led by a16z and completed a Series A financing of US$43 million. Judging from the documents released so far, the project It is a main network based on the L1 PoS protocol of the Polkadot network, focusing on deep learning. It aims to push the boundaries of machine learning by creating a global supercomputing cluster network. This network connects devices ranging from data centers with excess computing power to PCs that can potentially contribute personal GPUs, custom ASICs, and SoCs.

In order to solve some of the problems currently existing in decentralized computing power, Gensyn draws on some new theoretical research results in academia:

  1. Adopt probabilistic learning proof, that is, use metadata of the gradient-based optimization process to construct proofs of relevant task execution to speed up the verification process;

  2. Graph-based Pinpoint Protocol, GPP serves as a bridge, connecting the offline execution of DNN (Deep Neural Network) and the smart contract framework on the blockchain, solving the inconsistencies that easily occur across hardware devices. , and ensures the consistency of verification.

  3. An incentive method similar to Truebit, through a combination of staking and punishment, establishes an incentive system that allows economically rational participants to honestly perform assigned tasks. The mechanism uses cryptography and game theory methods. This verification system is essential for maintaining the integrity and reliability of large model training calculations.

However, it is worth noting that the above content is more about solving the task completion verification level, rather than the decentralized computing power to achieve model training functions as the main highlight in the project document, especially about parallel computing and distributed Optimization of communication, synchronization and other issues between hardware. Currently, affected by network latency (Latency) and bandwidth (Bandwidth), frequent communication between nodes will increase iteration time and communication costs. This will not only not bring about actual optimization, but will reduce training efficiency. Gensyn’s approach to handling node communication and parallel computation in model training can involve complex coordination protocols to manage the distributed nature of computation. However, without more detailed technical information or a deeper understanding of their specific methods, the exact mechanism by which Gensyn achieves large-scale model training through its network will not be truly revealed until the project comes online.

We also paid attention toEdge Matrix Computing (EMC) protocol which uses blockchain technology to apply computing power to AI, rendering, and scientific research. , AI e-commerce access and other types of scenarios, tasks are distributed to different computing power nodes through elastic computing. This method not only improves the efficiency of computing power, but also ensures the security of data transmission. At the same time, it provides a computing power market where users can access and exchange computing resources. It is convenient for developers to deploy and reach users faster. Combined with the economic form of Web3, computing power providers can also obtain real benefits and protocol party subsidies based on users’ actual usage, and AI developers can also obtain lower reasoning and rendering costs. Below is an overview of its main components and functions:

It is also expected that GPU-based RWA products will be launched. The key to this is to revitalize the hardware that was originally fixed in the computer room and divide and circulate it in the form of RWA to obtain additional liquidity. High-quality GPU can be used as the underlying asset of RWA. The reason is that computing power can be regarded as hard currency in the AI ​​field. There is currently an obvious contradiction between supply and demand, and this contradiction cannot be resolved in the short term, so the price of GPU is relatively stable.

In addition, implementing computing power clusters by deploying IDC computer rooms is also a key part of EMC protocol. This not only allows GPUs to operate in a unified environment, but also more efficiently handles related large-scale computing power-consuming tasks, such as model pre-training. This meets the needs of professional users. At the same time, the IDC computer room can also centrally host and run a large number of GPUs to ensure the technical specifications of the same type of high-quality hardware, making it easier to package them into the market as RWA products and open up new ideas for DeFi.

In recent years, the academic community has also developed new technical theories and application practices in the field of edge computing. As a supplement and optimization of cloud computing, edge computing is a part of artificial intelligence that is accelerating from the cloud to the edge and into increasingly smaller IoT devices. These IoT devices are often small in size, so lightweight machine learning is favored to meet issues such as power consumption, latency, and accuracy.

Network3 is built by building a dedicated AI Layer2 to provide AI developers around the world with AI model algorithm optimization and compression, federated learning, edge computing and privacy computing. Provide services to help them train or verify models quickly, conveniently, and efficiently. By utilizing a large number of smart IoT hardware devices, it can focus on small models to provide corresponding computing power, and by building a TEE (Trusted Execution Environment), users can complete relevant training only by uploading model gradients to ensure user-related data Privacy and security.

To sum up

• With the development of AI and other fields, many industries will undergo huge changes in their underlying logic, computing power will rise to a more important position, and various aspects related to it will also cause extensive exploration in the industry. Decentralized computing power networks have Its own advantages can reduce the risk of centralization and can also serve as a complement to centralized computing power.

• And the teams in the AI ​​field are also at a fork in the road. The choice of whether to use large trained models to build their own products or to participate in training large models in their respective regions is mostly dialectical. Therefore, decentralized computing power can meet different business needs. This development trend is welcome, and with the update of technology and iteration of algorithms, there will inevitably be breakthroughs in key areas.

• Don’t be afraid, just figure it out slowly.

Reference

https://www.semianalysis.com/p/gpt-4-architecture-infrastructure

https://medium.com/render-token/render-network-q2-highlights-part-2-network-statistics-ac5aa6bfa4e5

https://know.rendernetwork.com/

https://medium.com/livepeer-blog/an-overview-of-the-livepeer-network-and-lpt-44985f9321ff

https://www.youtube.com/watch?v=FDA9wqZmsY8

https://mirror.xyz/1kx.eth/q0s9RCH43JCDq8Z2w2Zo6S5SYcFt9ZQaRITzR4G7a_k

https://mirror.xyz/gensyn.eth/_K2v2uuFZdNnsHxVL3Bjrs4GORu3COCMJZJi7_MxByo

https://docs.gensyn.ai/litepaper/#solution

https://a16zcrypto.com/posts/announcement/investing-in-gensyn/

https://www.pinecone.io/learn/chunking-strategies/

https://akash.network/blog/the-fast-evolving-ai-landscape/

https://aws.amazon.com/cn/blogs/compute/amazon-ec2-p4d-instances-deep-dive/

https://manual.edgematrix.pro/emc-network/what-is-emc-and-poc

https://arstechnica.com/gaming/2022/09/the-end-of-ethereum-mining-could-be-a-bonanza-for-gpu-shoppers/

Disclaimer:

  1. This article is reprinted from [PANews]. All copyrights belong to the original author [Future3 Campus]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!
Create Account