Deep Dive into the Operational Network of Large Language Models, BasedAI

BeginnerApr 01, 2024
BasedAI, a collaboration between Based Labs and the founding team of Pepecoin, seeks to tackle privacy concerns inherent in contemporary AI practices. It integrates essential resources for AI-related factor computations into permission, leveraging scarcity to establish a robust token economy. This article serves as a guide to explore the amalgamation of large language models, Zero-Knowledge Proofs (ZK), homomorphic encryption, and Meme coins within this innovative AI project.
Deep Dive into the Operational Network of Large Language Models, BasedAI

Reposted Article Title: A Deep Dive into BasedAI: A Large Language Model Operational Network with a Focus on Privacy and Efficiency, The Next Bittensor in the AI Space?

The AI space continues to be red-hot. Many projects are attempting to “AI-fy” themselves, adopting the new proposition of “helping AI do better” in hopes of soaring higher on the winds of AI. However, most of the older projects have already realized their value in previous cycles, and new projects like Bittensor are no longer “new.” We still need to look for projects that have yet to realize their value and have the potential for compelling narratives. Improving privacy has always been an attractive direction in the crypto projects aimed at “helping AI do better”:

First, because protecting privacy resonates inherently with the concept of egalitarianism in decentralization; second, protecting privacy inevitably involves the use of technologies such as ZK and Homomorphic Encryption. A correct narrative philosophy combined with advanced technology likely means an AI project’s development won’t be lacking. And wouldn’t it be more interesting if such a serious project could also incorporate the gameplay of Meme coins?

At the beginning of March, a project named BasedAI quietly registered an account on Twitter, officially posting only two tweets beyond retweets, while its website appears to be extremely basic—featuring nothing but a sophisticated academic-style whitepaper. Some international influencers have already taken the lead in analyzing the project, suggesting that it might be the next Bittensor.

Simultaneously, its namesake token, $basedAI, has seen a meteoric rise since the end of February, surging more than 40 times in value.

After thoroughly examining the project’s whitepaper, we discovered that BasedAI is an AI project that integrates large language models, Zero-Knowledge Proofs (ZK), homomorphic encryption, and Meme coins. While recognizing its narrative direction, we are also impressed by its ingenious economic design, which naturally links the scheduling of computing resources with the use of other Meme coins. Considering that the project is still in its very early stages, in this issue, we will interpret it to see if it has the potential to become the next Bittensor.

When serious science meets Meme, what exactly is BasedAI doing?

Before answering this question, let’s first look at who is behind BasedAI.

Public information reveals that BasedAI was jointly developed by an organization named Based Labs and the founding team of Pepecoin, aiming to address privacy issues when using large language models in the current AI field. The public information on Based Labs is sparse, with its website being quite mysterious, featuring a string of technical keywords in a Matrix-style presentation. One of the researchers in the organization, Sean Wellington, is the author of the publicly available whitepaper for BasedAI:

Moreover, Google Scholar indicates that Sean graduated from UC Berkeley and has been publishing multiple papers related to settlement systems and distributed data since 2006, specializing in AI and distributed network research, making him a significant figure in the tech field.

On the other side, Pepecoin is not the currently popular PEPE coin but an original meme that started in 2016, with its own mainnet L1 at the time and has since migrated to Ethereum.

You could say this is an OG Meme that also understands L1 development.

But how do a serious AI scientific research heavyweight and a Meme team, seemingly unrelated in their fields, spark innovation within BasedAI?

ZK and FHE: Balancing AI Computational Efficiency and Privacy

If we set aside the meme element, the Twitter introduction of BasedAI succinctly highlights the project’s narrative value:

“Your prompts are your prompts.” This essentially emphasizes the importance of privacy and data sovereignty: when you use large AI language models like GPT, any prompt or information you input is received by the server on the other end, fundamentally exposing your data privacy to OpenAI or other model providers.

While this may seem harmless, it inevitably raises privacy concerns, and you have no choice but to unconditionally trust that the AI model provider will not misuse your conversation records.

Stripping away the obscure mathematical formulas and technical designs in the BasedAI whitepaper, you can simply understand the project’s aim as:

Encrypting any content of your dialogue with large language models, allowing the model to perform computations without exposing plaintext, and ultimately returning results that only you can decrypt.

You might anticipate that achieving this would involve ZK (Zero-Knowledge Proof) and FHE (Fully Homomorphic Encryption), two privacy technologies.

ZK allows you to prove the truth or falsehood of a statement without revealing the plaintext;

FHE enables you to compute on encrypted data.

Combining the two, you can submit your prompts to an AI model in encrypted form, and the model returns an answer to you, but no parties involved know what your question was or what the answer is.

This sounds promising, but there’s a critical issue — FHE is computationally intensive and slow, posing a conflict between computational efficiency and privacy protection for user-facing LLMs like GPT, which require fast result display.

BasedAI, in its paper, emphasized the “Cerberus Squeezing” technology and supported it with complex mathematical formulas:

While we can’t professionally assess the mathematical implementation of this technology, its purpose can be simply understood as:

Optimizing the efficiency of processing encrypted data in FHE, selectively concentrating computational resources where they have the most impact to quickly complete computations and display results.

The paper also demonstrated with data how this optimization significantly improves efficiency:

Using Cerberus Squeezing, the computational steps required for fully homomorphic encryption could be nearly halved.Using Cerberus Squeezing, the computational steps required for fully homomorphic encryption could be nearly halved.

Thus, we can quickly simulate a user’s entire process using BasedAI:

  • The user inputs prompts, asking to analyze the emotions displayed in someone’s conversation records while wishing to protect the privacy of those records.
  • Through the BasedAI platform, these data are submitted in encrypted form, specifying the AI model to use (e.g., an emotion analysis model).
  • Miners in the BasedAI network receive this task and use their computational resources to execute the specified AI model on the encrypted data.
  • The network nodes complete the computation task without decrypting the data and return the encrypted processing result to the user.
  • The user receives the encrypted result, decrypts it with their key, and obtains the data analysis result they need.

The “Brain,” Miners, and Validators

Beyond the technology, what specific roles exist within the BasedAI network to execute the technology and meet user needs? First and foremost, it’s important to introduce the self-created concept of the “Brain.”

A “Brain” from Based Labs

For most AI crypto projects, a few inevitable elements are:

  • Miners: Responsible for performing computational tasks, consuming computational resources.
  • Validators: Validate the correctness of the work completed by miners, and ensure the validity of transactions and computational tasks within the network.
  • Blockchain: Records the results of computational and validation tasks on a ledger, incentivizing the behavior of different roles through the network’s native tokens.

BasedAI adds another layer on top of these three elements with the concept of the “Brain”:

“You must have a ‘Brain’ to incorporate miners and validators’ computational resources, allowing these resources to compute for various AI models and complete tasks.”

Put simply, these “Brains” act as distributed containers for specific computational tasks, running modified large language models (LLMs). Each “Brain” can choose the miners and validators it wishes to associate with.

If you find this explanation abstract, think of having a “Brain” as having a “license to offer cloud services”:

To recruit a group of miners and validators for encrypted computing of large language models, you must have an operating license that specifies:

  • Your business location (ID)
  • Your scope of business (using AI for sentiment analysis, generating images, medical assistance, etc.)
  • How much computational resource you have, and its capacity
  • Specifically, who you have brought in
  • How much reward you can earn from this activity

According to Based AI’s paper, each “Brain” in BasedAI can accommodate up to 256 validators and 1792 miners, with a total of only 1024 “Brains” in the system, further increasing the scarcity of “Brains.”

Miners and validators must do the following to join a “Brain”:

  • Miners: Connect to the platform, decide the GPU resources to allocate (more suitable for computation), can deposit $BASED tokens, and start computing work.
  • Validators: Connect to the platform, decide the CPU resources to allocate (more suitable for validation), can deposit $BASED tokens, and start validation work.

The more $BASED tokens deposited, the higher the efficiency of miners and validators running on the “Brain,” and the more $BASED rewards they receive.

Clearly, a “Brain” represents a certain power and organizational relationship, also opening up space for token and incentive design (more on this later).

However, doesn’t the design of this “Brain” seem familiar?

Different “Brains” in Bittensor somewhat resemble different subnets, performing specific tasks using different AI models;

In the previous cycle’s popular Polkadot, different “Brains” resemble different “slots” to run various parachains, performing different tasks.

BasedAI also provided an example of a “medical Brain” performing a task:

  • Patient medical records are encrypted and submitted to the medical Brain, generating prompt words to ask for appropriate diagnostic advice;
  • With the help of ZK and FHE, the suitable large language model within the BasedAI network can generate answers without decrypting sensitive patient data, this step calls upon the computational resources of miners and validators;
  • Healthcare providers receive encrypted output from the BasedAI network. Only the submitting user can decrypt the results to obtain treatment suggestions, while the data remains unexposed or leaked during this process.

The “Brain” Privilege Sale Plays Out, Benefiting Pepecoin

So, how does one acquire a “Brain” to obtain the “work permit” privilege for encrypted AI model computation? BasedAI, in collaboration with Pepecoin, has gamified the sale of these privileges, bestowing Pepecoin, a MEME token, with utility value.

With only 1024 “Brains” available, the project naturally leverages NFT Minting —- each “Brain” sold generates a corresponding ERC-721 token, which can be seen as a license. To mint this “Brain” NFT, two actions related to Pepecoin are required to unlock: burning or staking Pepecoin.

  • In terms of burning:The minting of the first “Brain” requires spending 1000 Pepecoin.
  • Each subsequent “Brain” mint increases the cost by 200 Pepecoin.
  • Brains generated in this way are transferable.
  • If all Brains were obtained through burning, 107,563,530 Pepecoin would be permanently destroyed. (According to CMC data, the current circulating supply is 133M, meaning nearly 80% of the token supply would be reduced if this burn were fully realized.)

Regarding staking:

  • Users are required to stake 100,000 Pepecoin for 90 days.
  • The Brain’s ERC-721 NFT is issued immediately after staking.
  • Brains generated this way are non-transferable but will gradually earn $BASED, the project’s native token.
  • The stake can be withdrawn after 90 days.

Regardless of the method used, as more Brains are created, a corresponding amount of Pepecoin will either be burned or locked, depending on the participation ratio of the two methods. It’s clear that this distribution is more about the allocation of crypto assets than AI resources. Given the scarcity of Brains and the token rewards for their operation, the demand for Pepecoin will significantly increase during Brain creation; both staking and burning reduce the circulating supply of Pepecoin, theoretically benefiting the token’s secondary market price.

As long as fewer than 1024 Brains are issued and active within the ERC-721 contract, the BasedAI Portal will continue to issue Brains. If all 1024 Brains are allocated, the BasedAI Portal will no longer allow the creation of new Brains. An Ethereum address can hold multiple Brain NFTs. The BasedAI portal will allow users to manage rewards from all owned Brains associated with the connected ETH wallet. Active Brain owners can expect to earn between $30,000 and $80,000 per Brain per year, according to official paper data.

With these economic incentives and narratives around AI and privacy, the anticipated popularity of Brain upon its official launch is foreseeable.

Summary

In crypto projects, technology itself is not the goal; its role is to guide attention, thereby directing asset distribution and flow. The design of BasedAI’s Brain clearly demonstrates an understanding of “how to promote asset distribution”: under the correct narrative of data privacy, integrating the resources needed for AI-related computations into a privilege, creating scarcity for this privilege, thereby guiding assets into the privilege, and increasing the consumption of another MEME token.

Computational resources are correctly allocated and incentivized, the project’s “Brain” assets gain scarcity and popularity, and the Meme token’s circulating supply decreases… From the asset creation perspective, BasedAI’s design is sophisticated and clever.

However, if one were to address the unspoken, avoided questions with feigned ignorance:

How many people will use this privacy-protecting large language model? How many AI giants are willing to cooperate with such privacy-protecting technology that may not benefit them?

The answer remains less than optimistic. Yet, narratives thrive on momentum, and speculation is timely. Sometimes, what’s needed isn’t to question the viability of a path but to go with the flow.

Source Material:

X: https://twitter.com/getbasedai

Official Website: https://www.getbased.ai/

Pepecoin: https://twitter.com/pepecoins

BasedAI Whitepaper

Disclaimer:

  1. This article is reproduced from techflow, originally titled “A Deep Dive into BasedAI: A Large Language Model Network Prioritizing Privacy and Efficiency, The Next Big Thing in the AI Race” by [TechFlow]. The copyright belongs to the original author, [TechFlow]. For any issues regarding this reproduction, please contact the Gate Learn team. The team will process it according to the relevant procedures as soon as possible.

  2. Disclaimer: The views and opinions expressed in this article are those of the author alone and do not constitute any investment advice.

  3. Translations of the article into other languages were done by the Gate Learn team. Reproduction, dissemination, or plagiarism of the translated articles is not allowed without mention of Gate.io.

Start Now
Sign up and get a
$100
Voucher!
Create Account