From FIL to Walrus: The Evolution and Challenges of Decentralization Storage

Decentralization Storage: The Difficult Journey from Concept to Implementation

Storage was once one of the hot tracks in the blockchain industry. Filecoin, as a representative project of the last bull market, had a market value that once exceeded ten billion dollars. Arweave promoted permanent storage as its selling point, with a peak market value of 3.5 billion dollars. However, as the availability of cold data storage is questioned, the necessity of permanent storage has also been challenged, casting a shadow over the prospects of Decentralization storage.

The emergence of Walrus has brought new attention to the long-quiet storage track. Recently, the Shelby project launched by Aptos in collaboration with Jump Crypto aims to elevate decentralized storage in the hot data sector to new heights. So, is decentralized storage expected to make a comeback and be applied in broader scenarios? Or is it merely another round of conceptual speculation? This article will analyze the evolution of decentralized storage based on the development histories of Filecoin, Arweave, Walrus, and Shelby, exploring its future development prospects.

From Filecoin, Arweave to Walrus, Shelby: How far is the popularization of Decentralization storage?

Filecoin: The Name of Storage, The Reality of Mining

Filecoin is one of the earlier emerging cryptocurrency projects, and its development direction naturally revolves around Decentralization. This is a common characteristic of early crypto projects - seeking the meaning of Decentralization in various traditional fields. Filecoin combines storage with Decentralization and proposes solutions to the drawbacks of centralized storage. However, certain trade-offs made to achieve Decentralization have become pain points that later projects, such as Arweave or Walrus, attempt to address. To understand that Filecoin is essentially just a mining coin project, one must recognize the objective limitations of its underlying technology, IPFS, in handling hot data.

IPFS: Decentralization architecture transmission bottleneck

IPFS( InterPlanetary File System ) was launched around 2015, aiming to disrupt the traditional HTTP protocol through content addressing. The biggest drawback of IPFS is its extremely slow retrieval speed. In an era where traditional data service providers can achieve millisecond-level responses, retrieving a file from IPFS still takes over ten seconds, making it difficult to promote in practical applications and explaining why it is rarely adopted by traditional industries, except for a few blockchain projects.

The underlying P2P protocol of IPFS is mainly suitable for "cold data," which refers to static content that does not change frequently, such as videos, images, and documents. However, when dealing with hot data, such as dynamic web pages, online games, or artificial intelligence applications, the P2P protocol does not have a significant advantage over traditional CDNs.

Although IPFS itself is not a blockchain, its design concept of Directed Acyclic Graph (DAG) is highly compatible with many public chains and Web3 protocols, making it inherently suitable as the underlying construction framework for blockchains. Therefore, even if it has shortcomings in terms of practicality, it is already sufficient as a foundational framework for carrying blockchain narratives. Early crypto projects only needed a functional framework to initiate grand visions, but as Filecoin developed to a certain stage, the inherent problems brought by IPFS began to hinder its further development.

logic of mining coins under the storage cloak

The original intention of IPFS was to allow users to become part of the storage network while storing data. However, in the absence of economic incentives, it is difficult for users to voluntarily use this system, let alone become active storage nodes. This means that most users will only store files on IPFS but will not contribute their own storage space or store files for others. It is against this backdrop that Filecoin was born.

In the token economic model of Filecoin, there are three main roles: users are responsible for paying fees to store data; storage miners receive token rewards for storing user data; retrieval miners provide data when users need it and receive rewards.

This model has potential malicious space. Storage miners may fill garbage data after providing storage space to receive rewards. Since this garbage data will not be retrieved, even if they are lost, it will not trigger the penalty mechanism for storage miners. This allows storage miners to delete garbage data and repeat this process. Filecoin's proof of replication consensus can only ensure that user data has not been privately deleted, but it cannot prevent miners from filling garbage data.

The operation of Filecoin largely relies on the continuous investment of miners in the token economy, rather than on the real demand for distributed storage from end users. Although the project is still iterating, at this stage, the ecological construction of Filecoin aligns more with the "mining coin logic" rather than the "application-driven" positioning of storage projects.

Arweave: The Gains and Losses of Long-Termism

If Filecoin's design goal is to build an incentivized, verifiable Decentralization "data cloud" framework, then Arweave takes the storage in another extreme direction: providing permanent storage capability for data. Arweave does not attempt to build a distributed computing platform; its entire system revolves around a core assumption - important data should be stored once and remain forever on the network. This extreme long-termism makes Arweave fundamentally different from Filecoin in terms of mechanisms, incentive models, hardware requirements, and narrative perspectives.

Arweave takes Bitcoin as its learning object, attempting to continuously optimize its permanent storage network over a long period measured in years. Arweave does not focus on marketing, nor does it care about competitors and market trends. It simply moves forward on the path of iterating its network architecture, indifferent to whether anyone is paying attention, because that is the essence of the Arweave development team: long-termism. Thanks to long-termism, Arweave was highly sought after during the last bull market; and because of long-termism, even if it hits rock bottom, Arweave might still survive several rounds of bull and bear markets. The only question is whether there will be a place for Arweave in the future of Decentralization storage? The existence value of permanent storage can only be proven over time.

The Arweave mainnet has evolved from version 1.5 to the recent version 2.9. Although it has lost market attention, it has been committed to enabling a wider range of miners to participate in the network at minimal cost and incentivizing miners to maximize data storage, continuously enhancing the robustness of the entire network. Arweave is well aware that it does not align with market preferences, and therefore adopts a conservative approach, not embracing the miner community, with the ecosystem completely stagnant. It upgrades the mainnet at minimal cost while continuously lowering hardware thresholds without compromising network security.

Version 1.5-2.9 Upgrade Review

The Arweave 1.5 version exposed a vulnerability that allowed miners to rely on GPU stacking instead of real storage to optimize block production chances. To curb this trend, version 1.7 introduced the RandomX algorithm, limiting the use of specialized computing power and instead requiring general-purpose CPUs to participate in mining, thereby weakening computing power centralization.

Version 2.0 adopts SPoA, transforming data proof into a concise path of the Merkle tree structure, and introduces format 2 transactions to reduce synchronization burdens. This architecture alleviates network bandwidth pressure, significantly enhancing the collaborative capabilities of nodes. However, some miners can still evade the responsibility of holding real data through centralized high-speed storage pool strategies.

The 2.4 version introduces the SPoRA mechanism, incorporating global indexing and slow hash random access, requiring miners to genuinely hold data blocks to participate in effective block production, thereby weakening the effect of hash power stacking. As a result, miners began to focus on storage access speed, promoting the application of SSDs and high-speed read-write devices. The 2.6 version introduces hash chain control for block production rhythm, balancing the marginal benefits of high-performance devices and providing a fair participation space for small and medium-sized miners.

Subsequent versions further enhance network collaboration capabilities and storage diversity: 2.7 adds collaborative mining and pool mechanisms to improve the competitiveness of small miners; 2.8 introduces a composite packaging mechanism that allows large-capacity low-speed devices to participate flexibly; 2.9 introduces a new packaging process in the replica_2_9 format, significantly improving efficiency and reducing computational dependency, completing the closed loop of data-oriented mining models.

Overall, Arweave's upgrade path clearly presents its storage-oriented long-term strategy: while continuously resisting the trend of computational power centralization, it lowers the participation threshold and ensures the possibility of the protocol's long-term operation.

Walrus: Embracing the Innovation and Limitations of Hot Data

The design approach of Walrus is completely different from that of Filecoin and Arweave. Filecoin's starting point is to create a decentralized and verifiable storage system, at the cost of cold data storage; Arweave's starting point is to build an on-chain library of Alexandria that can permanently store data, at the cost of limited application scenarios; Walrus's starting point is to optimize the storage overhead of hot data storage protocols.

RedStuff: Cost Innovation or Old Wine in a New Bottle?

In terms of storage cost design, Walrus believes that the storage expenses of Filecoin and Arweave are unreasonable. The latter two adopt a fully replicated architecture, whose main advantage is that each node holds a complete copy, providing strong fault tolerance and independence among nodes. This type of architecture ensures that even if some nodes are offline, the network still has data availability. However, this also means that the system requires multiple copies for redundancy to maintain robustness, thus driving up storage costs. Especially in Arweave's design, the consensus mechanism itself encourages node redundant storage to enhance data security. In contrast, Filecoin is more flexible in cost control, but at the cost of potentially higher data loss risks with some low-cost storage options. Walrus attempts to find a balance between the two, with a mechanism that controls replication costs while enhancing availability through structured redundancy, thereby establishing a new compromise path between data availability and cost efficiency.

The RedStuff created by Walrus is a key technology for reducing node redundancy, originating from Reed-Solomon ( RS ) coding. RS coding is a traditional error correction code algorithm that can be used to reconstruct original data. From CD-ROMs to satellite communications to QR codes, it is widely used in daily life.

Erasure codes allow a data block ( of 1MB) to be expanded to twice its size (2MB), where the additional 1MB is special erasure code data. Even if any byte in the block is lost, it can be easily recovered using these codes. Even in the case of up to 1MB of data loss, the entire block can still be recovered. The same technology allows computers to read all data from damaged CD-ROMs.

The most commonly used is RS coding. The implementation method starts from k information blocks, constructs the relevant polynomial, and evaluates it at different x coordinates to obtain the encoded blocks. Using RS erasure codes, the probability of randomly sampling large chunks of data loss is very small.

The biggest feature of RedStuff is that through the improvement of the erasure coding algorithm, Walrus can quickly and robustly encode unstructured data blocks into smaller shards, which are distributed and stored in the storage node network. Even if up to two-thirds of the shards are lost, the original data block can be quickly reconstructed using partial shards. This becomes possible while maintaining a replication factor of only 4 to 5 times.

Therefore, it is reasonable to define Walrus as a lightweight redundancy and recovery protocol redesigned around the Decentralization scenario. Compared to traditional erasure codes ( like Reed-Solomon ), RedStuff no longer pursues strict mathematical consistency, but instead makes realistic trade-offs regarding data distribution, storage verification, and computational costs. This model abandons the immediate decoding mechanism required by centralized scheduling and instead verifies whether nodes possess specific data replicas through on-chain Proof, thus adapting to a more dynamic and marginalized network structure.

The core design of RedStuff is to split data into two categories: primary slices and secondary slices. Primary slices are used to recover the original data, and their generation and distribution are subject to strict constraints, with a recovery threshold of f+1, requiring 2f+1 signatures as availability endorsement. Secondary slices are generated through simple operations such as XOR combinations, serving to provide elastic fault tolerance and enhance the overall system robustness. This structure essentially reduces the requirement for data consistency - allowing different nodes to temporarily store different versions of data, emphasizing the practical path of "eventual consistency." Although similar to the relaxed requirements for backtracking blocks in systems like Arweave, which has achieved some effect in reducing network burden, it also weakens the guarantees of immediate data availability and integrity.

Not available

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 7
  • Share
Comment
0/400
MEVSandwichMakervip
· 11h ago
The road to storage is still difficult.
View OriginalReply0
BlockchainFriesvip
· 07-02 10:06
Can't keep up, looking for a new way out.
View OriginalReply0
SingleForYearsvip
· 07-02 09:55
Waiting for technological breakthroughs
View OriginalReply0
ChainSpyvip
· 07-02 09:50
This heat is coming.
View OriginalReply0
LiquidatedAgainvip
· 07-02 09:50
Not as reliable as AWS
View OriginalReply0
VibesOverChartsvip
· 07-02 09:48
Looking forward to the storage track
View OriginalReply0
OnchainFortuneTellervip
· 07-02 09:40
Both cold and hot data need to be secured.
View OriginalReply0
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)