Encoding for Robust Immutable Storage (ERIS)

ERIS allows you to make content available robustly over computer networks by using identifiers that are location independent and network-optimized. It can be used as a building block for decentralized and fault-tolerant applications, some examples being:

file-sharing
de-duplication of content
reproducible and deterministic identifiers of content

ERIS is a network optimized form of content-addressing. It defines an encoding from content to a set of uniformly sized blocks and a short identifier where the content can be securely decoded from the blocks and the short identifier. Blocks can be transferred over different network protocols or sneakernets, redundantly. The uniform and small block size prevents information leakage and allows quick recovery from transport errors.

At it's core ERIS defines the encoding from content to blocks and a short identifier which we call a read capability. The encoding is formally described in the specification.

To practically use ERIS for file-sharing or other applications you will need a way of encoding and decoding content to and from blocks as well as a way of transporting blocks over space and/or time. There are libraries that you can use to encode and decode content. The easiest way to transfer blocks is to encode them in a CBOR file and send the file using e-mail or on a USB stick. You could also use CoAP, HTTP or many other suitable protocols and networks.

We are developing tools that allow practical and usable file-sharing with block transfer and storage built-in (see for example eris-go). This is still very much work in progress. If you are interested in learning about and using ERIS, the best way to get started is to get in touch.

The ideas behind ERIS are not new. What makes ERIS different is that the encoding and format of identifiers (read capabilities) are defined independently of application or transport used. The vision is that content becomes decentralized, robustly available and interoperable among applications and protocols.

FAQ

What is content-addressing?

Content on computer networks is usually identified by the physical location of where it is stored. For example http://example.com/foo.txt identifies some text by the network address where it is hosted. Availability of the content is dependant on the reachability of the host and caching is complicated as the content receives a new location when cached.

An alternative to identifying content by its location is to identify content by its content itself. This is called content-addressing. The hash of some content is computed and used as an unique identifier for the content.

Caching content-addressed content and making it available redundantly is much easier as the content is completely decoupled from any physical location. Integrity of content is automatically ensured with content-addressing (when using a cryptographic hash) as the identifier of the content can be recomputed to check that the content matches the requested identifier.

How does ERIS do content-addressing?

Content-addressing by using the cryptographic hash of the content has certain drawbacks:

Large content is stored as a large chunk of data. In order to optimize storage and network operations it is better to split up content into smaller uniformly sized blocks and reassemble blocks when needed.
Unencrypted: Content is readable by all peers involved in transporting, caching and storing content.

ERIS addresses these issues by splitting content into small uniformly sized and encrypted blocks that are recursively content-addressed to form a tree. See the specification for an exact description of the process.

How is ERIS different from BitTorrent, IPFS, Gnutella, Freenet, Tahoe-LAFS, GNUnet, libchop and other similar protocols?

Encodings similar to ERIS are already widely-used in many applications and protocols. However, they all use slightly different encodings that are tied to the respective protocols and applications. ERIS is an attempt to define an encoding independent of any specific protocol or application and decouple content from transport or storage layers.

A major improvement compared to BitTorrent, IPFS and Gnutella is that content is always encrypted with ERIS. Content can not be directly inferred from the blocks and in certain circumstances intermediary peers that store and transport blocks can claim that decrypting encoded content is infeasible to them - a security property we call Intermediary Peer Deniability

Why is the identifier of content called a read capability?

We borrow the term from capability-based security because these URNs can (and should) be treated as permanent and unforgable data access tokens.

How do ERIS read capabilities look?

ERIS read capabilities are encoded as 66 bytes of binary content. Alternatively they can be as URNs. This allows ERIS read capabilities to be used and embedded in many applications and formats.

An ERIS URN looks like this:

urn:eris:BIAD77QDJMFAKZYH2DXBUZYAP3MXZ3DJZVFYQ5DFWC6T65WSFCU5S2IT4YZGJ7AC4SYQMP2DM2ANS2ZTCP3DJJIRV733CRAAHOSWIYZM3M

Why are there two block sizes (32KiB and 1KiB)?

The 32KiB is well-suited and should be used for encoding large content (from 16KiB to many GiBs). For small content (< 16KiB) there is a considerable overhead when using a block size of 32KiB. This is especially bad when a large number of small content needs to be encoded and identified independantly. For this case ERIS offers a block size of 1KiB.

The small block size was initially motivated by an annecdote that most ActivityStreams objects are smaller than 1KiB.

See the section on block size in the specification as well as some projects using ERIS for encoding small pieces of content (e.g. openEngiadina and Ctrl + All.).

How stable is ERIS? Will there by any breaking changes?

The ERIS specification has been stabilized. The encoding and the identifier format will not be changed. You may use and embed ERIS identifiers in your data and applications without fearing breaking changes.

Backwards incompatible changes will only be made when absolutely necessary (for security reasons) and with an updated identifier format that will keep existing identifiers working (see the section on versioning in the specification).

How can blocks be transported and stored?

Blocks of ERIS encoded content are uniformly sized (either 1KiB or 32KiB) and can be transferred over protocols such as HTTP, FTP, IPFS, CoAP, GNUnet, BitTorrent, Named Data Networking or Sneakernets.

What's the overhead of ERIS?

ERIS capabilities are larger than hash digests, because a typical hash digest is 32 bytes whereas an ERIS capability is 66 bytes. Systems that use these capabilities internally therefore generate larger internal metadata. We believe this cost is offset by the benefit of identifiers that enable partial verification using hash trees and because a standardized URN representation enables data portability between applications.

Storing ERIS encoded data comes with the cost of storing a tree of metadata blocks describing data blocks as well as some data padding. The amount of overhead can vary, but the worst case scenario for storing a 1GiB file is 2MiB of tree blocks.

All data that is encoded for convergent capabilities has some implicit deduplication during storage, regardless of the storage medium.

Who are you?

The core group behind ERIS is known as the ERIS maintainer collective. The ERIS maintainer collective maintains the ERIS specification, develop software and ideas around it, helps on-boarding and encourages experimentation and usage.

See the contact section for how to get in touch.

Who is Endo Renberg?

Endo Renberg is a collective pseudonym used by the ERIS maintainer collective.

Is this a religious thing?

No. We share the name with a godess, but that's about it.

Software

Tools

A list of known tools that use ERIS and can be used for file-sharing or other applications:

Tool	Documentation	Description	Status
eris-go	man 1 eris-go	Block store and encoding utility. Supports HTTP, CoAP, FUSE, 9P, CBOR among other things.	usable, in active development
kapla		Block store and encoding utility with support for bound storage and pluggable transports.	in conception
eristekt		Browser extension that can transparently resolve ERIS URNs, encode content, connect to CoAP endpoints over WebSockets and store blocks in a browser native storage.	currently not usable, update planned
oebstly		Block store and encoding utility. Implements sub-stores with quotas and garbage-collection.	no longer actively developed. Usage not recommended.

Libraries

A list of known implementations that satisfy the standard test vectors that can be used as libraries from various programming languages:

Language	License	Code	Documentation
Common Lisp	LGPL-3.0-or-later	eris-cl
Go	BSD-3-Clause	codeberg.org/eris/eris-go	godocs.io/codeberg.org/eris/eris-go
Guile	GPL-3.0-or-later	guile-eris	eris.codeberg.page/guile-eris
Nim	Unlicense	nim-eris	eris.codeberg.page/nim-eris
OCaml	AGPL-3.0-or-later	ocaml-eris	eris.codeberg.page/ocaml-eris/eris/Eris
Python	AGPL-3.0-or-later.txt	python-eris	https://eris.codeberg.page/python-eris/
Rust	AGPL-3.0	eris-rs
Smalltalk	ISC	ERIS

Additional implementations are being developed (Wisp).

Integrations

A list of known ERIS components that offer interfaces for high-level systems:

System	Implementation	Interface	Documentation
Linux (Unix)	codeberg.org/eris/eris-go	FUSE	man 1 eris-go
9FRONT (Plan 9)	codeberg.org/eris/eris-go	9P2000	man 1 eris-go

Contact

We are inclusive and happy to help and hear about your thoughts and ideas.

For comments or questions please use the mailing list.

Ephemeral discussions take place in the #eris channel on the Libera IRC network.

Urgent and sensitive security issues may be addressed directly to the ERIS maintainers. See security.txt.