aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorpukkamustard <pukkamustard@posteo.net>2020-10-29 12:35:05 +0100
committerpukkamustard <pukkamustard@posteo.net>2020-10-29 12:35:05 +0100
commitb6db6ca40efe5bb1899f5dd2301b44d9f47e07d9 (patch)
tree767987adf16b0c9b8afb2a7a07989586f59b2507 /doc
parent2d4148f78204a418a431f1c96fd5c1e4629e0576 (diff)
eris.adoc: use binary prefixes for units
Diffstat (limited to 'doc')
-rw-r--r--doc/eris.adoc34
1 files changed, 18 insertions, 16 deletions
diff --git a/doc/eris.adoc b/doc/eris.adoc
index 0947955..9c8bd18 100644
--- a/doc/eris.adoc
+++ b/doc/eris.adoc
@@ -35,7 +35,7 @@ The objectives of ERIS are:
Availability :: Content encoded with ERIS can be easily replicated and cached.
Authenticity :: Authenticity of content can be verified efficiently.
URN reference :: ERIS encoded content can be referrenced with a single URN.
-Storage efficiency :: ERIS can be used to encode small content (< 1Kb) as well as large content (> many Gb) with reasonable storage overhead.
+Storage efficiency :: ERIS can be used to encode small content (< 1 kibibyte) as well as large content (> many gibibyte) with reasonable storage overhead.
Simplicity :: The encoding should be as simple as possible in order to allow correct implementation on various platforms and in various languages.
@@ -56,7 +56,7 @@ ERIS is inspired and based on the encoding used in the file-sharing application
ERIS differs from ECRS in following points:
Cryptographic primitives :: ECRS itself does not specify any cryptographic primitives but the GNUNet implementation uses the SHA-512 hash and AES cipher. ERIS uses the Blake2b-256 cryptographic hash <<RFC7693>> and the ChaCha20 stream cipher <<RFC8439>>. This improves performance, storage efficiency (as hash references are smaller) and allows a convergence secret to be used (via Blake2b keyed hashing; see <<_convergence_secret>>).
-Block size :: ECRS uses a fixed block size of 32 Kb. This is inefficient when encoding small content. ERIS allows a block size of 1 Kb or 32 Kb, allowing efficient encoding of small and large content (see <<_block_size>>).
+Block size :: ECRS uses a fixed block size of 32 KiB. This is inefficient when encoding small content. ERIS allows a block size of 1 KiB or 32 KiB, allowing efficient encoding of small and large content (see <<_block_size>>).
URN :: ECRS does not specify an URN for referring to encoded content (this is specified as part of the GNUNet file-sharing application). ERIS specifies an URN for encoded content regardless of encoding application or storage and transport layer.
Namespaces :: ECRS defines two mechanisms for grouping and discovering encoded content (SBlock and KBlock). ERIS does not specify any such mechanisms (see <<_namespaces>>).
@@ -66,6 +66,8 @@ Other related projects include Tahoe-LAFS and Freenet. The reader is referred to
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 <<RFC2119>>.
+We use binary prefixes for multiples of bytes, i.e: 1024 bytes is 1 kibibyte (KiB), 1024 kibibytes is 1 mebibyte (MiB) and 1024 mebibytes is 1 gigibytes (GiB).
+
TODO a glossary of terms used.
== Specification of ERIS
@@ -98,19 +100,19 @@ This is the padding algorithm implemented in https://libsodium.gitbook.io/doc/pa
=== Block Size
-ERIS uses two block sizes: 1Kb and 32Kb. The block size must be specified when encoding content.
+ERIS uses two block sizes: 1KiB (1024 bytes) and 32KiB (32768 bytes). The block size must be specified when encoding content.
-Both block sizes can be used to encode content of arbitrary size. The block size of 1Kb is an optimization towards smaller content.
+Both block sizes can be used to encode content of arbitrary size. The block size of 1KiB is an optimization towards smaller content.
-Content smaller than TODO SHOULD be encoded with block size 1Kb, content larger than TODO SHOULD be encoded with block size 32Kb.
+Content smaller than TODO SHOULD be encoded with block size 1KiB, content larger than TODO SHOULD be encoded with block size 32KiB.
The block size is encoded in the read capability and the decoding process is capable of handling both cases.
[NOTE]
====
-When using block size 32Kb to encode content smaller than 1Kb, the content will be encoded in a 32Kb block. This is a storage overhead of over 3100%. When encoding very many pieces of small content (e.g. short messages or cartographic nodes) this overhead is not acceptable.
+When using block size 32KiB to encode content smaller than 1KiB, the content will be encoded in a 32KiB block. This is a storage overhead of over 3100%. When encoding very many pieces of small content (e.g. short messages or cartographic nodes) this overhead is not acceptable.
-On the other hand, using small block sizes increases the number of internal nodes that must be used to encode the content (see <<_collect_reference_key_pairs_in_nodes>>). When encoding larger content it is more efficient to use a block size of 32Kb.
+On the other hand, using small block sizes increases the number of internal nodes that must be used to encode the content (see <<_collect_reference_key_pairs_in_nodes>>). When encoding larger content it is more efficient to use a block size of 32KiB.
====
=== Convergence Secret
@@ -129,7 +131,7 @@ Inputs to the encoding process are:
`CONTENT` :: An arbitary length byte sequence of content to be encoded.
`CONVERGENCE-SECRET` :: A 256 bit (32 byte) byte sequence (see <<_convergence_secret>>).
-`BLOCK-SIZE` :: The block size used for encoding in bytes can be either 1024 (1Kb) or 32768 (32Kb) (see <<_block_size>>).
+`BLOCK-SIZE` :: The block size used for encoding in bytes can be either 1024 (1KiB) or 32768 (32KiB) (see <<_block_size>>).
Content is encoded by first splitting into uniformly sized blocks, encrypting the blocks and computing references to the blocks. If there are multiple references to blocks they are collected in nodes that have the same size as content blocks. The nodes are encrypted and references to the nodes are computed. This process is repeated until there is a single root reference.
@@ -137,7 +139,7 @@ References to nodes and blocks of content consist of a reference to an encrypted
The encoding process constructs a tree of reference-key pairs that reference nodes that hold references to nodes of a lower level or to content.
-The number of reference-key pairs collected into a node is called the _arity_ of the tree and depends on the block size. For block size 1Kb the arity of the tree is 16, for block size 32Kb the arity is 512.
+The number of reference-key pairs collected into a node is called the _arity_ of the tree and depends on the block size. For block size 1KiB the arity of the tree is 16, for block size 32KiB the arity is 512.
An encoding of a content that is split into eight blocks is depicted in <<figure_merkle_tree>>. For illustration purposes the tree is of arity 2 (instead of 16 or 512).
@@ -326,7 +328,7 @@ We specify an binary encoding of the read-capability 66 bytes:
|===
|Byte offset | Content | Length (in bytes)
-| 0 | block size (`0x00` for block size 1Kb and `0x01` for block size 32Kb)| 1
+| 0 | block size (`0x00` for block size 1KiB and `0x01` for block size 32KiB)| 1
| 1 | level of root reference-key pair as unsigned integer | 1
| 2 | root reference | 32
| 34 | root key | 32
@@ -340,7 +342,7 @@ TODO using 1 byte to encode level limits size of content that can be encoded. Ad
A read-capability can be encoded as an URN: `urn:eris:BASE32-READ-CAPABILITY`, where `BASE32-READ-CAPABILITY` is the unpadded Base32 <<RFC4648>> encoding of the read capability.
-For example the ERIS URN of the UTF-8 encoded string "Hail ERIS!" (with block size 1Kb and null convergence secret):
+For example the ERIS URN of the UTF-8 encoded string "Hail ERIS!" (with block size 1KiB and null convergence secret):
`urn:erisx2:AAAAV4OIFHWY67XFEHAOQVXUOWTYDVG5TEY6S6IW4PJ4SQLVJJF4MIKNDLKUDPPHDCKLBUIAJQ3U2IEARRPFHEHWFW5NJY7BJUGFESPGDQ`
@@ -369,8 +371,8 @@ The test vectors are given as machine-readable JSON files. For example the test
----
{
"id": 0,
- "name": "short string (block size 1Kb)",
- "description": "Encode the UTF-8 encoding of the string \"Hail ERIS!\" with block-size 1Kb and null convergence-secret.",
+ "name": "short string (block size 1KiB)",
+ "description": "Encode the UTF-8 encoding of the string \"Hail ERIS!\" with block-size 1KiB and null convergence-secret.",
"content": "JBQWS3BAIVJESUZB",
"convergence-secret": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
"block-size": 1024,
@@ -408,9 +410,9 @@ In order to verify implementations that encode content by streaming (see <<_stre
|===
|Test name | Content size | Block size | URN | Level of root reference
-| 100Mb (block size 1Kb) | 100Mb| 1Kb | `urn:erisx2:AAC5BQH6DQVWAHSRUCORMNEWDGLSONZC3NDAP4U6W5WMHFK4MI5U2GMXS5A5L3GZGNBN3Z7MER4SJOJE5USLQZNXMSV3QN3IKFP36SX6YU` | 5
-| 1Gb (block size 32Kb) | 1Gb | 32Kb | `urn:erisx2:AEBN2DAP2ZYSW4EI5SOBABN2DDNVCREGYHYRVVLKDC4Z4E4FOANV7ZJCT2VA4OMKRJGFGFZWMNFZEEN2PW27V527DXYWEKEQ7KXWM4D4WU` | 2
-| 256Gb (block size 32Kb) | 256Gb | 32Kb | `urn:erisx2:AEB6PWLNQGCT2OXYQV4YWZISEMUEROYNRM4BHMMYLLWPFVIVPT7KJETE2SLO7ALMT5GDSGJZOP6YFLRI7NAIKSEUI6TLDFPOBSZPIXKJE4` | 3
+| 100Mb (block size 1Kb) | 100Mb| 1Kb | `urn:erisx2:AACXPZNDNXFLO4IOMF6VIV2ZETGUJEUU7GN4AHPWNKEN6KJMCNP6YNUMVW2SCGZUJ4L3FHIXVECRZQ3QSBOTYPGXHN2WRBMB27NXDTAP24` | 5
+| 1Gb (block size 32Kb) | 1Gb | 32Kb | `urn:erisx2:AEBFG37LU5BM5N3LXNPNMGAOQPZ5QTJAV22XEMX3EMSAMTP7EWOSD2I7AGEEQCTEKDQX7WCKGM6KQ5ALY5XJC4LMOYQPB2ZAFTBNDB6FAA` | 2
+| 256Gb (block size 32Kb) | 256Gb | 32Kb | `urn:erisx2:AEBZHI55XJYINGLXWKJKZHBIXN6RSNDU233CY3ELFSTQNSVITBSVXGVGBKBCS4P4M5VSAUOZSMVAEC2VDFQTI5SEYVX4DN53FTJENWX4KU` | 3
|===
Content is the ChaCha20 stream using a null nonce and the key which is the Blake2b hash of the UTF-8 encoded test name (e.g. `KEY = Blake2b("100Mb (block size 1Kb)")`).