2 years ago · 58d1f685c1
--- a/doc/Book/Book.tex
+++ b/doc/Book/Book.tex
@@ -267,19 +267,20 @@ fields. The empty sequence of components is called the root path.
 
				 The root path is a directory, and its entries are the blocks of the global blocktree and links
			
 
				 to every private blocktree.
			
 
				 A path to a private blocktree has only
			
 
				-one component, consisting of the Hash of the private blocktree's root credentials. Any path with
			
 
				-only one components which consists of a Standard Hash
			
 
				-of the blocktree's root credentials is a valid path to the blocktree. These paths are
			
 
				+one component, consisting of the Hash of the private blocktree's root public signing key. Any path
			
 
				+with only one components which consists of a fingerprint 
			
 
				+of the blocktree's root public key is a valid path to the blocktree. These paths are
			
 
				 called the root paths of the private blocktree, and any path that begins with one of them
			
 
				-is simply called a private path. Conversely, paths that do not begin with them are called public.
			
 
				+is simply called a private path. Conversely, paths that do not begin with them are called public
			
 
				+paths.
			
 
				 
			
 
				-If one path is a prefix of a second, then we say the first path contains the second. Thus every path
			
 
				-is contained in the root path, and every private path is contained in the root path of a private
			
 
				-blocktree.
			
 
				+If one path is a prefix of a second, then we say the first path contains the second and that the
			
 
				+second is nested in the first.
			
 
				+Thus every path is contained in the root path, and every private path is contained in the root path
			
 
				+of a private blocktree.
			
 
				+Note that if path one is equal to path two, then path one also contains path two, and vice-versa.
			
 
				 
			
 
				-In addition to
			
 
				-identifying blocks, paths are used to scope capabilities and to address
			
 
				-messages to nodes and to the processes they're running.
			
 
				+In addition to identifying blocks, paths are used to scope capabilities and to address messages.
			
 
				 
			
 
				 \section{Principals}
			
 
				 A principal is any entity which can be authenticated.
			
@@ -293,29 +294,31 @@ for the following hash algorithms is required:
 
				 These are referred to as the Standard Hash Algorithms and a digest computed using one of them is
			
 
				 referred to as a Standard Hash.
			
 
				 
			
 
				-When a principal is identified in a textural representation, such as the textural representation of
			
 
				-a path, then the following syntax is used:
			
 
				+When a principal is identified in a textural representation, such as in a path,
			
 
				+the following syntax is used:
			
 
				 \begin{verbatim}
			
 
				     PrincipTxt ::= <hash algo index> '!' <Base64Url(hash)>
			
 
				 \end{verbatim}
			
 
				 where ``hash algo index'' is the base 10 string representation of the index of the hash algorithm in
			
 
				 the  above list, and ``Base64Url(hash)'' is the Base64Url encoding of the hash data identifying the
			
 
				-principal.
			
 
				+principal. Such a textural representation is referred to as a fingerprint of the public key from
			
 
				+which the hash was computed. In a slight abuse of language, we sometimes refer to the fingerprint or
			
 
				+hash data as a principal, even though these data merely identify a principal.
			
 
				 
			
 
				 Principals can be issued capabilities which allow them to read and write to blocks.
			
 
				 These capabilities are scoped by paths, which limit the set of blocks the capability can be used
			
 
				-for. A capability can only be used on a block whose path is contained in the path of the
			
 
				-capability. This access
			
 
				-control mechanism is enforced cryptographically, as described below.
			
 
				+on. A capability can only be used on a block whose path is contained in the path of the
			
 
				+capability. This access control mechanism is enforced cryptographically.
			
 
				 
			
 
				 A principal can grant a capability to another principal so long as the granted capability has a
			
 
				 path which is contained in the capability possessed by the granting node. The specific mechanisms
			
 
				-for how this is done differs depending on whether the capability is reading or writing to a block.
			
 
				+for how this is done differs depending on whether the capability is for reading or writing to a
			
 
				+block.
			
 
				 
			
 
				 Every private blocktree is associated with a root principal. The path containing only one
			
 
				-component which consists of a hash of the root principal's public key is a root path to the private 
			
 
				-block tree. The root principal has read and write capabilities for the private blocktree's root,
			
 
				-and so can grant subordinate capabilities scoped to any block in the private blocktree.
			
 
				+component which consists of the fingerprint of the principal's public key is a root path to the
			
 
				+private block tree. The root principal has read and write capabilities for the private blocktree's
			
 
				+root, and so can grant subordinate capabilities scoped to any block in the private blocktree.
			
 
				 
			
 
				 \section{Readcaps}
			
 
				 In order to protect the confidentiality of the data stored in a block, a symmetric cypher is used.
			
@@ -323,16 +326,16 @@ The key for this cipher is called block key, and by controlling access to it we
 
				 principals can read the block.
			
 
				 
			
 
				 The metadata of every block contains a dictionary of zero or more entries called read capabilities,
			
 
				-or readcaps for short. A key in this dictionary is a hash of the principal for which the
			
 
				-readcap was issued and the value is the encryption of the block key using the principal's public
			
 
				-encryption key.
			
 
				+or readcaps for short. A key in this dictionary is a hash of the public signing key of the principal
			
 
				+for which the readcap was issued and the value is the encryption of the block key using the
			
 
				+principal's public encryption key.
			
 
				 
			
 
				 The block key is also encrypted using the block key of the parent block, and the resulting cipher
			
 
				-text is stored in the block's metadata. This is referred to as the inherited readcap.
			
 
				+text is also stored in the block's metadata. This is referred to as the inherited readcap.
			
 
				 Hence, a principal which has a read capability for a
			
 
				-given path can read all paths contained in it as well. Further, a principal can use it's readcap
			
 
				-to decrypt the block key, and then re-encrypt it using the public encryption key of another
			
 
				-principal, thus allowing the principal to grant a subordinate readcap.
			
 
				+given path can read all paths contained in it as well. Further, a principal can use it's
			
 
				+private encryption key to decrypt the block key, and then re-encrypt it using the public encryption
			
 
				+key of another principal, thus allowing the principal to grant a subordinate readcap.
			
 
				 
			
 
				 A block that does not require confidentiality protection need not be encrypted. In this case, the
			
 
				 table of readcaps is empty and the inherited readcap is set to a flag value to indicate the block
			
@@ -343,11 +346,11 @@ A write capability, or writecap for short, is a certificate chain which extends
 
				 credentials back to the root signing key of a private blocktree.
			
 
				 Each certificate in this chain contains
			
 
				 the public key of the principal who issued it and the path that it grants write capabilities to.
			
 
				-The path of a certificate must be contained in the path of the next certificate in the chain, and
			
 
				-the chain must terminate in a certificate which is self-signed by the root signing credentials.
			
 
				+The path of a certificate must be contained in the path of the previous certificate in the chain,
			
 
				+and the chain must terminate in a certificate which is self-signed by the root signing credentials.
			
 
				 Each certificate also contains an expiration time, represented by an Epoch value, and a writecap
			
 
				-shall be considered invalid by a node if any certificate has an expiration time which is judged by
			
 
				-the node to be in the past. 
			
 
				+shall be considered invalid by a node if any certificate has an expiration time which it judges 
			
 
				+to be in the past. 
			
 
				 
			
 
				 A block's metadata contains a writecap, which allows it to be verified by checking the digital
			
 
				 signature on the metadata using the first certificate in the writecap, and then checking that the
			
@@ -361,7 +364,7 @@ A block's data is logically broken up into fixed sized sectors, and these are th
 
				 confidentiality protection, integrity protection and consensus. When a process performs writes on
			
 
				 an open block, those writes are buffered until a sector is filled (or until the process calls
			
 
				 flush). Once this happens, the contents of the buffer are encrypted using the block key and then
			
 
				-hashed. This hash, along with the offset at which the write occurred is then used to update the
			
 
				+hashed. This hash, along with the offset at which the write occurred, is then used to update the
			
 
				 Merkle Tree for the block. Once the new root of the Merkle Tree is computed, it is copied into the
			
 
				 block's metadata. After ensuring it's writecap is copied into the metadata, the node then signs
			
 
				 the metadata using its private signing key.
			
@@ -371,24 +374,25 @@ section. Then the value of the root Merkle Tree node in the metadata is compared
 
				 Merkle Tree for the block. If it matches, then reads to any offset into the block can be verified
			
 
				 using the Merkle Tree. Otherwise, the block is rejected as invalid.
			
 
				 
			
 
				-The sector size can be configured for each block at creation time. The sector size cannot be changed
			
 
				-after creation, but the same effect can be achieved by creating a new block with the desired sector
			
 
				-size, copying the data from the old block into, deleting the old block and renaming the new one.
			
 
				+The sector size can be configured for each block individually at creation time,
			
 
				+but cannot be changed afterwards.
			
 
				+However, the same effect can be achieved by creating a new block with the desired sector
			
 
				+size, copying the data from the old block into it, deleting the old block and renaming the new one.
			
 
				 The choice of sector size for a given block represents a tradeoff between the amount of space the
			
 
				-Merkle Tree over the blocks contents will occupy and the latency experienced by the initial read to
			
 
				+Merkle Tree occupies occupies and the latency experienced by the initial read to
			
 
				 a sector. With a larger sector size, the Merkle Tree size is reduced, but more data has to be
			
 
				 hashed and decrypted when each sector is read. Depending on the size of the block and its intended
			
 
				 usage, the optimal choice will vary.
			
 
				 
			
 
				 It's important to note that the entire Merkle Tree for a block is held in memory for as long as the
			
 
				-block is open. So, assuming the 32 byte sha2 256 hash and the default 4 KiB sector size are used,
			
 
				+block is open. So, assuming the 32 byte SHA2 256 hash and the default 4 KiB sector size are used,
			
 
				 if a 3 GiB file is opened then its MerkleTree will occupy approximately 64 MiB of memory, but will
			
 
				 enable fast random access. Conversely, if 64 KiB sectors are used the Merkle Tree for the same
			
 
				-3 GiB block will occupy approximately 4 MiB, but it's random access performance will suffer from
			
 
				+3 GiB block will occupy approximately 4 MiB, but it's random access performance may suffer from
			
 
				 increased latency.
			
 
				 
			
 
				 \section{Nodes}
			
 
				-The independent computing systems participating in the blocktree system are called nodes. A node
			
 
				+The independent computing systems participating in the Blocktree system are called nodes. A node
			
 
				 need not run on hardware distinct from other nodes, but it is logically distinct from all other
			
 
				 nodes. In all contemporary operating systems a node can be implemented as a process.
			
 
				 A node posses it own unique credentials, which consist of two public and private key pairs,
			
@@ -396,9 +400,13 @@ one used in an encryption scheme and another in a signing scheme. A node is a Pr
 
				 and a hash of its public signing key is used to identify it. In a slight abuse of language, this
			
 
				 hash is referred to as the node's principal.
			
 
				 
			
 
				-Nodes are identified by paths. We say that a node is attached to the blocktree at the directory
			
 
				-containing it. A node is responsible for storing the blocks contained in the directory where it is
			
 
				-attached. This allows data storage to scale as more nodes are added to a blocktree.
			
 
				+Nodes are also identified by paths. 
			
 
				+A node is responsible for storing the blocks contained in the directory where it is
			
 
				+attached, unless there is a second node whose parent directory is contained in the parent directory
			
 
				+of the first and which contains the block's path.
			
 
				+In other words, the node whose parent directory is closest to the block is responsible for storing
			
 
				+the block.
			
 
				+This allows data storage to scale as more nodes are added to a blocktree.
			
 
				 
			
 
				 When a directory contains more than one node a cluster is formed.
			
 
				 The nodes in the cluster run the Raft consensus protocol in order to agree on the sequence
			
@@ -408,18 +416,20 @@ of a block can be read from multiple nodes in the cluster concurrently.
 
				 
			
 
				 \section{Processes}
			
 
				 Nodes run code and that running code is called a process. Processes are spawned by the node on which
			
 
				-their running based on configuration stored in the node's directory. The code which they're spawned
			
 
				+their running based on configuration stored in the node's parent directory.
			
 
				+The code which they're spawned
			
 
				 from is also stored in a block, though that block need not be contained in the directory of the
			
 
				-node running it. So, for example, a software developer can publish their code in their
			
 
				-blocktree, and a user can run it by configuring a node with the path it was published to. Code
			
 
				-which is stored in it's own directory with a manifest describing it is called an app.
			
 
				+node running it. So, for example, a software developer can publish code in their
			
 
				+blocktree, and a user can run it by updating the configuring of a node with the path it was
			
 
				+published to.
			
 
				+Code which is stored in it's own directory with a manifest describing it is called an app.
			
 
				 
			
 
				 There are two types of apps, portable and native. A portable app is a collection of WebAssembly
			
 
				 (Wasm) modules which are executed by the node in a special Wasm runtime which exposes a
			
 
				-messaging API which can be used to perform block IO and communicate with other processes and nodes.
			
 
				+messaging API which can be used to perform block IO and communicate with other processes.
			
 
				 Native apps on the other hand are container images containing binaries compiled to a specific
			
 
				-target architecture. These containers can access blocktree messaging services via native library
			
 
				-using the C ABI, and they have access to blocks in their node's directory via a POSIX-compatible
			
 
				+target architecture. These containers can access blocktree messaging services via a native library
			
 
				+using the C ABI, and they have access to blocks via a POSIX-compatible
			
 
				 filesystem API. The software which runs the node itself is distributed using this mechanism. 
			
 
				 
			
 
				 Regardless of their type, all apps require permissions to communicate with the outside world, with
			
--- a/doc/Paper/Paper.tex
+++ b/doc/Paper/Paper.tex
@@ -46,7 +46,7 @@ then it is called a directory, and the data it contains is managed by the system
 
				 This information includes the list of blocks which are children of the directory as well as the
			
 
				 list of nodes which are attached to the block tree at this directory. In addition
			
 
				 to its payload of data,
			
 
				-each block has a header containing cryptographic access control mechanisms. These mechanisms ensure
			
 
				+each block has metadata containing cryptographic access control mechanisms. These mechanisms ensure
			
 
				 that only authorized users can read and optionally write to the block.
			
 
				 Users and nodes in the blocktree system are identified by hashes of their public keys. These hashes
			
 
				 are referred to as principals, and they are used for setting access control policy.
			
@@ -69,11 +69,11 @@ The atomic unit of data storage, confidentiality and authenticity is the block.
 
				 block contains a payload of data. Confidentiality of this data is achieved by encrypting it using 
			
 
				 a symmetric cipher using a random key. This random key is known as the block key.
			
 
				 The block key can be encapsulated using the public key of a principal whose is to be given access.
			
 
				-The resulting ciphertext is stored in the header of the block. Thus
			
 
				+The resulting ciphertext is stored in the metadata of the block. Thus
			
 
				 the person possessing the corresponding private key will be able to access the contents of
			
 
				 the block. Blocks are arranged into trees, and the parent block also has a block key.
			
 
				 The child's block key is always encapsulated using the parent's key and stored in the block
			
 
				-header. This ensures that when a principal is given read access to a block, it automatically has
			
 
				+metadata. This ensures that when a principal is given read access to a block, it automatically has
			
 
				 access to every child of that block. The encapsulated block key is known as a read capability,
			
 
				 or readcap, as it grants the holder the ability to read the block. In the case of the root block
			
 
				 there is no parent block key to issue a readcap to. Instead, the root block always contains a
			
@@ -427,7 +427,7 @@ to find them. But, in order for a node to read these messages it requires a its
 
				 Only the root
			
 
				 nodes can issue this readcap as only they have access to the root key. Once permission has been
			
 
				 granted to a node, a root node can use the root key to decrypt the readcap issued to it, and then
			
 
				-encrypt it using the public key of the node. The resulting readcap is then stored in the header
			
 
				+encrypt it using the public key of the node. The resulting readcap is then stored in the metadata
			
 
				 of the inbox.
			
 
				 
			
 
				 In addition to being able to check the inbox for mail, a blocktree message is sent to the receiving