2 years ago · cea99ea617
--- a/doc/Book/Book.tex
+++ b/doc/Book/Book.tex
@@ -240,24 +240,190 @@ path and the downloaded blocks can be cryptographically verified to be trusted b
 
				 key. Authors wishing to distribute their programs in this manner will of course need to make the
			
 
				 blocks containing them public (unencrypted), or else provide some mechanism for selective access.
			
 
				 
			
 
				-\chapter{Data Structures}
			
 
				+\chapter{Concepts}
			
 
				 
			
 
				 \section{Blocks}
			
 
				-The fundamental cryptographic operations that can be performed on blocks are:
			
 
				-\begin{itemize}
			
 
				-\item Encrypt body.
			
 
				-\item Decrypt body.
			
 
				-\item Add a writecap.
			
 
				-\item Sign
			
 
				-\item Verify
			
 
				-\end{itemize}
			
 
				+A block is a sequence of bytes and a sequence of events. The sequence of events define how the
			
 
				+sequence of bytes came to be in its current state. At any time the sequence of bytes can
			
 
				+be recreated by replaying the events starting with an empty sequence of bytes. Thus the sequence of
			
 
				+events is considered the canonical form of of the block, with the sequence of bytes simply enabling
			
 
				+efficient reads.
			
 
				+
			
 
				+Blocks are hierarchical, with every block having at most one parent and zero or more children.
			
 
				+If a block has children then it is called a directory, and its children are called its directory
			
 
				+entries. This hierarchical structure allows us to identify blocks using their position in the
			
 
				+hierarchy.
			
 
				+
			
 
				+\section{Paths}
			
 
				+A path is a globally unique identifier assigned to a block. The path of the block defines its
			
 
				+position in the hierarchy of blocks. The syntax of a path is as follows:
			
 
				+\begin{verbatim}
			
 
				+    COMP ::= '[\w-_\.]'+
			
 
				+    RelPath ::= COMP ('/' COMP)* 
			
 
				+    AbsPath ::= '/' RelPath*
			
 
				+\end{verbatim}
			
 
				+In other words, a path is a sequence of components, represented texturally as `/' separated
			
 
				+fields. The empty sequence of components is called the root path.
			
 
				+The root path is a directory, and its entries are the blocks of the global blocktree and links
			
 
				+to every private blocktree.
			
 
				+A path to a private blocktree has only
			
 
				+one component, consisting of the Hash of the private blocktree's root credentials. Any path with
			
 
				+only one components which consists of a Standard Hash
			
 
				+of the blocktree's root credentials is a valid path to the blocktree. These paths are
			
 
				+called the root paths of the private blocktree, and any path that begins with one of them
			
 
				+is simply called a private path. Conversely, paths that do not begin with them are called public.
			
 
				+
			
 
				+If one path is a prefix of a second, then we say the first path contains the second. Thus every path
			
 
				+is contained in the root path, and every private path is contained in the root path of a private
			
 
				+blocktree.
			
 
				+
			
 
				+In addition to
			
 
				+identifying blocks, paths are used to scope capabilities and to address
			
 
				+messages to nodes and to the processes they're running.
			
 
				+
			
 
				+\section{Principals}
			
 
				+A principal is any entity which can be authenticated.
			
 
				+All authentication in the blocktree system is performed using digital signatures and as such
			
 
				+principals are identified by a cryptographic hash of their public signing key. Support
			
 
				+for the following hash algorithms is required:
			
 
				+\begin{enumerate}
			
 
				+    \item SHA2 256
			
 
				+    \item SHA2 512
			
 
				+\end{enumerate}
			
 
				+These are referred to as the Standard Hash Algorithms and a digest computed using one of them is
			
 
				+referred to as a Standard Hash.
			
 
				+
			
 
				+When a principal is identified in a textural representation, such as the textural representation of
			
 
				+a path, then the following syntax is used:
			
 
				+\begin{verbatim}
			
 
				+    PrincipTxt ::= <hash algo index> '!' <Base64Url(hash)>
			
 
				+\end{verbatim}
			
 
				+where ``hash algo index'' is the base 10 string representation of the index of the hash algorithm in
			
 
				+the  above list, and ``Base64Url(hash)'' is the Base64Url encoding of the hash data identifying the
			
 
				+principal.
			
 
				+
			
 
				+Principals can be issued capabilities which allow them to read and write to blocks.
			
 
				+These capabilities are scoped by paths, which limit the set of blocks the capability can be used
			
 
				+for. A capability can only be used on a block whose path is contained in the path of the
			
 
				+capability. This access
			
 
				+control mechanism is enforced cryptographically, as described below.
			
 
				+
			
 
				+A principal can grant a capability to another principal so long as the granted capability has a
			
 
				+path which is contained in the capability possessed by the granting node. The specific mechanisms
			
 
				+for how this is done differs depending on whether the capability is reading or writing to a block.
			
 
				+
			
 
				+Every private blocktree is associated with a root principal. The path containing only one
			
 
				+component which consists of a hash of the root principal's public key is a root path to the private 
			
 
				+block tree. The root principal has read and write capabilities for the private blocktree's root,
			
 
				+and so can grant subordinate capabilities scoped to any block in the private blocktree.
			
 
				+
			
 
				+\section{Readcaps}
			
 
				+In order to protect the confidentiality of the data stored in a block, a symmetric cypher is used.
			
 
				+The key for this cipher is called block key, and by controlling access to it we can control which
			
 
				+principals can read the block.
			
 
				+
			
 
				+The metadata of every block contains a dictionary of zero or more entries called read capabilities,
			
 
				+or readcaps for short. A key in this dictionary is a hash of the principal for which the
			
 
				+readcap was issued and the value is the encryption of the block key using the principal's public
			
 
				+encryption key.
			
 
				+
			
 
				+The block key is also encrypted using the block key of the parent block, and the resulting cipher
			
 
				+text is stored in the block's metadata. This is referred to as the inherited readcap.
			
 
				+Hence, a principal which has a read capability for a
			
 
				+given path can read all paths contained in it as well. Further, a principal can use it's readcap
			
 
				+to decrypt the block key, and then re-encrypt it using the public encryption key of another
			
 
				+principal, thus allowing the principal to grant a subordinate readcap.
			
 
				+
			
 
				+A block that does not require confidentiality protection need not be encrypted. In this case, the
			
 
				+table of readcaps is empty and the inherited readcap is set to a flag value to indicate the block
			
 
				+is stored as plaintext. Blocks in the global blocktree are never encrypted. 
			
 
				 
			
 
				 \section{Writecaps}
			
 
				-The cryptographic operations that can be performed on a writecap are:
			
 
				-\begin{itemize}
			
 
				-\item Sign
			
 
				-\item Verify
			
 
				-\end{itemize}
			
 
				+A write capability, or writecap for short, is a certificate chain which extends from a node's
			
 
				+credentials back to the root signing key of a private blocktree.
			
 
				+Each certificate in this chain contains
			
 
				+the public key of the principal who issued it and the path that it grants write capabilities to.
			
 
				+The path of a certificate must be contained in the path of the next certificate in the chain, and
			
 
				+the chain must terminate in a certificate which is self-signed by the root signing credentials.
			
 
				+Each certificate also contains an expiration time, represented by an Epoch value, and a writecap
			
 
				+shall be considered invalid by a node if any certificate has an expiration time which is judged by
			
 
				+the node to be in the past. 
			
 
				+
			
 
				+A block's metadata contains a writecap, which allows it to be verified by checking the digital
			
 
				+signature on the metadata using the first certificate in the writecap, and then checking that the
			
 
				+writecap is valid. This ensures that the chain of trust from the root signing credentials extends
			
 
				+to the block's metadata. To extend this chain to the block's data, a Merkle Tree is used, as
			
 
				+described below. A separate mechanism is used to ensure the integrity of blocks in the global
			
 
				+block tree.
			
 
				+
			
 
				+\section{Sectors}
			
 
				+A block's data is logically broken up into fixed sized sectors, and these are the units of
			
 
				+confidentiality protection, integrity protection and consensus. When a process performs writes on
			
 
				+an open block, those writes are buffered until a sector is filled (or until the process calls
			
 
				+flush). Once this happens, the contents of the buffer are encrypted using the block key and then
			
 
				+hashed. This hash, along with the offset at which the write occurred is then used to update the
			
 
				+Merkle Tree for the block. Once the new root of the Merkle Tree is computed, it is copied into the
			
 
				+block's metadata. After ensuring it's writecap is copied into the metadata, the node then signs
			
 
				+the metadata using its private signing key.
			
 
				+
			
 
				+When the block is later opened its metadata is verified using the process described in the previous
			
 
				+section. Then the value of the root Merkle Tree node in the metadata is compared to the value in the
			
 
				+Merkle Tree for the block. If it matches, then reads to any offset into the block can be verified
			
 
				+using the Merkle Tree. Otherwise, the block is rejected as invalid.
			
 
				+
			
 
				+The sector size can be configured for each block at creation time. The sector size cannot be changed
			
 
				+after creation, but the same effect can be achieved by creating a new block with the desired sector
			
 
				+size, copying the data from the old block into, deleting the old block and renaming the new one.
			
 
				+The choice of sector size for a given block represents a tradeoff between the amount of space the
			
 
				+Merkle Tree over the blocks contents will occupy and the latency experienced by the initial read to
			
 
				+a sector. With a larger sector size, the Merkle Tree size is reduced, but more data has to be
			
 
				+hashed and decrypted when each sector is read. Depending on the size of the block and its intended
			
 
				+usage, the optimal choice will vary.
			
 
				+
			
 
				+It's important to note that the entire Merkle Tree for a block is held in memory for as long as the
			
 
				+block is open. So, assuming the 32 byte sha2 256 hash and the default 4 KiB sector size are used,
			
 
				+if a 3 GiB file is opened then its MerkleTree will occupy approximately 64 MiB of memory, but will
			
 
				+enable fast random access. Conversely, if 64 KiB sectors are used the Merkle Tree for the same
			
 
				+3 GiB block will occupy approximately 4 MiB, but it's random access performance will suffer from
			
 
				+increased latency.
			
 
				+
			
 
				+\section{Nodes}
			
 
				+The independent computing systems participating in the blocktree system are called nodes. A node
			
 
				+need not run on hardware distinct from other nodes, but it is logically distinct from all other
			
 
				+nodes. In all contemporary operating systems a node can be implemented as a process.
			
 
				+A node posses it own unique credentials, which consist of two public and private key pairs,
			
 
				+one used in an encryption scheme and another in a signing scheme. A node is a Principal
			
 
				+and a hash of its public signing key is used to identify it. In a slight abuse of language, this
			
 
				+hash is referred to as the node's principal.
			
 
				+
			
 
				+Nodes are identified by paths. We say that a node is attached to the blocktree at the directory
			
 
				+containing it. A node is responsible for storing the blocks contained in the directory where it is
			
 
				+attached. This allows data storage to scale as more nodes are added to a blocktree.
			
 
				+
			
 
				+When a directory contains more than one node a cluster is formed.
			
 
				+The nodes in the cluster run the Raft consensus protocol in order to agree on the sequence
			
 
				+of events which constitutes each of the blocks contained in the directory where they're attached. 
			
 
				+This allows for redundancy, load balancing, and increased performance, as different sectors
			
 
				+of a block can be read from multiple nodes in the cluster concurrently.
			
 
				+
			
 
				+\section{Processes}
			
 
				+Nodes run code and that running code is called a process. Processes are spawned by the node on which
			
 
				+their running based on configuration stored in the node's directory. The code which they're spawned
			
 
				+from is also stored in a block, though that block need not be contained in the directory of the
			
 
				+node running it. So, for example, a software developer can publish their code in their
			
 
				+blocktree, and a user can run it by configuring a node with the path it was published to. Code
			
 
				+which is stored in it's own directory with a manifest describing it is called an app.
			
 
				+
			
 
				+There are two types of apps, portable and native. A portable app is a collection of WebAssembly
			
 
				+(Wasm) modules which are executed by the node in a special Wasm runtime which exposes a
			
 
				+messaging API which can be used to perform block IO and communicate with other processes and nodes.
			
 
				+Native apps on the other hand are container images containing binaries compiled to a specific
			
 
				+target architecture. These containers can access blocktree messaging services via native library
			
 
				+using the C ABI, and they have access to blocks in their node's directory via a POSIX-compatible
			
 
				+filesystem API. The software which runs the node itself is distributed using this mechanism. 
			
 
				+
			
 
				+Regardless of their type, all apps require permissions to communicate with the outside world, with
			
 
				+the default only allowing them to communicate with their parent.
			
 
				 
			
 
				 \chapter{Nodes}
			
 
				 
			
--- a/tools/sector_size.ods
+++ b/tools/sector_size.ods