Browse Source

Added a "Concepts" section to the book,
mainly to organize my thoughts.

Matthew Carr 2 years ago
parent
commit
cea99ea617
2 changed files with 180 additions and 14 deletions
  1. 180 14
      doc/Book/Book.tex
  2. BIN
      tools/sector_size.ods

+ 180 - 14
doc/Book/Book.tex

@@ -240,24 +240,190 @@ path and the downloaded blocks can be cryptographically verified to be trusted b
 key. Authors wishing to distribute their programs in this manner will of course need to make the
 blocks containing them public (unencrypted), or else provide some mechanism for selective access.
 
-\chapter{Data Structures}
+\chapter{Concepts}
 
 \section{Blocks}
-The fundamental cryptographic operations that can be performed on blocks are:
-\begin{itemize}
-\item Encrypt body.
-\item Decrypt body.
-\item Add a writecap.
-\item Sign
-\item Verify
-\end{itemize}
+A block is a sequence of bytes and a sequence of events. The sequence of events define how the
+sequence of bytes came to be in its current state. At any time the sequence of bytes can
+be recreated by replaying the events starting with an empty sequence of bytes. Thus the sequence of
+events is considered the canonical form of of the block, with the sequence of bytes simply enabling
+efficient reads.
+
+Blocks are hierarchical, with every block having at most one parent and zero or more children.
+If a block has children then it is called a directory, and its children are called its directory
+entries. This hierarchical structure allows us to identify blocks using their position in the
+hierarchy.
+
+\section{Paths}
+A path is a globally unique identifier assigned to a block. The path of the block defines its
+position in the hierarchy of blocks. The syntax of a path is as follows:
+\begin{verbatim}
+    COMP ::= '[\w-_\.]'+
+    RelPath ::= COMP ('/' COMP)* 
+    AbsPath ::= '/' RelPath*
+\end{verbatim}
+In other words, a path is a sequence of components, represented texturally as `/' separated
+fields. The empty sequence of components is called the root path.
+The root path is a directory, and its entries are the blocks of the global blocktree and links
+to every private blocktree.
+A path to a private blocktree has only
+one component, consisting of the Hash of the private blocktree's root credentials. Any path with
+only one components which consists of a Standard Hash
+of the blocktree's root credentials is a valid path to the blocktree. These paths are
+called the root paths of the private blocktree, and any path that begins with one of them
+is simply called a private path. Conversely, paths that do not begin with them are called public.
+
+If one path is a prefix of a second, then we say the first path contains the second. Thus every path
+is contained in the root path, and every private path is contained in the root path of a private
+blocktree.
+
+In addition to
+identifying blocks, paths are used to scope capabilities and to address
+messages to nodes and to the processes they're running.
+
+\section{Principals}
+A principal is any entity which can be authenticated.
+All authentication in the blocktree system is performed using digital signatures and as such
+principals are identified by a cryptographic hash of their public signing key. Support
+for the following hash algorithms is required:
+\begin{enumerate}
+    \item SHA2 256
+    \item SHA2 512
+\end{enumerate}
+These are referred to as the Standard Hash Algorithms and a digest computed using one of them is
+referred to as a Standard Hash.
+
+When a principal is identified in a textural representation, such as the textural representation of
+a path, then the following syntax is used:
+\begin{verbatim}
+    PrincipTxt ::= <hash algo index> '!' <Base64Url(hash)>
+\end{verbatim}
+where ``hash algo index'' is the base 10 string representation of the index of the hash algorithm in
+the  above list, and ``Base64Url(hash)'' is the Base64Url encoding of the hash data identifying the
+principal.
+
+Principals can be issued capabilities which allow them to read and write to blocks.
+These capabilities are scoped by paths, which limit the set of blocks the capability can be used
+for. A capability can only be used on a block whose path is contained in the path of the
+capability. This access
+control mechanism is enforced cryptographically, as described below.
+
+A principal can grant a capability to another principal so long as the granted capability has a
+path which is contained in the capability possessed by the granting node. The specific mechanisms
+for how this is done differs depending on whether the capability is reading or writing to a block.
+
+Every private blocktree is associated with a root principal. The path containing only one
+component which consists of a hash of the root principal's public key is a root path to the private 
+block tree. The root principal has read and write capabilities for the private blocktree's root,
+and so can grant subordinate capabilities scoped to any block in the private blocktree.
+
+\section{Readcaps}
+In order to protect the confidentiality of the data stored in a block, a symmetric cypher is used.
+The key for this cipher is called block key, and by controlling access to it we can control which
+principals can read the block.
+
+The metadata of every block contains a dictionary of zero or more entries called read capabilities,
+or readcaps for short. A key in this dictionary is a hash of the principal for which the
+readcap was issued and the value is the encryption of the block key using the principal's public
+encryption key.
+
+The block key is also encrypted using the block key of the parent block, and the resulting cipher
+text is stored in the block's metadata. This is referred to as the inherited readcap.
+Hence, a principal which has a read capability for a
+given path can read all paths contained in it as well. Further, a principal can use it's readcap
+to decrypt the block key, and then re-encrypt it using the public encryption key of another
+principal, thus allowing the principal to grant a subordinate readcap.
+
+A block that does not require confidentiality protection need not be encrypted. In this case, the
+table of readcaps is empty and the inherited readcap is set to a flag value to indicate the block
+is stored as plaintext. Blocks in the global blocktree are never encrypted. 
 
 \section{Writecaps}
-The cryptographic operations that can be performed on a writecap are:
-\begin{itemize}
-\item Sign
-\item Verify
-\end{itemize}
+A write capability, or writecap for short, is a certificate chain which extends from a node's
+credentials back to the root signing key of a private blocktree.
+Each certificate in this chain contains
+the public key of the principal who issued it and the path that it grants write capabilities to.
+The path of a certificate must be contained in the path of the next certificate in the chain, and
+the chain must terminate in a certificate which is self-signed by the root signing credentials.
+Each certificate also contains an expiration time, represented by an Epoch value, and a writecap
+shall be considered invalid by a node if any certificate has an expiration time which is judged by
+the node to be in the past. 
+
+A block's metadata contains a writecap, which allows it to be verified by checking the digital
+signature on the metadata using the first certificate in the writecap, and then checking that the
+writecap is valid. This ensures that the chain of trust from the root signing credentials extends
+to the block's metadata. To extend this chain to the block's data, a Merkle Tree is used, as
+described below. A separate mechanism is used to ensure the integrity of blocks in the global
+block tree.
+
+\section{Sectors}
+A block's data is logically broken up into fixed sized sectors, and these are the units of
+confidentiality protection, integrity protection and consensus. When a process performs writes on
+an open block, those writes are buffered until a sector is filled (or until the process calls
+flush). Once this happens, the contents of the buffer are encrypted using the block key and then
+hashed. This hash, along with the offset at which the write occurred is then used to update the
+Merkle Tree for the block. Once the new root of the Merkle Tree is computed, it is copied into the
+block's metadata. After ensuring it's writecap is copied into the metadata, the node then signs
+the metadata using its private signing key.
+
+When the block is later opened its metadata is verified using the process described in the previous
+section. Then the value of the root Merkle Tree node in the metadata is compared to the value in the
+Merkle Tree for the block. If it matches, then reads to any offset into the block can be verified
+using the Merkle Tree. Otherwise, the block is rejected as invalid.
+
+The sector size can be configured for each block at creation time. The sector size cannot be changed
+after creation, but the same effect can be achieved by creating a new block with the desired sector
+size, copying the data from the old block into, deleting the old block and renaming the new one.
+The choice of sector size for a given block represents a tradeoff between the amount of space the
+Merkle Tree over the blocks contents will occupy and the latency experienced by the initial read to
+a sector. With a larger sector size, the Merkle Tree size is reduced, but more data has to be
+hashed and decrypted when each sector is read. Depending on the size of the block and its intended
+usage, the optimal choice will vary.
+
+It's important to note that the entire Merkle Tree for a block is held in memory for as long as the
+block is open. So, assuming the 32 byte sha2 256 hash and the default 4 KiB sector size are used,
+if a 3 GiB file is opened then its MerkleTree will occupy approximately 64 MiB of memory, but will
+enable fast random access. Conversely, if 64 KiB sectors are used the Merkle Tree for the same
+3 GiB block will occupy approximately 4 MiB, but it's random access performance will suffer from
+increased latency.
+
+\section{Nodes}
+The independent computing systems participating in the blocktree system are called nodes. A node
+need not run on hardware distinct from other nodes, but it is logically distinct from all other
+nodes. In all contemporary operating systems a node can be implemented as a process.
+A node posses it own unique credentials, which consist of two public and private key pairs,
+one used in an encryption scheme and another in a signing scheme. A node is a Principal
+and a hash of its public signing key is used to identify it. In a slight abuse of language, this
+hash is referred to as the node's principal.
+
+Nodes are identified by paths. We say that a node is attached to the blocktree at the directory
+containing it. A node is responsible for storing the blocks contained in the directory where it is
+attached. This allows data storage to scale as more nodes are added to a blocktree.
+
+When a directory contains more than one node a cluster is formed.
+The nodes in the cluster run the Raft consensus protocol in order to agree on the sequence
+of events which constitutes each of the blocks contained in the directory where they're attached. 
+This allows for redundancy, load balancing, and increased performance, as different sectors
+of a block can be read from multiple nodes in the cluster concurrently.
+
+\section{Processes}
+Nodes run code and that running code is called a process. Processes are spawned by the node on which
+their running based on configuration stored in the node's directory. The code which they're spawned
+from is also stored in a block, though that block need not be contained in the directory of the
+node running it. So, for example, a software developer can publish their code in their
+blocktree, and a user can run it by configuring a node with the path it was published to. Code
+which is stored in it's own directory with a manifest describing it is called an app.
+
+There are two types of apps, portable and native. A portable app is a collection of WebAssembly
+(Wasm) modules which are executed by the node in a special Wasm runtime which exposes a
+messaging API which can be used to perform block IO and communicate with other processes and nodes.
+Native apps on the other hand are container images containing binaries compiled to a specific
+target architecture. These containers can access blocktree messaging services via native library
+using the C ABI, and they have access to blocks in their node's directory via a POSIX-compatible
+filesystem API. The software which runs the node itself is distributed using this mechanism. 
+
+Regardless of their type, all apps require permissions to communicate with the outside world, with
+the default only allowing them to communicate with their parent.
 
 \chapter{Nodes}
 

BIN
tools/sector_size.ods