Browse Source

Added a "Concepts" section to the book,
mainly to organize my thoughts.

Matthew Carr 3 years ago
parent
commit
cea99ea617
2 changed files with 180 additions and 14 deletions
  1. 180 14
      doc/Book/Book.tex
  2. BIN
      tools/sector_size.ods

+ 180 - 14
doc/Book/Book.tex

@@ -240,24 +240,190 @@ path and the downloaded blocks can be cryptographically verified to be trusted b
 key. Authors wishing to distribute their programs in this manner will of course need to make the
 key. Authors wishing to distribute their programs in this manner will of course need to make the
 blocks containing them public (unencrypted), or else provide some mechanism for selective access.
 blocks containing them public (unencrypted), or else provide some mechanism for selective access.
 
 
-\chapter{Data Structures}
+\chapter{Concepts}
 
 
 \section{Blocks}
 \section{Blocks}
-The fundamental cryptographic operations that can be performed on blocks are:
-\begin{itemize}
-\item Encrypt body.
-\item Decrypt body.
-\item Add a writecap.
-\item Sign
-\item Verify
-\end{itemize}
+A block is a sequence of bytes and a sequence of events. The sequence of events define how the
+sequence of bytes came to be in its current state. At any time the sequence of bytes can
+be recreated by replaying the events starting with an empty sequence of bytes. Thus the sequence of
+events is considered the canonical form of of the block, with the sequence of bytes simply enabling
+efficient reads.
+
+Blocks are hierarchical, with every block having at most one parent and zero or more children.
+If a block has children then it is called a directory, and its children are called its directory
+entries. This hierarchical structure allows us to identify blocks using their position in the
+hierarchy.
+
+\section{Paths}
+A path is a globally unique identifier assigned to a block. The path of the block defines its
+position in the hierarchy of blocks. The syntax of a path is as follows:
+\begin{verbatim}
+    COMP ::= '[\w-_\.]'+
+    RelPath ::= COMP ('/' COMP)* 
+    AbsPath ::= '/' RelPath*
+\end{verbatim}
+In other words, a path is a sequence of components, represented texturally as `/' separated
+fields. The empty sequence of components is called the root path.
+The root path is a directory, and its entries are the blocks of the global blocktree and links
+to every private blocktree.
+A path to a private blocktree has only
+one component, consisting of the Hash of the private blocktree's root credentials. Any path with
+only one components which consists of a Standard Hash
+of the blocktree's root credentials is a valid path to the blocktree. These paths are
+called the root paths of the private blocktree, and any path that begins with one of them
+is simply called a private path. Conversely, paths that do not begin with them are called public.
+
+If one path is a prefix of a second, then we say the first path contains the second. Thus every path
+is contained in the root path, and every private path is contained in the root path of a private
+blocktree.
+
+In addition to
+identifying blocks, paths are used to scope capabilities and to address
+messages to nodes and to the processes they're running.
+
+\section{Principals}
+A principal is any entity which can be authenticated.
+All authentication in the blocktree system is performed using digital signatures and as such
+principals are identified by a cryptographic hash of their public signing key. Support
+for the following hash algorithms is required:
+\begin{enumerate}
+    \item SHA2 256
+    \item SHA2 512
+\end{enumerate}
+These are referred to as the Standard Hash Algorithms and a digest computed using one of them is
+referred to as a Standard Hash.
+
+When a principal is identified in a textural representation, such as the textural representation of
+a path, then the following syntax is used:
+\begin{verbatim}
+    PrincipTxt ::= <hash algo index> '!' <Base64Url(hash)>
+\end{verbatim}
+where ``hash algo index'' is the base 10 string representation of the index of the hash algorithm in
+the  above list, and ``Base64Url(hash)'' is the Base64Url encoding of the hash data identifying the
+principal.
+
+Principals can be issued capabilities which allow them to read and write to blocks.
+These capabilities are scoped by paths, which limit the set of blocks the capability can be used
+for. A capability can only be used on a block whose path is contained in the path of the
+capability. This access
+control mechanism is enforced cryptographically, as described below.
+
+A principal can grant a capability to another principal so long as the granted capability has a
+path which is contained in the capability possessed by the granting node. The specific mechanisms
+for how this is done differs depending on whether the capability is reading or writing to a block.
+
+Every private blocktree is associated with a root principal. The path containing only one
+component which consists of a hash of the root principal's public key is a root path to the private 
+block tree. The root principal has read and write capabilities for the private blocktree's root,
+and so can grant subordinate capabilities scoped to any block in the private blocktree.
+
+\section{Readcaps}
+In order to protect the confidentiality of the data stored in a block, a symmetric cypher is used.
+The key for this cipher is called block key, and by controlling access to it we can control which
+principals can read the block.
+
+The metadata of every block contains a dictionary of zero or more entries called read capabilities,
+or readcaps for short. A key in this dictionary is a hash of the principal for which the
+readcap was issued and the value is the encryption of the block key using the principal's public
+encryption key.
+
+The block key is also encrypted using the block key of the parent block, and the resulting cipher
+text is stored in the block's metadata. This is referred to as the inherited readcap.
+Hence, a principal which has a read capability for a
+given path can read all paths contained in it as well. Further, a principal can use it's readcap
+to decrypt the block key, and then re-encrypt it using the public encryption key of another
+principal, thus allowing the principal to grant a subordinate readcap.
+
+A block that does not require confidentiality protection need not be encrypted. In this case, the
+table of readcaps is empty and the inherited readcap is set to a flag value to indicate the block
+is stored as plaintext. Blocks in the global blocktree are never encrypted. 
 
 
 \section{Writecaps}
 \section{Writecaps}
-The cryptographic operations that can be performed on a writecap are:
-\begin{itemize}
-\item Sign
-\item Verify
-\end{itemize}
+A write capability, or writecap for short, is a certificate chain which extends from a node's
+credentials back to the root signing key of a private blocktree.
+Each certificate in this chain contains
+the public key of the principal who issued it and the path that it grants write capabilities to.
+The path of a certificate must be contained in the path of the next certificate in the chain, and
+the chain must terminate in a certificate which is self-signed by the root signing credentials.
+Each certificate also contains an expiration time, represented by an Epoch value, and a writecap
+shall be considered invalid by a node if any certificate has an expiration time which is judged by
+the node to be in the past. 
+
+A block's metadata contains a writecap, which allows it to be verified by checking the digital
+signature on the metadata using the first certificate in the writecap, and then checking that the
+writecap is valid. This ensures that the chain of trust from the root signing credentials extends
+to the block's metadata. To extend this chain to the block's data, a Merkle Tree is used, as
+described below. A separate mechanism is used to ensure the integrity of blocks in the global
+block tree.
+
+\section{Sectors}
+A block's data is logically broken up into fixed sized sectors, and these are the units of
+confidentiality protection, integrity protection and consensus. When a process performs writes on
+an open block, those writes are buffered until a sector is filled (or until the process calls
+flush). Once this happens, the contents of the buffer are encrypted using the block key and then
+hashed. This hash, along with the offset at which the write occurred is then used to update the
+Merkle Tree for the block. Once the new root of the Merkle Tree is computed, it is copied into the
+block's metadata. After ensuring it's writecap is copied into the metadata, the node then signs
+the metadata using its private signing key.
+
+When the block is later opened its metadata is verified using the process described in the previous
+section. Then the value of the root Merkle Tree node in the metadata is compared to the value in the
+Merkle Tree for the block. If it matches, then reads to any offset into the block can be verified
+using the Merkle Tree. Otherwise, the block is rejected as invalid.
+
+The sector size can be configured for each block at creation time. The sector size cannot be changed
+after creation, but the same effect can be achieved by creating a new block with the desired sector
+size, copying the data from the old block into, deleting the old block and renaming the new one.
+The choice of sector size for a given block represents a tradeoff between the amount of space the
+Merkle Tree over the blocks contents will occupy and the latency experienced by the initial read to
+a sector. With a larger sector size, the Merkle Tree size is reduced, but more data has to be
+hashed and decrypted when each sector is read. Depending on the size of the block and its intended
+usage, the optimal choice will vary.
+
+It's important to note that the entire Merkle Tree for a block is held in memory for as long as the
+block is open. So, assuming the 32 byte sha2 256 hash and the default 4 KiB sector size are used,
+if a 3 GiB file is opened then its MerkleTree will occupy approximately 64 MiB of memory, but will
+enable fast random access. Conversely, if 64 KiB sectors are used the Merkle Tree for the same
+3 GiB block will occupy approximately 4 MiB, but it's random access performance will suffer from
+increased latency.
+
+\section{Nodes}
+The independent computing systems participating in the blocktree system are called nodes. A node
+need not run on hardware distinct from other nodes, but it is logically distinct from all other
+nodes. In all contemporary operating systems a node can be implemented as a process.
+A node posses it own unique credentials, which consist of two public and private key pairs,
+one used in an encryption scheme and another in a signing scheme. A node is a Principal
+and a hash of its public signing key is used to identify it. In a slight abuse of language, this
+hash is referred to as the node's principal.
+
+Nodes are identified by paths. We say that a node is attached to the blocktree at the directory
+containing it. A node is responsible for storing the blocks contained in the directory where it is
+attached. This allows data storage to scale as more nodes are added to a blocktree.
+
+When a directory contains more than one node a cluster is formed.
+The nodes in the cluster run the Raft consensus protocol in order to agree on the sequence
+of events which constitutes each of the blocks contained in the directory where they're attached. 
+This allows for redundancy, load balancing, and increased performance, as different sectors
+of a block can be read from multiple nodes in the cluster concurrently.
+
+\section{Processes}
+Nodes run code and that running code is called a process. Processes are spawned by the node on which
+their running based on configuration stored in the node's directory. The code which they're spawned
+from is also stored in a block, though that block need not be contained in the directory of the
+node running it. So, for example, a software developer can publish their code in their
+blocktree, and a user can run it by configuring a node with the path it was published to. Code
+which is stored in it's own directory with a manifest describing it is called an app.
+
+There are two types of apps, portable and native. A portable app is a collection of WebAssembly
+(Wasm) modules which are executed by the node in a special Wasm runtime which exposes a
+messaging API which can be used to perform block IO and communicate with other processes and nodes.
+Native apps on the other hand are container images containing binaries compiled to a specific
+target architecture. These containers can access blocktree messaging services via native library
+using the C ABI, and they have access to blocks in their node's directory via a POSIX-compatible
+filesystem API. The software which runs the node itself is distributed using this mechanism. 
+
+Regardless of their type, all apps require permissions to communicate with the outside world, with
+the default only allowing them to communicate with their parent.
 
 
 \chapter{Nodes}
 \chapter{Nodes}
 
 

BIN
tools/sector_size.ods