Browse Source

More editing of the paper. Added a paragraph describing
blocktree messages.

Matthew Carr 2 years ago
parent
commit
0ecdcfe902
2 changed files with 65 additions and 19 deletions
  1. 20 0
      doc/Paper/.vscode/settings.json
  2. 45 19
      doc/Paper/Paper.tex

+ 20 - 0
doc/Paper/.vscode/settings.json

@@ -0,0 +1,20 @@
+{
+    "cSpell.words": [
+        "blockcoin",
+        "Blockstack",
+        "Blocktree",
+        "blocktrees",
+        "cryptosystem",
+        "Filecoin",
+        "filesystems",
+        "hashtable",
+        "incentivizing",
+        "Merkle",
+        "nodetree",
+        "readcap",
+        "readcaps",
+        "WASI",
+        "writecap",
+        "writecaps"
+    ]
+}

+ 45 - 19
doc/Paper/Paper.tex

@@ -23,7 +23,7 @@ the details of backing up user data and implementing access controls to facilita
 safe sharing. However, because these are closed systems, users are forced to trust that
 the operators are benevolent, and they lack any real way of ensuring that the access
 controls they prescribe will actually be enforced. There have been several systems proposed
-as an alternative to the conventional model, but these systems suffer from several shortcomings.
+as alternatives to the conventional model, but these systems suffer from several shortcomings.
 They either assume the need for cloud storage providers (Blockstack) or implement all operations
 using a global blockchain, limiting performance (Filecoin). Blocktree takes a different approach.
 
@@ -40,15 +40,18 @@ in a blocktree form a tree. Each block has a path corresponding to its location
 component of a fully qualified blocktree path is the fingerprint of the root public key of the
 blocktree. Thus a blocktree path can globally specify a block. If a block is not a leaf,
 then it is called a directory, and the data it contains is managed by the system.
-This information includes the list of blocks which are children of the directory. In addition
+This information includes the list of blocks which are children of the directory as well as the
+list of nodes which are attached to the block tree at this directory. In addition
 to its payload of data,
 each block has a header containing cryptographic access control mechanisms. These mechanisms ensure
 that only authorized users can read and optionally write to the block.
-
 Users and nodes in the blocktree system are identified by hashes of their public keys. These hashes
 are referred to as principals, and they are used for setting access control policy.
 
-This remainder of this paper is organized as follows:
+This paper is intended to be short introduction to the ideas of blocktree. A book is planned which
+will specify the system in greater detail. In keeping with the agile software methodology, this
+book is being written concurrently with an open source implementation of the system.
+The remainder of this paper is organized as follows:
 \begin{itemize}
 \item A description of the operations of a single blocktree.
 \item The definition of a blockchain which provides global state and links individual blocktrees
@@ -59,17 +62,24 @@ together.
 \end{itemize}
 
 \section{An Individual Blocktree}
-The atomic unit of data storage, confidentiality and authenticity is called a block. A
+The atomic unit of data storage, confidentiality and authenticity is the block. A
 block contains a payload of data. Confidentiality of this data is achieved by encrypting it using 
 a symmetric cipher using a random key. This random key is known as the block key.
 The block key can be encapsulated using the public key of a principal whose is to be given access.
 The resulting ciphertext is stored in the header of the block. Thus
 the person possessing the corresponding private key will be able to access the contents of
-the block. Blocks are arranged into trees, and the parent of the block also has a block key.
+the block. Blocks are arranged into trees, and the parent block also has a block key.
 The child's block key is always encapsulated using the parent's key and stored in the block
 header. This ensures that when a principal is given read access to a block, it automatically has
 access to every child of that block. The encapsulated block key is known as a read capability,
-or readcap, as it grants the holder the ability to read the block.
+or readcap, as it grants the holder the ability to read the block. In the case of the root block
+there is no parent block key to issue a readcap to. Instead, the root block always contains a
+readcap for the the root key.
+
+A note on visualizing a blocktree. In this paper I will describe the root block as being at the
+bottom of the tree, and describe the descendants of a block as being above it. The reader may
+recognize this as the ``trees grow up" model of formal logic, and it was chosen because such trees
+appear visually similar to real world trees when rendered in a virtual environment.
 
 Authenticity guarantees are provided using a digital signature scheme. In order to change the
 contents of a block a data structure called a write capability, or writecap, is needed. A
@@ -83,12 +93,12 @@ writecap is approximately an x509 certificate chain. A writecap contains the fol
 \item Optionally, the next writecap in the chain.
 \end{itemize}
 The last item is only excluded in the case of a self-signed writecap, i.e. one that was signed by
-the same principal it was issued to. A writecap is considered valid for use on a block if all
+the same principal it was issued to. A writecap is considered valid for use in a block if all
 of the following conditions are met:
 \begin{itemize}
 \item The signature on every writecap in the chain is valid.
 \item The signing principal matches the principal the next writecap was issued to for every
-write cap in the chain.
+writecap in the chain.
 \item The path of the block is contained in the path of every writecap in the chain.
 \item The current timestamp is strictly less than the expiration of all the writecaps in the
 chain.
@@ -107,14 +117,14 @@ participating in a blocktree is referred to as a node. Multiple nodes may be run
 a single computer. Every node is attached to the blocktree at a specific path. This information
 is recorded in the directory where the node is attached. A node is responsible for the storage of
 the directory where it is attached and all of the blocks that are recursively contained with in it.
-Of course if there is a child node attached to a subdirectory contained in the directory the node is
+If there is a child node attached to a subdirectory contained in the directory the node is
 responsible for, then the child node is responsible for the subdirectory.
 In this way data storage can be delegated, allowing the system to scale. When more than one
 node is attached to the same directory they form a cluster.
 Each node in the cluster contains a copy of the data that the cluster is responsible for. They
 maintain consistency of this data by running the Raft consensus protocol.
 
-When a new blocktree is created a node generates a key pair to serve use as the root keys.
+When a new blocktree is created a node generates a key pair to serve as the root keys.
 It is imperative for the security of the system that the root private key is protected, and it
 is highly recommended that it be stored in a Trusted Platform Module (TPM) and that the TPM
 be configured to disallow unauthenticated use of this key. The node then generates its own key pair
@@ -134,39 +144,55 @@ In order for the new node to be added to the user's blocktree, it needs to be is
 and a readcap must be added to the directory where it will be attached.
 This could be accomplished
 by providing a user interface on the node which received the public key from the new node.
-This interface would show the user the requests that have been received from new nodes attempting
-to join their blocktree. The user can then choose to approve or deny the request, and can specify
+This interface would show the requests that have been received from nodes attempting
+to join the blocktree. The user can then choose to approve or deny the request, and can specify
 the path where the new node will attach. If the user chooses to approve the request, then the writecap
 is signed using the node's key and transmitted to then new node.
 
 The ability to cope with key compromise is an important design consideration in any real-world
 cryptosystem. In blocktree the compromise of a node key is handled by re-keying every block under
-the block where the node was attached. Specifically, this means that a new block key is generated for
+the directory where the node was attached. Specifically, this means that a new block key is generated for
 each block, and the readcap for the compromised node is removed. This ensures that new writes to
 these blocks will not be visible to the holder of the compromised key. To ensure that writes
 will not be allowed, the root directory contains a revocation list containing the public keys
 which have been revoked by the tree. These only need to be maintained until their writecaps
 expire, after which time the list can be cleaned. Note that if the root private key is compromised or lost,
 then the blocktree must be abandoned, there is no recovery. This is real security, the artifact
-which grants control over the blocktree is the root private key. That is why storing the root
+which grants control over the blocktree is the root private key which is why storing the root
 private key in multiple secure cryptographic co-processors is so important.
 
+Nodes in a blocktree communicate with one another by sending blocktree messages. These messages
+are used for implementing consensus and distributed locking,
+as well as notifying parent nodes when a 
+child has written new data. The later mechanism allows the parent to replicate the data stored in
+a child for redundancy. User code is also able to initiate the sending of messages. Messages are
+addressed using blocktree paths. When a node receives a message that is not addressed to it,
+but is addressed to its blocktree, it forwards it to the closest node to the recipient that it
+is connected to. In order to enable efficient low-latency message transfers, nodes maintain open
+connections to the other nodes in their cluster, and the cluster leader maintains a connection to
+its parent. Diffie-Hellman key exchange is used to exchange a key for use in an AEAD cipher, and
+once this cipher context is established, the two nodes mutually authenticate each other using their
+respective key pairs. When a node comes online, it uses the global blocktree (described later)
+to find the other nodes in its cluster. If it is not part of a cluster, or this information is
+not stored in the global blocktree, then it instead looks up the
+IP address of a root node and connects to it. The root node may then direct the node to connect to
+one of root's children, and this process repeats until the new node is connected to its parent.
+
 A concept that has proven to be very useful in the world of filesystems is the symbolic link.
 This is a short file that contains the path to another file, and is interpreted by most programs
 as being a "link" to that file. Blocktree supports a similar system, where a block can be
-marked as a symbolic link and its body contains a blocktree path. This also provides us with
+marked as a symbolic link and a blocktree path placed in its body. This also provides us with
 a convenient way of storing readcaps for data that a node would otherwise not have access to.
 For instance a symbolic link could be created which points to a block in another user's blocktree.
 The other user only knows the public key of the owner of our blocktree, so they issue
-a readcap to it. But the root nodes, when given the user's password, can open this readcap and extract
+a readcap to it. But the root nodes can open this readcap and extract
 the block key. This key can then be encapsulated using the public key of the node which
 requires access, and placed in the symbolic link. When the node needs to read the data
 in the block, it opens the readcap in the symbolic link, follows the link to the block (how
 that actually happens will be discussed below) and decrypts its contents.
 
 While the consistency of individual blocks can be maintained using Raft, a distributed locking
-mechanism is employed to enable transactions which span multiple blocks.
-This is
+mechanism is employed to enable transactions which span multiple blocks. This is
 accomplished by exploiting the hierarchical arrangement of nodes in the tree. In order to describe
 this, its first helpful to define a new term. The \emph{nodetree} of a blocktree is tree obtained
 from the blocktree by collapsing all the blocks that a node (or cluster of nodes) is responsible