Преглед изворни кода

Made revisions to the paper and created an image of an example blocktree.

Matthew Carr пре 2 година
родитељ
комит
8665339a0f
3 измењених фајлова са 387 додато и 213 уклоњено
  1. 21 0
      doc/Book/example_bt.gv
  2. 124 0
      doc/Book/example_bt.svg
  3. 242 213
      doc/Paper/Paper.tex

+ 21 - 0
doc/Book/example_bt.gv

@@ -0,0 +1,21 @@
+digraph {
+    none [shape = none, label = ""]
+    root [shape = folder, label = ""];
+    phone [shape = invtriangle, label = ""];
+    editor [shape = circle, label = ""];
+    laptop [shape = invtriangle, label = ""];
+    viewer [shape = circle, label = ""];
+    docs [shape = folder, label = ""];
+    resignation [shape = box, label = ""];
+    contract [shape = box, label = ""];
+    avatar [shape = box, label = ""];
+    none -> root [label = "<root>"]
+    root -> phone [label = "phone"];
+    phone -> viewer [label = "<editor>"];
+    root -> laptop [label = "laptop"];
+    laptop -> editor [label = "<editor>"];
+    root -> docs [label = "docs"];
+    docs -> resignation [label = "resignation.docx"];
+    docs -> contract [label = "contract.docx"];
+    root -> avatar [label = "avatar.usd"];
+}

+ 124 - 0
doc/Book/example_bt.svg

@@ -0,0 +1,124 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 5.0.1 (0)
+ -->
+<!-- Pages: 1 -->
+<svg width="355pt" height="305pt"
+ viewBox="0.00 0.00 355.00 305.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 301)">
+<polygon fill="white" stroke="none" points="-4,4 -4,-301 351,-301 351,4 -4,4"/>
+<!-- none -->
+<g id="node1" class="node">
+<title>none</title>
+</g>
+<!-- root -->
+<g id="node2" class="node">
+<title>root</title>
+<polygon fill="none" stroke="black" points="162,-210 159,-214 138,-214 135,-210 108,-210 108,-174 162,-174 162,-210"/>
+</g>
+<!-- none&#45;&gt;root -->
+<g id="edge1" class="edge">
+<title>none&#45;&gt;root</title>
+<path fill="none" stroke="black" d="M135,-260.8C135,-249.16 135,-233.55 135,-220.24"/>
+<polygon fill="black" stroke="black" points="138.5,-220.18 135,-210.18 131.5,-220.18 138.5,-220.18"/>
+<text text-anchor="middle" x="154.5" y="-231.8" font-family="Times,serif" font-size="14.00">&lt;root&gt;</text>
+</g>
+<!-- phone -->
+<g id="node3" class="node">
+<title>phone</title>
+<polygon fill="none" stroke="black" points="27,-87 54,-114 0,-114 27,-87"/>
+</g>
+<!-- root&#45;&gt;phone -->
+<g id="edge2" class="edge">
+<title>root&#45;&gt;phone</title>
+<path fill="none" stroke="black" d="M107.93,-180.25C94.78,-174.27 79.23,-166 67,-156 55.42,-146.53 44.96,-133.2 37.65,-122.68"/>
+<polygon fill="black" stroke="black" points="40.54,-120.71 32.07,-114.33 34.71,-124.59 40.54,-120.71"/>
+<text text-anchor="middle" x="84" y="-144.8" font-family="Times,serif" font-size="14.00">phone</text>
+</g>
+<!-- laptop -->
+<g id="node5" class="node">
+<title>laptop</title>
+<polygon fill="none" stroke="black" points="99,-87 126,-114 72,-114 99,-87"/>
+</g>
+<!-- root&#45;&gt;laptop -->
+<g id="edge4" class="edge">
+<title>root&#45;&gt;laptop</title>
+<path fill="none" stroke="black" d="M121.46,-173.94C117.63,-168.46 113.75,-162.19 111,-156 106.52,-145.92 103.57,-133.91 101.71,-124.19"/>
+<polygon fill="black" stroke="black" points="105.12,-123.37 100.01,-114.09 98.22,-124.53 105.12,-123.37"/>
+<text text-anchor="middle" x="128" y="-144.8" font-family="Times,serif" font-size="14.00">laptop</text>
+</g>
+<!-- docs -->
+<g id="node7" class="node">
+<title>docs</title>
+<polygon fill="none" stroke="black" points="198,-123 195,-127 174,-127 171,-123 144,-123 144,-87 198,-87 198,-123"/>
+</g>
+<!-- root&#45;&gt;docs -->
+<g id="edge6" class="edge">
+<title>root&#45;&gt;docs</title>
+<path fill="none" stroke="black" d="M142.29,-173.8C147.31,-161.93 154.09,-145.93 159.8,-132.45"/>
+<polygon fill="black" stroke="black" points="163.05,-133.75 163.73,-123.18 156.6,-131.02 163.05,-133.75"/>
+<text text-anchor="middle" x="167.5" y="-144.8" font-family="Times,serif" font-size="14.00">docs</text>
+</g>
+<!-- avatar -->
+<g id="node10" class="node">
+<title>avatar</title>
+<polygon fill="none" stroke="black" points="270,-123 216,-123 216,-87 270,-87 270,-123"/>
+</g>
+<!-- root&#45;&gt;avatar -->
+<g id="edge9" class="edge">
+<title>root&#45;&gt;avatar</title>
+<path fill="none" stroke="black" d="M159.72,-173.94C167.58,-168.36 176.25,-162.04 184,-156 194.7,-147.67 206.14,-138.13 216.07,-129.63"/>
+<polygon fill="black" stroke="black" points="218.38,-132.26 223.67,-123.08 213.81,-126.96 218.38,-132.26"/>
+<text text-anchor="middle" x="229.5" y="-144.8" font-family="Times,serif" font-size="14.00">avatar.usd</text>
+</g>
+<!-- viewer -->
+<g id="node6" class="node">
+<title>viewer</title>
+<ellipse fill="none" stroke="black" cx="27" cy="-18" rx="18" ry="18"/>
+</g>
+<!-- phone&#45;&gt;viewer -->
+<g id="edge3" class="edge">
+<title>phone&#45;&gt;viewer</title>
+<path fill="none" stroke="black" d="M27,-86.8C27,-75.16 27,-59.55 27,-46.24"/>
+<polygon fill="black" stroke="black" points="30.5,-46.18 27,-36.18 23.5,-46.18 30.5,-46.18"/>
+<text text-anchor="middle" x="51.5" y="-57.8" font-family="Times,serif" font-size="14.00">&lt;editor&gt;</text>
+</g>
+<!-- editor -->
+<g id="node4" class="node">
+<title>editor</title>
+<ellipse fill="none" stroke="black" cx="99" cy="-18" rx="18" ry="18"/>
+</g>
+<!-- laptop&#45;&gt;editor -->
+<g id="edge5" class="edge">
+<title>laptop&#45;&gt;editor</title>
+<path fill="none" stroke="black" d="M99,-86.8C99,-75.16 99,-59.55 99,-46.24"/>
+<polygon fill="black" stroke="black" points="102.5,-46.18 99,-36.18 95.5,-46.18 102.5,-46.18"/>
+<text text-anchor="middle" x="123.5" y="-57.8" font-family="Times,serif" font-size="14.00">&lt;editor&gt;</text>
+</g>
+<!-- resignation -->
+<g id="node8" class="node">
+<title>resignation</title>
+<polygon fill="none" stroke="black" points="198,-36 144,-36 144,0 198,0 198,-36"/>
+</g>
+<!-- docs&#45;&gt;resignation -->
+<g id="edge7" class="edge">
+<title>docs&#45;&gt;resignation</title>
+<path fill="none" stroke="black" d="M171,-86.8C171,-75.16 171,-59.55 171,-46.24"/>
+<polygon fill="black" stroke="black" points="174.5,-46.18 171,-36.18 167.5,-46.18 174.5,-46.18"/>
+<text text-anchor="middle" x="216.5" y="-57.8" font-family="Times,serif" font-size="14.00">resignation.docx</text>
+</g>
+<!-- contract -->
+<g id="node9" class="node">
+<title>contract</title>
+<polygon fill="none" stroke="black" points="312,-36 258,-36 258,0 312,0 312,-36"/>
+</g>
+<!-- docs&#45;&gt;contract -->
+<g id="edge8" class="edge">
+<title>docs&#45;&gt;contract</title>
+<path fill="none" stroke="black" d="M198.36,-90.63C201.25,-89.35 204.17,-88.12 207,-87 231.32,-77.4 243.58,-86.5 263,-69 269.78,-62.88 274.58,-54.29 277.92,-45.95"/>
+<polygon fill="black" stroke="black" points="281.25,-47.04 281.2,-36.45 274.63,-44.75 281.25,-47.04"/>
+<text text-anchor="middle" x="310" y="-57.8" font-family="Times,serif" font-size="14.00">contract.docx</text>
+</g>
+</g>
+</svg>

+ 242 - 213
doc/Paper/Paper.tex

@@ -8,7 +8,7 @@
 \title{Blocktree \\
 \large A platform for distributed computing.}
 \author{Matthew Carr}
-\date{May 28, 2022}
+%\date{May 28, 2022}
 \begin{document}
 \maketitle
 \begin{abstract}
@@ -25,91 +25,116 @@ the details of backing up user data and implementing access controls to facilita
 safe sharing. However, because these are closed systems, users are forced to trust that
 the operators are benevolent, and they lack any real way of ensuring that the access
 controls they prescribe will actually be enforced. There have been several systems proposed
-as alternatives to the conventional model, but these systems suffer from their own shortcomings.
-They either assume the need for cloud storage providers (Blockstack \cite{blockstack}) or implement
-all operations using a global blockchain, limiting performance (Filecoin \cite{filecoin}).
-Blocktree takes a different approach.
-
-The idea behind blocktree is to organize a user's computers into a cooperative unit, called a
-blocktree. The user is said to own the blocktree, and they wield sovereign authority over it.
-The artifact granting them this authority is the root private key of the blocktree. Measures for protecting
-this key and delegating its authority are important design considerations of the system.
+as alternatives to the conventional model \cite{blockstack, filecoin}, but these systems
+suffer from their own shortcomings which Blocktree aims to address.
+
+The idea behind blocktree is to organize a user's computing devices and data into a single
+tree, called a blocktree. The user is said to own the blocktree, and they wield sovereign authority
+over it.
+The artifacts granting them this authority are their private keys, one for use in an encryption
+scheme and the other in a signing scheme.
+Measures for managing these keys and delegating their authority are important design considerations
+of the system.
 The owners of blocktrees are encouraged to collaborate with each other to replicate data by
-means of a cryptocurrency known as blockcoin. The blockchain implementing this cryptocurrency
-is the source of global state for the system, and allows for the creation of global paths.
-
-All data stored in blocktree is contained in units called blocks. As the name would suggest, the blocks
-in a blocktree form a tree. Each block has a path corresponding to its location in the tree. The first
-component of a fully qualified blocktree path is the fingerprint of the root public key of the
-blocktree. Thus a blocktree path can globally specify a block. If a block is not a leaf,
-then it is called a directory, and the data it contains is managed by the system.
-This information includes the list of blocks which are children of the directory as well as the
-list of nodes which are attached to the block tree at this directory. In addition
-to its payload of data,
-each block has metadata containing cryptographic access control mechanisms. These mechanisms ensure
+means of a cryptocurrency known as blockcoin.
+The blockchain implementing this cryptocurrency is the source of global state for the system,
+and allows for the creation of a name resolution mechanism.
+
+A blocktree consists of four different types of blocks: files, directories, servers and processes.
+Each block has a path corresponding to its location in the tree.
+A server is responsible for storing the files and directories that are children of the directory
+it's contained in.
+When multiple servers are contained in the same directory, they form a cluster, and run
+the Raft consensus protocol \cite{raft} to ensure consistency of the data they store.
+A process is either the child of a server, or the child of the process which started it.
+Processes operate under the actor model and exchange messages that are addressed using paths.
+System calls are implemented using this same messaging facility.
+
+The first component of a fully qualified path is the fingerprint of the blocktree owner's public
+signing key, allowing a block to be globally specified.
+The contents of directories are managed by the system,
+and they may contain links to files, other directories and servers.
+In addition to its payload of data, each file and directory has system managed metadata.
+These metadata are used to implement cryptographic access control mechanism which ensure
 that only authorized users can read and optionally write to the block.
-Users and nodes in the blocktree system are identified by hashes of their public keys. These hashes
-are referred to as principals, and they are used for setting access control policy.
+Access control for a given security principal is set using a hash of the principal's public signing
+key.
 
 This paper is intended to be a short introduction to the ideas of blocktree. A book is planned which
 will specify the system in greater detail. In keeping with the agile software methodology, this
 book is being written concurrently with an open source implementation of the system.
 The remainder of this paper is organized as follows:
-\begin{itemize}
+\begin{enumerate}
 \item A description of the operations of a single blocktree.
 \item The definition of a blockchain which provides global state and links individual blocktrees
 together.
 \item The programming interface for interacting with blocktrees and sending messages.
 \item An exploration of applications that could be written using this platform.
 \item Conclusion.
-\end{itemize}
+\end{enumerate}
 
 \section{An Individual Blocktree}
-The atomic unit of data storage, confidentiality and authenticity is the block. A
-block contains a payload of data. Confidentiality of this data is achieved by encrypting it using 
-a symmetric cipher using a random key. This random key is known as the block key.
-The block key can be encapsulated using the public key of a principal whose is to be given access.
-The resulting ciphertext is stored in the metadata of the block. Thus
-the person possessing the corresponding private key will be able to access the contents of
-the block. Blocks are arranged into trees, and the parent block also has a block key.
-The child's block key is always encapsulated using the parent's key and stored in the block
-metadata. This ensures that when a principal is given read access to a block, it automatically has
-access to every child of that block. The encapsulated block key is known as a read capability,
-or readcap, as it grants the holder the ability to read the block. In the case of the root block
-there is no parent block key to issue a readcap to. Instead, the root block always contains a
-readcap for the the root key.
-
-A note on visualizing a blocktree. In this paper I will describe the root block as being at the
-bottom of the tree, and describe the descendants of a block as being above it. The reader may
-recognize this as the ``trees grow up" model of formal logic, and it was chosen because such trees
-appear visually similar to real world trees when rendered in a virtual environment.
-
-Authenticity guarantees are provided using a digital signature scheme. In order to change the
+Files and directories are collectively known as data blocks. Data blocks have three components:
+\begin{enumerate}
+  \item Their body, which is a sequence of bytes. This is managed by user code in the case of files
+    and by the system in the case of directories.
+  \item Their metadata, which is managed by the system.
+  \item The log of events which have been committed, or are in the process of being committed.
+    This is managed by the system.
+\end{enumerate}
+
+The body of a data block and its metadata are both covered by integrity protection.
+The log is not directory covered, but the events in it are.
+The body of the block may be optionally encrypted to provide it with confidentiality protection.
+
+Confidentiality of the body is achieved by encrypting it using a symmetric cipher using a random
+key.
+This random key is known as the block key.
+When a principal is to be given access to a data block, it's public encryption key is used to
+encrypt the block key.
+The resulting ciphertext is referred to as a read capability, or readcap, for short.
+It is stored in the metadata of the block in a dictionary under the key which is computed by hashing
+the principal's public signing key.
+We say that the principal has been \emph{issued} a readcap for the block.
+When the principal issued the readcap wishes to read the block it looks for a hash of its public
+signing key in the block's metadata, and if it find a value, uses it's private encryption key
+to decrypt the block key.
+
+If a block is not the root, then its parent's block key is used to encrypt its block key and the
+result is stored in the block's metadata.
+This enables anyone with a readcap for a block to read all blocks which are descended from it.
+This also enables the contents of any block to be decrypted using only a single private key
+decryption operation.
+The root always contains a readcap for the owner, ensuring they can grant a readcap to any block
+in their blocktree.
+
+Integrity protection is provided using a digital signature scheme. In order to change the
 contents of a block a data structure called a write capability, or writecap
 \footnote{The names readcap and writecap were taken from the Tahoe Least-Authority Filesystem
 \cite{tahoe}. The access control mechanism described in the Tahoe system heavily influenced the
 design of Blocktree.},
-is needed. A
-writecap is approximately an x509 certificate chain. A writecap contains the following data:
+is used.
+A writecap is a certificate chain and it contains the following data:
 \begin{itemize}
 \item The path the writecap can be used under.
 \item The principal that the writecap was issued to.
 \item The timestamp when the writecap expires.
 \item The public key of the principal who issued the writecap.
-\item A digital signature produced by the private key corresponding to the above public key.
-\item Optionally, the next writecap in the chain.
+\item A digital signature produced by the private key corresponding to the public key above.
+\item Optionally, the next certificate in the chain.
 \end{itemize}
 The last item is only excluded in the case of a self-signed writecap, i.e. one that was signed by
 the same principal it was issued to. A writecap is considered valid for use in a block if all
 of the following conditions are met:
 \begin{itemize}
-\item The signature on every writecap in the chain is valid.
-\item The signing principal matches the principal the next writecap was issued to for every
-writecap in the chain.
-\item The path of the block is contained in the path of every writecap in the chain.
-\item The current timestamp is strictly less than the expiration of all the writecaps in the
-chain.
-\item The principal corresponding to the public key used to sign the last writecap in the chain,
+\item The signature on every certificate in the chain is valid.
+\item The signing principal matches the principal the next certificate was issued to for every
+certificate in the chain.
+\item The path in every certificate is contained in the path of the next certificate for each
+certificate.
+\item The path of the block is contained in the path of every certificate in the chain.
+\item The current timestamp is strictly less than the expiration of all the certificates.
+\item The principal corresponding to the public key used to sign the last certificate,
 is the owner of the blocktree.
 \end{itemize}
 The intuition behind these rules is that a writecap is only valid if there is a chain of trust
@@ -117,55 +142,55 @@ that leads back to the owner of the blocktree. The owner may delegate their trus
 of intermediaries by issuing them writecaps. These writecaps are scoped based on the path
 specified when they are issued. These intermediaries can then delegate this trust as well.
 A block is considered valid if it contains a valid writecap, it was signed using the key
-corresponding to the first writecap's public key, and this signature is valid. Note that because
+corresponding to the first certificate's public key, and this signature is valid. Note that because
 the first component of the path is the fingerprint
-\footnote{By \emph{fingerprint} I mean the base64url encoding of the root principal, which is
-itself a hash of the root public key.}
-of the root public key, and the path is contained
-in the block, a block can be verified using only information contained within it.
+\footnote{By \emph{fingerprint} I mean the base64url encoding of a hash of the owner's public
+signing key.}
+of the owners public signing key, anyone who receives a block and knowns the block's path can verify
+that the block has not been altered.
 
 Blocks are used for more than just organizing data, they also organize computation. A program
-participating in a blocktree is referred to as a node. Multiple nodes may be run on
-a single computer. Every node is attached to the blocktree at a specific path. This information
-is recorded in the directory where the node is attached. A node is responsible for the storage of
-the directory where it is attached and all of the blocks that are recursively contained with in it.
-If there is a child node attached to a subdirectory contained in the directory the node is
-responsible for, then the child node is responsible for the subdirectory.
+participating in a blocktree is referred to as a server. Multiple servers may be run on
+a single computer. Every server is contained in a directory in the blocktree.
+A server is responsible for the storage of
+the directory where it is attached and all of the blocks that are recursively contained within it.
+If there is a child server attached to a subdirectory contained in the directory the server is
+responsible for, then the child server is responsible for the subdirectory.
 In this way data storage can be delegated, allowing the system to scale. When more than one
-node is attached to the same directory they form a cluster.
-Each node in the cluster contains a copy of the data that the cluster is responsible for. They
+server is attached to the same directory they form a cluster.
+Each server in the cluster contains a copy of the data that the cluster is responsible for. They
 maintain consistency of this data by running the Raft \cite{raft} consensus protocol.
 
-When a new blocktree is created a node generates a key pair to serve as the root keys.
+When a new blocktree is created a server generates a key pair to serve as the root keys.
 It is imperative for the security of the system that the root private key is protected, and it
 is highly recommended that it be stored in a Trusted Platform Module (TPM) \cite{tpm} and that
 the TPM
-be configured to disallow unauthenticated use of this key. The node then generates its own key pair
+be configured to disallow unauthenticated use of this key. The server then generates its own key pair
 and uses the root private key to issue itself a writecap for the root of the tree. Once it has
 this writecap, it creates the root block and generates a block key for it. A readcap is added to
-this block for the root public key and the node's public key. Additional
-cryptographic operations are performed using the node's key pair, and only when a new writecap
-needs to be created for an addition root node is the root private key used.
+this block for the root public key and the server's public key. Additional
+cryptographic operations are performed using the server's key pair, and only when a new writecap
+needs to be created for an addition root server is the root private key used.
 
-When a new node comes online and wishes to join the blocktree, it generates its own key pair.
+When a new server comes online and wishes to join the blocktree, it generates its own key pair.
 The public key of this
-node then needs to be transmitted to another node that's already part of the user's blocktree. The
-mechanism used will depend on the nature of the device on which the new node is running.
+server then needs to be transmitted to another server that's already part of the user's blocktree. The
+mechanism used will depend on the nature of the device on which the new server is running.
 For example, a phone could scan a QR code which contains
-the IP address of the user's root node, and then transmit its public key to that internet host.
-In order for the new node to be added to the user's blocktree, it needs to be issued a writecap
+the IP address of the user's root server, and then transmit its public key to that internet host.
+In order for the new server to be added to the user's blocktree, it needs to be issued a writecap
 and a readcap must be added to the directory where it will be attached.
 This could be accomplished
-by providing a user interface on the node which received the public key from the new node.
-This interface would show the requests that have been received from nodes attempting
+by providing a user interface on the server which received the public key from the new server.
+This interface would show the requests that have been received from servers attempting
 to join the blocktree. The user can then choose to approve or deny the request, and can specify
-the path where the new node will attach. If the user chooses to approve the request, then the writecap
-is signed using the node's key and transmitted to then new node.
+the path where the new server will attach. If the user chooses to approve the request, then the writecap
+is signed using the server's key and transmitted to then new server.
 
 The ability to cope with key compromise is an important design consideration in any real-world
-cryptosystem. In blocktree the compromise of a node key is handled by re-keying every block under
-the directory where the node was attached. Specifically, this means that a new block key is generated for
-each block, and the readcap for the compromised node is removed. This ensures that new writes to
+cryptosystem. In blocktree the compromise of a server key is handled by re-keying every block under
+the directory where the server was attached. Specifically, this means that a new block key is generated for
+each block, and the readcap for the compromised server is removed. This ensures that new writes to
 these blocks will not be visible to the holder of the compromised key. To ensure that writes
 will not be allowed, the root directory contains a revocation list containing the public keys
 which have been revoked by the tree. These only need to be maintained until their writecaps
@@ -174,72 +199,72 @@ then the blocktree must be abandoned, there is no recovery. This is real securit
 which grants control over the blocktree is the root private key which is why storing the root
 private key in multiple secure cryptographic co-processors is so important.
 
-Nodes in a blocktree communicate with one another by sending blocktree messages. These messages
+servers in a blocktree communicate with one another by sending blocktree messages. These messages
 are used for implementing consensus and distributed locking,
-as well as notifying parent nodes when a 
+as well as notifying parent servers when a 
 child has written new data. The later mechanism allows the parent to replicate the data stored in
 a child for redundancy. User code is also able to initiate the sending of messages. Messages are
-addressed using blocktree paths. When a node receives a message that is not addressed to it,
-but is addressed to its blocktree, it forwards it to the closest node to the recipient that it
-is connected to. In order to enable efficient low-latency message transfers, nodes maintain open
-TCP connections to the other nodes in their cluster, and the cluster leader maintains a connection to
+addressed using blocktree paths. When a server receives a message that is not addressed to it,
+but is addressed to its blocktree, it forwards it to the closest server to the recipient that it
+is connected to. In order to enable efficient low-latency message transfers, servers maintain open
+TCP connections to the other servers in their cluster, and the cluster leader maintains a connection to
 its parent. Diffie-Hellman key exchange is used to exchange a key for use in an AEAD cipher, and
-once this cipher context is established, the two nodes mutually authenticate each other using their
-respective key pairs. When a node comes online, it uses the global blocktree (described later)
-to find the other nodes in its cluster. If it is not part of a cluster, or this information is
+once this cipher context is established, the two servers mutually authenticate each other using their
+respective key pairs. When a server comes online, it uses the global blocktree (described later)
+to find the other servers in its cluster. If it is not part of a cluster, or this information is
 not stored in the global blocktree, then it instead looks up the
-IP address of a root node and connects to it. The root node may direct the node to connect to
-one of root's children, and this process repeats until the new node is connected to its parent.
+IP address of a root server and connects to it. The root server may direct the server to connect to
+one of root's children, and this process repeats until the new server is connected to its parent.
 
 A concept that has proven to be very useful in the world of filesystems is the symbolic link.
 This is a short file that contains the path to another file, and is interpreted by most programs
 as being a "link" to that file. Blocktree supports a similar system, where a block can be
 marked as a symbolic link and a blocktree path placed in its body. This also provides us with
-a convenient way of storing readcaps for data that a node would otherwise not have access to.
+a convenient way of storing readcaps for data that a server would otherwise not have access to.
 For instance a symbolic link could be created which points to a block in another user's blocktree.
 The other user only knows the public key of the owner of our blocktree, so they issue
-a readcap to it. But the root nodes can open this readcap and extract
-the block key. This key can then be encapsulated using the public key of the node which
-requires access, and placed in the symbolic link. When the node needs to read the data
+a readcap to it. But the root servers can open this readcap and extract
+the block key. This key can then be encapsulated using the public key of the server which
+requires access, and placed in the symbolic link. When the server needs to read the data
 in the block, it opens the readcap in the symbolic link, follows the link to the block (how
 that actually happens will be discussed below) and decrypts its contents.
 
 While the consistency of individual blocks can be maintained using Raft, a distributed locking
 mechanism is employed to enable transactions which span multiple blocks. This is
-accomplished by exploiting the hierarchical arrangement of nodes in the tree. In order to describe
-this, its first helpful to define a new term. The \emph{nodetree} of a blocktree is tree obtained
-from the blocktree by collapsing all the blocks that a node (or cluster of nodes) is responsible
-for into a single logical block representing the node itself. Thus we can talk about a node having
-a parent, and by this we mean its parent in the nodetree. In terms of the blocktree, the parent
-of a node is the first node encountered when the path back to the root is traversed.
+accomplished by exploiting the hierarchical arrangement of servers in the tree. In order to describe
+this, its first helpful to define a new term. The \emph{servertree} of a blocktree is tree obtained
+from the blocktree by collapsing all the blocks that a server (or cluster of servers) is responsible
+for into a single logical block representing the server itself. Thus we can talk about a server having
+a parent, and by this we mean its parent in the servertree. In terms of the blocktree, the parent
+of a server is the first server encountered when the path back to the root is traversed.
 Now, distributed locking works as follows:
 \begin{itemize}
-\item A node sends a request to lock a block to the current consensus leader in its cluster.
-  If the node is not part of a cluster, then it is the leader. This request contains a timestamp
+\item A server sends a request to lock a block to the current consensus leader in its cluster.
+  If the server is not part of a cluster, then it is the leader. This request contains a timestamp
   for when the lock expires.
 \item If the leader is responsible for the block then it moves on to the next step. Otherwise
   it contacts its parent and forwards the lock request and this step is repeated for the parent.
-\item The responsible node checks to see if there is already a lock for this block. If there is
+\item The responsible server checks to see if there is already a lock for this block. If there is
   then the request fails. Otherwise the request succeeds and a lock is placed on the block. A
-  message indicating the result is then passed back up the tree ending at the original node. This
-  message includes the principal of the node enforcing the lock.
-\item Once the locking node is done making its updates it sends a message directly to the node
+  message indicating the result is then passed back up the tree ending at the original server. This
+  message includes the principal of the server enforcing the lock.
+\item Once the locking server is done making its updates it sends a message directly to the server
   enforcing the lock, causing it to be removed.
 \end{itemize}
 Locking a block locks the subtree rooted at that block. Thus no writes to any path contained in
-the path of the locked block will be allowed, unless they come from locking node. If the locking
-node does not send the message
+the path of the locked block will be allowed, unless they come from locking server. If the locking
+server does not send the message
 unlocking the block before the lock expires, then the modifications which had been performed by
-it are dropped and the block reverts to its prior state. Since the locking node is the leader
+it are dropped and the block reverts to its prior state. Since the locking server is the leader
 of the consensus cluster that is responsible for the block, this guarantees that
-writes from other nodes will not be accepted.
+writes from other servers will not be accepted.
 
 \section{Connecting Blocktrees}
-In order to allow nodes to access blocks in other blocktrees, a global ledger of events is used.
+In order to allow servers to access blocks in other blocktrees, a global ledger of events is used.
 This ledger is implemented using a proof of work (PoW) blockchain and a corresponding 
-cryptocurrency known as blockcoin. Nodes mine chain blocks (not to be confused with the tree 
+cryptocurrency known as blockcoin. Servers mine chain blocks (not to be confused with the tree 
 blocks we've been discussing up till now) in the same way they do in other PoW blockchain
-systems such as Bitcoin \cite{bitcoin}. The node which manages to mine the next chain block receives
+systems such as Bitcoin \cite{bitcoin}. The server which manages to mine the next chain block receives
 a reward,
 which is the sum of the fees for each event in the chain and a variable amount of newly minted
 blockcoin. The amount of new blockcoin created by a chain block is directly proportional to the
@@ -247,7 +272,7 @@ amount of data storage events contained in the chain block. Thus the total amoun
 in circulation has a direct relationship to the amount of data stored in the system, reflecting
 the fact that blockcoin exists to provide an accounting mechanism for data.
 
-When a node writes data to a tree block, and it wishes this block to be globally accessible or
+When a server writes data to a tree block, and it wishes this block to be globally accessible or
 replicated for redundancy,
 it produces what are called fragments. Fragments are the output symbols of the RaptorQ code
 \cite{raptorq}. This algorithm is an example of an Erasure Code, which is a class of fountain codes
@@ -256,41 +281,41 @@ the original data. Such a code ensures that even if some of the fragments are lo
 remain, the original data can be recovered.
 
 Once these fragments have been computed an event is created for each one and published to the
-blockchain. This event indicates to other nodes that this node wishes to store a fragment and
+blockchain. This event indicates to other servers that this server wishes to store a fragment and
 states the amount of blockcoin it will pay for each of the fragment's maintenance payments. When
-another nodes wishes to accept the offer, it directly contacts the first node, which then sends 
+another servers wishes to accept the offer, it directly contacts the first server, which then sends 
 it the fragment and publishes an event stating that the fragment is stored with the second 
-node. This event includes the path of the block the fragment was computed from, the fragment's 
-ID (the sequence number from the erasure code), and the principal of the node which stored it.
-Thus any other node in the network can use the information contained in this event to
-determine the set of nodes which contain the fragments of any given path.
-
-In order for the node which stored a fragment to receive its next payment, it has to pass
-a time-bound challenge-response protocol initiated by the node that owns the fragment.
-The owning node select a leaf in the Merkle tree of the fragment and sends the index of
-this leaf to the storing node. The storing node then walks the path from this leaf back to
+server. This event includes the path of the block the fragment was computed from, the fragment's 
+ID (the sequence number from the erasure code), and the principal of the server which stored it.
+Thus any other server in the network can use the information contained in this event to
+determine the set of servers which contain the fragments of any given path.
+
+In order for the server which stored a fragment to receive its next payment, it has to pass
+a time-bound challenge-response protocol initiated by the server that owns the fragment.
+The owning server select a leaf in the Merkle tree of the fragment and sends the index of
+this leaf to the storing server. The storing server then walks the path from this leaf back to
 the root of the Merkle tree, and updates a hash value using the data in each node it traverses.
-It sends this result back to the owning node which verifies that this value matches its
-own computation. If it does then the owning node signs a message indicating that the challenge
-passed and that the storing node should be paid. The storing node receives this message and uses
+It sends this result back to the owning server which verifies that this value matches its
+own computation. If it does, then the owning server signs a message indicating that the challenge
+passed and that the storing server should be paid. The storing server receives this message and uses
 it to construct an event, which it signs and publishes to the blocktree. This event causes
-the blockcoin amount specified to be transferred from the owning node's to the storing
-node's blocktree.
+the blockcoin amount specified to be transferred from the owning server's to the storing
+server's account.
 
-The fact that payments occur over time provides a simple incentive for nodes to be honest and
-store the data they agree to. In banking terms, the storing node views the fragment as an
+The fact that payments occur over time provides a simple incentive for servers to be honest and
+store the data they agree to. In banking terms, the storing server views the fragment as an
 asset, it is a loan of its disk space which provides a series of payments over time.
-On the other hand the owning node views the fragment as a liability, it requires payments to
+On the other hand the owning server views the fragment as a liability, it requires payments to
 be made over time. In order for a blocktree owner to remain solvent, it must balance its
 liabilities with its assets, incentivizing it to store data for others so that its own data
 will be stored.
 
-In order for nodes to be able to contact other nodes, a mechanism is required for associating
-an internet protocol (IP) address with a principal. This is done by having nodes publish events
+In order for servers to be able to contact other servers, a mechanism is required for associating
+an internet protocol (IP) address with a principal. This is done by having servers publish events
 to the blockchain when their IP address changes. This event includes their new IP address,
-their public key, and a digital signature computed using their private key. Other nodes can
+their public key, and a digital signature computed using their private key. Other servers can
 then verify this signature to ensure that an attacker cannot bind the wrong
-IP address to a principal in order to receive messages not meant for it.
+IP address to a principal in order to receive messages meant for it.
 
 While this event ledger is useful for appending new 
 events, and ensuring previous events
@@ -298,78 +323,82 @@ cannot be changed, another data structure is required to enable efficient querie
 In particular, it's important to be able to quickly perform the
 following queries:
 \begin{itemize}
-\item Find the set of nodes storing the fragments for a given path.
-\item Find the IP address of a node or owner given a principal.
+\item Find the set of servers storing the fragments for a given path.
+\item Find the IP address of a server or owner given a principal.
 \item Find the public key associated with a principal.
 \end{itemize}
-To enable these queries a special blocktree is maintained by each node in the network: the global
+To enable these queries a special blocktree is maintained by each server in the network: the global
 blocktree. This tree does not support the usual writing and locking semantics of local blocktrees.
 In functional programming terms, it can be thought of as a left fold over all of the events in the
-blockchain.
+blockchain starting from the empty state.
 The above queries are facilitated by the following blocks:
 \begin{itemize}
 \item \emph{/global/fragments}: this block contains a hashtable where the key is a path and the
-value is the list of nodes storing the fragments for the block at that path.
+value is the list of servers storing the fragments for the block at that path.
 \item \emph{/global/principals}: contains a hashtable where the key is the a principal and the value
 is the tuple containing the public key of that principal, its current IP address, and its current
 blockcoin balance.
 \end{itemize}
-To compute the entries in these tree blocks, the nodes in the network iterate over all the chain blocks, updating
-their local copy of each tree block appropriately. The experienced reader will recognize that this is an event 
+To compute the entries in these tree blocks, the servers in the network iterate over all the chain blocks, updating
+their local copy of each tree block appropriately. The reader may recognize this as an event 
 sourced architecture. Currently only these two tree blocks are known to be needed, but if new events are
 added to the system it can be easily extended to enable queries that have yet to be envisioned.
 
 \section{Programming Environment}
 Enabling an excellent developer experience is one of the primary goals of this system (the others being security
-and scalability). Nodes execute user code that has been compiled into WebAssembly modules \cite{wasm}. Such code
-running on a blocktree node is referred to as an "app". An app
-executes in a sandbox that isolates it from other code, as well as the security critical operations of the node
-itself. The sandbox provides the code with an extension of the WebAssembly System Interface (WASI), with extra 
-system calls to interact with the particulars of the blocktree system.
-The extra system calls fall into these categories:
+and performance). servers execute user code that has been compiled into WebAssembly modules \cite{wasm}. Such code
+running on a blocktree server is referred to as a process. A process 
+executes in a sandbox that isolates it from other processes, as well as the security critical operations of the server
+itself. The sandbox provides the code with an extension of the WebAssembly System Interface (WASI),
+with extra functions to send and receive blocktree messages.
+The standard POSIX filesystem APIs are used to interact with the contents of blocktrees.
+For instance, a file descriptor for a block can be obtained by calling path\_open.
+Additional non-POSIX functionality is implemented by adding
+messages which are handled by the server. This functionality includes the following:
 \begin{itemize}
 \item Distributed Locking
 \item Messaging
 \item Supervision Trees
 \item Protocol Contracts (Session Types)
 \end{itemize}
-The standard WASI filesystem APIs are used
-to interact with the contents of blocktrees. For instance a file descriptor for a block can be obtained
-by calling path\_open. Writes and reads of blocks are performed using the privileges of the node on which
-the app is running, but the node may limit the app's access depending on its permissions.
-
-When an app is installed it is given a directory under which it can store data that is shared between all nodes
-in the blocktree. The path of this block is formed by prefixing the path the app was published at
-with the string ``/apps". When an app is runs on a node, it is confined to a block contained
-in the node's directory. It is only allowed read and write blocks in this block, but to allow it to
-access shared data, a symbolic link is created to the app's shared directory in ``/apps". 
-
-% App Publishing
-Apps are published by writing them into a blocktree. The path of the directory used to publish an app is used to
-identify it. Only one app per directory is allowed. This directory contains the WebAssembly module itself as
-well as a JSON manifest. This manifest defines the app's user-friendly name as well as the list
-of permissions it requires. This list of permissions is used to determine which APIs the app has access to.
+This additional functionality is described later in this section.
+
+% Package Publishing
+These modules are distributed in packages stored in a blocktree. A package contains one or more
+Wasm modules and a TOML manifest file which describes the package.
+This manifest defines the package's user-friendly name as well as the list
+of permissions it requires.
+This list of permissions is used to determine which APIs the package has access to.
+These artifacts are then
+placed in a zip file and stored in a blocktree file. This ensures the integrity of the code in the
+package, as all blocks in a blocktree are integrity protected. 
+
+When a package is installed it's given a directory under which it can store data that is shared between all servers
+in a blocktree. The path of this block is formed by prefixing the path the package was published at
+with the string ``/apps". When a package is runs on a server, it is confined to a block contained
+in the server's directory. It is only allowed read and write blocks in this block, but to allow it to
+access shared data, a symbolic link is created to the package's shared directory in ``/apps". 
 
 % Privacy Safe vs Unsafe
-Apps are broken into two categories: those that are privacy safe and those that are not.
-An app is privacy unsafe if it requests any permissions which allow it to send data
+Packages are broken into two categories: those that are privacy safe and those that are not.
+A package is privacy unsafe if it requests any permissions which allow it to send data
 outside of the blocktree it's part of. Thus requesting the ability to
-open a TCP socket would cause an app to be privacy unsafe. Similarly, the creation of a protocol
-handler for HTTP would also be privacy unsafe. Privacy unsafe apps can limit the scope of their
+open a TCP socket would cause a package to be privacy unsafe. Similarly, the creation of a protocol
+handler for HTTP would also be privacy unsafe. Privacy unsafe packages can limit the scope of their
 unsafety by
-imposing limits on the unsafe APIs that they request. For instance an app which needs to send
+imposing limits on the unsafe APIs that they request. For instance a package which needs to send
 blocktree message back to the blocktree it was published in can request the messaging permission
-for a path in this tree. Similarly, an app which only wants to open a TCP socket listening on the
+for a path in that tree. Similarly, a package which only wants to open a TCP socket listening on the
 local network, can limit the scope of its requested permission.
 
 % Protocol Contracts
-In order to make it easy to write apps which interface with existing systems, most notably
-those using HTTP, apps are able to define protocols using a contract language and then register
-call backs for these protocols. This works by providing a system call which the app can
-supply a protocol contract to. This contract is then compiled into a state machine by the node
-and a handle is returned to the app. This handle is then used to register callbacks for different
-parts of the protocol. This ensures that the protocol is handled by the node itself and that
-protocols can be shared between many different apps as a library. For instance, the HTTP protocol
+In order to make it easy to write packages which interface with existing systems, most notably
+those using HTTP, packages are able to define protocols using a contract language and then register
+call backs for these protocols. This works by providing a system call which the package can
+supply a protocol contract to. This contract is then compiled into a state machine by the server
+and a handle is returned to the package. This handle is then used to register callbacks for different
+parts of the protocol. This ensures that the protocol is handled by the server itself and that
+protocols can be shared between many different packages as a library. For instance, the HTTP protocol
 would be compiled to a particularly simply state machine, with only one state: the listening state.
 This state would expose a hook that where a callback can be registered to handle a request.
 This state also defines the record format used to pass the request information to the callback
@@ -377,30 +406,30 @@ and the record format of the return value that is expected in order to produce t
 More complicated (stateful) protocols would have more states, each defining their own request and
 response records, as well as hooks. One nice thing about this setup is that it will enable
 optimizations where the state machine and the user callbacks can be compiled into a program
-which can be safely run in the node itself, or even in a SmartNIC. This would require that
+which can be safely run in the server itself, or even in a SmartNIC. This would require that
 the callbacks only use an approved set of APIs, but could enable much higher performance.
 
 % Supervision Trees
-Apps can arrange themselves into supervision trees, in the same way that Erlang
-processes are arranged \cite{armstrong}. In this scheme, when a child app crashes, or the node its
-running on dies (which is detected by other nodes), then the app receives a message.
+Processes can arrange themselves into supervision trees, in the same way that Erlang
+processes are arranged \cite{armstrong}. In this scheme, when a child package crashes, or the server its
+running on dies (which is detected by other servers), then the process receives a message.
 In the
-simplest case this can be used to implement a logging system, where crashes and node
+simplest case this can be used to implement a logging system, where crashes and server
 deaths are recorded. More interestingly, this can be used to integrate with a control
 plane. For instance, if a blocktree were running in AWS, when a message is received indicating
-that a node has died, a new EC2 instance could be started to replace it. The reliability of Erlang
+that a server has died, a new EC2 instance could be started to replace it. The reliability of Erlang
 and other system employing the Actor Model have shown the robustness of this approach.
-Apps can form this relationship after they've started running, provided that both of the apps
-have permission to send messages to each other. Apps, with the appropriate permissions, can also
-spawn other apps on descendent nodes. This can be used to implement map-reduce workloads, where
-an app spawns mapping jobs on descendent nodes containing the data of interest, and it processes
+Processes can form this relationship after they've started running, provided that both of the processes 
+have permission to send messages to each other. Processes, with the appropriate permissions, can also
+spawn other processes on descendent servers. This can be used to implement map-reduce workloads, where
+a process spawns mapping jobs on descendent servers containing the data of interest, and it processes
 their messages to compute the final reduction. Due to the tiny size of most programs, this is a much
-more efficient approach than moving the data to the nodes performing the computation.
+more efficient approach than moving the data.
 
-\section{A Brave New Web}
+\section{Potential Applications}
 In order to explore how blocktree can be used, the design of several hypothetical systems
 is discussed. It's important to note that blocktree does not try to force all computation
-to be local to a user's device, it merely tries to enable this for applications where it
+to be local to a user's device, but it tries to enable this for applications where it
 is possible.
 
 \subsection{Contacts and Mail}
@@ -415,7 +444,7 @@ it would be easy to fool the user into associating an attacker's key for the per
 
 The user now has a way of associating a blocktree with the name of this person. However, the
 root public key of this blocktree is not enough to establish secure communications, because
-the root private key is not available to every node in the person's blocktree. In particular
+the root private key is not available to every server in the person's blocktree. In particular
 it would be inadvisable for the root private key to be stored on a user's mobile device. To
 address this mailbox directories are created.
 
@@ -423,11 +452,11 @@ For each contact two directories are created: the inbox and the outbox. The user
 for another user's root key and adds it to their outbox. The inbox for the other user is a symbolic
 link to the user's outbox in the blocktree of the other user. Thus each user
 can write messages into their own blocktree at a location where the other party knows how
-to find them. But, in order for a node to read these messages it requires a its own readcap.
+to find them. But, in order for a server to read these messages it requires a its own readcap.
 Only the root
-nodes can issue this readcap as only they have access to the root key. Once permission has been
-granted to a node, a root node can use the root key to decrypt the readcap issued to it, and then
-encrypt it using the public key of the node. The resulting readcap is then stored in the metadata
+servers can issue this readcap as only they have access to the root key. Once permission has been
+granted to a server, a root server can use the root key to decrypt the readcap issued to it, and then
+encrypt it using the public key of the server. The resulting readcap is then stored in the metadata
 of the inbox.
 
 In addition to being able to check the inbox for mail, a blocktree message is sent to the receiving
@@ -453,7 +482,7 @@ is a symbolic link to a single status update block. This symbolic link contains
 readcap for the status update block.
 
 Comments and likes on status updates are implemented by sending mail to the user who posted the
-update. When one of the user's nodes receives this mail, it then updates the block containing
+update. When one of the user's servers receives this mail, it then updates the block containing
 the status update with the new comment, or increments the like counter.
 It then sends mail to the people this status update was shared with, notifying them
 that the data has changed.
@@ -467,14 +496,14 @@ inventory even when their internet service goes down, and that these updates nee
 on the website once connectivity is restored.
 
 To accomplish this a designer could create a directory for each warehouse. This directory would have
-nodes attached to it that are physically located at each warehouse. The inventory of the warehouse
+servers attached to it that are physically located at each warehouse. The inventory of the warehouse
 is then maintained in the warehouse's directory, satisfying the first requirement. Now, in order to
 enable efficient queries of the overall available inventory, the data from each of the warehouses
 needs to be merged. This is accomplished by creating another directory containing the merged data.
 In event sourcing terms this is called a read-model. This directory will have another cluster of
-nodes attached which will act as web servers. These nodes will subscribe to events published by the
+servers attached which will act as web servers. These servers will subscribe to events published by the
 warehouse clusters, events indicating changing levels of inventory, and digest this information into
-a format that can be efficiently queried. These nodes will also publish events to the warehouses
+a format that can be efficiently queried. These servers will also publish events to the warehouses
 when orders are placed, cancelled or amended.
 
 When a warehouse goes offline, its previous inventory counts are
@@ -494,7 +523,7 @@ I hope this example shows that having a standard format for data and the federat
 can provide designers with much greater flexibility, even if they do not care about decentralization
 or their user's privacy.
 
-\subsection{The Open Metaverse}
+\subsection{A Metaverse}
 As a final example I'd like to consider a platform for recording spacial information. The key insight
 that enables this is very general: blocktree enables the creation of distributed tree-like data structures.
 For instance, its straight forward to imagine creating a distributed hashtable implemented as a red-black tree.
@@ -519,8 +548,8 @@ pointing to a block in the owner's blocktree. They can then write whatever data
 the contents of their parcel. Collaboration on a single parcel is accomplished by issuing a read and writecap to
 another user.
 
-It's easy to imagine that one world would be more important than the rest and that he creation of a metaverse 
-representation of Earth will be an important undertaking. The hierarchical nature of permissions in blocktree
+It's easy to imagine that one world would be more important than the rest and that the creation of a metaverse 
+representation of the Earth will be an important undertaking. The hierarchical nature of permissions in blocktree
 make such a shared world possible. National blocktrees could be given ownership of their virtual territory,
 this would then
 be delegated down to the state and municipal levels. Finally municipalities would delegate individual parcels
@@ -533,7 +562,7 @@ In this paper I have given the outline of a decentralized method of organizing i
 and trust in a way that I hope will be useful and easy to use. The use of cryptographic primitives for
 implementing access control were discussed, as well as methods of protecting private keys. A blockchain
 and corresponding cryptocurrency was proposed as a means of incentivizing the distribution of data.
-Erasure coding was used to ensure that distributed data is resilient to the loss of nodes and can be
+Erasure coding was used to ensure that distributed data is resilient to the loss of servers and can be
 reconstructed efficiently. A programming environment based on WASM and WASI was proposed as a way
 of providing an interface to this data. APIs for defining protocol contracts and efficient web servers
 were indicated. APIs for constructing supervision trees were mentioned as a means for building reliable