3 years ago · 204c2e991c
--- a/doc/paper/BlockTree.tex
+++ b/doc/paper/BlockTree.tex
@@ -1,11 +1,12 @@
 
				 \documentclass{book}
			
 
				 \usepackage{amsfonts,amssymb,amsmath,amsthm}
			
 
				-\usepackage[scale=0.80]{geometry}
			
 
				+\usepackage[scale=0.75]{geometry}
			
 
				 \usepackage{hyperref}
			
 
				 
			
 
				 \begin{document}
			
 
				 \tableofcontents
			
 
				 \chapter{System Overview}
			
 
				+% I should replace "consumer" with "user".
			
 
				 The development of the internet was undoudedtly one of the greatest achievements in the 20th 
			
 
				 century, and the internet's killer app, the web, has reshaped our lives and the way we do
			
 
				 business. But, for all the benefits we have received from these technologies there have
			
@@ -35,8 +36,8 @@ confidentiality of consumer data, and the ability to authenticate consumers with
 
				 need for insecure techniques, such as passwords.
			
 
				 
			
 
				 This document proposes a potential solution. It describes a system for organizing information into
			
 
				-trees of blocks, distributing those blocks over a network of nodes, and a programming interface
			
 
				-to access this information in a convenient way. Because no one piece of hardware
			
 
				+trees of blocks, the distribution of those blocks over a network of nodes, and a programming interface
			
 
				+to access this information. Because no one piece of hardware
			
 
				 is infallible, the system also includes mechanisms for nodes to contract with one another to store
			
 
				 data. This allows data to be backed up and later restored in the case a node is lost. In order to
			
 
				 ensure the free exchange of data amongst nodes, a digital currency is used to account for the
			
@@ -46,126 +47,169 @@ The remainder of this chapter will give an overview of the system, with the rema
 
				 document going into specific details of each of the system's components.
			
 
				 
			
 
				 \section{Blocks}
			
 
				-The basis of all trust in the system is a public-private keypair. The secrecy of the private key
			
 
				-is the linchpin of all security in the system. If this key is compromised then all confidentiality
			
 
				-and authenticity assurances are void. Further, if this key is lost, then control of the
			
 
				-resources over which it has agency is also lost.
			
 
				-
			
 
				-% Should I remove this paragraph? Seems like this is an implementation details that should
			
 
				-% be saved for later.
			
 
				-Because of this, it's very important to protect this key by storing it in a secure location.
			
 
				-It is envisioned that a TPM will be used for this purpose. A TPM will be important for other
			
 
				-system features, as we'll see later.
			
 
				-
			
 
				-All data stored in the system is put into data structures called blocks. Each block has a path
			
 
				-which describes it's location in the tree. The root of this path is always a hash (hex encoded)
			
 
				-of the public key corresponding to the private key which is the root of trust for the tree. Each block is encrypted by a symmetric cipher using a randomly generated key, which is referred
			
 
				-to as the block's key. The block key for the root block is
			
 
				-encrypted using the root public key and the resulting cipher text is stored in the block itself,
			
 
				-along with a hash of the public key for later identification. For all non-root blocks in the tree,
			
 
				-their block key is encrypted using the block key of
			
 
				-their parent, and the resulting ciphertext is stored in the block.
			
 
				-This ensures that we can allow a party access to all
			
 
				-blocks in a subtree by simply encrypting the block key at the root of that subtree using the
			
 
				-party's public key. This mechanism is used to give node's that are controlled by the
			
 
				-root selective access to data stored in the tree, by encrypting the block key of the root of the
			
 
				-subtree using the node's public key.
			
 
				-
			
 
				-Integrity assurance of the block's content is achieved by a digital signature which covers all of
			
 
				-the block's contents (except the signature itself of course). A certificate chain
			
 
				-starting with the root key and ending with the key used by make this signature is included with 
			
 
				-the block. This ensures that any node which has been issued a certificate using the root key is
			
 
				-able to write data into the tree. In particular the path of the block is covered, and since the
			
 
				-path includes a hash of the root key, this means that anyone is able to independently verify
			
 
				-the authenticity of the block by checking that the certificate chain was indeed signed by
			
 
				-the root, that all other signatures in the chain are valid, and finally that the signature on
			
 
				-the block itself is valid.
			
 
				-
			
 
				-The size of each block has yet to be determined. It's envisioned that they will be fairly large,
			
 
				-on the order of 4 MB, so as to amortize the overhead of storing a certificate chain, and 
			
 
				-encapsulated keys as well as the cost of cryptographic operations.
			
 
				+User data stored in this system is organized into structures called \emph{block trees}. Every block tree
			
 
				+is identified with a public key. The private key that corresponds to a block tree's public key is
			
 
				+required to control that tree. Any person who has the private key for a block tree is called that
			
 
				+tree's owner.
			
 
				+
			
 
				+Computers participating in this system are called \emph{nodes}. Nodes are also identified by public keys, but
			
 
				+these keys are not directly tied to block trees. Nodes that have access to a block trees data are said
			
 
				+to be \emph{attached} to that block tree. Nodes can be attached to multiple block trees at once, or none
			
 
				+at all.
			
 
				+
			
 
				+Block trees are of course trees of \emph{blocks}. Every block is identified by a string called a \emph{path},
			
 
				+which describes its location in the tree. The root of this path is a hash (hex encoded)
			
 
				+of the tree's public key, allowing blocks from any tree to be referred to. A block consists of three segments:
			
 
				+a header, a payload and a signature.
			
 
				+
			
 
				+The payload is encrypted by a symmetric cipher using a randomly generated key. This randomly
			
 
				+generated key is called the \emph{block's key}. To allow access to the payload, the block's key is encapsulated
			
 
				+using other keys and the resulting cipher texts are stored in the block's header. These encapculated keys
			
 
				+are referred to as read capabilities, or \emph{read caps} for short.
			
 
				+The root node of every block tree contains a read cap for the block tree's public key.
			
 
				+Every non-root block contains a read cap
			
 
				+for the block's parent, which is to say the block's key is encapsulated using the its parent's block key.
			
 
				+So when one has a read cap for a block, they can read the data in all blocks descended from that
			
 
				+block. Because the owner of a block tree has a read cap for the root block, they can read all data
			
 
				+stored in the tree. Other people (or nodes) can be given access to a subtree by granting them a read
			
 
				+cap for the subtree's root. A block which contains public data is stored as cleartext with no read caps.
			
 
				+
			
 
				+While read caps provide for confidentiality, write caps provide for integrity. A \emph{write cap}
			
 
				+for a block is a certificate chain which terminates at a certificate signed by the block tree's
			
 
				+owner. Thus a self-signed certificate made using the tree's private key is a valid write cap
			
 
				+for any block in the tree. By allowing a chain of certificates to be used, it's possible for
			
 
				+the owner to give other people or nodes the ability to write data into their tree. The scope of
			
 
				+this access is controlled by specifying the path under which writing is allowed to the certificiate.
			
 
				+A write cap for a block is only valid if the path of the block is a contained in the path
			
 
				+specified in every certificate in the chain.
			
 
				+
			
 
				+Both the header and the payload of a Block are protected using a private key signature. The writer
			
 
				+of the block computes this signature using the private key which corresponds to the write cap 
			
 
				+for the block they're trying to write. In order to validate a block, this signature is validated, then
			
 
				+the Write Cap is validated, and finally the hash of the public key of
			
 
				+the last signer in the Write Cap chain is compared to the root of the Block's path. If these match,
			
 
				+then the block is valid, as this means that an owner has given permission for the writer to write
			
 
				+into their tree at this path.
			
 
				+
			
 
				+Accessing the data in a block requires several cryptographic operations, both for vaidation and
			
 
				+for decryption. Because of this its important that blocks are relatively large, on the order of
			
 
				+4 MB, to amortize the cost of these operations.
			
 
				 
			
 
				 \section{Fragments}
			
 
				 By itself this block structure would be useful for building a secure filesystem, but in order to
			
 
				-be a durable storage system it needs an efficient way of backing up, or rather distributing data.
			
 
				-This is the purpose of fragments.
			
 
				-
			
 
				-Blocks are distributed amongst nodes in the network by using a fountain code. The output symbols
			
 
				-of this code are referred to as fragments. A code with a high performance implementation and good
			
 
				-coding efficiency is an important design consideration for the system. For these reasons it's
			
 
				-envisioned that the RaptorQ code will be used.
			
 
				-
			
 
				-After a block is created the creating node will need to distribute the data in the block to other
			
 
				-nodes to ensure its persistence in case the node fails. It will create fragments as needed
			
 
				-and advertise to other node's its desire to store them. Currency controlled by the root key is 
			
 
				-exchanged with these other nodes in exchange for contracts to store the fragments.
			
 
				-
			
 
				-When a node needs to rebuild data that was previously distributed in fragments, it connects to a
			
 
				-subset of nodes containing fragments and, in parallel, downloads enough fragments to reconstruct 
			
 
				-the block. This same mechanism can be used to distribute block data to unaffiliated nodes in the
			
 
				-network. It is a convenient load balancing and performance improvement, as the parallel downloads
			
 
				-spread the load over multiple nodes and are not limited by the bandwidth between any pair.
			
 
				-
			
 
				-We keep track of which nodes hold fragments of a block by storing the IDs of these nodes in the
			
 
				-block's parent. This list of node IDs can then be resolved to a list of IP addresses by looking
			
 
				-up data in a shared data structure called the Public Blocktree.
			
 
				+be a durable storage system we need an efficient way of distributing data for redundancy and
			
 
				+availability. This is the purpose of fragments.
			
 
				+
			
 
				+Blocks are distributed amongst nodes in the network using a fountain code. The output symbols
			
 
				+of this code are referred to as \emph{fragments}. A code with a high performance implementation and good
			
 
				+coding efficiency is an important design consideration for the system. For these reasons the
			
 
				+RaptorQ code was chosen.
			
 
				+
			
 
				+In order to preserve the data in a newly created block, a node will need to distribute
			
 
				+fragments to other nodes. It does this by advertising its desire to trade [currency]
			
 
				+in its block tree for the storage of these fragments. \emph{[currency]} is a fungible
			
 
				+token for the exchange of computing resources between nodes. Every block tree has
			
 
				+some non-negative value for the amount of [currency] it controls. Nodes that are attached
			
 
				+to a tree spend the tree's [currency] when paying other nodes for the storage of fragments.
			
 
				+
			
 
				+If another node is interested in making the exchange, it contacts the advertising node
			
 
				+and both sign a contract. A \emph{contract} is a data structure signed by both nodes which
			
 
				+states that hash of the fragment being stored and the amount of [currency] being exchanged
			
 
				+for its storage. The contract is then stored in the public block tree (to be discussed below),
			
 
				+so that [currency] can be transerfed between nodes and to create an accountability mechanism
			
 
				+to prevent the storing node from acting in bad faith and deleting the fragment.
			
 
				+
			
 
				+When a node needs to retreive a block that was previously distributed in fragments, it connects to a
			
 
				+subset of nodes containing the fragments and downloads enough to reconstruct 
			
 
				+the block. These downloads can be performed concurrently for greater speed. This same mechanism
			
 
				+can be used to distribute public blocks to unaffiliated nodes. This mechanism facilitates load balancing
			
 
				+and performance, as concurrent downloads
			
 
				+spread the load over multiple nodes and are not limited by the bandwidth between any pair of nodes.
			
 
				+
			
 
				+The list of nodes containing the fragments of a block is called the block's \emph{node list}.
			
 
				+A block's node list is stored in it's parent. This allows for any non-root block to be retreived.
			
 
				+To allow the root block to be retrieved its node list is stored in the public block tree.
			
 
				 
			
 
				 \section{The Public Blocktree}
			
 
				-The Public blocktree is just another block tree, but one which is controlled by a distinguished
			
 
				-private key, whose public key is hard-coded into the other node's in the network. This blocktree
			
 
				+\emph{The Public Block Tree} is a block tree which is known to all nodes. This is accomplished by
			
 
				+providing all nodes with a hardcoded list of nodes that are attached to the public block tree.
			
 
				+This is similar to the list of root DNS servers distributed with any networked operating system.
			
 
				+Because the public block tree is only used for storing information that should be known to all
			
 
				+nodes in the network, the payload of every block in it is cleartext. The public block tree serves
			
 
				+only to facilitate the communication and exchange of data between nodes.
			
 
				+
			
 
				+One way that it does this is by containing a database of nodes and their IP addresses. A node
			
 
				+which has a write cap to this database will only store an entry for a node if that node can provide
			
 
				+a valid signed request. This signed request is stored in the database verbatim, so that other nodes can
			
 
				+independently verify its validity. Thus the nodes in the network can use this database to securely resolve
			
 
				+the IDs of other node's to their IP addresses.
			
 
				+
			
 
				+The other function of the public block tree is to contain a list of transactions and disputes.
			
 
				+This list is referred to as the \emph{public log}.
			
 
				+When a node is created, an event is logged detailing the amount of [currency] the node is worth.
			
 
				+When a node is first attached to a block tree, this [currency] is then removed from the node
			
 
				+and added to the block tree. When a node signs a contract with another node, it is stored in the
			
 
				+log and [currency] is removed by the sending node's block tree and added to the receiving node's.
			
 
				+
			
 
				+In order to discourage nodes from receive payment for the storage of a fragment, then deleting
			
 
				+the fragment to reclaim disk space, a reporting mechanism exists. If a node is unable to retrieve
			
 
				+a fragment that it previously stored with another node, then it sends an event to the log
			
 
				+indicating this. The other node can then respond by sending an event which contains the actual
			
 
				+fragment which was requested. This allows all the nodes in the network to view the log and
			
 
				+see if a node that they are considering signing a contract with is trustworthy. If they
			
 
				+are not the defendant in any disputes, then they should be safe. If they are in one, but responded
			
 
				+quickly with the fragment, then it could have been a transient network issue. If they never
			
 
				+responded, then they are risky and should perhaps receive a lower payment for the storage
			
 
				+of the fragment.
			
 
				+
			
 
				+Finally, the public block tree stores node lists for the root blocks of every block tree.
			
 
				+This ensures that even if every node that participates in a block tree fails, the block
			
 
				+tree can still be recovered from its fragments, provided its private key is known.
			
 
				 
			
 
				 \section{Nodes and the Network}
			
 
				-Each node in the network is identified by a public-private keypair and is issued a certificate
			
 
				-trusted by the public root key. Nodes can be claimed by issuing them a certificate and
			
 
				-then writing
			
 
				-that certificate into the public blocktree. When a new node is claimed, currency is deposited into
			
 
				-the account of the root key which claimed it. This currency is to account for the storage capacity
			
 
				-that the new node brings to the network. This mechanism is the reason why the node must have a
			
 
				-certificate trusted by the public root key, otherwise there would be no way to control the
			
 
				+Each node in the network has a public-private keypair. The string formed by hex encoding the
			
 
				+hash of a node's public key is referred to as the \emph{node ID} of the node. When nodes
			
 
				+are manufactured they are issued a certificate trusted by the
			
 
				+public block tree. New nodes are claimed by issuing them a certificate and then writing
			
 
				+that certificate into the public log. When a new node is claimed, currency is credited to
			
 
				+the block tree which claimed it. This currency is to account for the storage capacity
			
 
				+that the new node brings to that block tree. This mechanism is the reason why the node must have a
			
 
				+certificate trusted by the public block tree, otherwise there would be no way to control the
			
 
				 creation of currency.
			
 
				 
			
 
				-Nodes are identified by the hex encouding of the hash of their public key. This string is written
			
 
				-into directory blocks to keep track of which nodes contain fragments of blocks in that directory.
			
 
				-In order for this information to be useful, a mechanism is needed to resolve node IDs to IP
			
 
				-addresses. This is accomplished by writing a block into the public blocktree with the node's 
			
 
				-IP address which is signed by the node's private key. Anytime the node receives a new IP address,
			
 
				-it updates this block to inform the network of this change. Because the node's ID is derived from
			
 
				-its private key, and the block containing its IP address is signed with this key, it's possible
			
 
				-for a third party to independently verify that this IP address is authentic.
			
 
				-
			
 
				-When a node is given access to a blocktree, by issuing it a certificate, it is assigned a path
			
 
				-under which its data will be stored. This path is referred to as the node's home. This path is
			
 
				-written into the node's certificate. Thus a node's home is cryptographically verifiable and must
			
 
				-be chosen when the node joins the blocktree. The node which issues the new node it's certificate
			
 
				-grants the new node access to the block at its home path by encapsulating the block's key using
			
 
				-the node's public key. Thus the new node can recover the block key and read the contents of its
			
 
				-home block. If a node has its home at a block then its ID is written into the block.
			
 
				-
			
 
				-The data created by a node may optionally replicated to its parent node. This would be suitable
			
 
				+Nodes are identified by their node ID in the public block tree and in node lists. Nodes
			
 
				+are responsible for updating their IP address in the public block tree whenever it changes.
			
 
				+
			
 
				+When a node is attached to a block tree it is issued a certificate containing a path
			
 
				+under which its data will be stored. We say the the node is attached to the block tree at that path.
			
 
				+The node which issues the node its certificate creates a read cap for it and stores it in the
			
 
				+block where the node is attached.
			
 
				+
			
 
				+The data created by a node may optionally be replicated in its parent node. This would be suitable
			
 
				 for a lightweight or mobile device which needs to ensure its data is replicated immediately and
			
 
				-doesn't have time to negotiate contracts for the storage fragments. However, for larger 
			
 
				-blocktrees, having non-replicating nodes is essential for scalability.
			
 
				+doesn't have time to negotiate contracts for the storage of fragments. For larger 
			
 
				+block trees, having non-replicating nodes is essential for scalability.
			
 
				 
			
 
				-More than one node can be housed at the same path, such nodes are called a cluster.
			
 
				+More than one node can be attached at the same path, and when this happens a \emph{cluster} is formed.
			
 
				 Each node in the cluster stores
			
 
				 copies of the same data and they coordinate with each other to ensure the consitency of this
			
 
				-data. This is accomplished by electing a leader. All writes to blocks under the nodes home are
			
 
				+data. This is accomplished by electing a leader. All writes to blocks under the attachment point are
			
 
				 sent to the leader. The leader then serializes these writes and sends them to the rest of the
			
 
				 nodes. By default writes to blocks use optimistic concurrency, with the last write known to the
			
 
				-leader being the winner. But if a node requires exclusive access to the data in a block it can
			
 
				-request to the leader to lock it. Writes from nodes other than the locking node are rejected until
			
 
				+leader being the winner. But if a node requires exclusive access to a block it can
			
 
				+make a request to the leader to lock it. Writes from nodes other than the locking node are rejected until
			
 
				 the lock is released. The leader will release the lock on its own if no messages are received from
			
 
				 the locking node after a timeout.
			
 
				 
			
 
				-If a path is configured to be replicated to its parent, then the leader at that path will maintain
			
 
				-a connection to the the leader in a parent cluster. Note that the parent cluster need not be
			
 
				+If the attachment point is configured to be replicated to its parent, then the leader will maintain
			
 
				+a connection to the the leader in the parent cluster. Note that the parent cluster need not be
			
 
				 housed in the parent block, just at some ancestor block. Then, writes will be propagated through
			
 
				 this connection to the parent cluster, where this process may continue if that cluster is also
			
 
				 configured for replication. Distributed locking is similarly comunicated to the parent cluster,
			
 
				 where the lock is only aquired with the parent's approval.
			
 
				 
			
 
				 \section{Programmatic Access to Data}
			
 
				-No designer can hope to envsion all the potential applications that a consumer would want to have
			
 
				+No designer can hope to envsion all the potential applications that a person would want to have
			
 
				 access to their data. That's why an important component of the system is the ability to run
			
 
				 programs that can access data and provide services to other internet hosts, whether they are
			
 
				 blocktree nodes or not. This is accomplished by providing a WebAssembly based execution