Paper.tex 37 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531
  1. \documentclass{article}
  2. \usepackage{amsfonts,amssymb,amsmath}
  3. \usepackage[scale=0.75]{geometry}
  4. \usepackage{hyperref}
  5. \title{Blocktree \\
  6. \large A platform for distributed computing.}
  7. \author{Matthew Carr}
  8. \date{May 28, 2022}
  9. \begin{document}
  10. \maketitle
  11. \begin{abstract}
  12. Blocktree is a platform which aims to make developing reliable distributed systems easy and safe.
  13. It does this by defining a format for data stored in the system, a mechanism for replicating
  14. that data, and a programming interface for accessing it. It aims to solve the difficult problems
  15. of cryptographic key management, consensus, and global addressing so that user code can focus on
  16. solving problems germane to their application.
  17. \end{abstract}
  18. \section{Introduction}
  19. The online services that users currently have access to are incredible. They handle all
  20. the details of backing up user data and implementing access controls to facilitate
  21. safe sharing. However, because these are closed systems, users are forced to trust that
  22. the operators are benevolent, and they lack any real way of ensuring that the access
  23. controls they prescribe will actually be enforced. There have been several systems proposed
  24. as alternatives to the conventional model, but these systems suffer from several shortcomings.
  25. They either assume the need for cloud storage providers (Blockstack) or implement all operations
  26. using a global blockchain, limiting performance (Filecoin). Blocktree takes a different approach.
  27. The idea behind blocktree is to organize a user's computers into a cooperative unit, called a
  28. blocktree. The user is said to own the blocktree, and they wield sovereign authority over it.
  29. The artifact granting them this authority is the root private key for the blocktree. Measures for protecting
  30. this key and delegating its authority are important design considerations of the system.
  31. The owners of blocktrees are encouraged to collaborate with each other to replicate data by
  32. means of a cryptocurrency known as blockcoin. The blockchain implementing this cryptocurrency
  33. is the source of global state for the system, and allows for the creation of global paths.
  34. All data stored in blocktree is contained in units called blocks. As the name would suggest, the blocks
  35. in a blocktree form a tree. Each block has a path corresponding to its location in the tree. The first
  36. component of a fully qualified blocktree path is the fingerprint of the root public key of the
  37. blocktree. Thus a blocktree path can globally specify a block. If a block is not a leaf,
  38. then it is called a directory, and the data it contains is managed by the system.
  39. This information includes the list of blocks which are children of the directory as well as the
  40. list of nodes which are attached to the block tree at this directory. In addition
  41. to its payload of data,
  42. each block has a header containing cryptographic access control mechanisms. These mechanisms ensure
  43. that only authorized users can read and optionally write to the block.
  44. Users and nodes in the blocktree system are identified by hashes of their public keys. These hashes
  45. are referred to as principals, and they are used for setting access control policy.
  46. This paper is intended to be short introduction to the ideas of blocktree. A book is planned which
  47. will specify the system in greater detail. In keeping with the agile software methodology, this
  48. book is being written concurrently with an open source implementation of the system.
  49. The remainder of this paper is organized as follows:
  50. \begin{itemize}
  51. \item A description of the operations of a single blocktree.
  52. \item The definition of a blockchain which provides global state and links individual blocktrees
  53. together.
  54. \item The programming interface for interacting with blocktrees and sending messages.
  55. \item An exploration of applications that could be written using this platform.
  56. \item Conclusion.
  57. \end{itemize}
  58. \section{An Individual Blocktree}
  59. The atomic unit of data storage, confidentiality and authenticity is the block. A
  60. block contains a payload of data. Confidentiality of this data is achieved by encrypting it using
  61. a symmetric cipher using a random key. This random key is known as the block key.
  62. The block key can be encapsulated using the public key of a principal whose is to be given access.
  63. The resulting ciphertext is stored in the header of the block. Thus
  64. the person possessing the corresponding private key will be able to access the contents of
  65. the block. Blocks are arranged into trees, and the parent block also has a block key.
  66. The child's block key is always encapsulated using the parent's key and stored in the block
  67. header. This ensures that when a principal is given read access to a block, it automatically has
  68. access to every child of that block. The encapsulated block key is known as a read capability,
  69. or readcap, as it grants the holder the ability to read the block. In the case of the root block
  70. there is no parent block key to issue a readcap to. Instead, the root block always contains a
  71. readcap for the the root key.
  72. A note on visualizing a blocktree. In this paper I will describe the root block as being at the
  73. bottom of the tree, and describe the descendants of a block as being above it. The reader may
  74. recognize this as the ``trees grow up" model of formal logic, and it was chosen because such trees
  75. appear visually similar to real world trees when rendered in a virtual environment.
  76. Authenticity guarantees are provided using a digital signature scheme. In order to change the
  77. contents of a block a data structure called a write capability, or writecap, is needed. A
  78. writecap is approximately an x509 certificate chain. A writecap contains the following data:
  79. \begin{itemize}
  80. \item The path the writecap can be used under.
  81. \item The principal that the writecap was issued to.
  82. \item The timestamp when the writecap expires.
  83. \item The public key of the principal who issued the writecap.
  84. \item A digital signature produced by the private key corresponding to the above public key.
  85. \item Optionally, the next writecap in the chain.
  86. \end{itemize}
  87. The last item is only excluded in the case of a self-signed writecap, i.e. one that was signed by
  88. the same principal it was issued to. A writecap is considered valid for use in a block if all
  89. of the following conditions are met:
  90. \begin{itemize}
  91. \item The signature on every writecap in the chain is valid.
  92. \item The signing principal matches the principal the next writecap was issued to for every
  93. writecap in the chain.
  94. \item The path of the block is contained in the path of every writecap in the chain.
  95. \item The current timestamp is strictly less than the expiration of all the writecaps in the
  96. chain.
  97. \item The principal corresponding to the public key used to sign the last writecap in the chain,
  98. is the owner of the blocktree.
  99. \end{itemize}
  100. The intuition behind these rules is that a writecap is only valid if there is a chain of trust
  101. that leads back to the owner of the blocktree. The owner may delegate their trust to any number
  102. of intermediaries by issuing them writecaps. These writecaps are scoped based on the path
  103. specified when they are issued. These intermediaries can then delegate this trust as well.
  104. A block is considered valid if it contains a valid writecap, it was signed using the key
  105. corresponding to the first writecap's public key, and this signature is valid.
  106. Blocks are used for more than just organizing data, they also organize computation. A program
  107. participating in a blocktree is referred to as a node. Multiple nodes may be run on
  108. a single computer. Every node is attached to the blocktree at a specific path. This information
  109. is recorded in the directory where the node is attached. A node is responsible for the storage of
  110. the directory where it is attached and all of the blocks that are recursively contained with in it.
  111. If there is a child node attached to a subdirectory contained in the directory the node is
  112. responsible for, then the child node is responsible for the subdirectory.
  113. In this way data storage can be delegated, allowing the system to scale. When more than one
  114. node is attached to the same directory they form a cluster.
  115. Each node in the cluster contains a copy of the data that the cluster is responsible for. They
  116. maintain consistency of this data by running the Raft consensus protocol.
  117. When a new blocktree is created a node generates a key pair to serve as the root keys.
  118. It is imperative for the security of the system that the root private key is protected, and it
  119. is highly recommended that it be stored in a Trusted Platform Module (TPM) and that the TPM
  120. be configured to disallow unauthenticated use of this key. The node then generates its own key pair
  121. and uses the root private key to issue itself a writecap for the root of the tree. Once it has
  122. this writecap, it creates the root block and generates a block key for it. A readcap is added to
  123. this block for the root public key and the node's public key. Additional
  124. cryptographic operations are performed using the node's key pair, and only when a new writecap
  125. needs to be created for an addition root node is the root private key used.
  126. When a new node comes online and wishes to join the blocktree, it generates its own key pair.
  127. The public key of this
  128. node then needs to be transmitted to another node that's already part of the user's blocktree. The
  129. mechanism used will depend on the nature of the device on which the new node is running.
  130. For example, a phone could scan a QR code which contains
  131. the IP address of the user's root node, and then transmit its public key to that internet host.
  132. In order for the new node to be added to the user's blocktree, it needs to be issued a writecap
  133. and a readcap must be added to the directory where it will be attached.
  134. This could be accomplished
  135. by providing a user interface on the node which received the public key from the new node.
  136. This interface would show the requests that have been received from nodes attempting
  137. to join the blocktree. The user can then choose to approve or deny the request, and can specify
  138. the path where the new node will attach. If the user chooses to approve the request, then the writecap
  139. is signed using the node's key and transmitted to then new node.
  140. The ability to cope with key compromise is an important design consideration in any real-world
  141. cryptosystem. In blocktree the compromise of a node key is handled by re-keying every block under
  142. the directory where the node was attached. Specifically, this means that a new block key is generated for
  143. each block, and the readcap for the compromised node is removed. This ensures that new writes to
  144. these blocks will not be visible to the holder of the compromised key. To ensure that writes
  145. will not be allowed, the root directory contains a revocation list containing the public keys
  146. which have been revoked by the tree. These only need to be maintained until their writecaps
  147. expire, after which time the list can be cleaned. Note that if the root private key is compromised or lost,
  148. then the blocktree must be abandoned, there is no recovery. This is real security, the artifact
  149. which grants control over the blocktree is the root private key which is why storing the root
  150. private key in multiple secure cryptographic co-processors is so important.
  151. Nodes in a blocktree communicate with one another by sending blocktree messages. These messages
  152. are used for implementing consensus and distributed locking,
  153. as well as notifying parent nodes when a
  154. child has written new data. The later mechanism allows the parent to replicate the data stored in
  155. a child for redundancy. User code is also able to initiate the sending of messages. Messages are
  156. addressed using blocktree paths. When a node receives a message that is not addressed to it,
  157. but is addressed to its blocktree, it forwards it to the closest node to the recipient that it
  158. is connected to. In order to enable efficient low-latency message transfers, nodes maintain open
  159. connections to the other nodes in their cluster, and the cluster leader maintains a connection to
  160. its parent. Diffie-Hellman key exchange is used to exchange a key for use in an AEAD cipher, and
  161. once this cipher context is established, the two nodes mutually authenticate each other using their
  162. respective key pairs. When a node comes online, it uses the global blocktree (described later)
  163. to find the other nodes in its cluster. If it is not part of a cluster, or this information is
  164. not stored in the global blocktree, then it instead looks up the
  165. IP address of a root node and connects to it. The root node may then direct the node to connect to
  166. one of root's children, and this process repeats until the new node is connected to its parent.
  167. A concept that has proven to be very useful in the world of filesystems is the symbolic link.
  168. This is a short file that contains the path to another file, and is interpreted by most programs
  169. as being a "link" to that file. Blocktree supports a similar system, where a block can be
  170. marked as a symbolic link and a blocktree path placed in its body. This also provides us with
  171. a convenient way of storing readcaps for data that a node would otherwise not have access to.
  172. For instance a symbolic link could be created which points to a block in another user's blocktree.
  173. The other user only knows the public key of the owner of our blocktree, so they issue
  174. a readcap to it. But the root nodes can open this readcap and extract
  175. the block key. This key can then be encapsulated using the public key of the node which
  176. requires access, and placed in the symbolic link. When the node needs to read the data
  177. in the block, it opens the readcap in the symbolic link, follows the link to the block (how
  178. that actually happens will be discussed below) and decrypts its contents.
  179. While the consistency of individual blocks can be maintained using Raft, a distributed locking
  180. mechanism is employed to enable transactions which span multiple blocks. This is
  181. accomplished by exploiting the hierarchical arrangement of nodes in the tree. In order to describe
  182. this, its first helpful to define a new term. The \emph{nodetree} of a blocktree is tree obtained
  183. from the blocktree by collapsing all the blocks that a node (or cluster of nodes) is responsible
  184. for into a single logical block representing the node itself. Thus we can talk about a node having
  185. a parent, and by this we mean its parent in the nodetree. In terms of the blocktree, the parent
  186. of a node is the first node encountered when the path back to the root is traversed.
  187. Now, distributed locking works as follows:
  188. \begin{itemize}
  189. \item A node sends a request to lock a block to the current consensus leader in its cluster.
  190. If the node is not part of a cluster, then it is the leader. This request contains a timestamp
  191. for when the lock expires.
  192. \item If the leader is responsible for the block then it moves on to the next step. Otherwise
  193. it contacts its parent and forwards the lock request and this step is repeated for the parent.
  194. \item The responsible node checks to see if there is already a lock for this block. If there is
  195. then the request fails. Otherwise the request succeeds and a lock is placed on the block. A
  196. message indicating the result is then passed back up the tree ending at the original node. This
  197. message includes the principal of the node enforcing the lock.
  198. \item Once the locking node is done making its updates it sends a message directly to the node
  199. enforcing the lock, causing it to be removed.
  200. \end{itemize}
  201. Locking a block locks the subtree rooted at that block. Thus no writes to any path contained in
  202. the path of the locked block will be allowed, unless they come from locking node. If the locking
  203. node does not send the message
  204. unlocking the block before the lock expires, then the modifications which had been performed by
  205. it are dropped and the block reverts to its prior state. Since the locking node is the leader
  206. of the consensus cluster that is responsible for the block, this guarantees that
  207. writes from other nodes will not be accepted.
  208. \section{Connecting Blocktrees}
  209. In order to allow nodes to access blocks in other blocktrees, a global ledger of events is used.
  210. This ledger is implemented using a proof of work (PoW) blockchain and a corresponding
  211. cryptocurrency known as blockcoin. Nodes mine chain blocks (not to be confused with the tree
  212. blocks we've been discussing up till now) in the same way they do in other PoW blockchain
  213. systems such as BitCoin. The node which manages to mine the next chain block receives a reward,
  214. which is the sum of the fees for each event in the chain and a variable amount of newly minted
  215. blockcoin. The amount of new blockcoin created by a chain block is directly proportional to the
  216. amount of data storage events contained in the chain block. Thus the total amount of blockcoin
  217. in circulation has a direct relationship to the amount of data stored in the system, reflecting
  218. the fact that blockcoin exists to provide an accounting mechanism for data.
  219. When a node writes data to a tree block, and it wishes this block to be globally accessible or
  220. replicated for redundancy,
  221. it produces what are called fragments. Fragments are the output symbols from an Erasure Coding
  222. algorithm (such as the RaptorQ code). These algorithms are a class of fountain codes which have
  223. the property that only $m$ out of $n$ (where $m < n$) symbols are needed to reconstruct the
  224. original data. Such a code ensures that even if some of the fragments are lost, as long as $m$
  225. remain, the original data can be recovered.
  226. Once these fragments have been computed an event is created for each one and published to the
  227. blockchain. This event indicates to other nodes that this node wishes to store a fragment and
  228. states the amount of blockcoin it will pay for each of the fragment's maintenance payments. When
  229. another nodes wishes to accept the offer, it directly contacts the first node, which then sends
  230. it the fragment and publishes an event stating that the fragment is stored with the second
  231. node. This event includes the path of the block the fragment was computed from, the fragment's
  232. ID (the sequence number from the erasure code), and the principal of the node which stored it.
  233. Thus any other node in the network can use the information contained in this event to
  234. determine the set of nodes which contain the fragments of any given path.
  235. In order for the node which stored a fragment to receive its next payment, it has to pass
  236. a time-bound challenge-response protocol initiated by the node that owns the fragment.
  237. The owning node select a leaf in the Merkle tree of the fragment and sends the index of
  238. this leaf to the storing node. The storing node then walks the path from this leaf back to
  239. the root of the Merkle tree, and updates a hash value using the data in each node it traverses.
  240. It sends this result back to the owning node which verifies that this value matches its
  241. own computation. If it does then the owning node signs a message indicating that the challenge
  242. passed and that the storing node should be paid. The storing node receives this message and uses
  243. it to construct an event, which it signs and publishes to the blocktree. This event causes
  244. the blockcoin amount specified to be transferred from the owning node's to the storing
  245. node's blocktree.
  246. The fact that payments occur over time provides a simple incentive for nodes to be honest and
  247. store the data they agree to. In banking terms, the storing node views the fragment as an
  248. asset, it is a loan of its disk space which provides a series of payments over time.
  249. On the other hand the owning node views the fragment as a liability, it requires payments to
  250. be made over time. In order for a blocktree owner to remain solvent, it must balance its
  251. liabilities with its assets, incentivizing it to store data for others so that its own data
  252. will be stored.
  253. In order for nodes to be able to contact other nodes, a mechanism is required for associating
  254. an internet protocol (IP) address with a principal. This is done by having nodes publish events
  255. to the blockchain when their IP address changes. This event includes their new IP address,
  256. their public key, and a digital signature computed using their private key. Other nodes can
  257. then verify this signature to ensure that an attacker cannot bind the wrong
  258. IP address to a principal in order to receive messages not meant for it.
  259. While this event ledger is useful for appending new
  260. events, and ensuring previous events
  261. cannot be changed, another data structure is required to enable efficient queries.
  262. In particular, it's important to be able to quickly perform the
  263. following queries:
  264. \begin{itemize}
  265. \item Find the set of nodes storing the fragments for a given path.
  266. \item Find the IP address of a node or owner given a principal.
  267. \item Find the public key associated with a principal.
  268. \end{itemize}
  269. To enable these queries a special blocktree is maintained by each node in the network: the global
  270. blocktree. This tree does not support the usual writing and locking semantics of local blocktrees.
  271. In functional programming terms, it can be thought of as a left fold over all of the events in the
  272. blockchain.
  273. The above queries are facilitated by the following blocks:
  274. \begin{itemize}
  275. \item \emph{/global/fragments}: this block contains a hashtable where the key is a path and the
  276. value is the list of nodes storing the fragments for the block at that path.
  277. \item \emph{/global/principals}: contains a hashtable where the key is the a principal and the value
  278. is the tuple containing the public key of that principal, its current IP address, and its current
  279. blockcoin balance.
  280. \end{itemize}
  281. To compute the entries in these tree blocks, the nodes in the network iterate over all the chain blocks, updating
  282. their local copy of each tree block appropriately. The experienced reader will recognize that this is an event
  283. sourced architecture. Currently only these two tree blocks are known to be needed, but if new events are
  284. added to the system it's easy to see that this system can be used for creating other data structures enabling
  285. queries that we have yet to envision.
  286. \section{Programming Environment}
  287. Enabling an excellent developer experience is one of the primary goals of this system (the others being security
  288. and scalability). Nodes execute user code that has been compiled into WebAssembly modules. Such code
  289. running on a blocktree node is referred to as an "app". An app
  290. executes in a sandbox that isolates it from other code, as well as the security critical operations of the node
  291. itself. The sandbox provides the code with an extension of the WebAssembly System Interface (WASI), with extra
  292. system calls to interact with the particulars of the blocktree system.
  293. The extra system calls fall into three categories:
  294. \begin{itemize}
  295. \item Distributed Locking
  296. \item Messaging
  297. \item Supervision Trees
  298. \item Protocol Contracts (Session Types)
  299. \end{itemize}
  300. The standard WASI filesystem APIs are used
  301. to interact with the contents of blocktrees. For instance a file descriptor for a block can be obtained
  302. by calling path\_open. Writes and reads of blocks are performed using the privileges of the node on which
  303. the app is running.
  304. When an app is installed it is given a directory under which it can store data that is shared between all nodes
  305. in the blocktree. The path of this block is formed by prefixing the path the app was published at
  306. with the string ``/apps". When an app is runs on a node, it is confined to a block contained
  307. in the node's directory. It is only allowed read and write blocks in this block, but to allow it to
  308. access shared data, a symbolic link is created to the app's shared directory in ``/apps".
  309. % App Publishing
  310. Apps are published by writing them into a blocktree. The path of the directory used to publish an app is used to
  311. identify it. Only one app per directory is allowed. This directory contains the WebAssembly module itself as
  312. well as a JSON manifest. This manifest defines the app's user-friendly name as well as the list
  313. of permissions it requires. This list of permissions is used to determine which APIs the app has access to.
  314. % Privacy Safe vs Unsafe
  315. Apps are broken into two categories: those that are privacy safe and those that are not.
  316. An app is privacy unsafe if it requests any permissions which allow it to send data
  317. outside of the blocktree it's part of. Thus requesting the ability to
  318. open a TCP socket would cause an app to be privacy unsafe. Similarly, the creation of a protocol
  319. handler for HTTP would also be privacy unsafe. Privacy unsafe apps can limit the scope of their
  320. unsafety by
  321. imposing limits on the unsafe APIs that they request. For instance an app which needs to send
  322. blocktree message back to the blocktree it was published in can request the messaging permission
  323. for a path in this tree. Similarly, an app which only wants to open a TCP socket listening on the
  324. local network, can limit the scope of its requested permission.
  325. % Protocol Contracts
  326. In order to make it easy to write apps which interface with existing systems, most notably
  327. those using HTTP, apps are able to define protocols using a contract language and then register
  328. call backs for these protocols. This works by providing a system call which the app can
  329. supply a protocol contract to. This contract is then compiled into a state machine by the node
  330. and a handle is returned to the app. This handle is then used to register callbacks for different
  331. parts of the protocol. This ensures that the protocol is handled by the node itself and that
  332. protocols can be shared between many different apps as a library. For instance, the HTTP protocol
  333. would be compiled to a particularly simply state machine, with only one state: the listening state.
  334. This state would expose a hook that where a callback can be registered to handle a request.
  335. This state also defines the record format used to pass the request information to the callback
  336. and the record format of the return value that is expected in order to produce the response.
  337. More complicated (stateful) protocols would have more states, each defining their own request and
  338. response records, as well as hooks. One nice thing about this setup is that it will enable
  339. optimizations where the state machine and the user callbacks can be compiled into a program
  340. which can be safely run in the node itself, or even in a SmartNIC. This would require that
  341. the callbacks only use an approved set of APIs, but could enable much higher performance.
  342. % Supervision Trees
  343. Apps can also arrange themselves into supervision trees, in the same way that Erlang
  344. processes are arranged. In this scheme, when a child app crashes, or the node its
  345. running on dies (which is detected by other nodes), then the app receives a message. In the
  346. simplest case this can be used to implement a logging system, where crashes and node
  347. deaths are recorded. More interestingly, this can be used to integrate with a control
  348. plane. For instance, if a blocktree were running in AWS, when a message is received indicating
  349. that a node has died, a new EC2 instance could be started to replace it. The reliability of Erlang
  350. and other system employing the Actor Model have shown the robustness of this approach.
  351. \section{A Brave New Web}
  352. In order to explore how blocktree can be used, the design of several hypothetical systems
  353. is discussed. It's important to note that blocktree does not try to force all computation
  354. to be local to a user's device, it merely tries to enable this for applications where it
  355. is possible.
  356. \subsection{Contacts and Mail}
  357. The first application we'll consider is one which manages a user's contacts. This would expose
  358. the usual create, read, update and delete operations, allowing a user to input the name
  359. of a person they know and associate
  360. that name with their public key. Once the principal of a person is known, then their public
  361. key can be looked up in the global blocktree. This principal needs to be communicated to the
  362. user via some out-of-band method. They could receive it in an email, a text message, or embedded
  363. in a QR code. Of course this out-of-band communication needs to be authenticated, otherwise
  364. it would be easy to fool the user into associating an attacker's key for the person.
  365. The user now has a way of associating a blocktree with the name of this person. However, the
  366. root public key of this blocktree is not enough to establish secure communications, because
  367. the root private key is not available to every node in the person's blocktree. In particular
  368. it would be inadvisable for the root private key to be stored on a user's mobile device. To
  369. address this mailbox directories are created.
  370. For each contact two directories are created: the inbox and the outbox. The user creates a readcap
  371. for another user's root key and adds it to their outbox. The inbox for the other user is a symbolic
  372. link to the user's outbox in the blocktree of the other user. Thus each user
  373. can write messages into their own blocktree at a location where the other party knows how
  374. to find them. But, in order for a node to read these messages it requires a its own readcap.
  375. Only the root
  376. nodes can issue this readcap as only they have access to the root key. Once permission has been
  377. granted to a node, a root node can use the root key to decrypt the readcap issued to it, and then
  378. encrypt it using the public key of the node. The resulting readcap is then stored in the header
  379. of the inbox.
  380. In addition to being able to check the inbox for mail, a blocktree message is sent to the receiving
  381. blocktree when mail is sent. This message may contain the entire contents of the mail, if the
  382. contents are short enough. But if the mail contains a lot of data, for instance a video, then
  383. the message just serves as a notification that new mail is available.
  384. \subsection{Social Network}
  385. Building a social network on top of the contacts app is fairly straight-forward. Once
  386. a contacts entry has been created for a person, most interactions between the user and that
  387. person can be implemented by sending mail. For example, when the
  388. user sends a direct message to a person mail is placed in their outbox
  389. and a blocktree message is sent to the root cluster of their blocktree. If the user wishes to
  390. send a message to a group of people, mail is sent to each one.
  391. This same mechanism can be used to implement status updates. When a user updates their status,
  392. they send mail to each of their "friends". The social networking app running on each of the contacts'
  393. devices will then display the latest status update from the user as their current status. This
  394. setup allows the user to "unfriend" anyone by simply omitting them from the list of recipients
  395. of these updates. To allow comments and likes on the status update to be visible to everyone
  396. that it was shared with, the block created in the outbox of each of the friends
  397. is a symbolic link to a single status update block. This symbolic link contains the
  398. readcap for the status update block.
  399. Comments and likes on status updates are implemented by sending mail to the user who posted the
  400. update. When one of the user's nodes receives this mail, it then updates the block containing
  401. the status update with the new comment, or increments the like counter.
  402. It then sends mail to the people this status update was shared with, notifying them
  403. that the data has changed.
  404. \subsection{An e-commerce website.}
  405. The previous two examples show how to build decentralized versions of existing web applications,
  406. but blocktree also excels at building traditional web applications. Imagine an e-commerce website
  407. with multiple warehouses, all of whose inventory is to be visible to customers of the site. Part
  408. of the design consideration for the site is that the warehouses need to be able to update their
  409. inventory even when their internet service goes down, and that these updates need to be visible
  410. on the website once connectivity is restored.
  411. To accomplish this a designer could create a directory for each warehouse. This directory would have
  412. nodes attached to it that are physically located at each warehouse. The inventory of the warehouse
  413. is then maintained in the warehouse's directory, satisfying the first requirement. Now, in order to
  414. enable efficient queries of the overall available inventory, the data from each of the warehouses
  415. needs to be merged. This is accomplished by creating another directory containing the merged data.
  416. In event sourcing terms this is called a read-model. This directory will have another cluster of
  417. nodes attached which will act as web servers. These nodes will subscribe to events published by the
  418. warehouse clusters, events indicating changing levels of inventory, and digest this information into
  419. a format that can be efficiently queried. These nodes will also publish events to the warehouses
  420. when orders are placed, cancelled or amended.
  421. When a warehouse goes offline, its previous inventory counts are
  422. still recorded in the read-model, so queries can still be answered using the cluster's best
  423. knowledge of the warehouse. Once the warehouse comes back online, then the events that were recorded by the
  424. warehouse cluster while it was offline can be replayed to the web server cluster, and their read-model
  425. can be brought up to date.
  426. One of the advantages of this approach is that the cluster of web servers need not be physically
  427. close to each other. In fact they could be spread over several data centers, even in different
  428. cloud providers. This allows for greater fault tolerance and reliability. Of course, running a
  429. consensus cluster over a larger area means more latency and thus reduced performance, but if
  430. the workload is read-heavy, and writes can be handled mostly in the warehouse clusters, then this
  431. tradeoff may be worthwhile.
  432. I hope this example shows that having a standard format for data and the federation of servers,
  433. can provide designers with much greater flexibility, even if they do not care about decentralization
  434. or their user's privacy.
  435. \subsection{The Open Metaverse}
  436. As a final example I'd like to consider a platform for recording spacial information. The key insight
  437. that enables this is very general: blocktree enables the creation of distributed tree-like data structures.
  438. For instance, its straight forward to imagine creating a distributed hashtable implemented as a red-black tree.
  439. This impedance match between efficient query structures and the design of blocktree is one
  440. of the reasons why I believe it is so useful. The particular data structure germane to building the metaverse
  441. is the Binary Space Partition (BSP) tree.
  442. I don't see the metaverse as being one world, but rather many. Each world would be hosted by its own blocktree.
  443. The world will have a directory in the blocktree, that directory will be the root of a BSP. read
  444. access to the world is controlled by adding readcaps to this directory. If the world is to be
  445. publicly accessible, then then its block can be stored in plaintext.
  446. Thinking of worlds
  447. like ours (those that can be reasonably approximated as the surface of a sphere) we can impose the usual latitude
  448. and longitude coordinate system. We can then define parcels in the world by specifying the polygonal boundary as
  449. a set of these coordinates (more precisely, the parcel is the convex hull of this set of points). These parcels
  450. are then recorded as blocks in the world's directory, whose path is determined by its virtual location using
  451. the BSP algorithm. If a parcel
  452. is owned by the same user who owns the world, then the data contained in the parcel is stored in
  453. the same blocktree. However, if a parcel is to be owned by another user, then a symbolic link is created
  454. pointing to a block in the owner's blocktree. They can then write whatever data they want into this block, defining
  455. the contents of their parcel. Collaboration on a single parcel is accomplished by issuing a read and writecap to
  456. another user.
  457. It's easy to imagine that one world would be more important than the rest and that he creation of a metaverse
  458. representation of Earth will be an important undertaking. The hierarchical nature of permissions in blocktree
  459. make such a shared world possible. National blocktrees could be given ownership of their virtual territory,
  460. this would then
  461. be delegated down to the state and municipal levels. Finally municipalities would delegate individual parcels
  462. to their actual owners. Owners could then use these as they see fit, including leasing them to third parties.
  463. The ending date of such a lease would be enforced technically by the writecap issued to the lessee; when it
  464. expires so too does the lease.
  465. \section{Conclusion}
  466. In this paper I have given the outline of a decentralized method of organizing information, computation,
  467. and trust in a way that I hope will be useful and easy to use. The use of cryptographic primitives for
  468. implementing access control were discussed, as well as methods of protecting private keys. A blockchain
  469. and corresponding cryptocurrency was proposed as a means of incentivizing the distribution of data.
  470. Erasure coding was used to ensure that distributed data is resilient to the loss of nodes and can be
  471. reconstructed efficiently. A programming environment based on WASM and WASI was proposed as a way
  472. of providing an interface to this data. APIs for defining protocol contracts and efficient web servers
  473. were indicated. APIs for constructing supervision trees were mentioned as a means for building reliable
  474. systems.
  475. When Sir Tim Berners-Lee invented HTTP he could not have anticipated the applications
  476. that his inventions would bring about. It has been said that the key to the success of his system is
  477. that it made networking programming so easy that anyone could do it. I don't know what the future will
  478. bring, but I hope that this system, or one like it, will enable anyone to fearlessly build distributed
  479. systems.
  480. One thing is certain however, it is a moral imperative that we provide users with viable alternatives
  481. to online services which harvest their data and weaponize it against them. Only then will the web
  482. become the place it was meant to be.
  483. \end{document}