Bläddra i källkod

Started writing a new paper which emphasizes the actor runtime
over the filesystem and its cryptographic mechanisms.

Matthew Carr 1 år sedan
förälder
incheckning
bd0bf2b43b
2 ändrade filer med 178 tillägg och 0 borttagningar
  1. 144 0
      doc/BlocktreeCloudPaper/BlocktreeCloudPaper.tex
  2. 34 0
      doc/BlocktreeCloudPaper/notes.md

+ 144 - 0
doc/BlocktreeCloudPaper/BlocktreeCloudPaper.tex

@@ -0,0 +1,144 @@
+\documentclass{article}
+\usepackage[scale=0.8]{geometry}
+
+\title{The Blocktree Cloud Orchestration Platform}
+\author{Matthew Carr}
+
+\begin{document}
+\maketitle
+\begin{abstract}
+This document is a proposal for a novel cloud platform called Blocktree.
+The system is described in terms of the actor model,
+where tasks and services are implemented as actors.
+The platform is responsible for orchestrating these actors on a set of native operating system processes.
+A service is provdied to actors which allows them access to a highly available distributed file system,
+which serves as the only source of persistent state for the system.
+High availability is achieved using the Raft consensus protocol to synchronize the state of files between processes.
+All data stored in the filesystem is secured with strong integrity and optional confidentiality protections.
+A network block device like interface allows for fast low-level read and write access to the encrypted data,
+with full support for client-side encryption.
+Well-known cryptographic primitives and constructions are employed to provide this protection,
+the system does not attempt to innovate in terms of cryptography.
+The system's trust model allows for mutual TLS authentication between all processes in the system,
+even those which are controlled by different owners.
+By integrating these ideas into a single platform,
+the system aims to advance the status quo in the security and reliability of software systems.
+\end{abstract}
+
+\section{Introduction}
+% Describe paths, actors, and files. Emphasize the benefit of actors and files sharing the same
+% namespace.
+Blocktree is an attempt to extend the Unix philosophy that everything is a file
+to the entire distributed system that comprises modern IT infrastructure.
+The system is organized around a global distributed filesystem which defines security
+principals, resources, and their authorization attributes.
+This filesystem provides a language for access control that can be used to securely grant principals
+access to resources from different organizations, without the need to setup federation.
+The system provides an actor runtime for orchestrating tasks and services.
+Resources are represented by actors, and actors are grouped into operating system processes.
+Each process has its own credentials which authenticate it as a unique security principal,
+and which specify the filesystem path where the process is located.
+A process has authorization attributes which determine the set of processes that may communicate with it.
+Every connection between processes is established using mutual TLS authentication,
+which is accomplished without the need to trust any third-party certificate authorities.
+The cryptographic mechanisms which make this possible are described in detail in section 3.
+Messages addressed to actors in a different process are forwarded over these connections,
+while messages delivered to actors in the same process are delivered with zero-copying.
+
+One of the major challenges in distributed systems is managing persistent state.
+Blocktree solves this issue using its distributed filesystem.
+Files are broken into segments called sectors.
+The sector size of a file can be configured when it is created,
+but cannot be changed after the fact.
+Reads and writes of individual sectors are guaranteed to be atomic.
+The sectors which comprise a file and its metadata are replicated by a set of processes running
+the sector service.
+This service is responsible for storing the sectors of files which are contained in the directory
+containing the process in which it is running.
+The actors providing the sector service in a given directory coordinate with one another using
+the Raft protocol to synchronize the state of the sectors they store.
+This method of partitioning the data in the filesystem based on directory
+allows the system to scale beyond the capabilities of a single consensus cluster.
+Sectors are secured with strong integrity protection,
+which allows anyone to verify that their contents were written by an authorized principal.
+Encryption can be optionally applied to sectors,
+with the system handling key management.
+The cryptographic mechanisms used to implement these protections are described in section 3.
+
+To reduce load on the sector service, and to allow the system to scale to a larger number of users,
+a peer-to-peer distribution system is implemented in the filesystem service.
+This system allows filesystem actors to download sectors from other filesystem actors
+that have the sectors in their local cache.
+The threat of malicious actors serving bad sector data is mitigated by the strong integrity
+protections applied to sectors.
+By using peer-to-peer distribution, the system can serve as a content delivery network.
+
+One of the design goals of Blocktree is to facilitate the creation of composable distributed
+systems.
+A major challenge to building such systems is the difficulty in pinning down bugs when they
+inevitably occur.
+Research into session types (a.k.a. Behavioral Types) promises to bring the safety benefits
+of type checking to actor communication.
+Blocktree integrates a session typing system that allows protocol contracts to be defined that
+specify the communication patterns of a set of actors.
+This model allows the state space of the set of actors participating in a computation to be defined,
+and the state transitions which occur to be specified based on the types of received messages.
+These contracts are used to verify protocol adherence statically and dynamically.
+This system is implemented using compile time code generation,
+making it a zero-cost abstraction.
+By freeing the developer from dealing with the numerous failure modes that occur in a communication protocol,
+they are able to focus on the functionality of their system.
+
+% Describe the remainder of the paper.
+The remainder of this paper is structured as follows:
+\begin{itemize}
+  \item Section 2 describes the actor runtime, service and task orchestration, and service
+    discovery.
+  \item Section 3 discusses the filesystem, its concurrency semantics and implementation.
+  \item Section 4 details the cryptographic mechanisms used to secure communication between
+    actor runtimes and to protect sector data.
+  \item Section 5 is a set of examples describing ways that Blocktree can be used to build systems.
+  \item Section 6 provides some concluding remarks.
+\end{itemize}
+
+\section{Actor Runtime}
+% message passing interface
+
+% btmsg and how it functions as a secure transport.
+
+% security model based on filesystem permissions
+
+% service discovery.
+
+% protocol contracts, and runtime checking of protocol adherence. Emphasize the benefits to
+% system composability that this enables, where errors can be traced back to the actor which
+% violated the contract.
+
+\section{Filesystem}
+% Benefits of using a distributed filesystem as the sole source of persistent state for the system,
+% including secure software delivery.
+
+% Accessing data at two different levels of abstraction: sectors and files.
+
+% Concurrency semantics at the sector layer, and their implementation using Raft.
+
+\section{Cryptography}
+
+\section{Examples}
+This section contains examples of systems built using Blocktree. The hope is to illustrate how this
+platform can be used to implement existing applications more easily and to make it possible to
+implement systems which are currently out of reach.
+
+\subsection{A personal cloud for a home user.}
+% Describe my idealized home Blocktree setup.
+
+\subsection{An ecommerce website.}
+% Describe a blocktree which runs a cluster of webservers, a manufacturing process, a warehouse
+% inventory management system, and an order fulfillment system.
+
+\subsection{A realtime geo-spacial environment.}
+% Explain my vision of the metaverse.
+
+\section{Conclusion}
+
+\end{document}

+ 34 - 0
doc/BlocktreeCloudPaper/notes.md

@@ -0,0 +1,34 @@
+- Actor runtime
+* Messages securely forwarded over the network.
+* 
+
+- Distributed network storage system.
+* Sector-level access to data.
+* File-level access to data.
+
+
+## Process of delegating storage in a directory.
+1. A new directory is created. This directory has the generation number of the original sector
+   cluster.
+2. A process credential file is created in the directory. It is marked to indicate that the process
+   will host the sector service. This mark means that the process will be responsible (jointly,
+   along with all other such processes in the directory) for storing the sectors in the directory.
+3. The new process starts and initializes a new directory in its local filesystem to store sector
+   data. It knows to create this directory because it is configured to run the sector service,
+   which creates a new storage directory if one does not already exist. As part of the creation
+   process a new super block is created, which is the file with inode 1 and which is not contained
+   in any directory. This new superblock contains the generation number which identifies the sector
+   service in this directory. The generation number is determined by contacting the sector service
+   in the root directory, which has knowledge and authority to assign unique numbers to every
+   sector service.
+4. The filesystem service in the directory will discover the sector service actor running inside the
+   new process. When it creates new files in the directory it will store their sectors using the
+   sector service in the process. These new files will use the generation number defined in the
+   superblock stored in the sector service in the directory, which is different from the generation
+   number of the directory itself.
+5. When new processes configured to run the sector service are added to the directory, they
+   automatically replicate sectors marked with their generation number, and use Raft to ensure the
+   consistency of sector data.
+6. Note that the sectors of the directory itself are actually stored by the parent sector service.
+   It is just the files created within it which are created after the sector
+   service in the directory becomes active which are stored by the child sector service.