Bladeren bron

Wrote all but one of the examples for the new paper.

Matthew Carr 1 jaar geleden
bovenliggende
commit
11e9537c6d
1 gewijzigde bestanden met toevoegingen van 230 en 12 verwijderingen
  1. 230 12
      doc/BlocktreeCloudPaper/BlocktreeCloudPaper.tex

+ 230 - 12
doc/BlocktreeCloudPaper/BlocktreeCloudPaper.tex

@@ -1031,18 +1031,236 @@ one can begin monitoring files as soon as they are created.
 
 
 
 
 \section{Examples}
 \section{Examples}
-This section contains examples of systems built using Blocktree. The hope is to illustrate how this
-platform can be used to implement existing applications more easily and to make it possible to
-implement systems which are currently out of reach.
-
-\subsection{A personal cloud for a home user.}
-% Describe my idealized home Blocktree setup.
-
-\subsection{An ecommerce website.}
-% Describe a blocktree which runs a cluster of webservers, a manufacturing process, a warehouse
-% inventory management system, and an order fulfillment system.
-
-\subsection{A smart home.}
+This section contains examples of systems that could be built using Blocktree.
+The hope is to illustrate how this platform can be used to implement existing applications more
+easily and to make it possible to implement systems which are currently out of reach.
+
+\subsection{A distributed AI execution environment.}
+Neural networks are just vector-valued functions with vector inputs,
+albeit very complicated ones with potentially billions of parameters.
+But, just like any other computation,
+these functions can be conceptualized as computational graphs.
+Imagine that you have a set of computers equipped AI accelerator hardware
+and you have a neural network that is too large to be processed by any one of them.
+By partitioning the graph into small enough subgraphs,
+we can break the network down into pieces which can be processed by each of the accelerators.
+The full network can be stitched together by passing messages between each of these pieces.
+
+Let us consider how this could be accomplished with Blocktree.
+We begin by provisioning a runtime on each of the accelerator machines,
+each of which will have a new accelerator service registered.
+Messages will be sent to the accelerator service describing the computational graph to execute,
+as well as the name of the actor to which the output is to be sent.
+When such a message is received by an accelerator service provider,
+it spawns an actor which compiles its subgraph to a kernel for its accelerator
+and remembers the name of the actor to send its output to.
+An orchestrator service will be responsible for partitioning the graph and sending these messages.
+Ownership of the actors spawned by the accelerator service is given to the orchestrator service,
+ensuring that they will all be stopped when the orchestrator returns.
+When one of the spawned actors stops,
+it unloads the kernel from the accelerator's memory and returns it to its initial state.
+Note that the orchestrator actor must have execute permissions on each of the accelerator runtimes
+in order to send messages to them.
+The orchestrator dispatches messages to the accelerator service in reverse order of the flow of data
+in the computational graph,
+so that it can tell each service provider where its output should be sent.
+The actors responsible for the last layer in the computational graph send their output to the
+orchestrator.
+To begin the computation,
+the actors which are responsible for input are given the filesystem path of the input data.
+The orchestrator learns of the completion of the computation once it receives the output from
+final layer.
+It can then save these results to the file system and return.
+Because inference and training can both be modeled by computational graphs,
+this same procedure can be used for both.
+
+\subsection{A decentralized social media network.}
+One of the original motivations for designing Blocktree was to create a platform for a social
+network that puts users in fully in control of their data.
+In the opinion of the author,
+the only way to actually accomplish this is for users to host the data themselves.
+One might think it is possible to use client-side encryption to solve the privacy issue,
+but this does not solve the full problem.
+While it is true that good client-side encryption will prevent the service provider from reading
+the user's data,
+the user could still loose everything if the service provider goes out of business or simply
+decides to stop offering its service.
+Similarly, putting data in a federated system, as has been proposed by the Mastodon developers,
+also puts the user at risk of loosing their data if the operator of the server they use decides to
+shut it down.
+To have real control the user must host the data themselves.
+Then they decide how its encrypted, how its served, and to whom.
+
+Let us explore how Blocktree can be used to build a social media platform which provides this
+control.
+To participate in this network each user will need to setup their own domain by generating new root
+credentials
+and provisioning at least one runtime to host the social media service.
+A technical user could do this on their own hardware by reading the Blocktree documentation,
+but a non-technical user might choose to purchase a new router with Blocktree pre-installed.
+By connecting this router directly to their WAN,
+the user ensures that the services running on it will always have direct internet access.
+The user can access the \texttt{btconsole} web GUI via the router's WiFi interface to generate their
+root credentials and provision new runtimes on their network.
+
+A basic function of any social network is keeping track of a user's contacts.
+This would be handled by maintaining the contacts as files in a well-known directory in the user's
+domain.
+Each file in the directory would be named using the user defined nickname for the contact
+and its contents would include the root principal of the contact as well as any additional user
+defined attributes,
+such as address or telephone number.
+The root principal would be used to discover runtimes controlled by the contact
+so that messages can be sent to the social media service running in them.
+When a user adds a new contact,
+a connection message would be sent to it,
+which the contact could choose to accept or reject.
+If accepted,
+the contact would create an entry in its contacts directory for the user.
+The contact's social media service would then accept future direct messages from the user.
+When the user sends a direct message to the contact,
+its runtime discovers runtimes controlled by the contact and delivers the message.
+Once delivered the contact's social media service stores the message in a directory for the user's
+correspondence,
+sort of like an mbox directory but where messages are sorted into directories based on sender
+instead of receiver.
+
+Note that this procedure only works if a contact's root principal can be resolved using the
+search domain configured in the user's runtime.
+We can ensure this is the case by configuring the runtime to use a search domain that operates
+a Dynamic DNS (DDNS) service
+and by arranging with this service to create the correct records to resolve the root principal.
+The author intends to operate such a service to facilitate the use of Blocktree by home users,
+but a more long-term solution is to implement a blockchain for resolving root principals.
+Only then would the system be fully decentralized.
+
+Making public posts is accomplished by creating files in a directory with the HTML contents of the
+post.
+This file, the directory containing it, and all parents of it,
+would be configured to allow others to read, and in the case of directories, execute them.
+At least one runtime with the filesystem service registered would need to have the execute
+permission granted to others to allow anyone to access these files.
+When someone wanted to view the posts of another user,
+they would use the filesystem service to read these files from the well-known posts directory.
+
+Of course user's would not be using a file manager to interact with this social network,
+they would use their browsers as they do now.
+This web interface would be served by the social media service in their domain.
+A normal user who has a Blocktree enabled router would just type in a special hostname into their
+browser to open this interface.
+Because the router provides DNS services to their network,
+it can generate the appropriate records to ensure this name resolves to the address where the social
+media service is listening.
+The social media service would be responsible for sending message to other user's domains to
+get their posts,
+and to read the filesystem to display the user's direct messages.
+All this file data would be used to populate the web interface.
+It is not hard to see how the same system could be used to serve any type of media: text, images,
+video, immersive 3D worlds.
+All of these can be stored in files in the filesystem,
+and so all of them are accessible to Blocktree actors.
+
+One issue that must be addressed with this design is how it will scale to a large number of users
+accessing data at once.
+In other words,
+what happens if the user goes viral?
+Currently, the way to solve this would be to add more computers to the user's network which run
+the sector and filesystem services.
+This is not ideal as it means the user would need to buy more hardware to serve their dank memes.
+A better solution would be implement peer-to-peer distribution of sector data in the filesystem
+service.
+This would reduce the load on the user's computers and allow their follows to share the posted
+data with each other.
+This work is planned as a future enhancement.
+
+\subsection{A smart lock.}
+The access control language provided by Blocktree's filesystem can be used for more than just
+authorizing access to data.
+To illustrate this point,
+consider a smart lock installed on the front door of a company's office building.
+When the company first got the lock they used NFC to configure the lock
+and connect it to their WiFi network.
+The lock then used link-local runtime discovery to perform automatic provisioning.
+An IT administrator accessed \texttt{btconsole} to approve the provisioning request
+and position the lock in a specific directory in the company's domain.
+Permission to actuate the lock is granted if a principal has execute permission on the lock's file.
+To verify the physical presence of an employee,
+NFC is used for the authentication handshake.
+When an employee presses their NFC device, for instance their phone, to the lock,
+it generates a nonce and transmits it to the device.
+The device then signs the nonce using the credentials it used during provisioning in the company's
+domain.
+It transmits this signature to the lock along with the path to the principal's file in the domain.
+The lock then reads this file to obtain the principal's authorization attributes and its public key.
+It uses the public key to validate the signature presented by the device.
+If this is successful,
+it then checks the authorization attributes of the principal against the authorization attributes on
+its own file.
+If execute permissions are granted, the lock actuates, allowing the employee access.
+The administrators of the company's domain create a group specifically for controlling physical
+access to the building.
+All employees with physical access permission are added to this group,
+and the group is granted execute permission on the lock,
+rather than individual users.
+
+\subsection{A traditional three-tier web application.}
+While it is hoped that Blocktree will enable interesting and novel applications,
+it can also be used to build the kind of web applications that are common today.
+Suppose that we wish to build a three-tier web application.
+Let us explore how Blocktree could help.
+
+First, let us consider which database to use.
+It would be desirable to use a traditional SQL database,
+preferably one which is open source and not owned by a large corporation with dubious motivations.
+These constraints lead us to choose Postgres,
+but Postgres was not designed to run on Blocktree.
+However, Postgres does have a container image available on docker hub,
+we can create a service to run this container image in our domain.
+But Postgres stores all of its data in the local filesystem of the machine it runs on.
+How can we ensure this does not become a single point of failure?
+First, we should create a directory in our domain to hold the Postgres cluster.
+Then we should procure at least three servers for our storage cluster
+and provision runtimes hosted on each of them in this directory.
+The sector service is registered on each of the runtimes,
+so all the data stored in the directory will be replicated on each of the server.
+Now, the Postgres service should be register in one and only one of these runtimes,
+as Postgres requires exclusive access to its database cluster.
+We now have to decide how other parts of the system are going to communicate with Postgres.
+We could have the Postgres service setup port forwarding for the container,
+so that ordinary network connection can be used to talk to it.
+But we will have to setup TLS if we want this to be secure.
+The alternative is to use Blocktree as a VPN and proxy network communications in messages.
+This is accomplished by registering a proxy service in the same runtime as the Postgres service
+and configuring it to allow traffic it receives to pass to the Postgres container on TCP port 5432.
+
+In a separate directory,
+a collection runtimes are provisioned which will host the webapp service.
+This service will use axum to serve the static assets to our site,
+including the Wasm modules which make up our frontend,
+as well as our site's backend.
+In order to do this,
+it will need to connect to the Postgres database.
+This is accomplished by registering the proxy service in each of the runtimes hosting the
+webapp service.
+The proxy service is configured to listen on TCP 127.0.0.1:5432 and forwards all traffic
+to the proxy service in the Postgres directory.
+The webapp can then use the \texttt{tokio-postgres} crate to establish a TCP connection to
+127.0.0.1:5432
+and it will end up talking to the containerized Postgres instance.
+
+Although the data in our database is  stored redundantly,
+we do still have a single point of failure in our system,
+namely the Postgres container.
+To handle this we can implement a failover service.
+It will work by calling the Postgres service with heartbeat messages.
+If too many of these timeout,
+we assume the service is dead and start a new instance one of the other runtimes in the Postgres
+directory.
+This new instance will have access to all the same data the old,
+including its journal file.
+Assuming it can complete any in progress transactions,
+the new service will come up after a brief delay
+and the system will recover.
 
 
 \subsection{A realtime geo-spacial environment.}
 \subsection{A realtime geo-spacial environment.}
 % Explain my vision of the metaverse.
 % Explain my vision of the metaverse.