Distributed network storage system.
Sector-level access to data.
File-level access to data.
There are four cases to consider, depending on what permissions the discovering runtime has for the file being accessed:
In the first case, the sector service needs to discover all of the other sector service providers in its directory. Once it has connected to all of them, sectors can be reconstructed and written to the cluster. It makes sense to have the filesystem service registered in such a runtime, because this would allow all filesystem operations to happen locally (at least it would access the local sector service, the sector service may need to communicate with its peers in the directory when data is written). In this case the runtime needs to be able to find all of the runtimes hosting the sector service in its directory.
In the second case the runtime needs to be able to discover the correct sector service provider to connect to. It seems that it needs to find a runtime hosting the sector service contained in one of its parent directories. Once such a runtime is found, messages can be delivered to it to access the sectors of the file, and their contents will be decrypted locally.
In the third case, the runtime must locate the closest runtime hosting the filesystem service which is contained in one of the runtime's parent directories. This should be the same query as in case 2, just used for the filesystem service instead of the sector service.
In case four, the process must discover a filesystem service hosting the file. This case actually doesn't seem any different from case 3, it's just performed with no authorization attributes. So in terms of FS permissions, only files which allow others to read them could be accessed in this way, and all of whose parent directories can be read by others can be accessed in this way. This requirement that all parent directories can also be read by others, would be too strict for non-anonymous access. It's important to allow credentialed access to a file when a process has permission to that specific file, even if the process can't access one or more of the files parents. This helps to keep the system flexible.
There seem to be two queries which are needed to locate the appropriate runtimes. A query is executed with respect to a scope and only considers runtimes with a given service registration.
These queries correspond to the two ways that messages can be dispatched by an actor.
There are three cases to consider when defining the security model for runtime queries:
In the first two cases the query should be allowed. In the third case it should only be allowed if every file on the path from the scope to the root permits others to read.
When a runtime receives a query it should use the filesystem to answer it. If, as it navigates to the scope, it encounters a directory which it is not responsible for storing, it will return a redirection to the querier with the IP address of a runtime where the query should be retried. This processes repeats until the query is answered, either successfully with one or more runtimes or with an error and no runtimes.
Queries are issued automatically by processes as part of the message routing procedure. Each process maintains a trie keyed using message scope. It uses this trie to find the longest prefix match with the scope. The value contained in the trie is a hash table of service registrations. This allows a process to quickly determine if it already knows the correct runtime to deliver the message to. If the process does not know the correct recipient, it performs discovery using one of the queries above, with the query being determined by how the message was dispatched. If no other runtimes are known, the process uses DNS to find a runtime in the root directory, remembers the runtime in its trie, and issues the query to it. There will need to be a cache control mechanism for determining how long entries in the trie can be kept.
Blocktree requires a mechanism which allows runtimes to connect to each other even if one or both of them is behind a firewall. I don't yet know how to do this in the case were both are behind a firewall, but in the case where only a single one is, we can handle it by having a runtime contained in a parent directory send a control plane message to the runtime which can't be reached telling it to initiate a connection to the runtime attempting to reach it. If the runtime that initiated the connection has a public IP address, this will allow the two to connect, after which messages can be sent in either direction. This requires that at one runtime in the root directory has a public IP address, and that a connection is maintained between a child runtime and one of its parents.
Because the sector clusters are fully connected we only need to a connection request message to one of them if we have the runtime forward these connection requests. Then, if at least one of the sector hosts in the root has a public IP, one runtime in each cluster is connected to one runtime in each of its child clusters, the message should eventually be delivered to the correct runtime.
This means that the sector hosts will form a single connected component of the connection graph.
My idea of using actors to own file handles has a significant drawback. If an actor which opened a file crashes, the file will remain open forever, resulting in a resource leak. An alternative would be to issue file handle structs to actors in local messages, but this will not work when the filesystem service is being accessed by a remote runtime. I could keep a table of file handles (integers) in the filesystem service, and access it similar to how the filesystem struct is used today. This approach brings the overhead of an RwLock on the table and searching it for a specific file on every read or write. Perhaps I could have the file actor poll its owner periodically to see if its still alive? Then it would be able to halt if the owning actor has crashed. To get this to work I'll need to reintroduce the ability to send messages to a specific actor, and solve the issue of handling undeliverable messages. This approach has the advantage of working over the network, and it does not introduce any overhead from maintaining a table.