ScyNet tech update #2: The Hatchery Controller

Mar 12, 2019 | Tech Update

A glimpse into the core — how the Hatchery Controller ties business together

Oh, look! It’s time for the next tech update for ScyNet. These updates are meant to describe our recent progress while also explaining the stack behind ScyNet. You can learn more about them from our previous tech update.

Our last update introduced you to some of our blockchain decisions and puzzles while this time we’ll be looking at the core of the node software — the Hatchery Controller.

TLDR: We created a Controller which holds all business logic and manages NAS algorithms, data-mining operations, and blockchain connections. Given reliability concerns, the software stack had to be chosen in a way that would minimize crashes and unrecoverable errors.

The Hatchery Controller (also called just “Hatchery” or “Controller”) is a piece of software which manages a miner or trainer node in the ScyNet network. It has a middleman’s job, talking to different “components” and arranging communications between them.

The components that can be attached to a controller are varied, but generally fall into a few broad categories:

“Queen” components, which run NAS algorithms to create new agents. Those agents can then compete against each other in tournaments on the blockchain.“Harvester” components, which run data mining and other manually-devised algorithms to generate more data. The data is used by the Queens to train the agents.“Connector” components, which connect the Hatchery to others around the world. The blockchain connector is the prime example of those components.

Every component can generate agents and, at the discretion of the controller, receive input from other components. Due to this, the controller needs to be always-running, fault-tolerant, and performant; otherwise, all operations inside the node would stop.

To implement such fault tolerance, there are two main approaches. One is to write everything in a programming language which guards against accidental crashes and deadlocks, for example, Rust or Go. The other is to use the Actor model which allows for parts of the program to encounter crashes without affecting the others.

While the first approach sounds tempting, the second one also allows the program to run on multiple machines painlessly, so in the end we decided to go with the latter. We chose Microsoft’s Orleans framework, which is an implementation of the so-called Virtual Actors in C#. Virtual actors are a type of actors which get automatically scheduled and restarted, unlike pure actors, which require a manually-configured supervisor to monitor them.

Since the controller can run on multiple machines, someone could now run it on all machines in a cluster, without needing each of them to participate in the blockchain actively. Success!

With fault-tolerance and clustering being handled, we need to find a way for harvesters to send their data to queens and for agents to send their outputs to the outside world. The obvious solution would be to send all the data through the controller, and hope it manages to keep up with the load. That would make it unreasonably bloated and would require lots of work to ensure there are no bugs.

Instead, we decided to use Apache Kafka, since it is battle-tested and has an already established open-source ecosystem. We also noted that the Controller does not need to interact with the data directly, but only needs to coordinate data streams.

Finally, we need to have some logic related to publishing and subscribing to external sources. Since both publishing and subscribing are related to money, we can’t just hardcode some business logic in the blockchain component and hope it works alright. Instead, we put it inside the Controller, with multiple ways to enter one’s custom logic. That way, programmers creating components don’t have to worry about making proper business decisions, while miners can utilize what’s available and sleep soundly kowing their newly-installed NAS won’t mindlessly subscribe to external data sources.

As we now have the Hatchery Controller and the Tendermint blockchain modules, what’s left is to link the two together, creating the connector component. At the same time, the controller will have to get more business logic inside it for things like deciding whether to publish an agent, whether to subscribe to one, or how much the price for its outputs should be. The goal remains to build a fully-functional node which interacts with the blockchain and we are getting closer.

Want to learn more? Stay tuned for more tech updates and join us on Discord to meet the team or suggest ideas!