Open State Repository
Open State Repository • Infrastructure • Zooko's Triangle
Last updated
Open State Repository • Infrastructure • Zooko's Triangle
Last updated
Nodes are published on an "open state" repository. The open state repository's core architecture has been designed in line with the principles set forth by the Leiden declaration on digital infrastructure. This is the first step towards an Internet of FAIR digital objects where access is guaranteed to all humans and machines.
The open state repository irrevocably separates control over the data layer from control over the application layer. This not only guarantees accessibility to all research outputs, but also protects against the formation of vendor lock-in.
The open state repository achieves this separation through a distributed and open peer-to-peer architecture based on content-addressed storage and ledger-based anchoring of secure data structures on an open state PID registry. For an example of a similar architecture, see Microsoft's Sidetree protocol.
The open state registry PID namespace combines all three properties of Zooko's triangle: secure, decentralized, and human-meaningful. This is a solution that is famously owed to Aaron Schwartz, the open internet activist. Every digital object on the repository is secured in a tamper-proof data structure and is versionable at will by its creators across all access points (Gateways) to the open state repository. DeSci Nodes is the first access point or Gateway to the open-state repository. It is a prototype interface to build research objects and to broadcast them on the network.
Our end goal is to usher a future where the scientific record is FAIR and OPEN.
When a Node is published, it is broadcasted from your private staging area hosted on our cloud infrastructure to the open state repository. Publishing triggers the repository’s registry to mint a single, version-invariant PID (called dPID). This PID serves as an anchoring point for the current version of your Node, which is structured into an efficient and secure data structure known as a Merkle DAG. The individual branches of this Merkle DAG are CIDs, serialized as an IPLD-compliant, JSON-LD object. The root of this Merkle object is a cryptographic hash, and this hash is anchored to the Node’s PID on the open state registry contracts.
Every time you publish a new version of your Node, a new root hash is anchored to your Node’s PID. This means that version history is preserved, and every event is logged with traceability: who, what, and when.
Every Node has a single version invariant PID. From this PID, it is efficient to traverse the Merkle DAG and address any component of your node independently and uniquely. This is as simple as addressing a file path, something that everyone is familiar with. This means that every component of your Node has a provably unique path to every digital object contained, and therefore a unique PID. Because these PIDs are based on cryptographic hashes encoding the fingerprint of their content, they are architecturally guaranteed to secure data integrity.
The Node creator can version the PID. As we upgrade the registry system, we will add permission delegation and a community recovery system to expand the flexibility and fault tolerance of the permission configuration.
For the time being, DeSci Labs has admin rights over the open state repository registry contracts, which are upgradable via proxy delegate. This means that in the event of lost keys, we can re-assign your permissions to a new key. This is a necessary measure for the duration of the development process.
Our goal is to make PIDs as useful to humans and machines as possible. For convenience, they are resolvable over HTTP as short, secured and human-friendly URLs via the DNS gateway dpid.org. You can learn more about the dPID schema here.
You can compare below the properties of Node dPIDs with DOIs (which are based on the HANDLE system) and content-identifiers (CID).
Does not depend on a central authority
FALSE
TRUE
TRUE
Consistent resolution to their content
FALSE
TRUE
TRUE
Protects against content drift
FALSE
TRUE
TRUE
Linked Data support
FALSE
TRUE
TRUE
Method to compute over the data
FALSE
TRUE
TRUE
Native support for versioning
FALSE
FALSE
TRUE
Method to resolve to metadata
FALSE
FALSE
TRUE
Enables Short URLs
TRUE
FALSE
TRUE