Nostr NIP 111 - Version 1
Motivation
The open source movement has always been a beacon for collaboration, innovation, and freedom in software development. However, the centralized nature of many current code collaboration platforms poses a fundamental challenge to these principles. Centralization limits the scope of collaboration and compromises the digital sovereignty of developers and contributors. In a centralized system, the platform essentially owns your code, data, and interactions, which is antithetical to the core principles of open source development.
The transformative power of decentralization and digital sovereignty in open source development is revolutionary. Decentralization eliminates the single point of control, empowering individual developers and communities with full ownership and control over their code and contributions. Stopping centralized power is not only a technical requirement but also a moral imperative. Digital sovereignty ensures that developers can control their data, code, and interactions, which aligns perfectly with the open source ethos.
To truly realize decentralization and digital sovereignty, one of the most promising ways is to extend Git to support peer-to-peer (P2P) interactions. This way, we can create a P2P-based Git that inherently supports decentralized code translation. However, it's crucial to understand that more than having distributed code transmission capabilities is required. To fully realize the ideals of digital sovereignty and decentralized open-source collaboration, we must also implement mechanisms for distributed collaboration.
Distributed collaboration goes beyond just sharing code; it encompasses issue tracking, pull requests, code reviews, and community governance. In a decentralized environment, these aspects of collaboration should also be distributed across the network, allowing developers to engage in these activities without relying on a central authority.
Customization of the Git P2P Transfer Protocol in Mega
Mega is an engine for managing a monorepo. It works similarly to Google's Piper and helps streamline Git and trunk-based development for large codebases. Meanwhile, Mega is trying to adapt a p2p transfer protocol to bring distributed data interaction to Git.
The Protocol URI of Git P2P
The original Git protocol syntax is
[<protocol>://]<username>[:<password>]@<hostname>[:<port>]/<namespace>/<repo>[.git]
For a P2P Git protocol
<protocol>
could be an prefix likep2p://
to indicate the P2P protocol.<username>
and<password>
is unnecessary for P2P protocol. In implementation, we reference the Git SSH protocol interaction commands and use the peer ID for authentication, so we don't need a username and password.- The
<hostname>
usually represents the server, but in P2P, it maps to the peer ID hosting the repo. We use<peerID>
here to avoid confusion. - The
<port>
will not be relevant for p2p networking. - The mega uses mono repo, so there are no
<namespaces>
or<repo>
names, only<path>
. We could design a virtual path scheme to map directories to exposed public paths privately.
The Git version control system uses two major transfer protocols: "dumb" and "smart." The dumb protocol is simple but inefficient, requiring a series of HTTP GET requests. It is rarely used today due to its limitations in security and efficiency. On the other hand, the smart protocol is more common and efficient, as it allows for intelligent data transfer between the client and server.
Inspired by Git's approach to having multiple transfer protocols, we add a type segment in the custom P2P Git transport protocol. This segment allows us to specify the format of the files being transferred between peers, similar to how Git's protocols specify the nature of the data transfer. Currently, the type segment supports two formats: pack and object.
- pack: Indicates that the file being transferred is in Git's Pack format, efficiently transferring multiple Git objects.
- object: Indicates that the file being transferred is in Git's Object format, suitable for transferring individual Git objects like blobs, trees, commits, or tags.
Finally, the P2P protocol URI looks like
p2p://<peerId>/<type>/<repo>
Example
p2p://12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK/pack/mega.git
or
p2p://12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK/object/be044281f9604305e1b41b0e800e844c2a417e52
The DHT(Distributed Hash Table) Storage Specification
DHT(Distributed Hash Table) offers a compelling solution for enhancing the scalability and robustness of the Mega. DHTs are decentralized systems that can store key-value pairs across a network of nodes, eliminating the need for a central server. So, we use DHT to store information about open source repositories in the network in each peer node. How it work
In a DHT-based Git network, each repository or "repo" name would be a unique key. The key(repository name) can only contain ASCII letters, digits, characters ., -, and _. The value is a specification:
{
"origin": "12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK",
"name": "mega",
"latest": "1de1c6f",
"forks": [
{
"peer": "456DFooWFgpUQa9WnTztcvs5LLMJmwsMoGZccdw3Ddf3DTH23",
"latest": "1de1c6f",
"timestamp": 1629827281
},
{
"peer": "799DFoodjsfhuedDFEDSFesDFwefSDfwsefEWFweSDFWEfweS",
"latest": "be04428"
"timestamp": 1629827281
},
],
"timestamp": 1629827281
}
- origin: This field contains the PeerID of the original node where the repository was first made open-source.
- name: This is the name of the repository.
- latest: This field holds the SHA-1 value of the latest commit made to the repository.
- forks: This is an array containing objects representing the original repository's forks. Each object has its own set of fields.
- peer: The PeerID of the node that has forked the repository.
- latest: the SHA-1 value of the latest commit made to the forked repository.
- timestamp: A Unix timestamp indicates when this fork's latest update was made.
The Implementation of Git Peer-to-Peer Interactions
Delivery pack file Between two peers
sequenceDiagram
Client A->>+Relay: Hello Relay. I am joined!
Relay-->>+Client A: DHT Broadcast
Client A->>Client A: Put mega to DHT
Client A-->>+Relay: DHT Broadcast
Client B->>+Relay: Hello Relay. I am joined!
Relay-->>+Client B: DHT Broadcast
Client B->>+Relay: Help me connect to Client A
Relay-->>+Client B: Tell Client A the public IP address and port
Relay-->>-Client A: Tell Client B the public IP address and port
Client B->>+Client A: I need to clone: p2p://<Client A peerID>/pack/mega.git
Client A-->>+Client B: It's your pack file.
Client B->>Client B: Put mega to DHT
Client B-->>+Relay: DHT Broadcast
Relay-->>+Client A: DHT Broadcast
Collect object files through multiple peers
sequenceDiagram
Client A->>+Relay: Hello Relay. I am joined!
Relay-->>+Client A: DHT Broadcast
Client A->>Client A: Put mega to DHT
Client A-->>+Relay: DHT Broadcast
Client B->>+Relay: Hello Relay. I am joined!
Relay-->>+Client B: DHT Broadcast
Client B->>+Client A: I need to clone: p2p://<Client A peerID>/pack/mega.git
Client A-->>+Client B: It's your pack file.
Client B->>Client B: Put mega to DHT
Client B-->>+Relay: DHT Broadcast
Relay-->>+Client A: DHT Broadcast
Client C->>+Relay: Hello Relay. I am joined!
Relay-->>+Client C: DHT Broadcast
Client C->>Client A: I need an object list
Client A-->+Client C: It's the object list
loop [Ojbect List Part A]
Client C->>Client A: I need an Object <SHA-1>
Client A-->Client C: It's your object file
end
loop [Ojbect List Part B]
Client C->>Client B: I need an Object <SHA-1>
Client B-->Client A: It's your object file
end
Client C->>Client C: Put mega to DHT
Client C-->>+Relay: DHT Broadcast
Why Choose the Nostr for Collaboration
Git is a foundational platform for version control, playing a pivotal role in open source collaboration. Open source collaboration can be segmented into two primary parties: data transfer and information dissemination. The robust versioning capabilities of Git facilitate data transfer, while information dissemination, including updates, announcements, and collective communications, is the component that can be efficiently extended using the Nostr protocol.
Nostr uses events as the atomic unit for its protocol. These events are the only object types on the Nostr network and can be of various kinds, such as "text notes" intended for Twitter-like feeds, replies, and comments. Each event contains an id, pubkey, created_at timestamp, kind, content, tags, and a sig for signature. The kind specifies the type of event, and the content depends on what the kind means. For example, in the case of kind:1, the content is just a plaintext string meant to be read by others. While Nostr's event-based mechanism is unsuitable for transmitting code data due to its design for short, plaintext notes, it is highly effective for broadcasting other information relevant to open source collaboration.
Nostr Implementation Possibilities (NIPs) are designed to promote interoperability within the Nostr network. Aside from the first NIP, NIP-01, which outlines the basic protocol, all other NIPs are optional. NIPs aim to coordinate implementing solutions that are compatible across different applications. They are essential in a decentralized network like Nostr, where the community determines the direction of the protocol. NIPs provide a structured yet flexible framework for enhancing event-based broadcasting and subscription mechanisms in open-source collaboration.
Nostr NIP Proposal
This Nostr Implementation Possibility (NIP) aims to establish a standardized protocol designed explicitly for open-source writing collaboration. It outlines the essential structures and flows that should be universally implemented, ensuring compatibility and effective cooperation among contributors. Future NIPs may extend this foundational protocol by introducing optional or mandatory fields, messages, and features.
Basic Structures and Flows
{
"kind": 111,
"id": <32-bytes lowercase hex-encoded sha256 of the serialized event data>,
"peer": <32-bytes lowercase hex-encoded public key of the event creator>,
"timestamp": <unix timestamp in seconds>,
"tags": [
…
],
"content": <arbitrary string>,
"sig": < 64-byte lowercase hex of the signature of the sha256 hash of the serialized event data, which is the same as the "id" field>
}
- kind: A unique identifier for the type of event, set to 111 for open source writing events.
- id: A 32-byte lowercase hex-encoded SHA-256 hash of the serialized event data, serving as a unique identifier for each event.
- peer: A 32-byte lowercase hex-encoded public key of the event creator, identifying the contributor.
- timestamp: A Unix timestamp in seconds, indicating when the event was created.
- tags: An array of tags that provide additional context or categorization for the event.
- content: An arbitrary string that contains the actual content or details of the event, such as a new chapter, edit suggestions, comments, etc.
- sig: A 64-byte lowercase hex-encoded signature of the SHA-256 hash of the serialized event data should match the "id" field to ensure data integrity.
Update a repo status event
This event broadcasts a project's open source or an open source project's last commit update, and the subscriber determines whether to clone or pull the library to update it.
{
"kind": 111,
"id": <32-bytes lowercase hex-encoded sha256 of the serialized event data>,
"peer": <32-bytes lowercase hex-encoded public key of the event creator>,
"timestamp": <unix timestamp in seconds>,
"tags": [
["p", "12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK"],
["n", "mega"],
["t", "origin"],
["a", "update"],
["u", "p2p://12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK/pack/mega.git"],
["c", "1de1c6f"],
],
"content": <arbitrary string>,
"sig": < 64-byte lowercase hex of the signature of the sha256 hash of the serialized event data, which is the same as the "id" field>
}
- p: The peer id of the node
- n: The name of the repo
- t: The type of repo - origin or fork
- a: The action of event - update/request/issue
- u: The p2p URL of the repo
- c: The latest commit of the repo
Create a merge request
This type of event broadcasts the version of a project Fork, requesting that upstream be updated with this last commit.
{
"kind": 111,
"id": <32-bytes lowercase hex-encoded sha256 of the serialized event data>,
"peer": <32-bytes lowercase hex-encoded public key of the event creator>,
"timestamp": <unix timestamp in seconds>,
"tags": [
["p", "12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK"],
["k", "mega"],
["t", "fork"],
["a", "request"],
["u", "p2p://12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK/pack/mega.git"],
["c", "1de1c6f"],
],
"content": <arbitrary string>,
"sig": < 64-byte lowercase hex of the signature of the sha256 hash of the serialized event data, which is the same as the "id" field>
}
Create an Issue
This type of event broadcasts an issue; it is up to the upstream or fork to determine if it should be tracked.
{
"kind": 111,
"id": <32-bytes lowercase hex-encoded sha256 of the serialized event data>,
"peer": <32-bytes lowercase hex-encoded public key of the event creator>,
"timestamp": <unix timestamp in seconds>,
"tags": [
["p", "12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK"],
["k", "mega"],
["t", "fork"],
["a", "issue"],
["u", "p2p://12D3KooWFgpUQa9WnTztcvs5LLMJmwsMoGZcrTHdt9LKYKpM4MiK/pack/mega.git"],
["c", "1de1c6f"],
["i", "Issue Content"]
],
"content": <arbitrary string>,
"sig": < 64-byte lowercase hex of the signature of the sha256 hash of the serialized event data, which is the same as the "id" field>
}
Why the Mega is Well-Suited for Implementing Git's Peer-To-Peer Functionality
The Mega presents a compelling case for being an ideal platform to implement Peer-To-Peer (P2P) functionality for Git.
Difficulty in Modifying Git for P2P While Maintaining Upstream Compatibility
Modifying the existing Git architecture to support P2P functionality is challenging, especially if the goal is to maintain compatibility with upstream repositories. Being a standalone project, Mega offers the flexibility to implement P2P features without worrying about breaking existing Git functionalities.
Requirement for Host Service Capabilities
A distributed collaboration for a repository typically requires one or multiple Host Services. Mega can already act as a Host Service, making it easier to facilitate P2P interactions.
Need for Relay Nodes in Both Nostr and DHT
Whether using Nostr or DHT for networking, there's a requirement for Relay nodes to facilitate data transfer. Mega's architecture allows it to act as a Client and a Relay node, streamlining the data transfer process.
Database-Driven Storage Advantages
Mega uses a database for storage, which offers significant advantages when storing and retrieving large amounts of DHT information. This makes Mega highly efficient and scalable, especially for P2P networks that require quick data lookups.
Reusability of Existing Codebase
Mega has rewritten the underlying storage logic of Git and has implemented HTTP/SSH for data transfer and Pack file parsing. This means that a significant portion of the code can be reused when implementing P2P functionality, reducing development time and effort.
Support for LFS Protocol
Mega has also implemented the Large File Storage (LFS) protocol, which is particularly beneficial for transferring binary files. This gives Mega an edge in handling repositories that contain large binary files, making the P2P transfer process more efficient.
In summary, Mega's existing capabilities, from its flexible architecture and Host Service functionalities to its efficient storage and data transfer protocols, make it an excellent candidate for implementing Git's Peer-To-Peer functionality. Its design considerations align well with the requirements and challenges of creating a robust, efficient, and scalable P2P Git network.