Synchronised Storage Service Architecture

Overview

The Synchronised Storage Service implements a distributed, eventually-consistent data replication system built on a decentralised architecture. The system provides conflict-free, cryptographically-verifiable synchronisation of entity storage mutations across a network of heterogeneous nodes through a combination of trusted node coordination and decentralised storage protocols.

Core Architecture

System Components

1. Entity Storage Layer

Purpose: Persistent data layer storing application entities
Interface: CRUD operations with change event emission via event bus
Implementation: Pluggable storage connectors (Memory, File System, Database)
Change Detection: Mutation observers capture CREATE, UPDATE, DELETE operations
Event Emission: Publishes change notifications to event bus for downstream processing

2. Change Capture Subsystem

Event Bus Integration: Subscribes to entity storage change events via decoupled event bus architecture
Change Sets: Immutable collections of related entity mutations aggregated from event notifications
Serialisation: Compressed binary format with cryptographic signatures
Metadata: Timestamps, node identity, operation vectors
Batching: Configurable aggregation windows for performance optimisation

3. Cryptographic Verification Layer

Digital Signatures: DID Proofs on change set payloads
Identity Management: DID-based node authentication
Proof Chains: Verifiable computation proofs for data integrity
Encryption: RSA-2048 encryption for sensitive payload data with SPKI keys available to regular nodes, and PKCS and SPKI on the trusted nodes

5. Event Bus Architecture

Decoupled Communication: Asynchronous message passing between entity storage and change capture systems
Topic-Based Routing: Structured message topics for different operation types (create, update, delete)
Pluggable Connectors: Support for various event bus implementations (Local, Redis, AMQP, Kafka)
Message Serialisation: JSON-based message payloads with type-safe schemas

6. Network Topology

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Regular Node  │    │   Regular Node  │    │   Regular Node  │
│                 │    │                 │    │                 │
└────────┬────────┘    └────────┬────────┘    └────────┬────────┘
         │                      │                      │
         └──────────────────────┼──────────────────────┘
                                │
              ┌─────────────────┴─────────────────┐
              │                                   │
   ┌──────────▼──────────┐              ┌────────▼────────────┐
   │   Trusted Node A    │              │   Trusted Node B    │
   │ • Change Validation │              │ • Change Validation │
   │ • Load Distribution │              │ • Load Distribution │
   │ • State Consensus   │◄────────────►│ • State Consensus   │
   └──────────┬──────────┘              └─────────┬───────────┘
              │                                   │
              └─────────────────┬─────────────────┘
                                │
                    ┌───────────▼────────────┐
                    │  Verifiable Storage    │
                    │   • On-chain Pointers  │
                    │   • Access Control     │
                    └───────────┬────────────┘
                                │
                    ┌───────────▼────────────┐
                    │ Decentralised Storage  │
                    │   • IPFS/Content Hash  │
                    │   • Immutable Blobs    │
                    └────────────────────────┘

Node Types & Responsibilities

Regular Nodes

Local Change Monitoring: Capture entity storage mutations via event bus
Change Set Generation: Aggregate mutations into signed, compressed payloads
Trusted Node Communication: Submit change sets for global replication
Sync State Polling: Periodically query verifiable storage for remote updates
Conflict Resolution: Apply three-way merge algorithms for concurrent modifications
Resource Constraints: Minimal storage and computational requirements

Trusted Nodes

Multi-Node Architecture: Multiple trusted nodes operate in parallel to distribute processing load and provide high availability
Change Validation: Verify cryptographic signatures and node permissions across the cluster
Change Set Consolidation: Periodic compaction of changes to create complete data sets
Decentralised Storage Interface: Upload/download operations to e.g. IPFS
Verifiable Storage Updates: Atomic updates to on-chain state pointers with consensus agreement

Data Flow Architecture

1. Local Change Detection

EntityStorage → ChangeCapture → LocalSnapshot → EventBus

2. Change Set Propagation

RegularNode → [Sign] → ChangeSet → TrustedNode → [Verify] → GlobalState

3. Global State Persistence

TrustedNode → [Encrypt & Compress] → DecentralisedStorage → [StateRoot] → VerifiableStorage

4. Remote Synchronisation

AnyNode → [Poll] → VerifiableStorage → [Fetch] → DecentralisedStorage → [Decrypt & Decompress] → LocalState

Synchronisation Algorithms

Apply Sync State Logic

The applySyncState method in LocalSyncStateHelper implements the core synchronisation algorithm that reconciles remote sync state with local state. This algorithm handles both full and incremental synchronisation scenarios:

Algorithm Overview

Input Processing:

Retrieves existing local snapshots and remote sync state snapshots
Sorts both collections by creation date (newest to oldest) for temporal ordering
Extracts epoch information for gap detection analysis

Epoch Gap Detection:

If the node has not been running for a while or has failed communications it might start to miss data, so we use gap detection to determine if we need to use a full consolidation snapshot.

const newestExistingEpoch = existingSnapshots[0]?.epoch ?? 0;
const oldestSyncStateEpoch = syncStateSnapshots[syncStateSnapshots.length - 1]?.epoch ?? 0;
const hasEpochGap = newestExistingEpoch + 1 < oldestSyncStateEpoch;

Synchronisation Strategy Decision:

Full Synchronisation Trigger:
- No existing consolidated snapshots locally, OR
- Detected epoch gap between local and remote state
Incremental Synchronisation Trigger:
- Existing consolidated snapshots present AND
- No epoch gaps detected

Full Synchronisation Process

When full sync is required:

Consolidation Search: Locates the most recent consolidated snapshot in remote state
Entity Storage Reset: Clears all remote entities from local storage to ensure clean state, but retains all local entries maintained by this node
Progressive Application: Processes snapshots from the consolidation point forward to the newest, ensuring newer changes override older ones
Change Set Processing: Applies all change sets from the consolidation and subsequent modifications

Incremental Synchronisation Process

When incremental sync is sufficient:

Snapshot Mapping: Creates lookup map of existing local snapshots for efficient comparison
Change Detection: Categorises remote snapshots into:
- New Snapshots: Not present locally
- Modified Snapshots: Present but with different dateModified
- Unchanged Snapshots: Identical local and remote versions
Processing Optimisation: Stops processing when encountering an unchanged snapshot (temporal consistency guarantee)
Change Set Application:
- Processes modified snapshots to apply incremental changes
- Processes new snapshots in chronological order (oldest to newest)
- Removes unreferenced local snapshots for storage clean-up

Key Design Principles

Temporal Consistency: Changes are applied in chronological order to maintain causality
Conflict Resolution: Newer changes automatically override older ones within the same entity
Storage Efficiency: Only processes changed or new snapshots, avoiding redundant operations
Data Integrity: Maintains referential integrity by cleaning up orphaned snapshot references
Gap Recovery: Automatically triggers full synchronisation when data continuity is compromised

Error Handling & Edge Cases

Missing Consolidations: Logs warning when full sync is needed but no consolidated snapshot is available
Empty Remote State: Handles scenarios where remote sync state contains no snapshots
Epoch Inconsistencies: Automatically recovers from epoch gaps through full synchronisation
Processing Interruption: Uses completedProcessing flag to optimise incremental sync execution

This algorithm ensures eventually consistent data replication across distributed nodes while minimising network traffic and computational overhead through intelligent synchronisation strategy selection.

Data Structures

Sync Pointer Store

Stored in verifiable storage, for each storage type contains a storage id of the location for sync states.

{
  "syncPointers": {
    "my-type-1": "blob:ipfs:0x11...1111",
    "my-type-2": "blob:ipfs:0x22...2222"
  }
}

Sync State

Stored in decentralised storage, contains all of the snapshots for a specific storage key

{
    "storageKey": "my-type-1",
    "snapshots": [
        {
            "id": "0xaaa...aaa",
            ...
        },
        {
            "id": "0xbbb...bbb",
            ...
        }
    ]
}

Sync Snapshot

Contains a list of all the storage ids for the change sets which make up the snapshot

{
    "id": "0xaaa...aaa",
    "dateCreated": "2025-05-29T01:00:00.000Z",
    "dateModified": "2025-05-29T01:00:00.000Z",
    "isConsolidated": false, // Determines if this contains a complete data set
    "epoch": 123, // A counter that is incremented for each snapshot
    "changeSetStorageIds": [
        "blob:ipfs:0x111...1111",
        "blob:ipfs:0x111...1112"
        ...
    ]
}

Sync Change Set

Contains a set of changes to be made to the entity storage, the proof is generated by the node identity which guarantees the validity of the data from the node

{
    "id": "0xaaa...aaa",
    "storageKey": "my-type-1",
    "dateCreated": "2025-05-29T01:00:00.000Z",
    "dateModified": "2025-05-29T01:00:00.000Z",
    "changes": [
        ...
    ],
    "nodeIdentity": "did:iota:0xccc.cccccc",
    "proof": {
        // Cryptographic proof signed by the node identity
    }
}

Sync Change

Contains a mutation for an entry in entity storage, either a set of delete. The id and nodeIdentity are omitted from the entity and stored at a higher level in the data structures to shrink the data set size.

{
  "operation": "set",
  "id": "my-item-1",
  "entity": {
    "foo": true,
    "bar": 123
  }
}

Overview​

Core Architecture​

System Components​

1. Entity Storage Layer​

2. Change Capture Subsystem​

3. Cryptographic Verification Layer​

5. Event Bus Architecture​

6. Network Topology​

Node Types & Responsibilities​

Regular Nodes​

Trusted Nodes​

Data Flow Architecture​

1. Local Change Detection​

2. Change Set Propagation​

3. Global State Persistence​

4. Remote Synchronisation​

Synchronisation Algorithms​

Apply Sync State Logic​

Algorithm Overview​

Full Synchronisation Process​

Incremental Synchronisation Process​

Key Design Principles​

Error Handling & Edge Cases​

Data Structures​

Sync Pointer Store​

Sync State​

Sync Snapshot​

Sync Change Set​

Sync Change​