Document repositories, archives, and the InterPlanetary File System … and blockchain!

Document repositories, archives, and the InterPlanetary File System … and blockchain!

September 22, 2021

Simon Ručigaj,
digitalization advisor and communications manager

 

Companies in shipping have been desperately holding on to the paper bill of lading for hundreds of years. Over time it became the holy document that nobody wanted to let go of. But now CargoX has created a solution that empowers them to forget that piece of paper and start working digitally. What a sharp turn!

To make things even more interesting, their documents will be transferred through something called blockchain, stored on something called an InterPlanetary File System (IPFS), accessible in a digital document repository, which also can be seen as a central digital archive. All this at almost the speed of light.

One might just ask - are we sending documents to another planet?!

So, what does it all mean and what are these things called:

  • InterPlanetary File System (IPFS)

  • Digital repository

  • Digital archive

Let us start right away by clarifying first how the CargoX Platform for Blockchain Document Transfer (BDT) handles documents.

Creation and storage of time-stamped documents, electronic Bills of Lading, Letters of Credit, and other original trade documents

Once your original trade documents are created by you or your business partner, they are uploaded onto the CargoX Platform for Blockchain Document Transfer (BDT). This can be done through the simple graphic user interface, or through an API function that connects the CargoX Platform to your business software, as is the standard approach in a large corporation.

Once the original documents are created in the computer of the author (either as a PDF, or any other format), they are uploaded and stored in the InterPlanetary File System, which is encrypted, and also decentralized (which means it is not just stored in a single location that can suddenly fail). Each document is referenced with a hash number irreversibly extracted from the document contents - and it is mathematically impossible to determine the contents from the hash. An original document is considered “valid” once it is stored in the decentralized storage subsystem and signed by an issuer - before that it is just considered to be a draft document, similar to drafts on your computer or in your normal email inbox. 

In everyday use, all this technology assures that all documents sent through the CargoX Platform are protected from tampering with a special cryptographic code, and saved to the InterPlanetary File System. The cryptographic code (hash) is registered with the blockchain - a publicly accessible digital ledger that is unbreakable and highly secure - and it always carries the correct information about the trade document originality code and the document owner. 

With the cryptographic token the document's exact contents are tied to a specific user, and the user owns the document for as long as they want. The blockchain public ledger serves as the registry of the document’s ownership and any consecutive document ownership transfers, with detailed timestamps.

Document transfer = title transfer

The user can transfer the document to another user - your business partner, for example - and the transfer is again registered with the blockchain in an instant and validated by the global public network of the blockchain nodes - computer systems that confirm transactions without knowing their exact content. 

During all this time, the document file stays saved in the InterPlanetary File System, meaning that the document itself does not move anywhere when it is transferred. Only the credentials of ownership change, and only if so desired by the owner and registered in a valid blockchain transaction. 

All this time the document is secured with cryptographic methods from any unauthorized eyes (except the parties as defined in the transfer workflow), and the owner can transfer ownership of the document to the next person at any time.

Repository vs. archive

As we started our explanations here both with basic premises and with differentiating terms such as data repository and archive, let us explain some of the background on that.

In the world of data or document management, the two terms “repository” and “archive” are often used as synonymous terms, and sometimes library, collection, or similar terms are used as well. This is practical, as, in the digital world, where operational files and data are often just accessible to the whole company that needs them, the line between an everyday data storage system and an archive is blurred or even becomes nonexistent. 

On the other hand, professionals dealing with paper documents are well aware of the limitations of a working data repository (their desk, office, storage capacities in the office), and archives, which might be remote units, where storage space is cheaper and where they can store documents that are not accessed on a regular basis, but that still nonetheless need to be retrieved from time to time.

Just to add another level of intrigue, various countries all have their own specific legal requirements for a repository to be called an “archive”. But it can be generally said that its main properties are to ensure long-term preservation, data integrity, data security and data traceability - all of which are naturally provided within the CargoX Platform for Blockchain Document Transfer (BDT) as well. 

CargoX Platform digital archive functionality

Within the CargoX Platform there is no special “archive” module, as per definition of standard digital archives. 

But, the platform does offer an Archive Folder, where all documents are deposited that have already finished their process and have performed their duties - and the owner chose to archive them by clicking the Archive button in the application. 

When this action is performed, the document or envelope is shown in the Archive view. This means it is not visible in the operational folders Inbox, Drafts, or Sent anymore, but it is always accessible in the Archive

Please note that this Archive view does not enable any special archiving functionality and that all the documents on the CargoX Platform for Blockchain Document Transfer (BDT) have a life expectancy of 10 years (or as defined in any local legislation of the owner of the document), and thereafter they might be removed from the platform. 

Exporting copies of the original document

While the document is owned by only one party at any given point in time, all the parties in a document transfer process - meaning the shipper, exporter, importer, and release agent, or other involved parties in a workflow - can always export the document in a printable form

This can be used for forwarding to any other party in need of a copy of the original trade document. It needs to be noted that all exported PDF copies of the original document are clearly marked as “COPY”.

A more technical explanation

Data-wise the CargoX ecosystem consists of two parts:

  • the on-chain data (public), and

  • the off-chain (private) part

On-chain data

In the CargoX Platform for Blockchain Document Transfer (BDT), on-chain data is used for registering an original document with blockchain and tracking the ownership of a document. This is the part where trust is the most important, as we do not want anyone besides the current owner of the document to have the right to change it (not even CargoX).

Off-chain data

The off-chain database stores all user data required for mandatory identification and verification of customers including everything required by regulators – like KYC (Know Your Customer). This is also where the pairing of customers and their public keys is securely stored.

From the public blockchain perspective, storing original documents on a decentralized IPFS is also considered off-chain, as all confidential data are encrypted and hidden from the public. Although the principles of IPFS are similar to a blockchain in a sense that it is permanent and immutable, the fact that encrypted documents are only available to parties involved in a document transfer makes them similar to off-chain, but nevertheless guarantees a high level of trust, unobtainable with a private, centralized database.

Whenever we read/write on-chain data, the CargoX Platform smart contracts are used. When accessing off-chain data, the CargoX servers are used (also as a proxy for accessing IPFS).

Provided API functions enable access to off-chain data (serve as a proxy for accessing IPFS), and also offer simplified access to on-chain data, without the need for hosting or accessing a full Ethereum node (although this is possible as well). 

Permanent encrypted decentralized data storage

Storing any large amount of data, such as years’ worth of documents, on blockchain opens up several potential security and scalability issues, and can further become prohibitively expensive. 

CargoX accordingly uses a two-tiered approach that offers much better flexibility. 

The InterPlanetary File System (IPFS) has been identified as the  best decentralized storage service for permanent storage of documents. The documents are stored with a highly secure encryption technology, and all non-public metadata is encrypted too. IPFS offers permanent, reliable, and economic data storage appropriate for e-archiving.

The InterPlanetary File System (IPFS) is a protocol designed by Protocol Labs to create a permanent and decentralized method of storing and sharing files. It is a content-addressable, peer-to-peer hypermedia distribution protocol. Nodes in the IPFS network form a distributed file system. IPFS is an open-source project developed in 2014 by Protocol Labs with help from the open-source community. 

* * *

To read more about how the CargoX Platform for Blockchain Document Transfer (BDT) combines the concepts of digital document processing, electronic banking, and email, and what has blockchain got to do with it, we suggest you read the CargoX Platform for Blockchain Document Transfer (BDT) bluepaper - you can download the PDF document here