Hyperledger Indy, Not Your Grandmother’s Blockchain

Hyperledger Indy is a blockchain-based platform for managing identity, and facilitating the management and exchange of verifiable personal information.  It is a platform that enables Self Sovereign Identity, which enables individuals and organizations to manage and distribute their electronic information as they see fit.  Instead of organizations like Facebook and Google collecting and managing information, individuals and organizations will be able to self-manage. Instead of having to rely on paperwork or issuing organizations to verify information, individuals will be able to present verifiable, cryptographically signed credentials, independent of the issuing organizations.

So what makes Indy different from a “traditional” Blockchain platform?

A Blockchain is a permanent immutable ledger containing information shared by a group of individuals or organizations.  Bitcoin, the original blockchain platform, stores transactions on the ledger that record transfers of Bitcoin from one wallet to another. Due to the non-modifiable nature of the Blockchain, users can be assured that once a transaction is written to the blockchain it cannot be modified, so they can rely on the bitcoin record of transactions as a basis for conducting business.  The Blockchain is shared, so everyone has the same view of the information.

Bitcoin is an example of a Public Blockchain – anyone can install the required software and connect to the blockchain and participate in the update and management of the network.  The Blockchain implements “consensus” mechanisms to ensure that users follow certain rules when proposing updates to the blockchain, and the large number of participants ensures that the rules are followed.

Other Blockchains, geared towards business and enterprise users, are “Private”, and use traditional security mechanisms to ensure that only authorized participants can join the network.  Hyperledger Fabric, originally developed by IBM, is an example of such a blockchain – it includes a Certificate Authority to issue traditional digital certificates to participants, which grant them certain rights on the network.  It is being employed for business processes where many organizations need to collaborate and share information, for example supply chain management and shipping, etc. The Blockchain facilitates this information sharing, assuring the participants that common rules are being followed, and information – once written – cannot be altered.

Hyperledger Indy is another project within the Hyperledger Foundation.  Unlike other Blockchains, Indy does not store information on the Blockchain directly, rather it stores information that participants can use to Identify themselves, and that can be used to Define and Verify information that is published or exchanged between participants.  The information itself is held Off-chain, in the users’ wallets.

There are three main things that Hyperledger Indy stores on the Blockchain – Decentralized Identifiers, Schemas, and Credential Definitions.

A Decentralized Identifier (or DID) represents the identify of an individual or organization.  DID’s can be published on the blockchain, if the identity is to be made public, or DID’s can be exchanged privately between participants, if the DID is to be used to represent a private connection between participants.  DID’s contain cryptographic material that allows participants to sign and encrypt data, and allows other participants to verify this data. DID’s can also contain meta-data describing the participant, how to connect with their services, etc.

A Schema defines a specific set of information that will be issued or published as a Verifiable Credential.  It contains the list of attributes that each published credential will contain.

A Credential Definition links the Schema to the issuer’s DID, essentially announcing the fact that the issuer intends to publish credentials with the specific Schema referenced.

When a document is issued according to a specific Credential Definition (and Schema), it is referred to as a Verifiable Credential.  It is Verifiable because it is signed by the issuer, and can be verified via the Credential Definition and the linked Schema and DID.  It is verified based on information that is publicly available on the blockchain – the issuer does not need to be involved. Individual attributes within the Credential are called “Claims”.  When a credential is presented, individual attributes can be selected for presentation, the entire credential does no need to be revealed.

When information is presented in such a way it is referred to as a “Proof”, or a“Presentation”.  The Proof presents the claims and the cryptographic evidence that can be used to verify that the data was in fact issued by the identified issuer.  Proofs can reveal the claim values, or they can be “Zero Knowledge Proofs” (ZKP), which is a way of revealing characteristics of the data without revealing the value itself.

So that’s Indy in a nutshell!  Traditional Blockchains store information on shared, immutable ledgers.  Indy uses a shared, immutable ledger to facilitate the Off-chain sharing of information, which is held in an individual or organization’s private wallet, yet can be shared in a Verifiable manner.

Threshold Cryptography and You

Threshold Cryptography refers to a system whereby multiple parties are required to engage in a cryptographic process, either to produce a digital signature (for example to sign a document) or to decrypt a file or a piece of data.  This can be accomplished by dividing a key into multiple “shares”, and devising a system that requires multiple shares in order to perform the cryptographic operations.  Threshold Cryptography systems are characterized as (n, t+1), where n refers to the number of shares and t+1 refers to the number of shares required to perform crypto operations.  Up to “t” shares can be compromised without affecting the security of the system.

For example in a (3, 2) system, a key is divided into 3 shares and any 2 can combine to sign or decrypt files.  A single share can be compromised without losing security.

Note that in this kind of scheme the key is not simply divided up into sections, the shares are derived using “scary math”, so if an adversary gets hold of one of the shares, it doesn’t actually reveal any information about the key.

Threshold Cryptography has a number of use cases, including:

  • Securing private keys for applications like BitCoin wallets. Private keys (which are used to unlock BitCoin transactions) can be stored across multiple devices, making the keys more difficult for hackers to steal, and improving the security of your Bitcoins.
  • Securing keys for decrypting sensitive data. Multiple shares would be required to decrypt the data, making the private key more difficult for hackers and other adversaries to obtain.
  • Providing for a multi-party signature, without having to combine multiple different private keys. The parties would use the “shares” to participate in the signatory process and the final signature would represent a single private key.
  • Social password recovery – a private key’s shares could be distributed to friends and relatives (or to a lawyer or notary) to be used to recover a lost password or key. None of the “bearers” would have enough information to act on their own (or for a hacker to exploit) however this could provide a failsafe recovery for a forgotten password or lost key.
  • Distribution of public and private keys. In fact, one of the early use cases for Threshold Cryptography was to support a distributed CA model for an ad-hoc mobile network, to improve the resilience and security of the network.  A similar model could be applied to a blockchain network (which is a similar model), and could be used to either improve the security around Hyperledger Fabric’s CA process, or to support a distributed CA for a public version of a Hyperledger Fabric network.

In all the above scenarios, if any of the “shares” were lost or compromised, new shares could be generated and distributed without having to revoke and re-generate the underlying private key.

Threshold cryptography can be used in combination with tokenization to devise a system where data can be securely shared between users without revealing the data to third-party observers or adversaries, without having to reveal or share secret keys between the end users or any intermediary systems.  Anon Solutions is currently doing research in this area, which will be discussed in a future blog post.

If you have any questions or comments, or are interested in any of the solutions discussed, please send me a note.

Tools for Securing your Data (for Developers) – Tokenization

In this and the next few blog posts I’ll talk about two useful tools that can help secure and share your data – Tokenization and Threshold Cryptography.

Tokenization refers to the process of replacing sensitive data fields with a randomly generated token value, and storing the sensitive data value in a logically separate data store.  The token value should be randomly generated, so there is no way to map back from the token to the data value without the use of the tokenization system.  (This is a different approach from encrypting the data values, where the encrypted value can potentially be reversed.)  The generated token can potentially be of the same data type and format as the original data value, allowing the capability of integrating tokenization into existing legacy systems, or using tokenization to sever sensitive data values from public cloud-based systems.

Tokenization is an alternative to encryption using strong cryptography.  The two techniques can be combined, using string encryption to secure all data and token values that are transferred between the application data store and token “vault”, as well as encrypting the actual data values within the vault.

A lot of vendors are now talking about something called “Vault-less Tokenization”, which is something like tokenization but without having to maintain a separate token repository.  (The drawbacks to a token database is that if you lose it, you lose your data!)  Vault-less Tokenization is something like encryption, where the token value is derived from the original data, plus some “secret” or some values derived from a lookup table, and then rendered in the data’s original format.  It has the advantages of tokenization, without the cost of a separate data repository.  To recover the original value, you apply the reverse of the “secret” on the token value.  It’s really not that much different than strong cryptography, the main benefit that it gives you a properly formatted token.

There are a number of properties a secure tokenization system should possess and I’ll be talking about these in a future blog post.

Another great tool in your toolkit is threshold cryptography.  This one is a bit more complicated, and I’ll be talking about it (and its applications) in my next post.  And later I’ll demonstrate where tokenization and threshold cryptography can be combined to form a secure platform for social data sharing.