STORE is bringing Cloud 2.0 to the world with a zero-fee cryptocurrency and checks and balances governance.
STORE aims to tokenize data to make it open (easily discoverable), tradable (data can be purchased), and programmable (developers can build their applications using purchased data streams). STORE uses a ⅔ fault tolerant trust model to secure the data created on its platform. STORE miners become trusted oracles with this model, enhancing the transparency for data creators, data buyers, and developers on the STORE platform.
The business model for open data is created around a concept called datacoins. This document explains what datacoins are, why they are needed, and how they are used to create an economy around open data. The document is organized as a series of questions and answers to help simplify understanding the business model around open data.
Today’s data live in their own silos. They are usually unstructured or they are formatted by a specific company [1] to suit its business needs. The data is not universally discoverable and hence not easily sellable to, tradable with, or usable by others. The data’s value is not leveraged because the data cannot be easily accessed, combined with other data or further leveraged by others to create better data with even more value.
Additionally scrubbing and cleaning data after the fact for Artificial Intelligence/Machine Learning (AI/ML) is expensive and error prone.
“Open” data, on the other hand, are not silo’d, are well structured and labeled, so they are easily discovered. Open data allows a company’s data to be efficiently sold, traded and built with and upon. At STORE we call this process of opening up data the “tokenization of data”. The applications or systems that tokenize their data are called “tokenized apps” or tApps.
All kinds! Any company owning or producing valuable data can tokenize their data and take advantage of STORE’s open data platform. This includes any web2.0 or mobile app who can tokenize their data without expensive rewrites in STORE specific languages. A lightweight interface allows existing applications or systems to open up their data on the STORE platform in a matter of days (depending on the complexity of the existing applications or services), instead of months or even years it would take to re-engineer them on other dApp platforms.
An important distinction is that unlike dApp platforms which have limited capacity causing them to be extremely expensive to deploy any real world, data-rich apps, STORE is designed from the ground up to support data-rich apps and consume massive amounts of data in a trusted and decentralized network.
For data to be open and tradable, it should have the following properties.
When companies tokenize and open their data they benefit, but what about data creators (end users) of these apps? There are two ways tokenizing data on STORE benefit end users.
STORE has chosen 1MB of data as the unit of data. Data is priced, and access is purchased, in these data units. These tradable and programmable at 1MB data units ends up representing an asset (data), we call them datacoins.
1 MB of data = 1 datacoin
Each company’s tokenized app issues its own datacoins, which are created, purchased, and used in the context of their tokenized app and are unrelated to another company’s datacoins. For every MB of data created, 1 datacoin is created by that app. Fig. 1 illustrates the process of creating datacoins based on the data created by the tokenized apps.
For example, an IoT app that monitors activities around volcanoes and produces live data feed will issue its own datacoins (say, ‘volcanocoin’ [2]) compared to another AI-app that predicts traffic congestion based on real time traffic data, which will issue its own datacoins (say, ‘trafficcoin’). On an average, volcanocoin app may produce 500MB of data on a given day, so it creates 500 volcanocoins daily. The trafficcoin app, on the other hand, may produce 1 TB of data daily, so it creates 1M trafficcoins daily. Fig. 2 illustrates different tokenized apps issuing their own datacoins.
Datacoins are mined[3] based on the units of data created in a tokenized app, so they are mined on a continuous basis. For example, if a company creates 1 GB of data on a given day, then they will mine 1,000 datacoins on that day. This is different from Etherum’s ERC20 tokens, which have their “total supply” predefined when the tokens are first created. The number of datacoins mined may vary widely on a day-to-day basis, depending on the volume of data created.
You bet! The company’s data will first need to be structured, labeled and price like other data created in STORE. If a company wants to migrate 1 TB of their preexisting data then, 1,000,000 datacoins will be mined on day 1. From that point forward new data will generate new datacoins at the rate of 1 data coins per 1MB of data.
No. Different data have different values. Some data is more valuable and other data is less.
Of course. Companies can use different “classes” to identify and subsequently uniquely price different groups of their data.
Fig. 3 illustrates data classes. Tokenized apps can create data of different classes, but datacoins are issued purely based on the total data created by the apps.
Fig. 4 gives an example of how total number of datacoins is based on the total amount of data created by the tokenized apps.
Companies decide the pricing of their data. Different classes of data may be priced differently based on their value and demand. Fig. 5 illustrates an example of pricing different classes of data.
Data is priced in 1MB units (though pricing flexibility exists to accommodate events such as large or full buyouts and even ongoing live data streaming).
As discussed before, open data becomes tradable and programmable. App developers earn their revenue from selling data on the STORE platform. Open data has better tradability than building the same apps on closed, centralized cloud platforms where data is hard to discover and purchase.
STORE miners provide developers with p2p compute resources they need to run their apps. Developers pay for these resources similar to how they pay to AWS™, for example. The miners earn their revenue in the form of block rewards in $STORE for securing the STORE blockchain as well as by selling p2p compute resources to developers.
STORE also believes that data creators — end users — need to have a way to get paid for their data when their data is sold. STORE will provide the smart contract infrastructure and tooling for developers to facilitate revenue sharing with users. The decision to share revenue will be up to the developers. The ⅔ trust model used to secure the data elevates transparency where data creators have visibility into how (and if) their data is sold, the number of times it is sold and accessed, and so on. This transparency helps with creating reasonable revenue sharing contracts where developers want to incentivize users to create more valuable data. Even in cases where users don’t get paid for their data, they will still benefit from STORE’s default encryption and anonymization, which protect their privacy.
STORE cannot prevent data reselling because the buyer can copy the data to their local infrastructure, repackage, and resell it. This is very similar to how products are sold in gray markets in other businesses such as electronic items, phones, computers, etc. What STORE guarantees is integrity of data when it is accessed directly on the STORE platform. The data is guaranteed to have passed schema enforcement and other validations and it is guaranteed to be anonymized of any sensitive information. So, if data quality and such guarantees are critical, buyers will stick to legal access to data on the STORE platform.
All data, including sensitive and private information can be sold on the STORE platform. So, where is the user privacy then? Company’s tokenized apps are required to annotate sensitive and private data before submitting it to the platform for persistence and datacoin mining. The annotated segment is encrypted automatically at rest, so even an unauthorized access to the data doesn’t leak sensitive information. When sensitive information is annotated, additional metadata is created in the clear text form that describes the classification and categorization of the sensitive data. This metadata assists in data discovery and context-sensitive anonymization, when the data is eventually accessed. If the sensitive information happens to be included in the query executed by a purchase contract, the sensitive data is automatically anonymized with the information provided in the metadata. So, all queries always return safe data. This allows all types of data being traded without sacrificing user’s privacy.
If app developers fail to annotate sensitive information in their respective apps, the data created by those apps will not be attested by STORE miners. In other words, while the apps may create and sell their data on STORE, the data will not pass the validation rules for sensitive information and hence will not contain the proof of approval by STORE miners. It is a red flag for data buyers because it clearly implies that the apps haven’t fully followed protocol rules. Instead of taking punitive measures, STORE lets the markets decide if the data created by such apps have any value at all.
We believe that some data can demand premiums that will permit only a limited number of buyers to purchase access to and own that data. Developers can define “number of seats” to limit access to such data. With the current design, such data can be protected with “non-fungible datacoin” (NFD) where each datacoin specifically represents the data it backs. Purchasing a NFD will allow exclusive access to the data it protects. But the concept of “seats” may help developers better model their access criteria for premium data. This is a research topic for now and we’ll publish more details when we have the deterministic model for it.
[1] In this case company could mean a large enterprise, single developer or anything in between.
[2] Don’t take the coin names too seriously! They are used here for illustration purposes only.
[3] We use the terms “mined”, “created”, and “issued” interchangeably. 1 datacoin is created, mined, or issued when 1MB of data is created by a tokenized app.