Subscribe to receive STORE research, news, and updates
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

About STORE

STORE is bringing Cloud 2.0 to the world with a zero-fee cryptocurrency and checks and balances governance.

BACK TO NEWS
February 20, 2020
-
5 min read

Datacoins 201

About STORE

STORE aims to tokenize data to make it open (easily discoverable), tradable (data can be purchased), and programmable (developers can build their applications using purchased data streams). STORE uses a ⅔ fault tolerant trust model to secure the data created on its platform. STORE miners become trusted oracles with this model, enhancing the transparency for data creators, data buyers, and developers on the STORE platform.  

The business model for open data is created around a concept called datacoins. This document explains what datacoins are, why they are needed, and how they are used to create an economy around open data. The document is organized as a series of questions and answers to help simplify understanding the business model around open data.

Datacoin 201

What is open and tradable data?

Today’s data live in their own silos. They are usually unstructured or they are formatted by a specific company [1] to suit its business needs. The data is not universally discoverable and hence not easily sellable to, tradable with, or usable by others. The data’s value is not leveraged because they cannot be easily accessed or combined with other data or further leveraged by others to create better data with even value.

Additionally scrubbing and cleaning data after the fact for Artificial Intelligence/Machine Learning (AI/ML) is expensive and error prone.

“Open” data, on the other hand, are not silo’d, are well structured and labeled, so they are easily discovered. Open data allows a company’s data to be efficiently sold, traded and built with/upon. At STORE we call this process of opening up data the “tokenization of data”. The applications or systems that tokenize their data are called “tokenized apps” or tApps.

What about APIs and algorithms? Can they be opened and traded?

At STORE we use the term “data” to mean pre-created data as well as APIs and algorithms that produce data on-demand. These APIs and algorithms can be invoked with their supported properties to produce different sets of data. Tokenization includes APIs and algorithms as well.

What types of companies might tokenize their data?

All kinds! Any company owning or producing valuable data can tokenize their data and take advantage of STORE’s open data platform. This includes any web2.0 or mobile app who can tokenize their data without expensive rewrites in STORE specific languages. A lightweight interface allows existing applications or systems to open up their data on the STORE platform. This enables migrating existing apps on to STORE without expensive re-engineering.

An important distinction is that unlike dApp platforms which have limited capacity causing them to be extremely expensive to deploy any real world, data-rich apps, STORE is designed ground up to support data-rich apps and consume massive amounts of data in a trusted and decentralized network.

What are data-rich and data-light apps?

Today’s web2.0 and mobile apps consume and create large amounts of data. AI/ML services take this further and they consume and create even massive amounts of data. We call them data-rich apps. On the contrary, dApps built on top of web3.0 and similar platforms can be termed data-light apps. This differentiation is important because most dApp platforms require app developers or users to pay for network resources — memory, CPU, bandwidth, and storage. Given the limited capacity of these decentralized platforms, the network resources tend to be extremely expensive to deploy any real world, data-rich apps. STORE is designed grounds up to support data-rich apps in a trusted and decentralized network.

What is tokenization of data? What does it mean to “open up” data?

For data to be open and tradable, it should have the following properties.

  1. Discoverability — Prospective buyers should be able to discover the data they are looking for. STORE facilitates data categorization, which enables sellers to categorize their data in many different ways using existing standards or their own. This enables prospective buyers to search for data by category names or tags.
  2. Classification — Not all data is created equal. A single company can produce data of different qualities or values. Their tApps can produce many different types of data also. Companies can classify their data with the criteria of their choice and publish data in those classes. Together with data categorization, classification allows for the precise discoverability of data.
  3. Tradability — For data to be traded, it must be priced in some “units of data”, so that buyers can pay for and access the data. The price for different classes of data can be different.
  4. Access enforcement — Access enforcement and monitoring is required to ensure that only authorized buyers can access the data that they have purchased access for. There may be additional constraints such as number of times the data can be accessed, access duration, and so on.

The process of tokenization involves securing the data created by tApps with private keys. Data can then be traded when the buyer purchases access to private keys. In order to prevent access leaks, the private keys are hashed with the buyer’s public key so only the authorized buyer can access data.

What is the “unit of data” for tokenization?

STORE has chosen 1MB of data as the unit of data. Data is priced and access is purchased in these data units. These tradable tradable and programmable at 1MB data units ends up representing an asset (data), we call them datacoins.

1 MB of data = 1 datacoin

What are datacoins?

Each company’s tokenized app issues its own datacoins, which are created, purchased, and used in the context of their tokenized app and are unrelated to another company’s datacoins. For every MB of data created, 1 datacoin is created by that app. Fig. 1 illustrates the process of creating datacoins based on the data created by the tokenized apps.

Fig. 1 — How datacoins are created


For example, an IoT app that monitors activities around volcanoes and produces live data feed will issue its own datacoins (say, ‘volcanocoin’ [2]) compared to another AI-app that predicts traffic congestion based on real time traffic data, which will issue its own datacoins (say, ‘trafficcoin’). On an average, volcanocoin app may produce 500MB of data on a given day, so it creates 500 volcanocoins daily. The trafficcoin app, on the other hand, may produce 1 TB of data daily, so it creates 1M trafficcoins daily. Fig. 2 illustrates different tokenized apps issuing their own datacoins.

Fig. 2 — Each tokenized app issues its own datacoins

Are datacoins created all at once or created on a continuous basis?

Datacoins are mined[3] based on the units of data created in a tokenized app, so they are mined on a continuous basis. For example, if a company creates 1 GB of data on a given day, then they will mine 1,000 datacoins on that day. This is different from Etherum’s ERC20 tokens, which have their “total supply” predefined when the tokens are first created. The number of datacoins mined may vary widely on a day-to-day basis, depending on the volume of data created.

Can a company migrate their preexisting data onto STORE?

You bet! The company’s data will need to first be structured, labeled and price like other data created in STORE. If a company wants to migrate 1 TB of their preexisting data then, 1,000,000 datacoins will be mined on day 1. From that point forward new data will generate new datacoins at the rate of 1 data coins per 1MB of data.

Does all data have the same value?

No. Different data have different values. Some data is more valuable and other data is less.

Can a company have different prices for their data?

Of course. Companies can use different “classes” to identify and subsequently uniquely price different groups of their data.

Fig. 3 illustrates data classes. Tokenized apps can create data of different classes, but datacoins are issued purely based on the total data created by the apps.

Fig. 3 — Datacoins are created based on the total data created by tokenized apps

Fig. 4 gives an example of how total number of datacoins is based on the total amount of data created by the tokenized apps.

Fig. 4 — An example illustrating how datacoins are created based on the total amount of data created

How does data pricing work?

Companies decide the pricing of their data. Different classes of data may be priced differently based on their value and demand. Fig. 5 illustrates an example of pricing different classes of data.

Fig. 5 — An example illustrating data pricing of different data classes

Data is priced in 1MB units (though pricing flexibility exists to accommodate events such as large or full buyouts and even ongoing live data streaming). Fig. 6 illustrates an example of a full buyout where the buyer purchases access to all the data created by a tokenized app. In this example, the app creates 3 classes of data at different prices per class, but the buyer negotiates a common price (1.5 datacoins per 1 MB of data) to purchase all the 3 classes of data.

Fig. 6 — An example of purchasing firehose access to the data

How will buyers find what they want on the STORE platform?

As discussed earlier in this document, data discovery is done via data categorization. For a requested category, multiple data classes from multiple tokenized apps may match. Buyers use the search feature on STORE to discover the categories of data they want. Fig. 8 illustrates how generic search on STORE or specific APIs published by tokenized apps can be used to discover and filter data.

The search or the API result contains matching data classes, data sizes, and the price of data in respective datacoins and in $STORE. The result may also contain the preview data of matching classes to help buyers with making purchase decisions. The result has a unique result ID, which can be used later to purchase the data.

Fig. 7 — Generic search or specific APIs to discover and filter the data

How will buyers purchase access to the data?

Fig. 8 — Data access purchase flow

Fig. 8 illustrates how buyers purchase access to the data. As described before, buyers start with search and filtering to decide what they want to buy. The buyers then link the result ID to their STORE wallet to pay for the data access. The buyers pay in $STORE, so they should have sufficient balance in their wallet to purchase the access to the data. Once the purchase is completed, a smart contract is created with the following content.

  1. The result ID — The result ID points to a result object, which contains all the details about how to fetch the data the buyer has requested. The same search and filtering criteria are used to retrieve data and grant data access to the buyer.
  2. Datacoins purchased — The purchase involves automatic deposit of the tokenized app’s datacoins into the buyer’s wallet. Datacoins stay in the buyer’s wallet until the buyer proceeds with accessing the data.
  3. Authorization — The authorization required to access the data. This is similar to session cookies used in web browsers and contains sufficient information to ensure that the buyer has access to the data based on their request. For example, the authorization may contain access duration through which the buyer can access data, and so on.
  4. Public key — The public key of the wallet. The authorization is cryptographically tied to the public key to prevent the access leak where unauthorized people tried to access data with a stolen authorization token.

The contract contains other housekeeping data also, such as the public keys of app developers. The contract can be “executed” by the buyer to access the data. This process is discussed next.

How will the buyer actually access the data they have purchased?

The buyer can execute the contract to get an authorization code that is tied to their identity. The authorization code works similar to cookies used in websites and allows the STORE platform to enforce access rights. The authorization code is associated with the buyer’s identity to prevent unauthorized access to data. When executing the contract, the buyer signs the contract transaction with their private key to prove the ownership of the authorization code. Fig. 9 illustrates the data access flow.

Fig. 9 — Data access purchase flow

The authorization code embeds any restrictions associated with the data access such as the expiry date, concurrent access to data, anonymization schemes if any, and so on.

ow will datacoins have a separate value than $STORE?

The value of $STORE is determined based on the macroeconomics of the STORE network. It is the native currency of the STORE network. Datacoins however, are app specific and their value is based on the demand for the data of specific apps and set by the developers of those apps. Fig. 10 summarizes these differences.

Fig. 10 — Datacoins will have a separate value than $STORE

Developers are responsible for pricing their data and creating demand for their data. So, the value of datacoins is primarily a function of the tokenized data. Since not all data is created equal, datacoins of some apps may be more valuable (in absolute terms) than others.

What is the denomination in which datacoins are valued?

$STORE is the unit of account on the STORE platform. This means, all datacoins are valued and priced in $STORE. Just like any commodity, the value of data in a tokenized app may fluctuate based on demand and supply, so the value of a datacoin will fluctuate based on the value of the data. 1 volcanocoin may be equal to 2 $STORE and 1 trafficcoin may be equal to 0.1 $STORE.

Are datacoins fungible or non-fungible?

Fig. 11 explains the difference between fungible and non-fungible tokens.[4] Datacoins can be both fungible and non-fungible depending on how developers want to model access to their data. In other words, this is the responsibility of developers. For example, if an app produces unique instances of data, such as digital art, access to each instance can be modeled as a non-fungible datacoin. The purchase flow described in fig. 9 covers both types of coins. The smart contract generated to grant ownership of data describes the type of the coin used to access data. Since the purchase process starts with searching for specific data, the access is granted for the searched data, thus hiding the complexities between fungible and non-fungible datacoins. The smart contracts used on STORE to purchase access to data can be compared to ERC998[5] standard.

Fig. 11 — Comparing fungible and non-fungible tokens


We estimate that the majority of tokenized apps will issue fungible tokens, which exhibit the following properties.

  1. 1 datacoin of a tokenized app is interchangeable with another datacoin of the same tokenized app. For example, 1 volcanocoin is interchangeable with another volcanocoin or 10 x 1/10 volcanocoin fractions.
  2. 1 datacoin of one tokenized app is generally not interchangeable with a datacoin of another tokenized app. For example, 1 volcanocoin is not interchangeable with 1 trafficcoin.
  3. While 1 MB of data creates 1 datacoin, the created datacoin doesn’t exclusively own the access to the 1 MB of data created. The datacoin can be used to purchase any 1 MB of data for the tokenized app.
  4. For tokenized apps that create multiple classes of data (we assume, this is the predominant use case) 1 MB of one class of data is not interchangeable with 1 MB of another class of data, although 1 MB of data mines 1 datacoin. This is clarified in the previous question above with volcanocoins. This means, while datacoins of a tokenized app are fungible within the context of that app, its data are not.
  5. Since $STORE is the unit of account, datacoins from one tokenized app can be converted in value to datacoins of another tokenized app. For example consider two tokenized apps using each other’s data. Assume 1 App1COIN = 0.5 $STORE and 1 App2COIN = 0.1 $STORE. Since the apps use each other’s data on a continuous basis, it is very inefficient to buy each other’s datacoins with $STORE before they can access each other’s data. So the cooperating apps may decide on a conversion rate between their datacoins. In this case, 1 App1COIN = 5 App2COIN, but the conversion rate can be dynamic and may change over time depending on the respective values of datacoins. In other words it may not be a one-time setup, so app developers use STORE smart contracts to price and buy each other’s data. Figures 12 and 13 describe how developer to developer contract works.
Fig. 12 — Cooperating apps negotiate exchange rates for their datacoins


The cooperating apps negotiate an exchange rate between their datacoins. The purchase contract works exactly as described before, granting apps authorizations to each other’s data.

Fig. 13 — Automatic payment for each other’s data with each other’s datacoins

When accessing each other’s data, if payment is required, it is automatically triggered before data access is allowed. Fig 13 illustrates an example of what happens when App1 wants to use App2’s data.

How do app developers purchase resources required to run their apps?

App developers can purchase network resources such as the memory, CPU, storage, and bandwidth in one of the following two ways:

Paid p2p cloud — In this option, app developers pay STORE miners for compute resources they need for their apps. This is similar to paying AWS to rent EC2 instances, S3 storage, and other services. App developers may choose on-demand resources or enter into long-term contracts depending on their specific needs.

Zero-fee p2p cloud — In this option, app developers receive compute resources at no upfront cost from STORE miners. Not all apps may be eligible for receiving zero-fee compute resources. STORE miners in respective Markets evaluate apps for receiving zero-fee compute resources. In return for receiving zero-fee compute resources, app developers will share datacoin revenue with STORE miners. The specifics of revenue sharing varies from one app to another depending on the negotiation between STORE miners and app developers.

Who gets paid when buyers purchase datacoins ($STORE revenue)?

Depending on how developers purchase p2p compute resources, the revenue from purchasing datacoins may be paid out entirely to app developers or they share a percentage of the revenue with STORE miners. Figures 14 and 15 illustrate these two cases. Note that in both cases, the buyer pays in $STORE to purchase datacoins and the revenue is realized in $STORE.

Fig. 14 — In the paid model, developers earn 100% of the $STORE revenue from the datacoin purchase
Fig. 15 — In the zero-fee p2p compute model, developers share the $STORE revenue with STORE miners

Who gets paid when buyers use datacoins to purchase data (datacoin revenue)?

Buyers pay app developers in datacoins to request access to data. This results in datacoin revenue. Depending on how developers purchase p2p compute resources, datacoin revenue may be paid out entirely to app developers or they share a percentage of the revenue with STORE miners. Figures 16 and 17 illustrate these two cases.

In practice, buying datacoins and using that purchase to access data are executed in a single step. So, $STORE revenue and datacoin revenue are also paid out in a single step. The exception may be when the buyer wants to buy datacoins for speculation purposes and doesn’t intend to purchase data with them. In this case, the $STORE revenue and the datacoin revenue will be paid out in two separate steps based on when or if the buyer eventually decides to purchase data.


Fig. 16 — In the paid model, developers earn 100% of the datacoin revenue from the data purchase

Data creators — in many cases, end users of tokenized apps — may be incentivized by developers with revenue sharing contracts with them. So, while developers earn 100% of the datacoin revenue, they may incentivize data creators by sharing the revenue with them, as illustrated in figures 16 and 17. The ⅔ trust model used to secure the data elevates transparency where data creators have visibility into how (and if) their data is sold, the number of times it is sold and accessed, and so on. This transparency helps with creating reasonable revenue sharing contracts where developers want to incentivize users to create more valuable data.

Fig. 17 — In the zero-fee p2p compute model, developers share the datacoin revenue with STORE miners

How datacoins compare with ERC20s?

Datacoins have some similarities to ERC20 tokens, but they are different in many ways.

Similarities:

  1. Both are layer 2 tokens.
  2. Each tokenized app issues its own datacoins similar to each project on Ethereum issuing its own ERC20 token.

Differences:

  1. On STORE, every valuable piece of data will be secured by a private key. Data can then be traded when the buyer purchases access to the private key. Datacoins are the private keys for the data created by tokenized apps and therefore, they are backed by the data. On the other hand, ERC20 tokens are purely speculative.
  2. Datacoins are mined on a continuous basis as and when the data is created. ERC20 tokens are created with an initial supply at the launch.
  3. Datacoins can be accepted as revenue by STORE miners if they choose to provide zero-fee p2p compute resources to app developers. This gives datacoins the chance to have a monetary `belief” premium. Ethereum miners only accept ETH, its layer-1 currency.
  4. Because datacoins back data, they have intrinsic value, if the data is valuable and in demand. As already mentioned above, ERC20 tokens are purely speculative.

Fig. 18 compares datacoins with ERC20 tokens, stablecoins, and other coin types.

Fig. 18 — Datacoins compared to stablecoins and ERC20 tokens

How does STORE prevent a buyer from selling the data in the gray market?

STORE cannot prevent data reselling because the buyer can copy the data to their local infrastructure, repackage, and resell it. This is very similar to how products are sold in gray markets in other businesses such as electronic items, phones, computers, etc. What STORE guarantees is integrity of data when it is accessed directly on the STORE platform. The data is guaranteed to have passed schema enforcement and other validations and it is guaranteed to be anonymized of any sensitive information. So, if data quality and such guarantees are critical, buyers will stick to legal access to data on the STORE platform.

How is user privacy protected when potentially all types of data can be sold on STORE platform?

All data, including sensitive and private information can be sold on the STORE platform. So, where is the user privacy then? Company’s tokenized apps are required to annotate sensitive and private data before submitting it to the platform for persistence and datacoin mining. The annotated segment is encrypted automatically at rest, so even an unauthorized access to the data doesn’t leak sensitive information. When sensitive information is annotated, additional metadata is created in the clear text form that describes the classification and categorization of the sensitive data. This metadata assists in data discovery and context-sensitive anonymization, when the data is eventually accessed. If the sensitive information happens to be included in the query executed by a purchase contract, the sensitive data is automatically anonymized with the information provided in the metadata. So, all queries always return safe data. This allows all types of data being traded without sacrificing user’s privacy.

If app developers fail to annotate sensitive information in their respective apps, the data created by those apps will not be attested by STORE miners. In other words, while the apps may create and sell their data on STORE, the data will not pass the validation rules for sensitive information and hence will not contain the proof of approval by STORE miners. It is a red flag for data buyers because it clearly implies that the apps haven’t fully followed protocol rules. Instead of taking punitive measures, STORE lets the markets decide if the data created by such apps have any value at all.

What happens if a tokenized app cheats and doesn’t annotate users’ sensitive information?

Cheating can happen in any environment and STORE is no exception. However, STORE’s data discovery APIs also describe if tokenized apps annotate any sensitive information and the schema used to annotate sensitive information. This transparency is what forces app developers to be honest. If the users of a tokenized app know that sensitive and private information is collected, but don’t see that information being annotated, they know that the app is not treating data privacy as it should. The market decides what should happen to the data (and the resulting datacoins) of that tokenized app.

Use cases

Slides 17-21.

[1] In this case company could mean a large enterprise or a single developer.

[2] Don’t take the coin names too seriously! They are used here for illustration purposes only.

[3] We use the terms “mined”, “created”, and “issued” interchangeably. 1 datacoin is created, mined, or issued when 1MB of data is created by a tokenized app.

[4] Source: https://medium.com/0xcert/fungible-vs-non-fungible-tokens-on-the-blockchain-ab4b12e0181a

[5] https://github.com/ethereum/EIPs/blob/master/EIPS/eip-998.md


BACK TO NEWS
STORE will change the world’s relationship to computing resources and data.