Why blockchain will enable widespread use of data marketplaces

Davis Marklin

April 22, 2019 · 4 minutes read

Uncategorized

We’ll begin by taking a look at recent trends. Every day, 2.5 exabytes (2.5 billion gb) of data are produced, and with the number of machines on the rise, the rate of data production will increase exponentially.

 

Collectively, Facebook, Google, Amazon, and Apple know more about you than you’d probably care to imagine. Your data fuels their business models, creating stickiness that makes it very difficult for potential competitors to gain market share. The $3.17 trillion combined market cap is a reflection of that.

 

Not to mention that, in 2018, total spending on the US gig economy was $864 billion. And there are no signs of that slowing down.

 

So, what can we make of these trends?

 

Well, data are considered valuable because AI & machine learning algorithms use them to provide value that wouldn’t otherwise be possible. With so many possible use cases of AI, the demand from startups and researchers to handle large amounts of high quality data will only increase.

 

The good news is: there is a large supply of data. But the problem for many would-be entrepreneurs is that the required data are siloed off in large corporations that have no incentive to share them.

 

But on the flip side, data producers are providing a valuable resource and aren’t getting paid for it.

 

With this context in mind, it only makes sense that the next evolutionary stage of the Information Age will be the commoditization of personal data. In order for small players to compete, data needs to be more widely accessible.

 

The problem right now is we don’t have the proper mechanisms in place to exchange data in a way that is A) Simultaneously scalable & cost effective, B) Guarantees privacy, and C) Guarantees the provenance of data.

 

Enter blockchain.

 

In a traditional marketplace, 3rd parties maintain the infrastructure and charge a fee to the market participants in order to use them. However, developing a marketplace infrastructure from scratch can be very expensive. Additionally, the cost of maintaining and scaling is a continuous and ever-increasing expense. This cost has to be passed on to customers in the form of high fees which can only be reduced if A) The company raises a ton of capital, or B) Reaches economies of scale.

 

Because data purchases are high-volume microtransactions, quality, scalability, and low fees are critical to a successful marketplace. A blockchain such as Ethereum offers a globally accessible, 24/7 uptime computer. With economies of scale built in, it’s possible to spin up marketplaces with low infrastructure costs (users would pay very small network fees).* Additionally, since all the code is open source, a wide variety of templates would enable engineers to quickly develop while reducing security risks.

 

Another major issue with centralized data exchanges, especially when the information being transacted is personal data, is the question of who has access. This issue has been in the public eye quite a bit with the recent Facebook data breaches. While not technically a data exchange (because they keep it for themselves), centralized entities can do whatever they want with the information.

 

On a decentralized data exchange, smart contracts govern the transaction terms and only decrypt information to qualified buyers, thus creating a record of data custody that’s entirely in the owner’s control. This gives privacy-conscious individuals an additional layer of comfort, which wasn’t previously possible. Fundamental to this concept, however, is the successful implementation of decentralized identities and reputation systems. Without going into too much detail, in order to weed out bad actors, a buyer’s identity would be assigned a reputation score (similar to an Uber driver rating) that tells a data producer whether or not the buyer is trustworthy.

 

Perhaps the most important issue for buyers is the ability to ensure that data producers are, in fact, selling the data they claim to be. In the current model, it’s impossible to tell if someone is buying incomplete or inaccurate data. But by linking blockchains directly to digital identities and IoT devices, it’s possible to guarantee the provenance of data right from the source. This ensures the reliability of any algorithms built from purchased data.

 

With such a high value placed on data, blockchains offer an infrastructure that can make transactions economically feasible while guaranteeing privacy and accuracy.

 

*The Ethereum blockchain is not yet scalable but the upcoming upgrades bring higher transaction throughput.