big-data-marketplace

Exploring the Big Data Marketplace: Buying, Selling, and Sharing Big Data

Revelate

Table Of Contents

The data market as a whole has expanded exponentially in recent years. Organizations are still purchasing singular datasets from data marketplaces, but increasingly, the draw of being able to utilize a continuous stream of real-time data is becoming more attractive.

However, with more and more data being produced every year, it’s natural that the strain on data supply chains is also increasing. Ensuring that collection, governance, and integration of data remains a holistic and controlled process can be difficult when you’re dealing with zettabytes of data and continues to be one of the major challenges organizations that organizations face when dealing with internal and external data effectively.

But as the flow of data continues to grow, so does the potential opportunities it presents for organizations of all sizes, industries, sectors, and regions. The use cases for data are as endless as the data itself, and being able to harness the insights hidden in data is of unquantifiable value for organizations.

Research from Accenture estimates that by 2030, over 1 million organizations worldwide will participate in data monetization, resulting in more than $3.6 trillion in value generated.

Organizations that can successfully create and manage large amounts of external data can also augment their internal data with it, creating new and improved datasets that can be sold, shared, or exchanged with other organizations, potentially opening up a new revenue stream.

In this article, we explore more about the conversation surrounding the increasing amount of data that is moving through data marketplaces and offer insight into data marketplaces themselves.

Let’s get started.

Is There a Difference Between a Data Marketplace and a Big Data Marketplace?

The platforms that currently exist to purchase, sell, share, or exchange data are called data marketplaces. A data marketplace is an accessible platform that allows organizations to place their datasets in front of the right target audience, with data transfer being handled by the marketplace owner. Data marketplaces can be B2B, B2C, or both, but the ones that we’ll discuss in this article will be B2B focused.

From a high-level point of view, there is no difference between a data marketplace and a big data marketplace. If we qualify a “big data marketplace” as one that provides access to the widest variety of data products, then there are a couple of possibilities:

  • Open data marketplaces, such as ones run by Governments—data.gov being one example—could potentially qualify as a big data marketplace just due to the sheer amount of data products available, and the concerted effort behind it to connect as many data providers and consumers as possible in the name of the public good.
  • Another potential big data marketplace example would be an IoT data marketplace. This type of marketplace provides aggregated third-party data from devices like smartphones, connected vehicles, smart home devices, and more. Typically, data consumers utilize real-time data streaming to retrieve data from IoT marketplaces.

What sets the majority of data marketplaces apart, however, is their approach to handling increasing amounts, complexity, and sizes of datasets.

The majority of current data marketplaces have the following challenges when it comes to effective data distribution:

  • Display of dataset information is limited. In the name of standardization and keeping listing size consistent, data providers are often limited to brief descriptions of their datasets. Sometimes they are able to provide query examples as well, but data consumers are often left having to download samples, reach out to the data provider for more information, and otherwise do further investigation to determine whether the dataset is worth their time.
  • The length and complexity of the sales process are increased by display limitations. For the data provider, more time and effort are required to sell a dataset if the data consumer doesn’t completely understand what they are getting and needs more information. For the data consumer, it’s frustrating trying to understand datasets based on a small description and list of non-contextual query examples.
  • The data consumer incurs data transfer costs for investigation. During the investigation process (downloading sample data for evaluation), data transfer costs still apply, so the data consumer is incurring costs alongside the risk of finding out that the data set is not usable for their purpose.
  • Data formats are often proprietary. Many data marketplaces are closed ecosystems, requiring data providers to load data in a specific, proprietary format, forcing providers to replicate data for different clouds and regions for consumers. This increases compute and operational costs for the provider, especially with more and bigger datasets.

Addressing these challenges has given rise to evolved data marketplaces that are able to handle the increasing complexity of datasets and reduce the friction between data consumers and providers when it comes to displaying and purchasing relevant datasets.

Revelate, for instance, provides a centralized data marketplace platform that simplifies the cataloging, segmentation, and marketing of data products. Large datasets can be effectively consolidated, ingested, and aggregated to provide data products that consumers can easily access and, more importantly, understand. In other words, the provider is able to curate the data purchasing, sharing, or exchange experience for the data consumer, making the end-to-end experience of getting a data product easier and more efficient. This reduces friction between the data provider and consumer and streamlines the sales process.

Simplify Data Fulfillment with Revelate

Revelate provides a suite of capabilities for data sharing and data commercialization for our customers to fully realize the value of their data. Harness the power of your data today!

Get Started

How Can Organizations Handle Increasing Amounts of Data?

As organizations are handling an increasing amount of data, the issue of effective data collection, integration, and governance comes into play. Data lakes and data warehouses are still commonly used to store and organize datasets into centralized locations, with these data repositories being relied on by organizations as a “single source of truth”. However, as data grows in size and complexity (i.e. “big data”), more organizations rely on data lakes, as they are more flexible with handling different types of data and multiple use cases. You still get the centralized data repository, and data lakes are generally better at handling big data.

However, the increasing complexity of big data often exposes issues with effective data management and compliance while still trying to maintain a centralized data repository. And of course, as the complexity of the data landscape grows, ensuring that this centralized repository of data continues to meet the needs of an agile and growing organization is paramount.

To handle large amounts of data effectively, organizations need to democratize not only access to data, but also democratize its management.

This concept, which can sometimes be referred to as data mesh, explains that data ownership and management should be handled by the appropriate departments and business units while at the same time backed by a centralized and self-service data infrastructure.

For most organizations, this requires a variety of data governance roles and responsibilities, including:

Data Governance Role Responsibilities
Data owner (an individual or team that makes decisions about who has the right to access certain data and how they may use it)
  • Overseeing and protecting a data domain (e.g., the VP of human resources would be responsible for employee data)
  • Establishes policies that define the appropriate use of the data that they oversee
  • Ensure that the appropriate security and access measures are in place to protect data from unauthorized access and modification
Data steward (collect and maintain data for the organization via databases, data lakes, data warehouses, etc.)
  • Controls the quality of data that an organization gathers
  • Carries out data usage and security policies as set out by the organization’s data governance initiatives
  • Acts as a liaison between IT and the business side of an organization
Data custodian (an individual or team that has technical control over certain data)
  • Maintains the technical environment and database structure needed for data storage and transfer
  • Resolves issues with data quality in partnership with data stewards
  • Ensures that the integrity of data is maintained during processing
Data user (individual who accesses and uses data)
  • Comply with the rules and regulations imposed on the data by the data owner, the organization, and governing bodies
  • Take reasonable steps to prevent sensitive or confidential information from being leaked (e.g., use MFA and/or single sign-on, comply with organizational security measures, etc.)
  • Ensure they are only using the data for its intended purpose as proposed by the data owner

Data-Market Strategy

An article from McKinsey about data-market strategy talks about the very real issues that organizations face with regards to extracting value from big data, and this involves how organizations are investing in the technology to support business goals. The example provided is General Electric, which invested $1 billion in 2016 to build the technological infrastructure needed to analyze the data from sensors on gas turbines, jet engines, oil pipelines, and more.

To get this right, GE needed to invest in:

  • Building an effective cloud computing solution that could combine internal data with external and send those datasets to analytics software for analysis
  • Retraining tens of thousands of salespeople and support staff
  • Hire thousands of data scientists and engineers
  • Shift GE’s business model from product sales with service licenses to outcomes-based subscription pricing

This is a high-level overview of a data-market strategy from a large enterprise organization, but the sentiment remains: the infrastructure should be in place for your organization in advance before you go looking for data sources on a big data marketplace.

Once the infrastructure is in place and tested, the next step is choosing a data marketplace platform, which requires understanding its characteristics.

Characteristics of a Data Marketplace Platform

man review their data marketplace architecture

The conversation surrounding the most effective way to move data around while still maintaining organizational security and regulatory compliance usually lands back on data marketplaces, even with their challenges. Understanding the characteristics of a data marketplace will help you determine how it can help support your organization’s data-market strategy.

1. Data Discovery

To make data discovery possible, marketplaces will implement data catalog tools and benchmark licensing conditions that data providers need to follow when creating their own licensing conditions for datasets. Standardized data categories organize datasets according to different parameters, such as industry, use case, data type, and more. As we mentioned before, different marketplaces will display data products differently and have variable search functionality much like Amazon displays products differently from Walmart’s website and has different search functionality⁠—meaning that depending on the platform, finding or displaying data products may be more or less effective for your organization.

2. Compare, Sample, and Review Data

Data marketplaces give consumers the ability to browse a variety of datasets, allowing them to compare the features, prices, and contents (as described in the metadata) of different products. Either through a self-service option or by contacting the data provider directly, the consumer can obtain a sample of the data to determine if it will be useful to them.

3. Purchase Data

The buyer’s journey for a dataset can vary widely between marketplaces. Depending on licensing requirements, different verification methods may be employed, from getting the data consumer to fill out a form with contact information and outlining how the data will be used to requiring the consumer to be signed up with the marketplace and verified in advance of purchase. Further, multiple back-and-forths may be required between provider and consumer to determine the efficacy of the dataset for the consumer’s use case, lengthening the sales cycle.

4. Integration

Once data is purchased, integration into the organization’s data pipeline is required. This is up to the data consumer to ensure that the relevant technologies and governance are in place and that the data basically goes where it needs to go after purchase. Costs associated with data transfer are usually the responsibility of the consumer.

What Types of Data Can You Buy on a Big Data Marketplace?

As we mentioned earlier in the article, a big data marketplace could be classified as an open data marketplace, which governments or overarching industry organizations typically run in an attempt to create an accessible data ecosystem that connects data from different organizations together, usually in the name of the public good.

The types of data you can purchase on a big data marketplace vary, but is usually real-time data streams from IoT devices or a massive variety of datasets from various industries, organizations, countries, etc.

For instance, the CPP data marketplace, a self-proclaimed big data marketplace that is backed by funding from the European Union’s Horizon 2020 research and innovation program, provides data products from more than 200 sensor signals in IoT devices—such as vehicles and smart devices in homes and buildings.

The purpose of providing this non-brand specific data, according to the CPP data marketplace themselves, is the following:

  1. Provide a standardized, cross-industrial data model that is flexible enough to provide data from various industrial sectors while at the same time being able to evolve to meet the future needs of data providers and consumers.
  2. Provide a “one-stop-shop” for data streams from multiple mass products, including a big data analytics toolbox to help data providers deal with large datasets.
  3. Provide access to cross-industrial data streams that encourage data consumers to create new and innovative products and business ideas.

On the other hand, data.gov provides over 335,221 datasets from all industries, business sectors, and organizations worldwide. The types of data you can access on data.gov range from nautical charts to monthly house price indexes, healthcare provider data, manufacturing and trade inventories, and sales, and the list goes on.

Run by the US government, the aim of data.gov is to support a more transparent government by providing open federal data and encouraging organizations and industries to provide open data as well. The idea is that when data is readily available and accessible, citizens feel more empowered to trust their government, but it also supports the development of innovative new products, medicines, services, and much more.

4 Types of Business Data Marketplaces

When it comes to buying, selling, exchanging, or data sharing in a B2B environment, organizations are going to have different preferences and needs with regard to whom they want to make their data available. With some types of data, laws and regulations may also dictate availability rules, so organizations have to find marketplaces that will accommodate to their needs. This has given rise to a variety of business data marketplace offerings, including public, private, hybrid, and aggregated.

 

Type of Business Data Marketplace Description
Public Accessible via a publicly available link
Private Only accessible by certain parties via a private link
Hybrid Some parts of the marketplace are available via a public link, while other parts are only available via a private link
Aggregated This type of business data marketplace collects data from multiple sources (e.g., IoT data) and then sells the data to data consumers

 

B2B Data Marketplaces

There is a ton of opportunity for businesses to use data from other businesses to improve their operations, innovate on existing products or develop new ones, and much more. This has given rise to B2B marketplaces, which specialize in providing a data ecosystem where businesses can buy, sell, share, and exchange data with each other.

B2B Data Marketplace Specialization
OpenPrise Data Marketplace Aggregates data from leading B2B and B2C data providers like Salesforce, Marketo, Pardot, HubSpot, and more and sells that data to organizations.
Crunchbase Marketplace Provides access to public third-party data from a variety of big organizations (e.g., LexisNexis, Semrush, Hoover’s) through an app-like store with a list of companies. Data that can be retrieved includes web traffic stats, app install metrics, IT purchasing data, trending product usage, and more.
Dataguru.in A B2B contact database that provides prospecting data for specific audiences that organizations can use for outreach.
The DX Network Utilizes the semantic web stack to allow the real-time exchange of structured data.
Informatica B2B Data Exchange Provides the tools required for organizations to build automated internal and external data exchange networks with partners, suppliers, distributors, and more.

 

Conclusion

reviewing big data marketplace in the workplace

Data is such an important commodity in today’s world that organizations increasingly want access to all sorts of data. Ensuring that your organization has the correct technology and processes in place to handle vast amounts of data can be challenging, but it’s often a more than worthwhile investment.

While there isn’t technically a difference between a regular data marketplace and a big data marketplace, there are different types of marketplaces that could be thought of as supplying big data. These so-called “big data” marketplaces have come out of the woodwork, such as open data marketplaces that provide access to a wide variety of data from various industries, organizations, and even government sources, while IoT data marketplaces provide real-time data streaming from sensors located in smartphones, connected vehicles, smart home devices, and more.

But just because there are data marketplaces out there that can handle the demands of big data doesn’t mean that data marketplaces in and of themselves don’t have challenges. The challenges that we discussed in the article include:

  1. Display of dataset information
  2. The length and complexity of the sales process
  3. Data transfer costs are incurred by the data consumer during investigation
  4. Data formats are often proprietary

Revelate, as a data fulfillment platform, aims to address the challenges of data marketplaces via a fully customizable, white-label data web store. With Revelate, you can democratize data access within your organization or provide datasets for sale to your target audience.

Ready to meet your data fulfillment goals? Get Started.

Simplify Data Fulfillment with Revelate

Revelate provides a suite of capabilities for data sharing and data commercialization for our customers to fully realize the value of their data. Harness the power of your data today!

Get Started