Table Of Contents
A data marketplace is a type of data exchange where data products are made available for consumption. People (and systems) can browse data marketplaces to access and consume a wide variety of datasets.
Marketplaces serve a variety of use cases. They can operate solely as internal data sharing platforms, or be used for commercialization and monetization. Some marketplaces specialize in one type of data, or data from a specific industry. Others are less specific and have a wide variety of data products and services available.
Companies can build their own first-party marketplace platforms, sell data products on existing third-party commercial marketplaces, or pay for a data fulfillment platform like Revelate to manufacture, package, fulfill, commercialize, and distribute data products. There’s no “one size fits all” method of gaining business value from data.
Marketplace Roles and Personas
There are three personas involved with a data marketplace:
- Data provider: The organization with the data to be shared or sold as a product in the marketplace.
- Marketplace provider: The organization that’s hosting the marketplace (in most cases, this is the same organization as the data provider).
- Data consumer: The person, system, or organization that consumes the data product from the marketplace.
We’ll be using this terminology throughout the guide, so be sure to understand each of these personas.
Data Marketplace Benefits and Uses
Data marketplaces provide a variety of benefits for a data-driven business, such as:
- Revenue generation: Selling commercialized data products or monetizing a new aspect of existing products and services.
- Expanding data sales: Marketplaces democratize access to consumable data, allowing more people to acquire or sell a variety of data products.
- Decreasing time to decision: Fast and easy access to internal and external data means less time spent waiting for information to drive a decision.
- Improving reporting: Marketplaces use automation to improve data quality, increase data consumability, and replace any middlemen between a data consumer and provider, making it easier for automated and manual reports to be built.
- Supporting data science: Many data analysts and engineers spend copious time and energy building and maintaining data pipelines. Marketplaces see to that, enabling more time for high-value work like analysis and reporting.
- Operational improvements with automation: Marketplaces create a need for consumable data products, which require a scalable solution for manufacturing and packaging. Less frequent manual intervention, fewer mistakes, fewer data request tickets, and happier consumers.
- Building ecosystems with data: A functional marketplace drives demand for consumability across all data sources and destinations, leading to more functionality and efficiency across the entire data ecosystem.
Companies benefit across the board from having a data marketplace because a marketplace creates technological and cultural expectations across people and systems. With a marketplace in place, data consumers become less tolerant of delays, bottlenecks, requests for access, and data with low consumability. As more people use the marketplace, expectations for ingestion, security, and automatability only increase. Eventually (and sometimes quickly), data-driven cultures are born.
Core Outcomes of a Data Marketplace
Data marketplaces offer businesses four core outcomes:
- Grow revenue: Companies that leverage data are faster to market, more able to innovate, and have more to sell than their peers.
- Reduce risk: As trust in and leveraging of data grows, companies become more capable and therefore sensitive to the risk of misuse or improper exposure.
- Reduce cost: Because marketplaces streamline business processes and require automation, costs inevitably decrease due to less manual work and fewer data silos.
- Accelerate time to value: When data is available across an organization, teams can respond quickly to market conditions, customer demands, and other factors.
Marketplaces are inherently concerned with data monetization. Whether data is used for internal R&D purposes or productized solely for public commercialization, marketplaces are intrinsically built to support data monetization, even if no money is exchanged for a data product.
Marketplaces inevitably lead to a growth in revenue. In bull markets, companies buy data and use it to improve products and services, create more value for customers, or bring innovations to the market. In bear markets, companies buy data to understand market conditions to protect themselves. Buying and selling data are relatively low-risk investments that hedge companies in times of uncertainty or illuminate new opportunities in times of growth.
In other words: once the right systems and cultures are in place, there’s never a bad time to buy or sell data.
Gartner correctly highlights that data analytics leaders are confident in data driving growth and acceleration, but face “emotional resistance” around “fear of data misuse.” The fear of risk and mistrust can be allayed on both sides of the data marketplace. Data providers can build automation with built-in compliance checks and access control. Data consumers can access the data they want at a (usually) reasonable cost and at low risk. Marketplace providers can even expose data lineage, ensure data quality, and broker a successful transaction.
Data providers can leverage the technical expectations and requirements of the marketplace to automate and streamline data product manufacturing, packaging, and fulfillment. This removes the need for IT teams and high-value data experts to build and maintain data pipelines and fulfill low-level data requests.
Data consumers can buy or consume data products from marketplaces, obtaining the data they want from the source they want without the cost of acquiring or maintaining that data. Marketplaces can offer low-cost, low-risk data products that can help a consumer decide whether they want to take a specific path with a dataset or not.
Accelerate Time to Value
Marketplaces remove many of the roadblocks that prevent data-driven value creation. In infrastructural situations, it could be that high-value data is locked up in another team’s infrastructure or another line of business. In administrative situations, it could be that users can’t get approval to access or transfer that data. In technical situations, it could be that the data isn’t formatted in a way that’s digestible to the person—or the marketplace platform—that needs it.
Whether on the side of the data provider or the data consumer, there are dozens of ways data-driven initiatives can be stalled, stunted, or stopped.
Marketplaces can provide self-serve access to data (perhaps even with manual approvals) with IAM integrations providing data visibility via role-based access controls (RBAC). The more quickly the right data gets into the right hands, the faster a company can gain value from it.
Core Features of a Data Marketplace
The most important feature of a data marketplace is to generate and fulfill data orders. Marketplaces host data products that need to be sold and delivered. Providers use the marketplace to package, market, price, and transact their data products. Consumers use the marketplace to discover, purchase, and access data products.
The marketplace itself facilitates these baseline functions and provides the interfaces to administer data transactions. At a high level, these features include:
- Data productization: Create and maintain data products for purchase by data consumers.
- Data discovery: All data products should exist within a searchable catalog that allows consumers to find and preview data.
- Data access: Access to data products and their datasets is secure and controlled through APIs, access control system integrations, or other mechanisms.
- Data licensing: Allow data providers to sell and license their data products to other users by designing specific language around its use and transaction types.
- Data integrations: Allow providers and consumers to easily send and receive data from other sources and platforms.
- Data distribution: Determine data product shipment options for the related target groups for the data product (e.g., direct download or API).
Types of Data Marketplaces
There are three types of data marketplaces: internal, external, and hybrid. Internal marketplaces serve data products to the employees and trusted contacts within an organization (e.g., contractors, franchisees, advisors). External marketplaces serve data products to people outside of an organization. Hybrid (or multi-layered) marketplaces serve both.
Each type of marketplace differs in its risk profile, architecture, and technical requirements.
Internal Data Marketplace
Internal data marketplaces are typically found in data-rich organizations like enterprises. They are “internal” in that the marketplace is used by a single organization to manage and share first- and third-party data assets internally. Internal marketplaces enable streamlined, low-risk data access to employees, teams, and business units, enabling data-driven decisions and innovations.
Just because a marketplace is internal does not mean it lacks the features one might expect of an external marketplace. Internal marketplaces still require data discovery, access, and management with considerations for data governance and security. However, there are unique benefits to internal platforms.
- Teams can catalog, classify, and make available general-purpose and specialized data assets while also providing controlled access based on internal user roles and permissions.
- There may be integrations with internal analytics and visualization tools, allowing employees to easily explore and analyze before “purchasing.”
Because there is built-in trust within a single organization, there’s greater potential for deriving value from proprietary information, driving innovation and competitive advantages. An internal marketplace removes the concern of exposing proprietary information to third parties.
Consider a large organization with multiple product-driven lines of business, each with its own set of valuable and proprietary data. Cross-functional R&D teams might want to amalgamate data from these products to investigate building new services. Without an internal marketplace, teams may have to deal with ticketing systems, email requests, approvals, oversight, governance, and business justifications for data access and use. An internal data marketplace can streamline every one of these aspects, including recordkeeping for regulatory and compliance concerns.
Internal marketplaces can have built-in governance and licensing using automated configurations and integration with identity management systems. Ownership and usage can be tightly controlled and monitored without long approval chains or emails sitting in the inbox of a person who’s on a two-week vacation.
Considerations for Building and Hosting an Internal Marketplace
Building and hosting an internal data marketplace is a complicated set of short- and long-term tasks. From a technical standpoint:
- Data storage and management: Sophisticated systems need to be in place to store and manage large datasets, fast access, and complex analysis.
- Data access and security: Protective measures must be in place to ensure only the right people have the right level of access at the right time to requested data. These measures require IAM systems and role-based access control capabilities.
- Data quality: Data needs to be cleaned, validated, and standardized to ensure that it’s accurate and useful.
- Data cataloging: A system needs to be in place to catalog and tag data so it can be searched and discovered by consumers.
- Data integration: Large organizations have data stored across dozens (if not hundreds) of self-hosted and cloud-based storage systems. Wherever possible, integrations and pipelines need to be designed and implemented so data can flow into the data storage infrastructure.
Additionally, there are several non-technical considerations for hosting an internal data marketplace:
- Data governance: Clear policies and procedures need to be established to govern data usage, sharing, and access.
- Data ownership: Someone must determine who owns what data and is responsible for its accuracy, quality, and security. This information must also be easily referencable for specific access or troubleshooting needs.
- Data monetization: Decisions will need to be made around how data will be priced and who will benefit from its use, even internally. Large enterprises often have chargeback mechanisms in place for resource sharing.
- Data culture: A culture of data-driven decision making will need to be fostered within the organization to encourage the use of the data marketplace.
- Change management: The implementation of an internal data marketplace will likely involve significant changes to processes, roles, and responsibilities, and will require strong change management to ensure successful adoption.
External Data Marketplace
Whereas an internal data marketplace serves data products to consumers within a single organization, an external data marketplace serves other organizations and individuals regardless of affiliation with the marketplace provider. External marketplaces allow data consumers to access a wide range of data sets from one or more data providers. Whereas internal marketplaces largely serve trusted data consumers, external marketplaces may have no reason to trust their data consumers.
Because of this difference, the risks associated inherent to data products are quite different. Data procured by an internal resource on an internal marketplace is highly unlikely to yield a major, uncorrectable problem. However, both the data provider and the marketplace provider (they may be the same organization or may not) must ensure that the data is safe to be shared and not in violation of any regulatory compliance concerns.
External marketplaces need to provide the expected tools and services for data discovery, access, and management, with the added complexity of licensing and security. Because any public organization or individual can be a data consumer, there needs to be greater scrutiny and transparency about data sourcing, licensing, costs, and access on both sides of the transaction. It’s more likely for data to be misused via an external marketplace since the data provider may be unable to enforce the license of its own data products.
Data consumer expectations may also be higher in external data marketplaces. Internal and homegrown system users can be more tolerant of bugs, slow data transfers, and manual transaction approvals. External marketplace users, however, may be less forgiving and expect:
- Service-level agreements for transaction times and network response
- Data product consumability options (e.g., SFTP, API, and direct download)
- Product description accuracy
Other considerations involve the licensing and proper usage of a data product (e.g., ethical and legal uses of data). A data consumer may be bound to local governance or restrictions around the purchase or usage of a data product. The external marketplace needs to offer administrative tooling to enforce those constraints. Without such controls or oversight, there’s risk that data will be used irresponsibly or unethically.
Risk aside, there are plenty of justifiable reasons to establish an external data marketplace. Data providers with significant amounts of proprietary data can build multi-million dollar revenue streams by selling data products. These can include data from:
- Stock markets: Historical and live data, including future projections
- Fossil fuels
The need and appetite for data is only going to increase in the future. Enterprises with high data sophistication, infrastructural expertise, and product development experience can certainly benefit from an external data marketplace.
Considerations for Building and Hosting External Data Marketplaces
When selling data products to external parties, a financial transaction needs to take place. Data consumers trust sites that are secure, fast, easy to use, and provide a positive user experience.
Providing a marketplace means implementing and managing:
- Financial transaction tools
- Order management tools
- Customer support capabilities (e.g., call centers, automated support)
- The ability to accept credit cards (and potential PCI compliance)
- Legal terms and conditions, privacy policies
The differences between internal and external data marketplaces are considerable.
Architecturally, an external marketplace is exposed to the public internet. Depending on the sensitivity of data products on the platform, varying degrees of security, scalability, quality assurance, and functionality testing will be required. Technical and non-technical platform requirements may also be influenced by regulatory requirements (e.g., HIPAA, GDPR) or data provider expectations.
Technical considerations are similar to those of an internal marketplace, but differ in the following ways:
- Data integration: External data marketplaces often need to handle a greater volume and variety of data sources, and therefore may require more robust data integration capabilities.
- Data security: External data marketplaces need to ensure the security of data from a wide range of sources, which may require additional security measures such as encryption, tokenization, and anonymization.
- Data sharing: External data marketplaces may need to enable data sharing with a wide range of external partners, which may require additional data sharing and collaboration features.
Non-technical considerations are also similar, but differ in these ways:
- Data governance: External data marketplaces need to ensure compliance with a wide range of data regulations, such as GDPR and CCPA.
- Data monetization: External data marketplaces typically monetize data through subscriptions, licensing, or pay-per-use models, which may require different pricing and billing systems.
- Data ownership: External data marketplaces need to clearly establish data ownership and rights, and may need to handle disputes over data ownership.
- Data culture: External data marketplaces may need to build trust with a wide range of external partners, which may require a culture of transparency and open communication.
- Legal considerations: External data marketplaces need to be aware of legal responsibilities and liabilities that arise from managing third-party data.
Overall, external data marketplaces need to be more robust, secure, and compliant than internal data marketplaces, and they may also require additional features and functionality to support data sharing and collaboration with external partners.
Hybrid and Multi-Layered Data Marketplace
A hybrid data marketplace combines the considerations and requirements of both internal and external marketplaces. It allows organizations to manage and share data assets internally while also providing controlled data product access to external parties. For example, a large data set may be available for internal usage and a subset of that data may be available for external sale with a different license.
A multi-layered data marketplace provides access to multiple “levels” or “layers” of data. This includes raw, processed, or derived data that’s been aggregated, transformed, or enriched in some way so it can be offered externally. Multi-layered data products serve different types of data consumers. As such, there may be different permissions and security rules for each consumer.
Because hybrid and multi-layered data marketplaces simultaneously serve both internal and external customers, they are the most complex of all the marketplace architectures. They are not so infrastructurally different as they are operationally. In this model, expect more role-based access considerations and even more security policy concerns.
An errant configuration could accidentally expose internal data to the external marketplace. Depending on the severity of the incident, this could bring down the entire marketplace and severely tarnish the marketplace provider’s reputation.
Considerations for Building and Maintaining Hybrid or Multi-Layered Data Marketplaces
The product, data, and operational sophistication for running hybrid or multi-layered marketplaces are very high. Because these marketplaces are essentially a hybrid of internal and external, all of the considerations for each of those marketplaces must be taken into account.
There are some complex use cases in which there are overlapping considerations for internal and external users. For example, there may be a data product that is configured one way for internal users and another for external users. Supporting these types of data products is difficult and particularly risky, depending on the type of data within the dataset.
Hybrid and multi-layered marketplaces are especially difficult to build. A well-built platform should be able to host all four types of marketplaces.