AWS Data Pipeline Architecture Core Components

Managing A Data Marketplace: A Quick Reference Guide

Revelate

Table Of Contents

For some companies, managing their data marketplace is a spare-time responsibility for one or two people. For others, it’s a full-time job for dozens. The level and type of effort required for managing a data marketplace depend on a few factors we’ve already covered:

  • The kind of marketplace: internal, external, hybrid, or multi-layered
  • Whether the data provider chooses to buy or build a marketplace
  • The data provider’s data monetization maturity path
  • The complexity and ability to get data in and out of their data ecosystem
  • The data productization strategy and capabilities of the provider and marketplace platform
  • Data access and the associated user experience
  • The quality of the data, metadata, and the product itself
  • The data discovery capabilities of the marketplace platform
  • The fulfillment capabilities of the provider, platform, and consumer
  • The number of data products listed available in the marketplace

Other ancillary factors also come into play, like the provider’s existing relationships with the data consumers or whether there are regulatory compliance issues at play. But for data providers considering a future in which they’re selling or sharing productized data, it’s important to consider the type of marketplace that will host their data products.

Choosing the Right Marketplace

Let’s assume you’ve decided to productize and share or commercialize your data. The type of marketplace you choose depends on the data you want to sell and what you hope to get out of it. You may desire to build relationships with a certain type of customer, or you may want to provide data specifically for internal research and development purposes. Whatever the case, you need to choose the right type of marketplace with the right features to ensure your data products are successful.

Universal Expectations for Marketplace Providers

The marketplace you choose should provide the administrative tools you need to broker, troubleshoot, and manage transactions. This may include features and interfaces for order tracking, payment processing, customer support (for you and your consumers), and visibility into the underlying distribution systems that deliver your data products to your consumers.

Marketplaces should protect intellectual property and the integrity of the data products it hosts. Therefore, they should feature:

  • Industry-leading security practices to ensure data is safe and protected from misuse.
  • Data licensing and entitlement tracking and configuration to control who can use the data and how.
  • Pipelines and APIs that enable data providers to quickly, easily, and affordably get new data into a data product.
  • Both self-service, automated transactions, and manual transaction administration and processing.
  • Marketing and discoverability features for consumers to find the data products they need from the right providers.
  • Timely access to revenue and data generated by sales of data products

Marketplace operators must provide helpful tools, the right technologies, and an operating culture that supports effective management of people, costs, and processes. With so much at stake in the sale of a data product, the marketplace shouldn’t be a faceless black box that ingests your data and opaquely presents it to data acquirers.

Data acquirers will rightly expect support as they engage with the marketplace and its products. They need products to be available, discoverable, and usable, which means you’ll also need sufficient support to understand the rulesets through which you can list and make your products discoverable and packaged. Put another way, all of the responsibilities for discoverability and associated rules lay with the marketplace provider.

The same can be said for order processing. It’s up to the marketplace operator to lay the groundwork and provide visibility into how orders get processed within the marketplace. You’ll likely want to know whether orders are shipped to systems you own if an automated billing system transacts the purchase, how orders are prioritized, and whether you have any responsibility for fulfilling your products.

Though you may not find it necessary early in your productization journey, at some point, you will need to understand how data acquirers are navigating the marketplace, discovering products, assessing value, and making purchasing decisions. It’s up to the marketplace operator to provide visibility into customer usage, and it’s up to you as the data provider to update your product packaging. Hence, it better meets the needs of the existing user base.

Shared Responsibilities of the Data Marketplace

A marketplace establishes a contract between the data provider, marketplace operator, and data consumer (or data acquirer).

Providers and operators share the responsibility of product consumability. First, the operator must ensure products are delivered correctly, consumable, and accurately Described. For example, if you were to order a new book from Amazon and half the pages were crumpled, you won’t be calling the book publisher or the printing company. You’ll be calling Amazon. The marketplace operator assumes the risks of ensuring products arrive to their consumers correctly and on time.

The second level of responsibility for product consumability is with the data provider. Continuing with the Amazon book example above, the marketplace operator must ensure every delivery is performed accurately. But, if 50% of the books they ship have crumpled pages, they are likely to recuse themselves from their quality and consumability responsibilities to the consumer. If the product is consistently defective, the provider must fix that, not the operator.

The third level of responsibility for consumability rests with the product manager at the data provider. In traditional product management, products built for enterprises, consumers, and specialists differ. They can all live in the same marketplace but need appropriate packaging and metadata. It’s up to the product manager to ensure the product is described so that the consumer knows whether the product suits their level of data maturity.

Amazon and other marketplaces sell highly specialized products designed specifically for experts and professionals. These products go beyond the expectations or understanding of an uninformed, everyday consumer. Only so much Amazon can do to market a product to a specialized audience. The same goes for a data product in a marketplace. There are several ways to demonstrate whether a data product is appropriate for a specific type of consumer, like setting expectations on row size or the average byte size of a file. A data set with 15B rows hasn’t suited to a consumer with mild SQL capabilities.

Types of Marketplaces

There are three types of data marketplaces:

  1. Self-Managed: A marketplace you build, maintain, and manage on your own
  2. Vendor- or Partner-Managed: A marketplace built and maintained by a system integrator or a managed platform like Revelate.
  3. Aggregated: A marketplace built exclusively to sell data from various third-party providers (including you).

Self-Managed

A self-managed data marketplace is designed, built, owned, and operated entirely by a data provider. It is a huge undertaking that can cost millions of dollars in headcount, infrastructure, scalability, and support. This option is only suitable for large enterprises with high data sophistication or highly-specialized companies with experience running, maintaining, and supporting other marketplaces.

There are many advantages and disadvantages to a self-managed market, many of which we covered in the section called “When to Buy and When to Build a Marketplace Platform.” In short, the upsides include:

  • Total and complete ownership
  • Flexibility and customizability in design and implementation
  • Higher margins on product sales
  • Long-term potential for cost savings.

The disadvantages may include:

  • Very high costs to build, maintain, support, and scale
  • Slower speed to market or delivery
  • Need to build a customer base
  • Total responsibility for incidents, bugs, downtime, and user complaints
  • Total responsibility for compliance, regulatory, integrations, and relationships
  • If you want it, you have to build it

One particular challenge to consider is the potential uphill political battle in every aspect of the platform’s development. Depending on the externality of the marketplace and the depth of integrations to data sources throughout the ecosystem, there could be a string of red flags raised by security, IT, product management, and other stakeholders who own and maintain siloed data.

Developing a self-managed marketplace can take months to overcome the political challenges, months to design, years to build and scale, and significant capital investments in headcount, infrastructure, and processes to enable the functionality of the marketplace. However, we have seen some companies accomplish precisely what they set out to do and are successfully running self-managed marketplaces. It’s possible, but certainly not cheap, fast, or easy.

Vendor- or Partner-Managed

A vendor- or partner-managed marketplace offers a high degree of flexibility, customization, and capabilities to leverage existing data ecosystems. Many companies want as many self-managed benefits as possible without the tremendous costs and overhead. A managed marketplace can be the perfect solution, offering few trade-offs for minimal costs.

Managed marketplaces are single- or multi-tenant platforms that allow data providers to brand and control much of the user experience. Managed platforms typically come with all of the productization, commercialization, sharing, and fulfillment functionality a provider needs, including transparency into product manufacturing, order fulfillment, and cost tracking. They’re designed and built with provider needs in mind.

Though managed marketplaces don’t have all the upsides of the self-managed marketplace, they do come close:

  • Total ownership of data products, not the platform
  • Flexibility and customizability in design and implementation
  • Shared margins on product sales
  • Fast time to market and delivery
  • Existing customer base
  • No unplanned costs to build, maintain, support or scale the platform
  • Low costs to build, maintain, and support data products
  • No responsibility for platform incidents, bugs, downtime, or user complaints
  • Reduced responsibility for compliance, regulatory, integrations, and relationships
  • Access to new features without any investment in their development
  • Access to the customer and the potential relationship
  • Transparency into and manual administration of many functions
  • Once pipelines and built and productization is working, it’s set-and-forget
  • Reporting and analytics features are likely built-in

The disadvantages may include:

  • Costs for a managed platform may be high
  • Revenue- and/or margin-sharing plus fees
  • Need to build a customer base
  • Non-trivial up-front effort to prepare for productization

There is plenty of variation in managed marketplace offerings. Some are fully featured and come with a premium price tag. Others are minimally featured with low fees. Most offerings are SaaS-based, with very few (if any) available for on-premise installation.

Self-managed marketplace initiatives may come with significant uphill battles, but managed marketplaces have challenges. There may be internal objections or inquiries about how products and data are hosted (especially on a multi-tenant platform), what sort of access the platform may need to internal data or infrastructure, and whether it may still be better to build a self-managed platform.

Aggregated

An aggregated marketplace exists purely to sell third-party data products.

There’s no ownership or customization except for the packaging of the data product and its metadata. The aggregated model is the equivalent of building an app and putting it on Apple’s or Google’s app stores. The data provider has no practical control over the customer experience, may not trust the marketplace operator, may have products listed right next to a competitor’s, and can suffer from the algorithmic priorities of the platform (e.g., lowest-priced products rise to the top).

There are many upsides to using an aggregated marketplace:

  • No responsibility for the costs of the marketplace
  • No costs or investment required for the user experience
  • Simple, standardized productization model
  • Low overall costs
  • Existing user base

The downsides, however, can be numerous:

  • No customer relationship
  • No customization or flexibility over the user experience
  • The marketplace and its sorting algorithms may be opaque or could lead to a race-to-the-bottom for data product pricing
  • Lower product margins
  • Products may not be portable to other marketplaces
  • No chance to explain that one product is better or higher-value than another
  • No visibility into fulfillment processes
  • May not provide analytics or success metrics
  • No control over the product data standards

Because productization requirements for an aggregated platform are standardized, a data provider can learn and master one process, and get products online quickly and easily. Aggregated platforms are designed for the open listing of data products, regardless of their origin. Getting data products into an aggregated marketplace is a low-risk, lower-reward option for many companies. The trade-off for the reduced complexity is the lack of ownership. For many companies, that’s well worth the trade.

Notably, aggregated marketplaces are purely for commercialized data products designed for externalized consumption. There are no aggregated marketplaces for internal data sharing. These marketplaces are best for small businesses and organizations with low data maturity. It’s hands-off, low-commitment, and an excellent way to start monetizing data. It’s also a great way to learn to do productization on the infrastructure side.

Marketplace Comparison Table

Aggregated Managed Self-Managed
Price $ $$ $$$$$$$$
Maintenance Packaging UX

Pipelines

Packaging

Bugs

Incidents

Support

Features

Pipelines

Manufacturing

Packaging

Everything

Staff required Product Mgr Product Mgr

Data Engineers

Product Mgr

Data Engineers

Software Devs

QA Engineers

DevOps Engineers

Customer Support

Provider Support

Finance

Infrastructure None Pipelines into marketplace

 

Takeaways

  • The effort required for managing a marketplace can depend on several factors, some of which are political, others are technological.
  • Data providers and consumers have a universal set of expectations for marketplaces, all of which should facilitate and broker a successful transaction.
  • Providers and marketplace operators share responsibility for product consumability, packaging, discoverability, and fulfillment.
  • There are three types of marketplaces: self-managed, vendor- or partner-managed, and aggregated.