How Revelate Enhances the Talend Data Catalog

Revelate
An individual using a laptop to browse a data marketplace platform

Table Of Contents

Talend is like your class valedictorian—it seems they do everything. Their products span data integration, governance, API integrations, data lakes, and customer experience management. There is a functional data catalog tool with capabilities like search, automated discovery, and data connectors for popular services.

The data catalog space is highly competitive, so it takes a lot to stand out. Because Talend has so many capabilities across so many products, they can manage the entire data lifecycle. Surprisingly, they do not have a data marketplace (not as of this writing, anyway), but they do have strong data integration and management capabilities.

Let’s look at what makes the Talend Data Catalog stand out, then share how you can build a marketplace on top of it for internal and partner data sharing and external commercialization.

Talend Data Catalog baseline features

Talend is a leading provider of data integration and management solutions. Their data integration product provides ELT and ETL tools as well as CDC (Change Data Capture). They emphasize data usability and work well with data whether it’s on-prem, hosted by a third-party service, or in a public or private cloud. Their products span the entire data lifecycle, from creation to warehousing to analytics.

Talend’s products are popular because they:

  • Leverage widely-used and adopted open-source technologies
  • Prioritize ease of use 
  • Build for scalability
  • Offer comprehensive services

Talend’s Data Integration product provides a firm foundation for extracting and understanding data, no matter where it lives. Extraction and data warehousing aren’t enough for most companies; data-rich organizations also need to gain deep knowledge of their data to achieve their data-driven goals, which is why they lean on the Talend Data Catalog.

The Talend Data Catalog offers many features, so we’ll start by looking at the baseline features that most data catalogs should include:

  • Data discovery: Allows users to find and explore data across their organization
  • Data profiling: Provides insights into the quality and structure of data
  • Data governance solutions: Ensures companies remain compliant and in control of their data
  • Data lineage: Tracks the history of data as it flows through an organization
  • Data search: Allows users to quickly find the data they need
  • Data collaboration: Enables users to share data and collaborate on projects

Let’s go deeper to see what Talend’s Data Catalog offers beyond baseline features.

Metadata management for everyone

Talend proudly proclaims that their data catalog “automatically crawls, profiles, organizes, links, and enriches all your metadata.” Their search syntax is flexible and they have a metadata query language, making data discovery a breeze, even for non-technical users.

Data subject matter experts and stakeholders can contribute to the data catalog by providing business knowledge through annotations, classifications, tags, and associations. Social curation allows users to collaborate on the data catalog by adding and editing metadata. Anyone with catalog access can contribute.

For example, a data subject matter expert might add an annotation to a table to explain the meaning of a particular column. A stakeholder might add a classification to a table to indicate the business department that owns the data. A data analyst might add tags to a table to make it easier to find. These simple examples of how metadata management merely scratch the surface of how effective metadata frameworks can drive a data-driven culture and reinforce the value of a data catalog.

Cloud integration and deployment

The Talend Data Catalog can connect to cloud-based data sources including Amazon S3, Google Cloud Storage, and Microsoft Azure Blob Storage. Native cloud integrations such as these present a consistent and unified interface for data stored across many sources and destinations.

Talend also integrates with cloud-based data warehouses like Amazon Redshift, Google BigQuery, and Microsoft Azure Synapse Analytics. This integration allows users to create a single view of their data across different cloud-based data warehouses.

You can deploy Talend Data Catalog on a variety of cloud platforms, including Amazon Web Services, Google Cloud Platform, and Microsoft Azure. No company has the same deployment and infrastructure requirements, so having this level of flexibility is a boon to customers.

Customization is key

Talend Data Catalog users have plenty of customization options available to them, including:

  • Dashboards with options for widgets and layouts
  • Metadata types that reflect the specific needs of an organization. Add or remove fields, change field types, and even create your own metadata types
  • Tailored presentation that includes colors, fonts, layouts, and branding
  • Customized workflows that automate tasks using pre-built or bespoke workflows
  • APIs that integrate with other applications to automate tasks or to share data 

Data catalog customization fits an organization’s specific needs and improves usability, flexibility, and branding. Flexibility caters to the ways those needs change over time. Branding creates a more consistent and appealing user experience for everyone who uses the catalog.

What makes Talend Data Catalog different

Some of the differentiating features offered by Talend Data Catalog include:

  • A unified view of all data—regardless of where it is stored—making it easy to find and understand data and its usage
  • Automatic data discovery from a variety of sources, including databases, files, and cloud storage
  • Enhanced data governance to comply with data regulations and protect data from unauthorized access
  • Increased data productivity to provide a centralized repository for metadata, which helps people quickly find the data they need

Talend’s catalog goes above and beyond a typical catalog offering in a variety of ways. Providing a single, centralized view of all data assets across the entire data ecosystem is atypical, especially considering that data can be spread across multiple clouds, data centers, and providers. They also have fine-grained access controls and deeper lineage tools than you’d expect from a typical catalog.

Because it is one of many Talend products, it integrates seamlessly with other products across the Talend ecosystem. For companies that want to go all in on a walled garden ecosystem approach, Talend has nearly everything they need—except a data marketplace.

Why Talend customers need a data marketplace

Talend has an impressive array of capabilities and products, but may not completely fulfill the data sharing needs of many organizations. While catalogs are excellent for locating and keeping track of your data, they can’t always get people the data they need. Obviously, Talend Data Catalog users can search for and get access to data, but there’s a very low chance it’s formatted and documented in a way that non-technical users need.

Companies need data products that solve data consumers’ pain points.The finance team may be able to search for and access the data they need with Talend Data Catalog, but it may not be packaged in a useful way. For example, if they primarily work in Excel, will it help them to have API access to data that’s stored in a SaaS? Most likely not.

Catalogs do a lot of great work, but they don’t productize data. They don’t package it up for data consumers in a way they’ll find most useful. In most cases, even for companies with data catalogs, someone has to file a ticket requesting specific data in a specific format to make it more consumable. This problem is what productization solves.

Productization is the next step to understanding what data you have and where it lives. Having an inventory of data isn’t enough; it’s like having a data library full of pages scattered about, none of which are packaged into books people can borrow.

Talend + Revelate = Data sharing nirvana

Revelate is a data productization and fulfillment platform that works with Talend Data Catalog—and many other catalogs. In fact, Revelate works with a variety of Talend products, including Talend ETL and data integration products, and seamlessly integrates to create a productization pipeline. This pipeline makes it incredibly easy for any organization to build data products, manage metadata, and improve data discoverability.

When companies set out to build a data catalog, they’re usually trying to build an inventory that will serve the broader organization. Unfortunately, they don’t realize until it’s too late that a catalog can only get them so far.

Many catalogs offer data marketplaces as part of their product capabilities, which is a step in the right direction. Those catalogs, however, still miss the point on productization, consumability, and fulfillment. Marketplaces like Snowflake and Databricks are purely for external data commercialization. They don’t solve the major problems of internal data sharing and partner data sharing.

Revelate sits on top of catalogs, warehousing, and data integration platforms. We know how they work and we’ve built an abstraction layer that makes it easy for technical and non-technical users to build, market, and distribute data products.

When it comes to Talend in particular, Revelate augments its functionality with:

Talend Data Catalog and Revelate are a winning combination for organizing, sharing, and productizing data across multiple clouds, SaaS data sources, and data destinations. Looking for a way to level up your data catalog with an easy-to-use data marketplace? Learn how data marketplaces can strengthen your overall data strategy.

Unlock Your Data's Potential with Revelate

Revelate provides a suite of capabilities for data sharing and data commercialization for our customers to fully realize the value of their data. Harness the power of your data today!

Get Started