Salesforce introduced Changed Data Capture (CDC) few years back. I do remember this as one of the important topics in Salesforce Integration Architect certification guide. Most of us would have spent time exploring this during preparation of certification exams. This is one of the great integration capabilities provided by Salesforce, but it is not widely used integration design pattern in real-life projects.
In this article, I would try to consolidate my experience and learning on Changed Data Capture (CDC) Integration design pattern. This article is focused for architects who are exploring CDC as integration solution or exploring options for Data Replication requirements.
What is Salesforce Changed Data Capture (CDC)
Well, you can find excellent Salesforce documentation of CDC. Salesforce documentation covers all basic and advanced details of CDC. I understand CDC as a data processing engine offered by Salesforce. It captures the snapshot of changes happening at Salesforce Data Layer. I am referring data layer and not object layer as it does not capture changes at formula and derived fields. Salesforce generates these changes as an event stream that can be consumed by external application trying to integrate with Salesforce.
One important thing to note is, CDC offers one directional integration It means Salesforce is generating data stream (event) that can be consumed by other applications, but you cannot use Changed Data Capture (CDC) for inbound integration to receive updates from other applications.
When to use Salesforce Changed Data Capture (CDC)
This is one of the difficult questions to answer. From capabilities perspective this is a great out of the box features offered by Salesforce, but this is an architecture decision. If you are an architect, you need to understand the capabilities and limitations (challenges) of CDC.
- If you are designing data replication solution where you are planning to replicate data from Salesforce Object(s) to external applications (MDM, HRMS, Billing System etc…), this could be an excellent design
- If you are working on high volume data integration, this is again an excellent design decision
- This can be considered as one of the best integration design patterns for most of the outbound integration requirements as long as you understand the considerations (listed in separate section of this article) for using CDC as solution for your integration requirements.
- As mentioned above, CDC can be used for outbound integration requirements where you are trying to sync Salesforce data with external applications
Salesforce Configuration to use Changed Data Capture (CDC)
This is the fun part of using Changed Data Capture (CDC). You actually do not require any kind of development in Salesforce. CDC is a feature offered by Salesforce and you just need to do some simple configurations.
- Identify the requirements of number of channels you need to stream your changes to external application. You can create different channel for different external application and add Salesforce Entities (Objects) as needed. You do not need to send order details to HRMS and same way you may not require employee specific information to your fulfillment systems. Create and group the changes as needed for your use case
- Evaluate the requirements for add-on license. Refer Salesforce documents to understand the limits available on your Salesforce org and work with Salesforce to get add-on licenses as needed
Well, this is all we need to use CDC. We do not require any development like we do for Platform Events like using flow or trigger to publish the events. You need to use tooling API or Metadata API to create channel and channel members. Refer Salesforce documentation with sample requests to use to create channel and adding entities as members.
Consideration & Challenges for using Changed Data Capture (CDC)
You may be thinking if Changed Data Capture (CDC) is such a great feature offered by Salesforce for outbound integration scenarios, why it is not widely used in real-life project. I would share some of the considerations for using CDC based on my experience.
- Lack of complete understanding of Changed Data Capture architecture. This is one of the common reason most of the architect do not consider CDC for integration design and moves to other traditional integration design pattern
- Complex Event Message Structure
- Most likely you require a good middleware tool and good architect to design solution to consume events coming from Changed Data Capture
- Salesforce may group changes across multiple records in single event messages. It means you need to design your solution to process multiple changes coming in single message
- You need to consider Gap Event and Overflow Event in your solution. You need to configure a flow that would receive set of IDs in Gap Event and calls Salesforce API to get details of changes
- You need to understand the fields Salesforce include in event messages. You may require ensuring some of the fields like External IDs are included in the message even they are no changed but needed by external system for processing
- You do not use flow or APEX to publish the changes. It means you can not do any pre-processing before publishing the messages for external applications. Do not forget this is Data Replication solution so we should not expect pre-processing but most of the integration requires that
- Add-on licensing requirement if out of the box limit does not meet your requirements.
Summary
Changed Data Capture is a great integration design offered by Salesforce. We have used this successfully to solution data replication with back-end systems. This should be a preferred approach for your data replication requirements.
My recommendation is to understand the message structure for different scenarios (specially Gap and Overflow events), discuss with your middleware architect and go for it you have necessary license and expertise.