Data is becoming more and more important for all companies. With the evolution of technology and communication channels data is growing exponentially and it is also bringing requirement to integrate large volume of data.
Salesforce being #1 CRM, also requires integrating with other systems and moving large volume of data bidirectional. Salesforce Bulk APIs are based on REST principles and are optimized for working with large sets of data. It can be used to extract large volume of data from Salesforce using Query and it can also be used to Insert, Update or Upsert large volume of data asynchronously.
Salesforce recommends considering Bulk API as preferred option if we are integrating more than 2000 records in single execution.
Business Requirement
Recently for one of the integration requirements we were designing solution to extract client data (basic demographics only) from Custom ERP deployed on AWS to Salesforce. After evaluating all the integration options, we decided to use Bulk API 2.0. A high-level approach was defined to write a Python script that will run every night to extract data from ERP and push it to Salesforce using Bulk API 2.0.
Solution Design
A Python developer was onboarded to write the script to pull the data from ERP and push it to CRM. We shared Salesforce Bulk API 2.0 documentation, but he was not interested in going through so many details and was looking for few examples to start with.
If you are using middle-ware tools like MuleSoft or Informatica Cloud, then some of complexities are addressed by tool but if you are planning to write a C# or Python Program then you need to make multiple HTTP based API calls in proper sequence.
Below are the steps to push the data to Salesforce using Bulk API 2.0
Steps | Owner |
---|---|
Create / Acquire a user that will be used for Integration and ensure this user has limited access as needed to perform the integration | Salesforce Administrator |
Create a connected app. Since this is going to be server to server communication, it can be Client-Credential Flow or JWT Flow | Salesforce Administrator |
Generate Client ID & Client Secret and share it to Script Developer | Salesforce Administrator |
Develop Script | C#, Java, Python Developer as needed |
Script Execution Flow
Bulk API implementation requires submitting multiple HTTP requests to Salesforce and it needs to be done in proper order. I am going to list down different HTTP calls that would be needed to implement the business requirement that we discussed at top of this page. I would put down HTTP Call Details that can be simulated using a REST Client or a sample script.
Life Cycle of Script
- Submit request to Salesforce with Key & Secret to get the session token
- Submit request to Salesforce with token to create a Bulk Job
- Submit request to Salesforce with Job ID & token to create a batch (or submit multiple batches)
- Submit request to Salesforce with Job ID & token to indicate that all batches are uploaded
- Submit request to Salesforce with Job ID & token to get status of Job
1. Get Salesforce Token
We assume a Salesforce Administrator has created a connected app for Client-Credential flow and necessary details are shared to developer who is working on script development. First step is to submit a POST request to Salesforce and get the session token. If you are not receiving token, then something is wrong with request or shared token, and you need to work with Salesforce Administrator to fix the issue
Request Header
Content-Type:application/x-www-form-urlencoded
#### Use Salesforce Domain from My Domain
POST URL: https://crmview.net.my.salesforce.com/services/oauth2/token
Request Body (Update Client ID and Client Secret as shared by Salesforce Admin)
grant_type:client_credentials
client_id:3MVG9teL3XYZP3IcVeEOyPMpUGLp8bY6CsdXemyecBkRVoq_ABCDDDDEE
client_secret:09D8115FDF9393ECF2D62DCDFFC46C993689BA0AB7F9F82F61449880C71AC28C
Expected Response (Sample only)
{
"access_token": "<Token from Salesforce>",
"signature": "4A1BcXGYCxm9nqFVQ87c98ag//2ix7Tp0tUFjKTJziFrl5M=",
"scope": "sfap_api api",
"instance_url": "https://crmview.net.my.salesforce.com",
"id": "https://login.salesforce.com/id/00DD6000000BYTsMAG/005Hs00000M9NlpICF",
"token_type": "Bearer",
"issued_at": "1728331937994",
"api_instance_url": "https://api.salesforce.com"
}
2. Create Bulk API 2.0 Job
If the previous call was successful, you will have a session token that you would be required to use in order to submit other requests. We need to submit a new HTTP request to Salesforce to create a Bulk API Job that can be used to post our data.
Request Header
Content-Type:application/json
Authorization:Bearer <token from token call>
#### Use Salesforce Domain from My Domain
POST URL: https://crmview.net.my.salesforce.com/services/data/v58.0/jobs/ingest
Request Body
{
"object": "Account",
"operation": "insert / upsert",
"externalIdFieldName": "External_ID__c", /**** needed for upsert. refer salesforce documentation for other parameters that can be passed ****/
"lineEnding":"CRLF" or “LF”
}
Expected Response (Sample only)
{
"id": "750D6000006nBa7IAE",
"operation": "insert",
"object": "Account",
"createdById": "005Hs00000AC1NlpIAF",
"createdDate": "2024-10-07T19:43:57.000+0000",
"systemModstamp": "2024-10-07T19:43:57.000+0000",
"state": "Open",
"concurrencyMode": "Parallel",
"contentType": "CSV",
"apiVersion": 58.0,
"contentUrl": "services/data/v58.0/jobs/ingest/750D6000006nBa7IAE/batches",
"lineEnding": "LF",
"columnDelimiter": "COMMA"
}
3. Submit Batch for Bulk API 2.0 Job
Now we got a Job ID for our Bulk API 2.0 job, and we can submit one more request to this batch. Refer Salesforce documentation for limits, formatting and other details like compression, chunking etc… Sample CSV should look like below. This is going to be PUT request and not POST.
FirstName,LastName,PersonEmail,PersonGender,phone,PersonMailingCity,PersonMailingState,PersonMailingStreet
Glendon,Murby,gmurby0@nps.gov,Male,512-354-0749,Austin,Texas,318 Schiller Crossing
Wrennie,McElwee,wmcelwee1@bandcamp.com,Female,916-409-2529,Sacramento,California,02 Hayes Road
Sebastian,Neame,sneame2@globo.com,Male,915-932-8203,El Paso,Texas,1437 Commercial Pass
Sheffie,Bean,sbean3@google.co.uk,Male,716-664-8653,Buffalo,New York,7 Kings Lane
Request Header
Content-Type:text/csv
Authorization:Bearer <token from token call>
#### Use Salesforce Domain from My Domain
PUT URL: https://crmview.net.my.salesforce.com/services/data/v58.0/jobs/ingest/ <Job Id>/batches
Request Body
Submit CSV
Expected Response
You will receive 201 status for successful upload of file.
3. Submit Request to update Upload Complete status for Bulk API 2.0 Job
Now we have submitted one or more batches for processing, and we need to inform Salesforce Bulk API to start processing the batches. We need to submit one more request to Salesforce to close the Job and start processing the batches. We need to send PATCH HTTP request for this.
Request Header
Content-Type:application/json
Authorization:Bearer <token from token call>
#### Use Salesforce Domain from My Domain
PUT URL: https://crmview.net.my.salesforce.com/services/data/v58.0/jobs/ingest/<Job Id>
Request Body
{
"state":"UploadComplete"
}
Expected Response
You will receive 200 status for successful execution
Setup Process to track progress of Job
Bulk API Job may take few seconds to several hours to complete. Execution time depends on the volume of record. It is recommended to define a batch that monitors the JOB status and send an alert to support team to validate and address the issue. We need to make REST calls to get success or failure records.
You can refer these URL for developing a process for exception handling, retry mechanism and reconciliation process.
Link | Description |
---|---|
https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/get_job_info.htm | To get the status of JOB |
https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/get_job_successful_results.htm | To retrieve successfully processed records |
https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/get_job_failed_results.htm | To retrieve failed records |