API integration poses a unique set of challenges, from compatibility issues to security concerns. To simplify this process, we offer a practical checklist for defining the optimal integration method between your data source and analytics tool.
In this article, we’ll walk you through the process using Pinterest API integration as an example. We’ll explore two methods: Python and ClicData’s universal connector.
By following our checklist and examples, you’ll gain insights into seamless API integration, empowering you to leverage your data effectively. Let’s dive in and unlock the full potential of your data.
API Integration: Checklist to Choose the Best Method
#1 – Identify the type of API
Is it a SOAP or REST API?
That will have an impact on the integration method, here’s how:
- REST APIs typically use HTTP and simple message formats like JSON, XML, or plain text. Integration with REST APIs often involves making HTTP requests using standard methods like GET, POST, PUT, DELETE, etc. These requests are sent over the internet, making REST APIs accessible from any programming language or platform that supports HTTP communication.
- SOAP APIs, on the other hand, use the SOAP protocol, which relies heavily on XML for message formatting. Integration with SOAP APIs often involves generating and parsing XML messages according to the SOAP specifications. SOAP APIs may also require using specialized libraries or frameworks that support SOAP messaging.
#2 – Check the authentication method
There are multiple authentication methods provided by APIs, but the most common ones are:
- OAuth 2.0: OAuth 2.0 is an authorization framework that allows third-party applications to obtain limited access to a web service on behalf of a user, without the user sharing their credentials directly with the application. OAuth 2.0 typically involves obtaining an access token from an authorization server by presenting credentials (such as a client ID and client secret) and then using the access token to authenticate subsequent requests to the API. Note: OAuth 2.0 is the most secure and it’s a one-time set up, but not necessarily the easiest to implement.
- API Key Authentication: API key authentication involves generating a unique key (usually a long string of characters) for each client or application that needs access to the API. You typically include this key in your requests to the API, typically either in the request headers or as a query parameter. The API server verifies the key against a database of valid keys to authenticate the client.
- Basic Authentication: Basic authentication involves sending a username and password with each request to the API. The credentials are typically encoded (not encrypted) using Base64 encoding and included in the request headers. While simple to implement, basic authentication is not considered very secure since credentials are transmitted in plaintext and can be intercepted.
#3 – Check for pagination
If there is pagination, the API integration will be more complex as you will have to take into consideration the logic of loops. Therefore, pagination requires more code and verification steps.
There are multiple pagination methods:
- Offset-based Pagination
- Cursor-based Pagination
- Token-based Pagination
- Time-based Pagination
- Keyset Pagination
- Range-based Pagination
- Page-based Pagination
- Index-based Pagination
You can learn more about API pagination methods here. In the example below, we’re going to deal with the Page-based pagination.
#4 – Check for limitations in tools
You might be limited in terms of API calls by the tool’s pricing model or because the API developers implemented throttling.
Here are some best practices to help you work with throttling:
- Know the rules: Read the API documentation to learn about the rate limits – how many requests you can make and in what time frame (e.g., per minute, per second, etc.)
- Go slow and steady: Before making requests, set up your code to make requests at a pace that won’t exceed the rate limits.
- Handle blocks gracefully: If the API tells you to slow down (like with a HTTP status code 429 “Too Many Requests” error), don’t panic. Adjust your code to wait a bit before trying again.
- Give it another try: Set up your code to retry making a request if it was blocked, but be patient and increase the wait time between retries to avoid getting blocked again.
- Keep an eye on things: Watch your usage and any messages from the API to make sure you’re staying within the limits and handling any issues smoothly.
#5 – Map your use case to Scopes
Map each use case or functionality requirement to the corresponding scopes or permissions required to access the necessary endpoints or resources.
Identify the minimum set of scopes needed to accomplish each use case while adhering to the principle of least privilege.
Let’s consider four examples:
- Use Case: Read-Only Access to User Profile Information:
- Use Case Description: Your application needs to display basic user profile information (e.g., name, email) but doesn’t require the ability to modify or update the profile.
- Scope(s) Needed: profile:read
- Example: If your application is using OAuth 2.0 for authentication, you might request the profile:read scope to access the user’s profile information.
- Use Case: Read and Write Access to User Posts:
- Use Case Description: Your application allows users to create, view, update, and delete their posts.
- Scope(s) Needed: posts:read, posts:write
- Example: If your application integrates with a social media API, you might request both posts:read and posts:write scopes to allow users to read and write posts.
- Use Case: Access to Financial Transactions for Reporting:
- Use Case Description: Your application needs to fetch financial transaction data for generating reports but doesn’t require the ability to initiate transactions.
- Scope(s) Needed: transactions:read
- Example: If your application integrates with a banking API, you might request the transactions:read scope to access the user’s transaction history.
- Use Case: Limited Access to Location Data for Geocoding:
- Use Case Description: Your application requires access to location data for geocoding addresses but doesn’t need continuous access to the user’s location.
- Scope(s) Needed: location:read
- Example: If your application integrates with a mapping API, you might request the location:read scope to access location data for geocoding addresses.
#6 – Choose the type of query
You can execute different queries when calling the API:
- GET: Used to get data from a server. Example: Reading a user’s profile.
- POST: Used to send data to a server to create something new. Example: Creating a new user account.
- PUT: Used to update data on a server. Example: Updating a user’s profile.
- PATCH: Used to make partial updates to data on a server. Example: Changing only a user’s email address.
- DELETE: Used to remove data from a server. Example: Deleting a user account.
#7 – Select endpoints
Using multiple endpoints to retrieve data is often necessary in API integration scenarios where the desired data is spread across different resources or requires different parameters for retrieval.
Here’s why it might be necessary, along with examples:
- Normalization of Data:
- Sometimes, data is normalized across multiple endpoints to maintain data integrity and reduce redundancy.
- Example: In an e-commerce application, product information may be stored separately from inventory information. To retrieve complete product details along with inventory status, you would need to query both the product and inventory endpoints separately.
- Granular Access Control:
- APIs may enforce granular access control, allowing access to different subsets of data based on user permissions or roles.
- Example: In a social media platform, retrieving user posts and comments may require separate endpoints. Users with different access levels (e.g., regular users vs. moderators) may have access to different subsets of comments, necessitating separate endpoints for retrieving them.
- Optimization of Performance:
- Separate endpoints may be provided to optimize performance by allowing clients to retrieve only the data they need, reducing the payload size and response time.
- Example: A weather API may offer separate endpoints for current weather conditions, hourly forecasts, and daily forecasts. Clients interested in current conditions may only need to query the current weather endpoint, avoiding unnecessary data retrieval.
- Versioning and Evolution of APIs:
- APIs evolve over time, and new features or changes may require the introduction of new endpoints while maintaining backward compatibility.
- Example: An API for a messaging service may introduce a new endpoint for retrieving message reactions or attachments in a later version. Clients using older versions of the API can continue to use existing endpoints without disruption, while newer clients can utilize the new endpoints for additional features.
#8 – Check output data format
Most APIs will return datasets in a JSON or XML format which are nice and easy formats to work with. But sometimes, you get more complicated datasets with nested objects or JSON strings which would require two queries to get a proper table.
If you’re dealing with underlying data schema, you can go through those steps to get a clean table:
- Parse the JSON Response: Start by parsing the JSON response returned by the API to convert it into a format that can be easily manipulated and processed in your programming language of choice.
- Identify Nested Objects: Identify the nested objects or JSON strings within the dataset that need to be separated into different tables. These nested objects typically represent related but distinct entities in your data model.
- Extract Nested Data: Extract the data from the nested objects and store them in separate data structures or tables. This may involve iterating through the JSON response and extracting specific fields or objects of interest.
- Generate Unique Identifiers: Generate unique identifiers or keys for the extracted data that can be used to establish relationships between the main dataset and the nested tables. These identifiers will help you link related data across different tables.
- Perform Join Operations: Use the unique identifiers to perform join operations between the main dataset and the nested tables. Joins allow you to combine related data from multiple tables based on common keys or identifiers.
- Create Views or Data Models: Optionally, create views or data models in your data management tool to abstract the complexity of the underlying schema. Views can provide simplified representations of the data structure, making it easier to query and analyze.
API Integration Example: Two Methods for Extract Data From Pinterest API
Using Python to extract data into a data viz tool
Python is commonly used by Data Analysts to create custom data visualizations when the built-in charts are not sufficient.
But Python can also be used for API integration. However, this requires some coding and data engineering skills which is not necessarily Data Analysts’ strongest suit!
But if you were to use Python for an API integration project, we put together a step-by-step guide based on a public project on GitHub and following our checklist:
Step 1 – Authentication:
Begin the OAuth flow to request user access by sending a POST request to the link https://www.pinterest.com/oauth/
Once you authorize the app, you will be redirected to a specified URI. A method is needed to receive the access code in the URI (referred to as auth_code here)
Exchange the access code for an access token by making a POST request to the access token endpoint (using the method called exchange_auth_code). This method will retrieve both an access token and a refresh token.
def exchange_auth_code(self, auth_code):
"""
Call the Pinterest API to exchange the auth_code (obtained by
a redirect from the browser) for the access_token and (if requested)
refresh_token.
"""
post_data = {
"code": auth_code,
"redirect_uri": self.api_config.redirect_uri,
"grant_type": "authorization_code",
}
if self.api_config.verbosity >= 2:
print("POST", self.api_config.api_uri + "/v5/oauth/token")
if self.api_config.verbosity >= 3:
self.api_config.credentials_warning()
print(post_data)
response = requests.post(
self.api_config.api_uri + "/v5/oauth/token",
headers=self.auth_headers,
data=post_data,
)
unpacked = self.unpack(response)
print("scope: " + unpacked["scope"])
self.access_token = unpacked["access_token"]
self.refresh_token = unpacked["refresh_token"]
self.scopes = unpacked["scope"]
Pinterest access tokens expire within 30 days (2,592,000 seconds), and refresh tokens last for 365 days (31,536,000 seconds). The refresh method is mandatory to renew the access token when necessary. To do this, a POST request is made to the token endpoint with the refresh token included as one of the parameters.
def refresh(self, continuous=False):
print(f"refreshing {self.name}...")
post_data = {"grant_type": "refresh_token", "refresh_token": self.refresh_token}
if continuous:
post_data["refresh_on"] = True
if self.api_config.verbosity >= 2:
print("POST", self.api_config.api_uri + "/v5/oauth/token")
if self.api_config.verbosity >= 3:
self.api_config.credentials_warning()
print(post_data)
response = requests.post(
self.api_config.api_uri + "/v5/oauth/token",
headers=self.auth_headers,
data=post_data,
)
unpacked = self.unpack(response)
self.access_token = unpacked["access_token"]
# save refresh token if it was also refreshed
if "refresh_token" in unpacked:
self.refresh_token = unpacked["refresh_token"]
Optional: Each time the access token is refreshed, you will need to store both the access and refresh tokens to a JSON-encoded file using the write method.
Utilize the access token by including it in the request header as part of the key-value, similar to what was done in the _init_ method, which will be invoked when querying each data endpoint.
def __init__(self, api_config, name=None):
if name:
self.name = name
else:
self.name = "access_token"
self.api_config = api_config
self.path = pathlib.Path(api_config.oauth_token_dir) / (self.name + ".json")
# use the recommended authorization approach
auth = api_config.app_id + ":" + api_config.app_secret
b64auth = base64.b64encode(auth.encode("ascii")).decode("ascii")
self.auth_headers = {"Authorization": "Basic " + b64auth}
Step 2 – Handle Pagination
There are two query parameters: page_size and bookmark.
Include the query parameter bookmark and set it to the value returned in the previous call’s response.
def __next__(self):
if self.index >= len(self.items):
# need to fetch more data, if there is a bookmark
if self.bookmark:
# Determine whether the query needs to be added to the path or
# if the bookmark will be an additional parameter at the end
# of the query.
delimiter = "&" if "?" in self.path else "?"
path_with_bookmark = self.path + delimiter + "bookmark=" + self.bookmark
self._get_response(path_with_bookmark)
if not self.items: # in case there is some sort of error
raise StopIteration
else:
raise StopIteration # no bookmark => all done
retval = self.items[self.index] # get the current element
self.index += 1 # increment the index for the next time
return retval
Step 3 – Select Endpoint
Finally, you’ll need to reconstitute the GET query with all these methods/variables to call an endpoint such as Board.
def get_response(self, path):
if self.api_config.verbosity >= 2:
print(f"GET {self.api_uri + path}")
return requests.get(
self.api_uri + path,
headers=self.access_token.header(),
allow_redirects=False,
)
def request_data(self, path):
return self.unpack(self.get_response(path))
def get(self):
if not self.board_id:
raise ValueError("board_id must be set to get a bord")
return self.request_data(f"/v5/boards/{self.board_id}")
Using ClicData’s web service connector to feed a built-in data warehouse and dashboards
If you don’t know ClicData, we are a complete data management and analytics platform that allows you to connect, transform, visualize, automate and collaborate with data from any source. We offer native connectors to most applications, storage systems and databases and a universal connector, Web Services, which can pull data from any REST or SOAP API.
We’re going to show you how to integrate with Pinterest API with our Web Service connector.
This integration will require an initial step of creating an app in the Pinterest developer portal.
Copy your App ID + App secret key (you will need it in step 1).
Define the callback url to ClicData: https://app.clicdata.com/oauth2/callback
Select the scopes you need for your reporting.
Step 1 – Authentication
Begin the OAuth flow to request user access by sending a POST request to the link https://www.pinterest.com/oauth/
In the ‘Server’ tab, fill out the web service host name:
Protocol: HTTPS
Host: api.pinterest.com
Port: 443
In the ‘Authentication’ tab, fill out the credentials provided by Pinterest in the App developer platform.
Grant Type: Authorization Code
Authorization Url: https://www.pinterest.com/oauth/
Access Token Url: https://api.pinterest.com/v5/oauth/token
Client ID: provided by Pinterest as ‘App ID’
Client Secret: provided by Pinterest as ‘App secret key’
Send Authentication: In Header
Scope: all the scopes you need. Make sure it matches the scopes listed in your app in Pinterest. In this example: boards:get
Audience: optional
Resource: optional
Add Token: Header
Step 2 – Handle Pagination
ClicData allows you to efficiently handle incremental pagination with stop response, stop when bookmark is null. Learn how to handle API pagination in our detailed documentation.
Step 3 – Select Endpoint
As you can see on the screenshot below, no coding is required to select the endpoint.
Make API Integration More Efficient With ClicData
In conclusion, tackling API integration can be a challenging and time-consuming endeavor, especially without a comprehensive checklist to guide you through the process. Our checklist serves as a cheat sheet, ensuring that you cover all essential aspects of API integration efficiently.
While methods like Python offer powerful capabilities, they may require coding skills and add complexity to the integration process. However, with no-code options like ClicData’s web service connector, extracting data becomes much simpler and more accessible, making it an ideal solution for both SOAP and REST APIs.
Ready to streamline your API integrations? Learn more about our features for API integration in our documentation. Let’s simplify data connectivity together.