Strategies for Building Bulletproof Integrations

Modern web applications often rely on external APIs that can be slow, unreliable, or even disappear entirely. To build robust integrations, you need clear timeouts, conditional retries, rate limiting, decoupled code designs, and carefully orchestrated background jobs. This post provides proven techniques, using Ruby on Rails for illustration, that can be applied in almost any technology stack.

1. Configure Timeouts

Timeouts prevent your application from getting stuck when the external service fails to respond promptly.

  • Connection Timeout: How long to wait for the connection to be established (e.g., 5 seconds).
  • Read Timeout: How long to wait for data after the connection is established (e.g., 15 seconds).

Example (Net::HTTP)

require 'net/http'
require 'uri'

uri = URI.parse("https://api.unreliable.com")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = (uri.scheme == 'https')
http.open_timeout = 5   # connection timeout
http.read_timeout = 15  # read timeout

response = http.get(uri.request_uri)
puts response.body

If these limits are exceeded, Net::HTTP raises an exception that you can catch and handle.

2. Automatic Retries

When dealing with transient or server-side failures, you can retry requests to increase success rates. However, be selective:

  1. Retry on timeouts and server (5xx) errors.
  2. Do not retry on 4xx errors. They usually indicate issues in your request itself.

Example Using Faraday with Retry

Use faraday-retry or Retriable for robust retry behavior.

Faraday + faraday-retry

require 'faraday'
require 'faraday/retry'

def fetch_data
  Faraday.new(url: "https://api.unreliable.com") do |conn|
    conn.request :retry,
      max: 3,
      interval: 1,
      exceptions: [Faraday::TimeoutError, Faraday::ConnectionFailed]
    conn.options.timeout = 15
    conn.options.open_timeout = 5
    conn.response :raise_error
    conn.adapter Faraday.default_adapter
  end.get("/data")
rescue Faraday::ClientError => e
  # Handle 4xx or client errors
rescue Faraday::ServerError => e
  # Handle 5xx after retries
end

Retriable

require 'net/http'
require 'uri'
require 'retriable'

def reliable_request
  uri = URI.parse("https://api.unreliable.com")
  Retriable.retriable(on: [Timeout::Error], tries: 3) do
    http = Net::HTTP.new(uri.host, uri.port)
    http.use_ssl = (uri.scheme == 'https')
    http.open_timeout = 5
    http.read_timeout = 15
    response = http.get(uri.request_uri)
    # raise or handle depending on response code
    return response
  end
end

3. Implement a Hard Rate Limiter

Many APIs have soft limits—once you exceed them, you might be billed for each additional request. Implement a hard limit within your app to avoid unexpected costs.

  1. Track request counts in persistent storage (e.g., database or Redis).
  2. Reject calls immediately if you’re at the limit.

Example (Rails Model)

class ApiUsageTracker
  MAX_REQUESTS_PER_HOUR = 1000

  def self.increment_usage
    record = ApiTracker.find_or_create_by(hour: Time.current.beginning_of_hour)
    if record.request_count < MAX_REQUESTS_PER_HOUR
      record.increment!(:request_count)
      return true
    else
      return false
    end
  end
end

def call_protected_api
  if ApiUsageTracker.increment_usage
    # Proceed with API call
  else
    # Return an error or raise an exception
    raise "API limit exceeded!"
  end
end

In this snippet, hour is used to group requests by the current hour.

4. Decouple with an Interface

Minimize direct dependencies on a specific API by coding against an interface. This gives you freedom to swap out underlying providers without altering business logic.

module CurrencyProvider
  def convert(amount, from_currency, to_currency)
    raise NotImplementedError
  end
end

class UnreliableApiCurrencyConverter
  include CurrencyProvider

  def convert(amount, from_currency, to_currency)
    # Actual call to https://api.unreliable.com
  end
end

class FallbackCurrencyConverter
  include CurrencyProvider

  def convert(amount, from_currency, to_currency)
    # A cached or offline calculation
  end
end

5. Asynchronous, Persistent Queuing

Offload external API interactions to background jobs. This approach lets your application remain responsive and resilient to transient failures.

  1. Dedicated Queue: Place API jobs in a separate queue to keep them isolated.
  2. Persist Status: Use database-backed jobs so failures are not lost.
  3. Retry or Discard: Use retry_on or discard_on to handle specific errors gracefully.

Rails ActiveJob Example

class PaymentJob < ApplicationJob
  queue_as :api_calls

  retry_on PaymentGatewayError, wait: 1.minute, attempts: 5
  retry_on NetworkTimeout, wait: :exponentially_longer, attempts: :unlimited

  def perform(order)
    # Make request to https://api.unreliable.com
    # If PaymentGatewayError or NetworkTimeout occurs,
    # Rails retries automatically based on the settings
  end
end

retry_on handles defined exceptions by re-enqueuing the job for another attempt. You can also use discard_on for errors that should not be retried.

Conclusion

Incorporating strict timeouts, targeted retries, robust rate limiting, interface-based designs, and asynchronous job orchestration will make your application more resilient when integrating with an unreliable or slow external service. By implementing these strategies, you ensure higher availability, better fault tolerance, and tighter control over behavior when the API fails or becomes unresponsive.

Author Of article : David Paluy Read full article