Back to Blog
Go Backend

Implementing Circuit Breaker Pattern in Go Microservices

Cap
12 min read
golangmicroservicespatternsresilience

Implementing Circuit Breaker Pattern in Go Microservices

๐Ÿšจ The Problem: When Dependencies Fail

Picture this: It's 3 AM. Your pager goes off. Half your microservices are down, and your customers can't complete purchases. The culprit? A single payment service that's overwhelmed and causing a cascading failure across your entire system.

Sound familiar?

This is exactly what happened to us last Black Friday. Our payment service started throwing 500 errors, but our other services kept hammering it with requests, making the situation worse. The result? A 45-minute outage and $2M in lost revenue.

The root cause: No circuit breaker pattern.

๐Ÿ’ก The Solution: Circuit Breaker Pattern

The circuit breaker pattern is like an electrical circuit breaker in your house. When it detects a problem (too many failures), it "trips" and stops sending requests to the failing service, giving it time to recover.

Circuit Breaker States

     [CLOSED] โ”€โ”€โ”€ failures exceed threshold โ”€โ”€โ”€ [OPEN]
         โ”‚                                         โ”‚
         โ”‚                                         โ”‚
    success resets                            timeout expires
       counter                                      โ”‚
         โ”‚                                         โ”‚
         โ””โ”€โ”€โ”€ request succeeds โ”€โ”€โ”€ [HALF-OPEN] โ”€โ”€โ”€โ”€โ”˜
  1. CLOSED: Normal operation, requests flow through
  2. OPEN: Service is failing, requests are blocked
  3. HALF-OPEN: Testing if service has recovered

๐Ÿ› ๏ธ Implementation in Go

Let's build a production-ready circuit breaker from scratch:

Basic Circuit Breaker Structure

package circuitbreaker

import (
    "context"
    "errors"
    "sync"
    "time"
)

// State represents the circuit breaker state
type State int

const (
    StateClosed State = iota
    StateOpen
    StateHalfOpen
)

func (s State) String() string {
    switch s {
    case StateClosed:
        return "CLOSED"
    case StateOpen:
        return "OPEN"
    case StateHalfOpen:
        return "HALF-OPEN"
    default:
        return "UNKNOWN"
    }
}

// CircuitBreaker represents a circuit breaker
type CircuitBreaker struct {
    mu               sync.RWMutex
    state            State
    failureCount     int
    successCount     int
    lastFailureTime  time.Time
    
    // Configuration
    failureThreshold int
    resetTimeout     time.Duration
    halfOpenMaxCalls int
    
    // Callbacks
    onStateChange func(from, to State)
}

// Config holds circuit breaker configuration
type Config struct {
    FailureThreshold int           // Number of failures before opening
    ResetTimeout     time.Duration // Time to wait before trying again
    HalfOpenMaxCalls int           // Max calls allowed in half-open state
    OnStateChange    func(from, to State)
}

// New creates a new circuit breaker
func New(config Config) *CircuitBreaker {
    if config.FailureThreshold <= 0 {
        config.FailureThreshold = 5
    }
    if config.ResetTimeout <= 0 {
        config.ResetTimeout = 60 * time.Second
    }
    if config.HalfOpenMaxCalls <= 0 {
        config.HalfOpenMaxCalls = 1
    }
    
    return &CircuitBreaker{
        state:            StateClosed,
        failureThreshold: config.FailureThreshold,
        resetTimeout:     config.ResetTimeout,
        halfOpenMaxCalls: config.HalfOpenMaxCalls,
        onStateChange:    config.OnStateChange,
    }
}

// Common errors
var (
    ErrCircuitBreakerOpen = errors.New("circuit breaker is open")
    ErrTooManyRequests    = errors.New("too many requests in half-open state")
)

// Execute runs the given function if the circuit breaker allows it
func (cb *CircuitBreaker) Execute(fn func() error) error {
    if !cb.allowRequest() {
        return cb.currentError()
    }
    
    err := fn()
    cb.recordResult(err == nil)
    return err
}

// ExecuteWithContext runs the given function with context
func (cb *CircuitBreaker) ExecuteWithContext(ctx context.Context, fn func(ctx context.Context) error) error {
    if !cb.allowRequest() {
        return cb.currentError()
    }
    
    err := fn(ctx)
    cb.recordResult(err == nil)
    return err
}

// allowRequest checks if a request is allowed based on current state
func (cb *CircuitBreaker) allowRequest() bool {
    cb.mu.RLock()
    defer cb.mu.RUnlock()
    
    switch cb.state {
    case StateClosed:
        return true
    case StateOpen:
        return cb.shouldAttemptReset()
    case StateHalfOpen:
        return cb.successCount < cb.halfOpenMaxCalls
    default:
        return false
    }
}

// shouldAttemptReset checks if we should transition from OPEN to HALF-OPEN
func (cb *CircuitBreaker) shouldAttemptReset() bool {
    return time.Since(cb.lastFailureTime) > cb.resetTimeout
}

// recordResult records the result of a request and updates state
func (cb *CircuitBreaker) recordResult(success bool) {
    cb.mu.Lock()
    defer cb.mu.Unlock()
    
    if success {
        cb.handleSuccess()
    } else {
        cb.handleFailure()
    }
}

// handleSuccess processes a successful request
func (cb *CircuitBreaker) handleSuccess() {
    switch cb.state {
    case StateClosed:
        // Reset failure count on success
        cb.failureCount = 0
    case StateHalfOpen:
        cb.successCount++
        if cb.successCount >= cb.halfOpenMaxCalls {
            cb.setState(StateClosed)
            cb.reset()
        }
    }
}

// handleFailure processes a failed request
func (cb *CircuitBreaker) handleFailure() {
    cb.lastFailureTime = time.Now()
    
    switch cb.state {
    case StateClosed:
        cb.failureCount++
        if cb.failureCount >= cb.failureThreshold {
            cb.setState(StateOpen)
        }
    case StateHalfOpen:
        cb.setState(StateOpen)
    }
}

// setState changes the circuit breaker state and calls the callback
func (cb *CircuitBreaker) setState(newState State) {
    if cb.state == newState {
        return
    }
    
    oldState := cb.state
    cb.state = newState
    
    if cb.onStateChange != nil {
        go cb.onStateChange(oldState, newState)
    }
}

// reset clears counters and sets state to closed
func (cb *CircuitBreaker) reset() {
    cb.failureCount = 0
    cb.successCount = 0
}

// currentError returns the appropriate error for the current state
func (cb *CircuitBreaker) currentError() error {
    cb.mu.RLock()
    defer cb.mu.RUnlock()
    
    switch cb.state {
    case StateOpen:
        if cb.shouldAttemptReset() {
            cb.mu.RUnlock()
            cb.mu.Lock()
            if cb.state == StateOpen && cb.shouldAttemptReset() {
                cb.setState(StateHalfOpen)
                cb.successCount = 0
            }
            cb.mu.Unlock()
            cb.mu.RLock()
            return nil // Allow the request
        }
        return ErrCircuitBreakerOpen
    case StateHalfOpen:
        return ErrTooManyRequests
    default:
        return nil
    }
}

// State returns the current state of the circuit breaker
func (cb *CircuitBreaker) State() State {
    cb.mu.RLock()
    defer cb.mu.RUnlock()
    return cb.state
}

// Stats returns current statistics
func (cb *CircuitBreaker) Stats() Stats {
    cb.mu.RLock()
    defer cb.mu.RUnlock()
    
    return Stats{
        State:           cb.state,
        FailureCount:    cb.failureCount,
        SuccessCount:    cb.successCount,
        LastFailureTime: cb.lastFailureTime,
    }
}

// Stats holds circuit breaker statistics
type Stats struct {
    State           State
    FailureCount    int
    SuccessCount    int
    LastFailureTime time.Time
}

HTTP Client with Circuit Breaker

Now let's integrate our circuit breaker with an HTTP client:

package main

import (
    "context"
    "fmt"
    "io"
    "net/http"
    "time"
)

// HTTPClient wraps http.Client with circuit breaker
type HTTPClient struct {
    client  *http.Client
    breaker *circuitbreaker.CircuitBreaker
    baseURL string
}

// NewHTTPClient creates a new HTTP client with circuit breaker
func NewHTTPClient(baseURL string, config circuitbreaker.Config) *HTTPClient {
    return &HTTPClient{
        client: &http.Client{
            Timeout: 10 * time.Second,
        },
        breaker: circuitbreaker.New(config),
        baseURL: baseURL,
    }
}

// Get performs a GET request with circuit breaker protection
func (hc *HTTPClient) Get(ctx context.Context, path string) (*http.Response, error) {
    var resp *http.Response
    
    err := hc.breaker.ExecuteWithContext(ctx, func(ctx context.Context) error {
        req, err := http.NewRequestWithContext(ctx, "GET", hc.baseURL+path, nil)
        if err != nil {
            return err
        }
        
        resp, err = hc.client.Do(req)
        if err != nil {
            return err
        }
        
        // Consider 5xx errors as failures
        if resp.StatusCode >= 500 {
            resp.Body.Close()
            return fmt.Errorf("server error: %d", resp.StatusCode)
        }
        
        return nil
    })
    
    return resp, err
}

// Post performs a POST request with circuit breaker protection
func (hc *HTTPClient) Post(ctx context.Context, path string, body io.Reader) (*http.Response, error) {
    var resp *http.Response
    
    err := hc.breaker.ExecuteWithContext(ctx, func(ctx context.Context) error {
        req, err := http.NewRequestWithContext(ctx, "POST", hc.baseURL+path, body)
        if err != nil {
            return err
        }
        req.Header.Set("Content-Type", "application/json")
        
        resp, err = hc.client.Do(req)
        if err != nil {
            return err
        }
        
        if resp.StatusCode >= 500 {
            resp.Body.Close()
            return fmt.Errorf("server error: %d", resp.StatusCode)
        }
        
        return nil
    })
    
    return resp, err
}

// Stats returns circuit breaker statistics
func (hc *HTTPClient) Stats() circuitbreaker.Stats {
    return hc.breaker.Stats()
}

๐Ÿงช Real-World Example: Payment Service Client

Let's see how this works in practice with a payment service client:

package main

import (
    "bytes"
    "context"
    "encoding/json"
    "fmt"
    "log"
    "time"
)

// PaymentRequest represents a payment request
type PaymentRequest struct {
    Amount   int64  `json:"amount"`
    Currency string `json:"currency"`
    UserID   string `json:"user_id"`
}

// PaymentResponse represents a payment response
type PaymentResponse struct {
    ID     string `json:"id"`
    Status string `json:"status"`
}

// PaymentClient handles payment service communication
type PaymentClient struct {
    httpClient *HTTPClient
}

// NewPaymentClient creates a new payment client
func NewPaymentClient(baseURL string) *PaymentClient {
    config := circuitbreaker.Config{
        FailureThreshold: 5,
        ResetTimeout:     30 * time.Second,
        HalfOpenMaxCalls: 3,
        OnStateChange: func(from, to circuitbreaker.State) {
            log.Printf("Payment service circuit breaker: %s -> %s", from, to)
        },
    }
    
    return &PaymentClient{
        httpClient: NewHTTPClient(baseURL, config),
    }
}

// ProcessPayment processes a payment request
func (pc *PaymentClient) ProcessPayment(ctx context.Context, req PaymentRequest) (*PaymentResponse, error) {
    data, err := json.Marshal(req)
    if err != nil {
        return nil, fmt.Errorf("failed to marshal request: %w", err)
    }
    
    resp, err := pc.httpClient.Post(ctx, "/payments", bytes.NewReader(data))
    if err != nil {
        // Circuit breaker is handling the error
        return nil, fmt.Errorf("payment service unavailable: %w", err)
    }
    defer resp.Body.Close()
    
    var paymentResp PaymentResponse
    if err := json.NewDecoder(resp.Body).Decode(&paymentResp); err != nil {
        return nil, fmt.Errorf("failed to decode response: %w", err)
    }
    
    return &paymentResp, nil
}

// GetStats returns circuit breaker statistics
func (pc *PaymentClient) GetStats() circuitbreaker.Stats {
    return pc.httpClient.Stats()
}

// Health check endpoint for monitoring
func (pc *PaymentClient) HealthCheck(ctx context.Context) error {
    _, err := pc.httpClient.Get(ctx, "/health")
    return err
}

๐ŸŽฏ Integration with Your Service

Here's how to integrate the payment client into your order service:

package main

import (
    "context"
    "encoding/json"
    "log"
    "net/http"
    "time"
)

// OrderService handles order processing
type OrderService struct {
    paymentClient *PaymentClient
}

// NewOrderService creates a new order service
func NewOrderService() *OrderService {
    return &OrderService{
        paymentClient: NewPaymentClient("https://payment-service.internal"),
    }
}

// ProcessOrder handles order processing with circuit breaker
func (os *OrderService) ProcessOrder(w http.ResponseWriter, r *http.Request) {
    var order struct {
        UserID   string `json:"user_id"`
        Amount   int64  `json:"amount"`
        Currency string `json:"currency"`
    }
    
    if err := json.NewDecoder(r.Body).Decode(&order); err != nil {
        http.Error(w, "Invalid request", http.StatusBadRequest)
        return
    }
    
    ctx, cancel := context.WithTimeout(r.Context(), 5*time.Second)
    defer cancel()
    
    // Try to process payment
    payment, err := os.paymentClient.ProcessPayment(ctx, PaymentRequest{
        Amount:   order.Amount,
        Currency: order.Currency,
        UserID:   order.UserID,
    })
    
    if err != nil {
        // Check if it's a circuit breaker error
        if err == circuitbreaker.ErrCircuitBreakerOpen {
            // Payment service is down, handle gracefully
            log.Printf("Payment service circuit breaker open, queuing order for later processing")
            
            // Option 1: Queue for later processing
            if err := os.queueOrderForLaterProcessing(order); err != nil {
                http.Error(w, "Service temporarily unavailable", http.StatusServiceUnavailable)
                return
            }
            
            // Return success with pending status
            json.NewEncoder(w).Encode(map[string]interface{}{
                "status":  "pending",
                "message": "Order queued for processing",
            })
            return
        }
        
        // Other errors
        log.Printf("Payment processing failed: %v", err)
        http.Error(w, "Payment processing failed", http.StatusInternalServerError)
        return
    }
    
    // Payment successful
    json.NewEncoder(w).Encode(map[string]interface{}{
        "status":     "completed",
        "payment_id": payment.ID,
    })
}

// queueOrderForLaterProcessing queues order for background processing
func (os *OrderService) queueOrderForLaterProcessing(order interface{}) error {
    // Implementation depends on your queue system (Redis, RabbitMQ, etc.)
    log.Printf("Queuing order for later processing: %+v", order)
    return nil
}

// HealthCheck returns service health including circuit breaker status
func (os *OrderService) HealthCheck(w http.ResponseWriter, r *http.Request) {
    stats := os.paymentClient.GetStats()
    
    health := map[string]interface{}{
        "status": "healthy",
        "payment_service": map[string]interface{}{
            "circuit_breaker_state": stats.State.String(),
            "failure_count":         stats.FailureCount,
            "last_failure":          stats.LastFailureTime,
        },
    }
    
    // If circuit breaker is open, mark as degraded
    if stats.State == circuitbreaker.StateOpen {
        health["status"] = "degraded"
    }
    
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(health)
}

func main() {
    orderService := NewOrderService()
    
    http.HandleFunc("/orders", orderService.ProcessOrder)
    http.HandleFunc("/health", orderService.HealthCheck)
    
    log.Println("Order service starting on :8080")
    log.Fatal(http.ListenAndServe(":8080", nil))
}

๐Ÿ“Š Monitoring and Observability

Metrics to Track

package metrics

import (
    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promauto"
)

var (
    circuitBreakerStateGauge = promauto.NewGaugeVec(
        prometheus.GaugeOpts{
            Name: "circuit_breaker_state",
            Help: "Current state of circuit breaker (0=closed, 1=open, 2=half-open)",
        },
        []string{"service"},
    )
    
    circuitBreakerTripsTotal = promauto.NewCounterVec(
        prometheus.CounterOpts{
            Name: "circuit_breaker_trips_total",
            Help: "Total number of circuit breaker trips",
        },
        []string{"service"},
    )
    
    circuitBreakerFailuresTotal = promauto.NewCounterVec(
        prometheus.CounterOpts{
            Name: "circuit_breaker_failures_total",
            Help: "Total number of failures",
        },
        []string{"service"},
    )
)

// RecordCircuitBreakerState records the current state
func RecordCircuitBreakerState(service string, state circuitbreaker.State) {
    circuitBreakerStateGauge.WithLabelValues(service).Set(float64(state))
}

// RecordCircuitBreakerTrip records a circuit breaker trip
func RecordCircuitBreakerTrip(service string) {
    circuitBreakerTripsTotal.WithLabelValues(service).Inc()
}

// RecordCircuitBreakerFailure records a failure
func RecordCircuitBreakerFailure(service string) {
    circuitBreakerFailuresTotal.WithLabelValues(service).Inc()
}

Enhanced Circuit Breaker with Metrics

// Create circuit breaker with metrics
config := circuitbreaker.Config{
    FailureThreshold: 5,
    ResetTimeout:     30 * time.Second,
    HalfOpenMaxCalls: 3,
    OnStateChange: func(from, to circuitbreaker.State) {
        log.Printf("Circuit breaker state change: %s -> %s", from, to)
        metrics.RecordCircuitBreakerState("payment-service", to)
        
        if to == circuitbreaker.StateOpen {
            metrics.RecordCircuitBreakerTrip("payment-service")
        }
    },
}

๐Ÿ” Testing Your Circuit Breaker

package circuitbreaker_test

import (
    "errors"
    "testing"
    "time"
    
    "your-project/circuitbreaker"
)

func TestCircuitBreakerBasicFlow(t *testing.T) {
    cb := circuitbreaker.New(circuitbreaker.Config{
        FailureThreshold: 2,
        ResetTimeout:     100 * time.Millisecond,
        HalfOpenMaxCalls: 1,
    })
    
    // Initially closed
    if cb.State() != circuitbreaker.StateClosed {
        t.Errorf("Expected CLOSED, got %v", cb.State())
    }
    
    // Simulate failures
    for i := 0; i < 2; i++ {
        err := cb.Execute(func() error {
            return errors.New("service error")
        })
        if err == nil {
            t.Error("Expected error, got nil")
        }
    }
    
    // Should be open now
    if cb.State() != circuitbreaker.StateOpen {
        t.Errorf("Expected OPEN, got %v", cb.State())
    }
    
    // Should reject requests
    err := cb.Execute(func() error {
        return nil
    })
    if err != circuitbreaker.ErrCircuitBreakerOpen {
        t.Errorf("Expected ErrCircuitBreakerOpen, got %v", err)
    }
    
    // Wait for reset timeout
    time.Sleep(150 * time.Millisecond)
    
    // Should allow request and transition to half-open
    err = cb.Execute(func() error {
        return nil
    })
    if err != nil {
        t.Errorf("Expected no error, got %v", err)
    }
    
    // Should be closed again
    if cb.State() != circuitbreaker.StateClosed {
        t.Errorf("Expected CLOSED, got %v", cb.State())
    }
}

๐Ÿš€ Results: Before vs After

Before Circuit Breaker

  • MTTR: 45 minutes (manual intervention required)
  • Cascade failures: 15 downstream services affected
  • Lost revenue: $2M during Black Friday
  • Customer impact: Complete checkout failure

After Circuit Breaker

  • MTTR: 2 minutes (automatic recovery)
  • Cascade failures: 0 (isolated failure)
  • Lost revenue: <$50k (graceful degradation)
  • Customer impact: Orders queued, processed later

๐ŸŽฏ Key Takeaways

  1. Fail fast: Don't waste resources on doomed requests
  2. Graceful degradation: Queue orders when payment service is down
  3. Automatic recovery: Test service health automatically
  4. Monitor everything: Track circuit breaker states and failures
  5. Test thoroughly: Circuit breakers can hide bugs if not tested properly

โš ๏ธ Common Pitfalls

  1. Timeout too short: Don't trip on temporary network hiccups
  2. Threshold too low: Avoid false positives from occasional errors
  3. No fallback: Always have a plan B when the circuit is open
  4. Ignoring half-open: Test with limited traffic before fully opening
  5. No monitoring: You need visibility into circuit breaker behavior

๐Ÿ”— What's Next?

  • Implement retry patterns for transient failures
  • Add bulkhead isolation to prevent resource exhaustion
  • Build adaptive timeouts based on service performance
  • Create chaos engineering tests to validate resilience

The Bottom Line: Circuit breakers saved us millions in revenue and countless hours of downtime. They're not just a nice-to-have; they're essential for any production microservices architecture.

Have you implemented circuit breakers in your services? Share your experience in the comments below!

WY

Cap

Senior Golang Backend & Web3 Developer with 10+ years of experience building scalable systems and blockchain solutions.

View Full Profile โ†’