Skip to main content

Introduction

Most, if not all, of the systems we develop nowadays require the cooperation of other services, which may live in the same process, in the same machine, or may require some network communication. This creates a lot of different potential scenarios where things may fail. Resilience is the ability of your system to act in an organized way when those events occur.

Your particular approach to resilience depends on multiple factors. Is it possible to try that request again? Should the administrator be alerted if the detected error is fatal? Instead of predefined answers, Arrow aims to provide a set of tools that you can compose to specify your solution in a concise and composable way.

Where to find it

All the resilience mechanisms described in this section are part of the arrow-resilience library. Prior to version 1.2.0, they were available as part of arrow-fx-coroutines.

The Arrow Resilience library implements three of the most critical design pattern around resilience:

  • Retry and repeat computations using Schedule,
  • Protect other services from being overloaded using CircuitBreaker,
  • Implement transactional behavior in distributed systems in the form of Saga.
Media Resources

The following videos showcase how to introduce resilience in your applications.

Further Reading

If you want to know more about these patterns, you can check some of these guides: