Our Android testing process


When we took on the daunting task of rewriting our entire app at the end of 2019, one of our key focus areas was testability. At the time, we had less than 20% of our code covered by unit tests — no integration or end-to-end (E2E) tests — and adding any tests to the code base was a big effort. From the beginning, we agreed on the standard of at least 80% unit test coverage on all Pull Requests (PRs), E2E tests for critical flows, and an architecture that was focused on testability.

Testing Strategy

Shortly after the rewrite, we had hundreds of unit tests and around twenty E2E tests. We wanted to optimize the number of E2E tests that we do to only certain use cases where it makes sense to take on the additional time and costs in exchange for the reduced risk.

Test selection pyramid for cost and speed

Unit Tests

Our unit tests are straightforward. They test a very small unit of code by relying on JUnit 5 and different layers of mocking:

  • We introduced a Java Faker library, that provides “fakes” for primitive values that do not need to have a specific value
  • Similarly, we have our own “model Fakers” that provide domain objects. That way when the underlying classes change, we only make modifications in the Fakers.
  • We use the AAA pattern to keep our tests organized and consistent
  • We also use Robolectric for the handful of unit tests that goes through some Android specific code

E2E Tests

Our E2E tests validate the UI in our app by running instrumentation tests on physical devices in Firebase Test Lab (FTL). While this worked really well at catching UI regressions, we soon realized that this approach was not scalable as the number of tests increased. The reason for this is because of flakiness.

Integration Tests

We realized this wasn’t working, so we decided to introduce UI Integration Tests. These solved our primary issue of network flakiness by mocking out the network completely, using OkHttp’s MockWebServer. These tests use fake JSON responses. We started to see fast and consistent test results that could reliably gate our PRs.

CI Pipeline and Reporting

Overall Setup

We want our tests to add confidence without slowing us down. As such, we have different rules for when tests are run.

  1. Regression — runs at 12pm every day as a separate CI job.
  2. Minimum Acceptance Tests (MAT) — runs at 4pm every day as a separate CI job.

Beyond FTL

One of the difficult challenges that we have faced when maintaining our UI tests is keeping them in a green state.

  • It should group all tests in a single list, whether it was run in parallel or not.
  • Automate bug ticketing for test failures into JIRA.

Looking Into The Future

We are constantly focusing on ways to improve our testing strategy.

  • We’re also (finally) finishing our migration from Dagger to Hilt. This will allow us, among many other things, to better swap dependencies in our tests.


There’s no one-size-fits-all for testing. We believe that it’s mostly about striking the right balance between having a stable app and moving with speed. However, there are ways to make your testing suite easier to maintain, more insightful and eventually make you move faster. By having tests that are cheap (fast, low maintenance, low flakiness), we allow our developers to move faster by having higher confidence in our code.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Headspace is meditation made simple. Learn with our app or online, when you want, wherever you are, in just 10 minutes a day.