a red and green colored pencil

Software testing is critical to evaluating and improving the performance, security, and overall success of an application. It requires the dedicated support of skilled testing professionals who know what to look for and what the resulting data means. Because it’s tailored to the application itself, testing is rarely a one-size-fits-all thing—but this is especially true when it comes to complex, microservice-based architectures.

In our Intro to Software Testing article, we broke down some common types of testing in the software testing cycle. But what types of testing should you be implementing specifically if you’re using microservices? What different types of traffic, bottlenecks, failures, and performance issues do you need to consider?

Previously, we discussed the unique challenges microservices present in terms of security. Here, we’ll take a look at some things to consider when performance testing a microservices environment.

Performance Testing for Peace of Mind

Performance testing is designed to generate data and key performance indicators (KPIs) regarding an app’s stability, reliability, speed, and resource usage that can help you correct bugs, errors, and bottlenecks before it goes into production. However, performance testing, like most other things, needs to be adjusted when you’re working with distributed systems.

With microservices, it’s inherent that the environment will always be evolving—that’s the beauty of being able to deploy different services independently from one another, and to scale up where and when you need to. But this means that different, difficult-to-predict failures among the various interconnected services is inevitable. So how can you be as confident as possible that your app won’t suffer major lags or outages once it’s in production? How can you test to account for the added choreography of microservices?

“We need testing strategies that will provide confidence that a service artifact can proceed to the production environment. Testing a large distributed system is hard, and continuous delivery makes this harder since the shared runtime environment is changing constantly. We are seeing promising results with contract testing, which is one of the various microservices testing strategies we’ve considered.”

-Stratis Karamanlakis, Chief Technology Officer, Upwork, from Upwork Modernization: An Overview.

There are a few types of performance test that can tell you different things about the way your app will behave in production. Here are a few to know and what questions they should try and answer.

1. Base testing

Does the software work? This first, most basic question—does the software work—can be answered with base testing. The goal here is to determine the overall functionality of the app and the “steady state” of the distributed system. Similar to the way you’d map out microservices to make it easier to identify any security aberrations, establishing this baseline behavior lets you know what output indicates the system is functioning properly before adding any extra stress onto the system. (It’s also helpful for planning and finding ways the system can evolve.)

End-to-end testing ensures the system as a whole is meeting its business requirements without getting into testing on a microservice-based level. Similarly, capacity testing will let you know if your app can handle the baseline amount of traffic it was designed to handle.

Browse top functional testing freelancers on Upwork.

2. Unit testing

Does each component work well on its own? It would make sense that unit testing would be a part of any granular microservices testing strategy. In a monolith architecture, there are also benefits to unit testing—isolating the smallest usable components of an app and testing them one by one to ensure everything works on its own before integrating with the app as a whole. But with microservices, it’s a bit easier to do as all of those components are already separated out.

Learn more about different types of unit tests and how to run unit tests with external services here and browse top unit testing freelancers on Upwork.

3. Integration testing

Are the communication pathways between services functioning properly? With microservices, it’s important to recognize that you’re not just testing software like you would with a tiered monolith—you’re testing functionality as a whole, each microservice on its own, and the calls between microservices, or “the plumbing.” The interfaces where the services communicate with one another (typically APIs) are critical to the functionality of the app, so it’s just as important to test the components themselves as it is the communication pathways between them.

Test the interfaces between them for defects that might interrupt the interactions between services. Many microservices environments rely on RESTful HTTP or other messaging protocols to interface with one another. Running gateway integration tests will locate any errors at the protocol level, and integration contract testing will ensure that each component is behaving the way it should when contacted. The API interfaces between the components should also be tested, to verify they’re delivering the right information in the correct format.

4. Load testing

How does the app handle day-to-day transactions? A big difference between microservices and a monolith is the network calls that occur between services—and it’s there that bottlenecks will likely occur. Test how the app and network perform and hold up to a high number of concurrent calls, and test for different “pathways” of calls depending on what a user is requesting from the app with the end goal of finding any bottlenecks and correcting them.

Be sure to mix up what you put your app through in testing with different events and combinations of events that might threaten the stability of your app. Test for a high volume of calls. Create tests to see how the app as a whole performs when one microservice needs to transfer a large amount of data to another. Test for big events, or smaller, day-to-day events. The idea here is to identify any weaknesses like aberrant behaviors, outages from too much downstream traffic, response time latencies, or the effects of too many “retries” if timeouts aren’t properly set up.

The network will often be the bottleneck and this type of testing will uncover what parts of the app aren’t scalable enough to keep the app from melting down under a high volume of traffic.

Browse top load testing freelancers on Upwork to get started.

5. Stress testing

How does the app perform under extreme volume? Stress testing takes load testing and pushes your app to the limits to see how it handles an extremely heavy load. This type of testing will give you a good idea of your app’s ceiling, or break point. It’s not an everyday scenario, but it should give you pretty good indicators of what will cause your system to get overloaded. (Other types of stress testing include spike tests and soak tests, which let you know about sudden increases in traffic vs. more long, sustained increases in traffic, respectively.)

6. Resiliency testing

“Our experience with the cloud has not been without glitches, a couple of which have been major. Cloud infrastructure can be volatile and can fail in ways dissimilar to the ones we had learned to expect using a data center. The important takeaway is that failure is a given and we need to plan and design around it. Resiliency must be baked into our designs and plans.”

-Stratis Karamanlakis, Chief Technology Officer, Upwork, from Upwork Modernization: An Overview.

How does the app stand up to failures? Failures are inevitable in an environment that’s always evolving. A good measure of how well your app can stand up to isolated failures is through some rigorous resiliency testing.

A primary goal in microservice testing is to prevent cascading failures once your app is in production. There are levels to the kinds of failures a microservices application can face depending on your architecture, and they can sometimes be isolated to a component, have a ripple out effect, or cause a complete outage altogether.

For example, individual servers that are running part of a specific service may crash or become unavailable. Or, parts of the network may fail, making it impossible for certain services to communicate with one another. That’s not all they need to communicate with, though—if an external data center goes down, that needs to be accounted for, too.

A New Discipline: Chaos Engineering

All of the above fall under the bigger picture of performance testing—tests designed to gauge the stability, scalability, speed, and responsiveness of a distributed system.

Related to that is chaos engineering, a way to observe the behavior of microservices in controlled environments to see how they perform when parts of the system fail. It’s slightly different from typical resilience testing, deliberately causing failures to reveal any “systemic uncertainty in distributed systems.”

“The principles of Chaos Engineering and tools such as the Netflix Chaos Monkey emphasize the practice of proactively stressing a system in order to understand its limits and build confidence in its expected availability and resiliency. We are not there yet, but we will soon start to adopt some of these approaches.”

-Stratis Karamanlakis, Chief Technology Officer, Upwork, from Upwork Modernization: An Overview.

Why “chaos”? Well, these systems can get a little chaotic—especially when you take into account that things are always evolving, each service needs to be functioning properly on its own, external services and databases can fail, and all of the boundaries and gateways internally and externally need to be tightly controlled. If one instance goes down, what do components that rely on it fall back on?

The key is to set up the right redundancies so systems can tolerate failures without causing service interruptions. Learn more about this tool from Netflix here and check it out on Github here.