How to Choose a More Effective UI Testing Framework

Originally published:March 8, 2021|Last updated:September 16, 2024|W. Perry Wortman

In this article we’re going to discuss four steps to improve the quality of the user interfaces (UIs) we create. Using a modern UI testing framework will not only enhance our experience as developers— more importantly, our users will experience a more reliable interface. Let’s begin, shall we?

1. Identify pain points

Gif credit: Giphy

By understanding the shortcomings of our current toolset, we are able to estimate which features will be of more value to us as well as those that we don’t care as much about. I will go into the most pertinent challenges we faced at Dashlane while working with a Ruby-based UI testing stack of Watir, Cucumber and Selenium.

Flaky tests

In order to talk about flaky tests, we should first talk for a second about determinism.

In the context of programming, determinism is generally defined as code that always produces the same output for a given input. This is not only an important idea in the day-to-day code we write, but especially important to testing. When our testing output is no longer reliable, we have what is referred to as a flaky test.

Flaky tests can lead to...

missed bugs that get into production

a frustrating developer experience

false positives

low confidence on tests

slower pipelines

delayed feature release times

solutions that no longer represent the original test case

long-term test suite degradation

To illustrate some of these dangers, let’s take an example test suite that has a case that is tied to some kind of async behavior. And let’s say that this test will fail 50% of the time in our CI/CD. Now we have to rerun our pipeline twice as often because half of the runs fail. We now have a situation where it is also not immediately obvious if the failure is a false positive or a true positive which should be fixed. This adds cognitive load and defeats the entire purpose of our CI test verification—not good.

As a temporary workaround, the team decides to create a script to collect and rerun any failures up to ten times. Sounds like a decent solution, right? Not so fast. The team has now just accepted uncertainty, or non-determinism, to the test suite. Over time this will erode test quality and integrity.

The problem with flaky tests is that they push us down the path of introducing uncertainty. And with uncertainty around our test results, what's the point in testing at all?

Waiting for XHR requests to come back

Calling endpoints and waiting for responses in our UI tests can be problematic with many UI testing frameworks. While many have the ability to use static waits and sleeps, these solutions are actually one of the common causes of flakiness in tests: since we can't anticipate how long we will have to wait for every async scenario, timeouts could introduce false positives.

In the case of dashlane.com, tests would need to play nice with XHR requests. The team consciously decided to take a look at some of the newer frameworks like TestCafe, Puppeteer and Cypress that have much more control over handling GET/POST requests. In step three, we'll take a closer look at some of the pros and cons of those three testing frameworks.

Difficult to configure frameworks

Embed provider giphy not yet supported.

Gif credit: Giphy

If updating or expanding the functionality of our current UI test framework feels like setting up a Rube Goldberg machine then it’s probably time to move on. With Dashlane’s previous implementation we were constantly having to update and configure the webdrivers for each browser we supported. This was often the cause of the CI pipeline failing.

Selenium used to be a good solution... but the project was started in 2004. Since then, many other projects have sought alternative solutions to the issue of creating reliable automation drivers, most notably Puppeteer and Cypress. These new solutions have simplified the process of setting up a new project, which sometimes means the initial setup can be as easy as writing a couple of lines of code! Some even have the ability to expand functionality through the use of plugins, so practically no coding is required. For our purposes, plugins integrating with visual regression tools like Percy and Applitools were a big plus.

Unclear and out-of-date docs

The importance of clear, navigable documentation can’t be overstated. Certainly it is possible to go and look at the source code to understand how a project works. But glancing through well written docs is much more efficient. Think reading MDN docs vs. trying to figure out the API of this cool-but-undocumented library off NPM.

Dealing with subpar documentation can be an indicator of poor quality in a project. If the developers didn't take the time to document their code, can you trust them to have taken more care while writing it? This lack of rigor can end up costing you valuable time. Imagine working on what we think is a valid feature of the tool, only to later discover in a Stack Overflow post that the feature has actually been deprecated. So frustrating!

Tests written in a different language than the product you’re testing

The team at Dashlane used to run tests written in Ruby to test the website—which was written in JS. This wasn’t such a huge problem when the team was small and able to quickly adapt and learn. But not having tests written in JS certainly wasn’t a plus. Having the UI tests live inside the codebase they are meant to test is convenient but adds complexity and additional maintenance when in another language. Additionally, the tests being written in Ruby would be one more barrier to new engineers contributing efficiently.

Tests take forever to run in your CI/CD pipeline

Typically we would want UI tests to run at some point before or during our pipeline so we can validate that our changes don’t have any regressions and that what was coded functions as expected. The faster the tests run, the less time developers will have to wait. In that way, we can improve productivity by eliminating idle time. In the benchmark tests Dashlane ran between our top three choices, all performed significantly faster than Selenium based solutions. Puppeteer came out the fastest with Cypress a close second:

Example Test Action (Avg. of 5 runs)	Cypress	Puppeteer	Test Cafe
Navigate to homepage and check the logo	1112ms	1020ms	1774ms
Navigate to features and verify sub navigations	1203ms	710ms	2373ms
Navigation to paid plans and validate the free plan	396ms	330ms	1303ms

2. Decide which features are must-haves (and which are nice-to-haves)

Once we determine which problems to avoid, we should decide which features are most important for our use cases. Many times these features will simply be solutions to the problems identified within our current tools.

Once it becomes clear which features we need most, the next step is to go out into the wild and hunt down a few frameworks that look like they can satisfy our criteria. You’re not looking for perfection, but rather a list of contenders to compare. If the framework seems promising enough, it’s not a bad idea to create a small proof of concept that can be shown to peers to gather feedback. For our comparison at Dashlane, we selected the following:

Test Cafe (https://devexpress.github.io/testcafe)

Excellent browser support

Decent docs

Smaller community

Less familiar syntax

Puppeteer (https://pptr.dev/)

Fast test executions

Good docs

Very large community

Chrome extension support

No browser support outside of Chrome

Cypress (https://www.cypress.io/)

Tests run inside the browser using native APIs

Fast test executions

Excellent docs

Excellent debugging tools

Easily tie tests to endpoint responses

Uses familiar Mocha and Chai syntax

Larger community

Good browser support

No Chrome extension support

4. Choose the framework that aligns best with your criteria

After figuring out what issues we simply can’t live with any longer and which features we absolutely need, let’s decide which framework comes closest to satisfying those needs. For the team at Dashlane, the choice was not easy because we were very excited by more than just one framework.

For dashlane.com, our team decided to use Cypress as our new UI testing tool.

Cypress checks off all the must have boxes for us. But there are also a bunch of useful features that only Cypress has, such as:

Automatic waiting

Network traffic control

Spies, Stubs, Clocks

Easy setup compared to TestCafe and Puppeteer

Super easy debugging

Time travel (not the sci-fi type but snapshots that you can see by hovering over the command in the test runner)

That’s not to say that Cypress is a perfect fit. While moving over from our hodge-podge mix of Watir, Cucumber and Selenium to Cypress has greatly increased developer happiness and overall UI reliability, we have had some small bumps along the road:

Difficulty testing iframes (there is now a plugin for this)

Limited browser support (this has greatly improved over the past year including FireFox and Edge support)

Chrome extension testing (because of this we have to use a different tool for our product)

Inability to test more than one tab at a time

Despite its few shortcomings, our team is very excited at the possibilities that Cypress brings to our quest to improve the overall quality of our applications at Dashlane. Hopefully this blog post will assist you in evaluating new UI testing solutions as well!

Engineering testing

W. Perry Wortman