The dark side of UI test automation

In part one, I talked about handling the software development aspect of test automation and how to take UI test automation seriously. In this part, I will continue with my view of how to handle problems with the environment since it is a big part of test automation. And a short take on test automation tools.

Save the environment

The initial idea for this blog came to me after a very exhausting and frustrating week. At that time, I had been working on a client project for about four months. The refactoring of the existing solution had been completed (as far as refactoring can ever be called as completed, that is). The requested regression tests were implemented, and I had regular requirements meetings with the responsible product owners. Tests for new features were implemented with about half a sprint offset.

The situation:

As part of my usual routine, I checked my test reports on Monday morning of that week, and all the tests failed.

There was no need to panic; I knew that every Sunday evening, the client updates and reboots the system in batches. After a system reboot, the Selenium Grid must be rebooted as well. There were scheduled tasks to take care of it, but who knew what had been going wrong? After the nodes were rebooted, I started the test automation job manually. A short breakfast break later, I realized that all the tests failed again.

I checked the test automation framework locally on my workstation, and it worked. But then, why were my test reports failing?

In the next paragraphs, I will explain the ‘but,’ which of course, must follow in more detail. For this, I will zoom into the technical details of the setting. Readers who are not very familiar with Selenium and web application testing will have to forgive me for this technical excursion and can skip the ensuing sections to go directly to the ‘Conclusion of the situation’.

It took me the entire morning to understand that there were automatic Chrome updates on two out of the three nodes. It took me half of the afternoon to find out that the ChromeDriver, which is in use, introduces a bug with the updated version of Chrome. The ChromeDriver could not handle the new security settings of Chrome.

The bug fix of Selenium/ChromeDriver was released on Wednesday and should have worked. It should have but did not.

It took me another day to figure out that the client whitelists Chrome extensions, as also the fact that any new Chrome instance makes the ChromeDriver unpack and install a new automation extension.

This is, how else should it be, not an official extension from the Chrome Web Store. For security reasons, the client cannot put this extension on the whitelist. This meant the extension had to be entered manually into the registry Chrome whitelist.

The Monday itself, or the whole week would not have been too bad. The test automation runs on more than just a browser. In this case, Firefox.

But, as they say, a bug rarely comes alone. The Firefox tests failed on Monday as well (I noticed this only on Tuesday, as I was too busy with the Chrome bug). Firefox had an automatic update as well. This time, it was not the extensions that caused problems, but the certificate management.

Short explanation: In the client’s environment, an enterprise root certificate is needed to connect outside the internal network. The certificate is stored in the Windows certificate management. Firefox manages its certificates and does not access the Windows certificate management. Therefore, the Enterprise Root Certificate must be manually imported into each profile. Every instance started by the GeckoDriver creates a new Firefox profile. The GeckoDriver has a function that allows passing on the certificate, but not in a Selenium Grid environment. There was a bug at that time.

Conclusion of the situation:

Given the complex environment and multiple departments that had to access all the information to identify and solve the problems, the test automation was completely down for almost a week.

How to handle the situation:

DON’T PANIC

Complicated environments with complex roles and rights management that have grown historically, as well as loveless cloned test environments that are usually virtualized with far too little memory, are unfortunately very common. There are always data or even version differences between test, staging, and production environments that cannot be influenced.

I call this the external influence. Even if the test automation is supposedly set up correctly, some things are out of our control.

In such cases, there isn’t much you can do. Just talk to the appropriate people, point out existing problems regularly, and keep calm. Some solutions will be found, somehow.

The best way to deal with such problems is to adopt DevOps principles like highly automated infrastructure (Buzzword: IaC), and build pipelines that deploy environments the moment they are needed.

Tools, Tools, Tools

Everyone loves using tools and integrating them into the system environment. There are a ridiculous number of tools for test automation out there. In my opinion, there are two major problems:

We, as experts, face the paradox of choice.
The wrong people, or let’s say, people with no expertise decide what tools to use.

The situation:

As someone who has recently been between projects, I’ve had some downtime. A colleague asked for my help with a test automation situation. The next Monday, I met him at the client’s office, and he showed me the system under test (SUT) - a web application, which had been heavily customized, along with the test automation ‘solution’.

Although I don’t mean to discredit any tools, I noticed that the client used a tool from an Austrian vendor, which is globally known and very expensive. There had been no prior analysis or acquis phase where they checked if the SUT is testable with that tool.

Spoiler: It was not.

Why did the client decide to use the tool? Simple, they already had a license, which was seldom used. They had pieced together three test cases till then (no, the tool didn’t offer a way to write code or scripts). In theory, the modules were reusable, and data could be passed dynamically. So far, so good – or so I thought.

I had ten days before the start of my new project, and the goal was to implement the 10 most crucial use cases, which were data-driven and reusable. Three done, seven to go - easy.

So, what do you do before you start implementing anything? You check with what you are working. I started the test run and observed what happened.

Nothing.

After analyzing, I realized that the test runner needs an extension to find either Chrome or any elements within the website to be tested.

I installed the extension and started the test again.

And nothing.

I inspected the error message: “couldn’t find element exception.” Ok, easy to fix. I inspected the element, which was a username input field, and even had an ID. I checked the element in the tool, the same ID. Weird! I checked the test case which uses the element, and the reference seemed to be correct. I wasted half an hour before I thought about rescanning the username input field of the system under test and use that rescan to create a new element. After that, it worked, at least for that element.

In the first three days, I came across so many weird problems:

Elements in the web application couldn’t be addressed from one test run to the next, for no reason.
The same test case with different data sets passed one time and failed the second time, for no reason.
I wasted one full day to connect the test runner with Chrome. It stopped working overnight I did all the suggested points in their documentation, around 8, still didn’t work. On my last try to fix it before giving up, it suddenly worked.

At the end of day four, I still had nothing other than a very deep reluctance towards that tool.

On day five, my colleague and I spoke to the project owner. We told him that it would not work with the tool, and if he wants to have a usable result, we need something which is robust and can be implemented quickly. He gave us the go-ahead, and I started to implement a code-based solution, using Java, Junit as test framework, and Selenium as the driver to access the SUT.

How to handle the situation:

Writing code-based test automation is no silver bullet. But you can do anything within a solid framework and a basic understanding of writing codes. In this situation, I could implement the requirements in the remaining five days because at Nagarro, we have enough experience in QA and test automation. We do have a proofed approach and well-implemented framework, which provides an engineer with everything to get started quickly.

Before you consider getting the licenses of a tool, try to answer a couple of questions. Be critical and think about what your team/organization needs and how the tool will support you:

Do we need a tool with a monthly/yearly/user license fee?
Do we need a tool where a huge learning curve is needed to handle it – which implies that you build a knowledge island?
- Is there another way to simulate what I need from the tool? Perhaps the knowledge to reach the goal already exists in my team/organization
How robustly can the test cases be designed?
How hard is it to create reusable keywords?
How difficult is it to integrate it into build pipelines?
How tough is it to integrate it into the used ticket system?
Is versioning with standard tools like Git possible?

In my experience, no tool provides you with all the necessary bandwidth required to cover the complexity of a system. It is fatal to build test automation around one tool that doesn’t support enough integration of other tools/libraries (inessential test automation tools). As you can imagine, I’m no fan of click-and-replay tools for various reasons.

But one reason stands out especially.

Clients tend to hold on to tools longer than required for them. Sometimes, even after seeing the proof that a certain tool is not good for their major use case, they continue to use and persist with it.

Conclusion - test automation is highly specialized software development!

There is a lot to do. Everyone needs or wants (UI) test automation. However, only a few are aware of what it really means.

Today, more than ever, (UI) test automation is developing into a discipline of its own in software development. And it is exactly that: highly specialized software development. It is important to provide young beginners and ‘old pros’ with enough space, time, and knowledge to support their team and make the project a success.

Digital Engineering

Intelligent Enterprise

Experience and Design

Article 11 May 2020 7 min read

The dark side of UI test automation - Part 2

Thomas Goldberger

Save the environment

Tools, Tools, Tools

Conclusion - test automation is highly specialized software development!

Thomas Goldberger

AI-powered test case generation: Myth or reality?

Article

Amplifying human potential with AI-augmented testing

Article

Transforming donor engagement with AI-powered solutions

Article

What can we help you achieve?

Stay up to date with insights from Nagarro!

Digital Engineering

Intelligent Enterprise

Experience and Design

Article 11 May 2020 7 min read

The dark side of UI test automation - Part 2

Thomas Goldberger

Save the environment

Tools, Tools, Tools

Conclusion - test automation is highly specialized software development!

Thomas Goldberger

Interesting? Spread the word

Or check these related articles

AI-powered test case generation: Myth or reality?

Article

Amplifying human potential with AI-augmented testing

Article

Transforming donor engagement with AI-powered solutions

Article