HTML <canvas> testing with Selenium and OpenCV

🐞Maciej Kusz

Test Automation Architect 🐞 Lecturer 🐞 Blogger

Published Nov 9, 2017

Since HTML <canvas> become more and more popular for creating interactive content on any web page like games (especially since Adobe Flash technology is dying), there is a big problem with testing it using pure Selenium. If you have never seen <canvas> you may be wondering why? Mostly because <canvas> (like old Flash element) is seen in DOM structure just like element without any content even if there is a complex game inside, eg.

<canvas id="myCanvas" width="200" height="100"></canvas>

Using just Selenium you will be only able to locate <canvas> element and get its position, size and some state, like isElementVisible, etc., but you will not be able to see what's inside and test internal behavior.

What can we do to test HTML <canvas>?

Since the <canvas> element is a container for graphics elements (with additional logic written in JavaScript) we can try to perform manual mouse actions using Selenium Action Chains. We have there a few useful action, like:

By combining only those 2 actions you will be able to click any button inside <canvas> element. But you will face 2 big problems:

What are (x, y) coordinates of the center of button to be clicked?
What is the current state of the game?

Get (x, y) coordinates the button center

We can approach this problem from 2 different directions:

Prepare static (x, y) coordinate of the button center inside <canvas> element and use move_to_element_with_offset from Selenium Action Chain
Get button center dynamically

Point 1 is quite easy to prepare using any graphics editing tool and we will not talk about it (going and easy path is not the way we follow at XCaliber, especially when the path is short and ends with a cliff). Reason for it is quite easy: we will need to implement dynamic method if we want to know a state of the game.

So how we can obtain button center coordinates dynamically?

We "just" need to "see" what's happening inside the <canvas> element. You can think: "easy to say, harder to do", but you will see that it's not that hard.

The best approach to "just see" problem is to use computer vision. Since Python has a very good binding for widely use the library called OpenCV, we can use it to solve this problem. In short, OpenCV is an image processing tool that will allow us to see what's happening inside <canvas> element.

In my previous article about Page Object Pattern, I have described how to prepare the object for XPath element locator. Let's use the same approach for a graphical locator.

Graphical locator

import cv2 import numpy from io import BytesIO from PIL import Image class GraphicalLocator(object): def __init__(self, img_path): self.locator = img_path # x, y position in pixels counting from left, top cornerself.x = None self.y = None self.img = cv2.imread(img_path) self.height = self.img.shape[0] self.width = self.img.shape[1] self.threshold = None @property def center_x(self):return self.x + int(self.width / 2) \ if self.x and self.width else None @property def center_y(self):return self.y + int(self.height / 2) \ if self.y and self.height else None def find_me(self, drv):# Clear last found coordinatesself.x = self.y = None # Get current screenshot of a web page scr = drv.get_screenshot_as_png() # Convert img to BytesIO scr = Image.open(BytesIO(scr)) # Convert to format accepted by OpenCV scr = numpy.asarray(scr, dtype=numpy.float32).astype(numpy.uint8) # Convert image from BGR to RGB format scr = cv2.cvtColor(scr, cv2.COLOR_BGR2RGB) # Image matching works only on gray images# (color conversion from RGB/BGR to GRAY scale) img_match = cv2.minMaxLoc( cv2.matchTemplate(cv2.cvtColor(scr, cv2.COLOR_RGB2GRAY), cv2.cvtColor(self.img, cv2.COLOR_BGR2GRAY), cv2.TM_CCOEFF_NORMED)) # Calculate position of found elementself.x = img_match[3][0] self.y = img_match[3][1] # From full screenshot crop part that matches template image scr_crop = scr[self.y:(self.y + self.height), self.x:(self.x + self.width)] # Calculate colors histogram of both template# and matching images and compare them scr_hist = cv2.calcHist([scr_crop], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256]) img_hist = cv2.calcHist([self.img], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256]) comp_hist = cv2.compareHist(img_hist, scr_hist, cv2.HISTCMP_CORREL) # Save treshold matches of: graphical image and image histogramself.threshold = {'shape': round(img_match[1], 2),'histogram': round(comp_hist, 2)} # Return image with blue rectangle around matchreturn cv2.rectangle(scr, (self.x, self.y), (self.x + self.width, self.y + self.height), (0, 0, 255), 2)

The code above should be self-explained. The main reason for this is to provide the same interface as for XPath element locator. Having above you can do something like this if you want to test if given image is present on a web page:

img_check = GraphicalLocator("/path/to/image.png") img_check.find_me(webdriver_instance)

Problem with above code is that it can give you false positive matches.

How to defend against false positive matches?

Take a look at GraphicalLocator object and its threshold attribute. It contains 2 values:

The threshold for image shape match is telling us how similar both images are (the one you are looking for and found one). If value equals 1 then images shapes are identical.
The threshold for image colors histogram match is telling us how similar colors of both images are. If value equals 1 then images colors histograms are identical.

Why do we need those 2 thresholds? Take a look at pictures below:

Both images present the same button, but in 2 different states (enabled and disabled). When you will try to find the first image and the second one will be present, shape threshold will be set to 1. It's happened because OpenCV image matching algorithm works on grey scaled images. In grey scale, both images shape is the same. Because of that, when you want to be sure that image you are looking for, is an image you can see, you need to check not only if the shape is identical but also colors of the images are the same. It's why there is also color histogram threshold calculated during image finding. This way code for checking is image present, should lock like this:

img_check = GraphicalLocator("/path/to/img.png") img_check.find_me(webdriver_instance) is_found = True if img_check.threshold['shape'] >= 0.8 and \ img_check.threshold['histogram'] >= 0.4 else False

Values of thresholds to compare should be chosen during some experiments (those are working for me).

How to click?

Now the best part. Just take a look at this code snippet:

from selenium.webdriver.common.action_chains import ActionChains img_check = GraphicalLocator("/path/to/img.png") img_check.find_me(webdriver_instance) is_found = True if img_check.threshold['shape'] >= 0.8 and \ img_check.threshold['histogram'] >= 0.4 else False if is_found: action = ActionChains(webdriver_instance) action.move_by_offset(img_check.center_x, img_check.center_y) https://www.linkedin.com/redir/invalid-link-page?url=action%2eclick() action.perform()

Conclusion

As you can see it's not so hard to check if the image is visible in the <canvas> element and click on it. Extending this approach with allow you check the current state of the game, because of state checking will be based on visibility or invisibility of some elements.

PS. It also works with Flash elements ;)

23 Comments

Mirko Brankovic

Voip/WebRTC enthusiast

What did you use as webdriver_instance? Canvas got by id or webdriver instance, since on canwas i get error that webelement can't get screenshot and on webdriver i get sometimes out of bounds possition ;/

Mirko Brankovic

Voip/WebRTC enthusiast

have you encounterd this error, maybe a python version problem File "graphical_locator.py", line 18 @propertydef center_x(self):return self.x + int(self.width / 2) \ ^ SyntaxError: invalid syntax Pointing on first letter of center_x Thanks

Somia Sharma

Tech Lead(QA)

Hi can u plzz explain how we test canvas using java

David Sun

Software Developer at Capital One

In what way is this different from/superior than using Sikuli?

Oleksii Shcherbin

Senior Software Engineer

Hello. I have just used your code, and it's working. But! When i want, after clicking on first element, find another element and click on it, it does not do it. Code find element, find coordinates, calculate shape and histogram, click on it, but i don't see clicking process. When i do these steps only for first element, or only for second element - it works. But when i want to do several steps consistently - it does not work. Maybe you know why, please tell me.

HTML <canvas> testing with Selenium and OpenCV

🐞Maciej Kusz

Test Automation Architect 🐞 Lecturer 🐞 Blogger

What can we do to test HTML <canvas>?

Get (x, y) coordinates the button center

So how we can obtain button center coordinates dynamically?

Graphical locator

How to defend against false positive matches?

How to click?

Conclusion

More articles by this author

Others also viewed

Optimize Cursor Workflow

Code Interpreter Python Package Reference: July 4, 2024

Say Goodbye to Fragile Prompts: How DSPy is Revolutionizing AI Programming

Ultimate Web Development Trends Python Developers Can't Ignore

Dash Club 19: Introducing Plotly Studio, Plotly OFTW App Challenge Winners, Figure Friday, App of the Month

A Goldfish's Guide to Vibe Coding

OpenAI Completions API — Complete Guide

How To Train Your AI Agent

Beyond the Prompt: A Microcosm of Why Programming Could Be Your AI Superpower

Can ASO be easy?

Explore content categories

What can we do to test HTML <canvas>?

Get (x, y) coordinates the button center

So how we can obtain button center coordinates dynamically?

Graphical locator

How to defend against false positive matches?

How to click?

Conclusion

Ask Me Anything

Apr 18, 2019

Page Object Pattern?

Jul 6, 2017

Why do we need middle layer in test framework?

Apr 10, 2017

How to build test automation framework from scratch?

Feb 3, 2017

Others also viewed

Optimize Cursor Workflow

Code Interpreter Python Package Reference: July 4, 2024

Say Goodbye to Fragile Prompts: How DSPy is Revolutionizing AI Programming

Ultimate Web Development Trends Python Developers Can't Ignore

Dash Club 19: Introducing Plotly Studio, Plotly OFTW App Challenge Winners, Figure Friday, App of the Month

A Goldfish's Guide to Vibe Coding

OpenAI Completions API — Complete Guide

How To Train Your AI Agent

Beyond the Prompt: A Microcosm of Why Programming Could Be Your AI Superpower

Can ASO be easy?

Explore content categories