You are on page 1of 4

12/25/2019 Appium Pro #101: AI for Appium--and Selenium!

Edition 101

AI for Appium--and Selenium!


Java All Platforms All Devices

Appium Pro is normally all about... well, Appium! And other mobile testing related topics.
However, in this post we're going to discuss an exciting development in AI in the world of
Selenium, Appium's web-based forebear. Read on--I think you'll get something out of this
even if you're focused purely on mobile testing.

For some time Appium has been experimenting with AI/ML approaches to augmenting mobile test
automation. In addition to its visual testing capabilities, there is also a special plugin for nding elements
using ML models (even when all you have is a screenshot). Part of what makes these features possible
with Appium is the fact that it is possible to write plugins for Appium that integrate with these various
other projects.

I've often wondered how we can do the same thing with Selenium. Unfortunately, Selenium's
architecture is not quite set up for third parties to write plugins that take advantage of behavior in the
Selenium server itself. That doesn't stop us from writing client-side "plugins" that have access to the
driver object, though!

AI-based element nding for Selenium

How do we create a client-side plugin for Selenium? Basically by putting together a library which takes
an existing Selenium session (a driver object) and uses it for its own purposes. In our case, this special
library will have access to the Test.ai classi cation model that already exists as part of the Test.ai +
Appium classi er plugin. This plugin was originally developed to give Appium users access to the
classi cation model via the -custom locator strategy. The advantage of this approach was precisely that
it was the Appium server being augmented--all the work could be done in one language (Node.js) and
made available to every client library with minimal modi cations.

In the case of Selenium, the equivalent work would have needed to be done as an extension to each
client library. That was way too much work! So instead, we extended the capabilities of the existing
Appium classi er plugin, so that it could also act as a classi cation server. This approach (very much akin
to the client/server architecture of Selenium and Appium themselves) keeps the heavy lifting in one
https://appiumpro.com/editions/latest 1/4
12/25/2019 Appium Pro #101: AI for Appium--and Selenium!

place, and allows very thin clients to be written in every language. The only downside is that you have to
make sure to have the classi er server up and running.

The Classi er server

If you already have the test-ai-classifier package installed via NPM, no extra install steps are
necessary. Otherwise, npm install -g test-ai-classifier . Then, running the server is quite simple:

test-ai-classifier

With no arguments, the server will start up on localhost, port 50051 (the default for gRPC-based
services). Of course, you can always pass in -h and -p ags with custom host and port information
(using 0.0.0.0 for host if it's important to listen on all interfaces).

The Classi er client

Once you've got the server running, you need to decide which client to use. There are four available:

Java client

Python client

Node client

Ruby client

We'll use the Java client for our purposes. To get it included in you Gradle-based Java project, the easiest
thing to do is use Jitpack, and then to include a directive like the following, to get the client downloaded
from GitHub:

testImplementation group: 'com.github.testdotai', name: 'classifier-client-java', version: '1.0.0'

There are a few different ways to use the client, including the ability to pass image data to it directly,
outside of the context of Appium, Selenium, or anything else. Either way, the rst thing we need to do is
instantiate the client:

classifier = new ClassifierClient("127.0.0.1", 50051);

The only parameters are the expected host and port values. Of most interest for us in terms of what we
can call on classifier here is the method findElementsMatchingLabel , which takes two parameters:
a driver object and a string representing the label for which we want to nd matching elements. Have a
look at this example:

List<WebElement> els = classifier.findElementsMatchingLabel(driver, "twitter");

In this case, we're looking for any elements that look like a Twitter logo. Notice that the return value of
this method is exactly what you'd expect--a list of standard WebElement objects. You can click them, get
their attributes, and anything else you'd be able to do with a regular element.

https://appiumpro.com/editions/latest 2/4
12/25/2019 Appium Pro #101: AI for Appium--and Selenium!

How does all this magic work? Well, the Classi er client runs a special XPath query that attempts to nd
any leaf node element, and then directs the browser to take a screenshot of each element, all on its own.
From these screenshots, the client has all the image data it needs to send over to the Classi er server,
which sends back information about the strength of any matches. The client can then map these results
to the elements it found via XPath, lter out any which don't match the requested label, and return the
rest to you!

What this does mean is that any browser driver you use will need to support the "take element
screenshot" command. In my experimentation, only Chrome was reliable enough to not fail in weird ways
when asked to take screenshots of so many elements. This API is relatively new, so I expect we'll see
better reliability from Safari and Firefox (the only two other browsers I tried) soon enough. At any rate,
take a look at the full code sample below, which demonstrates how we can load up a webpage, nd an
icon using only its semantic label, and then interact with it:

import java.net.MalformedURLException;
import java.net.URL;
import java.util.List;

import org.hamcrest.collection.IsCollectionWithSize;
import org.junit.After;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.remote.RemoteWebDriver;

import ai.test.classifier_client.ClassifierClient;

public class Edition101_AI_For_Selenium {


private RemoteWebDriver driver;
private ClassifierClient classifier;

@Before
public void setUp() throws MalformedURLException {
driver = new RemoteWebDriver(new URL("http://localhost:4444/wd/hub"),
new ChromeOptions());
classifier = new ClassifierClient("127.0.0.1", 50051);
}

@After
public void tearDown() throws InterruptedException {
if (driver != null) {
driver.quit();
}
if (classifier != null) {
classifier.shutdown();
}
}

@Test
public void testClassifierClient() throws Exception {
// navigate to a webpage
driver.get("https://test.ai");

// find the twitter icon


List<WebElement> els = classifier.findElementsMatchingLabel(driver, "twitter");

// make sure we have just one element which is a twitter icon, and click on it
Assert.assertThat(els, IsCollectionWithSize.hasSize(1));
els.get(0).click();

https://appiumpro.com/editions/latest 3/4
12/25/2019 Appium Pro #101: AI for Appium--and Selenium!
// assert that we got to the appropriate twitter homepage
Assert.assertEquals(driver.getCurrentUrl(), "https://twitter.com/testdotai");
}
}

Discuss this Edition

The Appium Pro newsletter and site are made with love by

© 2018 - 2019 Cloud Grey, LP. All rights reserved. Appium is a registered trademark of the JS Foundation.
Appium logos, marks, and names used with permission.

https://appiumpro.com/editions/latest 4/4

You might also like