Why is XCUITest so Bad

The problems of XCUITest to provide correct order of elements, their geometry and visibility.

Jon Gabilondo
11 min readMar 29, 2020
A view of XCUITest UI Tree with Organismo.

Conclusion I

Let’s start from the end, the conclusion. Just for the fun of breaking the usual order and saving you time.

XCUITest does many things well. But there are some things it does badly. Being the only way to make UI Tests on iOS, any problem is going to be big. There are many systems to make UI tests but they all have to use XCUITest at the endpoint.

If for any reason you are interested in getting a simple value such as the visibility of an UI element then you are headed to an unsatisfactory job. The reason is that “visibility” of an UI element is a very complicated matter. So much so that Apple doesn’t want to get involved in it. Therefore you will be alone in figuring out that “simple” property, condemned to write a ‘good enough’ solution that will need continuous refinements as you go along.

You will find similar situations in the order of elements. Often, the order of elements as you traverse the UI Tree with the XCUITest API does not match what is represented in the screen. This can be a serious problem if you want to find the element that corresponds to a given X,Y on the screen. Again, finding an element by location X, Y is not something that Apple provides, therefore you need to use the XCUITest element traversing API to find the element.

The conclusion is, stay away from using or figuring out the visibility and find element by X, Y in your UI tests, if you can.

Are these complains about XCUITest APIs that are not available ? Yes and no. Read more if you are interested in more detailed information.

Why to Get Into Trouble

I wonder if this is what Apple asked themselves when confronted, if they did, with the issue of providing a ‘isVisible’ and ‘elementByPoint’.

First let’s see what does the XCUITest API provide for elements (XCUIElement) that could help us find out if it is visible. We can see the documentation here:

Through the XCUIElementAttributes parent class we can get to other important values like : identifier, elementType, value, title, label, enabled, selected, frame.

The attributes exists, hittable and frame would seem enough to find out if an element is visible, even more, hittable could be a synonym for visible.

In any case, the isVisible or elementByPoint are not available, why to look for trouble. Well, if you are doing serious testing and you did not start today, probably you are obliged to respond to those two functions. Take the WebDriver specification, with Selenium and Appium as an implementation of them, or even older testing frameworks, those functions are there.

To help us understand the issue it is very interesting to read the WebDriver Specification on Element Displayedness for those who have to implement the feature. We can see how did a field (App testing) about looking for fail-proof operations like find the element in the UI that has label “Login” moved into terminilogy like “perceptually visible to the human eye” and “based on crude approximations about an element’s nature and relationship in the tree”. And to add to the mix, we can find something as simple as “An element is in general to be considered visible if any part of it is drawn on the canvas within the boundaries of the viewport”. But it is not that simple, it is a trap.

Element displayedness

Although WebDriver does not define a primitive to ascertain the visibility of an element in the viewport, we acknowledge that it is an important feature for many users. Here we include a recommended approach which will give a simplified approximation of an element’s visibility, but please note that it relies only on tree-traversal, and only covers a subset of visibility checks.

The visibility of an element is guided by what is perceptually visible to the human eye. In this context, an element’s displayedness does not relate to the visibility or display style properties.

The approach recommended to implementors to ascertain an element’s visibility was originally developed by the Selenium project, and is based on crude approximations about an element’s nature and relationship in the tree. An element is in general to be considered visible if any part of it is drawn on the canvas within the boundaries of the viewport.

The element displayed algorithm is a boolean state where true signifies that the element is displayed and false signifies that the element is not displayed. To compute the state on element, invoke the Call(bot.dom.isShown, null, element). If doing so does not produce an error, return the return value from this function call. Otherwise return an error with error code unknown error.

This function is typically exposed to GET requests with a URI Template of /session/{session id}/element/{element id}/displayed.

Deeper Into the Trouble

To make things worse we have to now discuss what is visibility and what is display.

In HTML/CSS display and visibility are commonplace notions, similar but with different meaning. We can illustrate with this example in codepen.io. Three simple CSS boxes illustrate how display is about keeping the the space for the element, and visibility whether it will be drawn.

Three simple CSS boxes, with default visibility and display.
Box b with display : none.
Box b is displayed, therefore taking the space, but with visibility : hidden

Would WebDriver specification stick to this principles and just make the displayed/visibility functions based on the mentioned simple rule ? Unfortunately not, visibility is naturally and unavoidably related to the notion whether the human eye can see it, and that is the result of a combination of location, size, opacity, tag NOSCRIPT, hidden flag, stroke, ghosts etc. etc.

It is interesting to point out that the old Selenium RC had a visible property which has been deprecated, currently the property is called displayed:

Given this background, one could understand why to avoid falling into this fuzzy questions, why would you provide an API call that is not failproof. Why not just stick to find elements in a tree and performing taps on them, leaving “perception” issues aside.

XCUITest, it seems to me, very intelligently decided not to enter into visibility issues, against what it seems a totally legitimate property (visibility) and against the standards in the field (WebDriver).

But I have the feeling, though, that XCUITest, providing “hittable” API call may fall into the same cracks. I would say that to compute “hittable” it has similar challenges.

Let’s explore if XCUIElement’s “exists” and “hittable” would be synonyms of WebDriver’s “display” and ”hittable”.

Show Me An Example

Since I carry the burden of proof I will show some examples. It is not difficult to find some cases having a tool to explore iOS App like Organismo. We are going to use the WebDriverAgent of Appium (originated by Facebook) which is a WebDriver Specifications implementation using XCUITest.

Example of “visibility” Inaccuracy

The Wikipedia page has a good example of the difficulties to figure out element visibility. The language selection menu in the search field is invisible for the WebDriverAgent. See on the UITree on the right side the element “(XCUIelementType)Other” and “(XCUIelementType)StaticText”, they have visibility value false.

Organismo exploration of Wikepedia page. Focusing on the language popup menu within the search field.

You may rightly think that this is a Appium’s WebDriverAgent problem, why to blame XCUITest. Let’s advance to see if it XCUITest is correct.

Let’s see what does the straight API of XCUITest tell about that language selection UI element. I have written a simple XCUITest test to find the menu element and its child, both with label “EN” as we saw in Organismo tree. It will log the exists, hittable and enabled attributes of XCUIElement.

XCUITest test to get the “EN” element and get the attributes.

The tests confirms my fears. The elements are found, this is a 100% success operation, there is no room for surprises. The exists and enabled attributes are therefore TRUE, this is exact science, there is no room for calculations of perception. Attribute hittable is FALSE. But it is hittable. I can send a tap to it and the menu will expand. XCUITest is playing the visibility game with hittable. And it looses.

Example of Geometry Inaccuracy

Let’s take Apple’s own web site. The Apple icon at the top is represented by a top XCUIElementOther of 49x53 p. See the pink rectangle at the top in the image bellow. But the XCUIElementLink child is in offset.

Appium’s Dispear To Provide the Visibility Attribute

The WebDriverAgent is a smart implementation of the WebDriver specifications using XCUITest. Originally a Facebook in house project, now discontinued, it has been taken forward by Appium.

The WebDriverAgent is an extraordinary exercise of taking the XCUITest API and extending it to provide WebDriver specifications in a fast and as realible as possible manner. It pushes the XCUITest to the limit using Private API to provide what Apple doesn’t.

The code of the WebDriverAgent to figure out the visibility of an object it is really worth reading. It exemplifies to complexity of the visibility notion. You can find the code here in XCUIElement+FBVisible.m category.

- (BOOL)fb_isVisible

The above function is the extension to the XCElementSnapshot class to get the visibility attribute. In the function we can see code related to checking if the element is within the limits of the screen. This is not as straightforward as one may think either. It calls a complicated function to see if the the frame of the element is inside bounds:

- (CGRect)fb_frameInContainer:(XCElementSnapshot *)container hierarchyIntersection:(nullable NSValue *)intersectionRectange

The process continues with even more impressive tests based on a private API “id<XCTestManager_ManagerInterface>” that gives the element at a given point calling the function “_XCT_requestElementAtPoint”.

+ (XCAccessibilityElement *)axElementWithPoint:(CGPoint)point {__block XCAccessibilityElement *onScreenElement = nil;
id<XCTestManager_ManagerInterface> proxy = [FBXCTestDaemonsProxy testRunnerProxy];
dispatch_semaphore_t sem = dispatch_semaphore_create(0);[proxy _XCT_requestElementAtPoint:point reply:^(XCAccessibilityElement *element, NSError *error) {if (nil == error) {
onScreenElement = element;
} else {
[FBLogger logFmt:@"Cannot request the screen point at %@: %@", [NSValue valueWithCGPoint:point], error.description];
}
dispatch_semaphore_signal(sem);
}];
dispatch_semaphore_wait(sem, dispatch_time(DISPATCH_TIME_NOW, (int64_t)(0.3 * NSEC_PER_SEC)));return onScreenElement;
}

We can see even more smart things. They use YYCache to cache the visibility boolean of every element in the tree. Because and element often it computes as non-visible, but some children compute visible (!). Therefore an element is considered visible if some child is visible.

And the best of all is that they use a private flag of XCUITest to get the visibility value (!). Here it is :

NSNumber *isVisible = self.additionalAttributes[FB_XCAXAIsVisibleAttribute];  
if (isVisible != nil) {
return isVisible.boolValue;
}

This is the first check in the “fb_isVisible” function and it is FALSE. Appium does not do any further tests and relies on that attribute.

Even commenting the check of the Snaphot attribute, Appium continues looking for trouble by asking to find the XCAccessibilityElement at the elements mid-point using axElementWithPoint (above), which doesn’t return anything that matches the expectations. The result is that both elements related to the language menu get visible attribute FALSE.

Does XCUITest Have Visible Attribute ?

One must admit that Appium knows to dig into XCUITest. Officially XCUITest doesn’t mention anything about visibility in the API, and until now I have been saying the it might be a smart thing to do.

Through the serious Appium XCUITests explotation we see that Apple has something under the sleeve. Bellow we can see part of the private API of XCElementSnapshot. The additionalAttributes dictionary seems to be keeping secret information.

@interface XCElementSnapshot : NSObject <XCUIElementAttributes, NSSecureCoding> {....
@property
(copy) NSDictionary *additionalAttributes; // @synthesize additionalAttributes=_additionalAttributes;
....
}
// Keys of additionalAttributes :NSNumber *FB_XCAXAIsVisibleAttribute;NSString *FB_XCAXAIsVisibleAttributeName = @"XC_kAXXCAttributeIsVisible";

Appium is using undocumented private API to get the the visibility attribute from additionalAttributes. And it seems to work, in most cases. It doesn’t for the language menu in Wikipedia.

If “XC_kAXXCAttributeIsVisible” is unreliable it would have to be constrasted with other attributes of coordinates, if so maybe it would have to be not used at all.

My Solution for Visibility

I would base the visibility in the simple geometry test that would always work, is the element frame in the screen ? From there, bug free, I would decide if I want to go its children and get the latest child that hits my point to find the element I want. To know if it is visible to the eye I could check children clipping and take my chances.

visibleRect = [self fb_frameInContainer:parentWindow hierarchyIntersection:nil];

For this simple solution to work, is fundamental that the order of the elements and their geometry information to be correct. Which is often not good in XCUITest.

I have modified the WebDriverAgent “- (BOOL)fb_isVisible” to evaluate visibility by intersection on the screen. I get a much more satisfactory result, and faster. The language-menu is visible now. See the tree in th image bellow.

The language menu is visible with simpler visibility attribute computation.

Update

If we already opened up to private API, look this property in XCElementSnapshot :

@property(readonly) struct CGRect visibleFrame;

Visible frame, seriously ? Does it work ? Let’s change the visibilility function

- (BOOL)fb_isVisible {
return CGRectIsEmpty(self.visibleFrame) == NO;
}

Although it solves the language menu in Wikipedia, it does not check if it is within boundaries of the screen. Elements outside the screen area have a non-zero visibleFrame. We would need to test the results of using visibleFrame and doing the clipping to screen bounds.

Conclusion II

I don’t think UI testing API in general should have fallen into complexities and uncertainties that make any request into anything but 100% accuracy. I think it is to add noise to engineering, waste of time, frustration. Specially not in a field that is all about elements in a tree with a clearly defined size position and visualisation properties.

AI image based in testing accepts from the beginning that it is about percentage of accuracy, but the benefits are worth it.

Errors of XCUITest in geometry and coordinates should not be acceptable, it is matter of being more precise in the calculations.

Is there a solution to the visibility problem ? It may be a semantical problem, a mixup of the computing and human notion of visible. The computing notion of display, hidden and visible are very clear, no room for mistakes, the elements will be considered for mapping them onto the video memory or not, and simple clipping will do the rest. Will my eye see the element ? Well, that I can’t tell you. What are the values you added to opacity, stroke, what other elements are on top, what about their opacity. Can I click on them ? same response.

The only party that could really know if something is visible, really, would be the GPU and the code that draws into the video memory. But the display engine is not and will not be built to keep a communication with the the testing engine to tell them how the painting went for every element.

Particularly I would prefer a reliable API to know the element bounds to know if is within the screen. The visibility/display should be constrained whether it is meant to be painted on screen.

Of course what I’d really want is a better XCUITest with no mistakes in geometry and element order. And a visibility and hittable attribute that is 99.99% accurate. Apple is in a better position than us to solve this problem.

Thanks !

You can play with Organismo downloading it from here:

--

--

Jon Gabilondo
Jon Gabilondo

Written by Jon Gabilondo

Engineering New Software Concepts for Mobile.

No responses yet