Nurun worked on a prototype that combines new technologies with insights from a recent ethnographic sprint. The objective of the overall project was to improve the shopping experience for visually impaired people with the use of new technologies. The insights that emerged from the ethnography served as foundational elements for our technological analysis and eventually lead us to prototyping. We learned a few surprising things at every step of the process, from the way visually impaired people use technologies, to the inner workings of computer vision.

The Process

It all started with a simple question: how do the blind go shopping? It’s the kind of questions that this guy answers on his YouTube channel.

We were planning on refreshing our knowledge of computer vision technologies, but were looking for a way to constrain it with real user requirements—the visually impaired seemed like the perfect audience to have in mind.

We started with an ethnographic study to gain more insight into the challenges that the visually impaired face when they go shopping. We then used those insights as inspiration for a prototype idea enhanced by an exploration of the latest in computer vision technologies.

Step 1—The Ethnography

Designing for the visually impaired requires more empathy than other target audiences. Therefore, conducting an ethnographic study was crucial to the success of this project. As we discovered, our assumptions were slightly off from the actual realities of the visually impaired. For instance, their knowledge of accessible technologies surpassed what we initially thought, so much so that they had already ruled out some technical approaches for enhancing their life, such as guided indoor navigation. Furthermore, we discovered that the visually impaired are methodical in their approaches to adopting new tools, which does not leave a big margin of error for developers. It was really clear that good intentions were not enough, but that understanding their true realities was much more important.

We interviewed five people who had different levels of visual impairment and who lived in different social contexts. We wanted to understand the role that technology currently plays in their lives and how they used technology, if at all, to overcome the challenges and anxieties related to shopping with a visual impairment.

Throughout our ethnographic interviews and the analysis that followed, a few key elements emerged:

  • Efficient technology is readily explored: The visually impaired people who we interviewed were particularly tech-savvy. They were open to the potential of the newest technologies, and most owned smartphones and were very rigorous in their approaches to learning new technologies. Online shopping was often their preferred shopping method, and they were also quick to discover new products and services that become accessible through various online and offline channels.
  • Shopping assistance is almost always necessary: In contrast, our informants were clear about the impracticalities of going shopping by themselves. Store layouts are unpredictable and can be very hazardous to navigate. Guide dogs aren’t fully reliable either, since smells and other people in the store can disturb them. Instead, the blind must rely on human assistants that guide them around the store. It is easy to see then how online shopping has for many become their preferred shopping method.
  • The experience of discovery is difficult: For the blind, there is no wandering upon something or someplace serendipitous or new; they must rely on others to discover new products. Even if some store doors are equipped with buttons that are able to announce the name of the shop and what it sells, the equipment cost remains too high to promote widespread adoption. Additionally, unexpected sensory inputs can create dissonance and suddenly make navigation more difficult.
  • Tools and technology cannot take the place of previously mastered skills: Canes, dogs, and even talking GPS are not replacements for the ability to walk in a straight line and sense the environment. Acquiring new tools only happens when the underlying skill has been mastered. For instance, guide dogs are generally obtained when a person has perfected the ability to use a cane. The key is that they are the ones walking the dog and not the other way around. This is the same for the adoption of new technologies as well.

Example of a tool: A contrast plate to see food (low vision).

Example of a tool: A contrast plate to see food (low vision)

These insights led us to a brainstorming session, guided by what we had learned.

Step 2—Ideation

We set out to develop an idea that would enhance, rather than replace, a skill mastered by the visually impaired. It was tempting to develop a navigation system that would allow the visually impaired to go around the store by themselves, but we resisted as our informants were adamant that the technology is still not as reliable as human assistants.

Instead, we came up with a mobile application that augments the autonomy of the visually impaired by allowing them to manage their own shopping list, empowering them to direct their shopping assistant.

Furthermore, we sought to develop this application in a way that would also be useful for the sighted.

  • Shopping list management: The app can automatically check the product off the list when it is scanned, which is much more convenient than going through the list with voiceover (an accessibility technique that reads the element that is selected on the screen). The scanning occurs continuously and in real-time, so the visually impaired only need to turn the product around in front of the smartphone until it is recognized, instead of having to take perfectly focused snaps.
  • Efficient store navigation: To optimize going around the store, the application coalesces elements together based on their proximity to one another and their location in the store. For instance, vegetables are located in the produce section of most grocery stores. As the user walks around the store, the application automatically highlights the items on the list that are nearby; blind users will get a spoken list. This is useful information for instructing the store clerk on which items to pick up, and avoiding any difficult and anxiety provoking backtracking on the side of the visually impaired shopper.
  • Improved discovery: The previous shopping lists created by the user are used to recommend items for new shopping lists. Those recommendations can come directly from previous purchases, or from other users’ who have similar purchase histories. Furthermore, the store can suggest recipes that require already purchased ingredients with new ones that have yet to appear on the visually impaired shopper’s list.

After identifying these three principles, we were ready to prototype.

Step 3—Prototyping

To achieve the features of our prototype, we looked into three technological topics:

  • Computer Vision
  • Indoor Location
  • Speech Synthesis

Computer Vision: As computers interact with the real world, they are developing the ability to understand image data. Computer Vision is a key component of many advanced technologies such as robotics, medical image analysis and surveillance. In our day-to-day life, computer vision is responsible for technologies such as OCR (character recognition), facial recognition (such as with advanced auto-focus on modern cameras) and even QR Codes. While computer vision tends to be computationally intensive, the improvements to processing capabilities make computer vision increasingly possible, especially in the mobile space.

After looking at several image recognition solutions for our prototype such as IQ Engines, Qualcomm Vuforia, Pointcloud, Kooaba, Layar, Metaio and Open CV, we choose Moodstocks. Moodstocks provides an impressive real-time image recognition engine that can track more than 1,500 images—10 times more than what competitor solutions can recognize.

The size of the image set in itself, however, was insufficient for our needs, so we decided to combine indoor location data to switch data sets as the user walks inside the store, for a potentially unlimited number of images that can be recognized.

Moodstocks is able to recognize a product even when it is partially occluded. In addition, when an occluded product is recognized, but the visible part of the product is not enough to distinguish between similar products, Moodstocks correctly identifies this match as a partial match, which allows the application to provide better feedback and directives to the user. While partial matching is much more advanced than other solutions, it does not work in all cases. For instance, when color was the only difference between different products, Moodstocks does not recognize the products as being different.

Another key limitation of the non-specialized image recognition solutions that we analyzed is that they rely on image features (edges) being present on the objects needing recognition. Smooth, textureless images and images that include highly repetitive patterns are not very well recognized. Organic elements such as vegetables do not work with any of the solutions that we tested.

Indoor Location
Indoor location technologies allow us to track the location of a device inside a building, which is usually unreachable by GPS. To assist our indoor location requirements, we chose to use iBeacons, a new technology introduced in iOS 7.

Beacon region monitoring is a technique that allows devices to be notified when a beacon is in close proximity or when the beacon is no longer in range. Beacon region monitoring is similar in many ways to circular region monitoring (Geo-fencing) with the exception that it doesn’t have inherent location coordinates. Beacon monitoring is the most precise of location technologies that exists today, and this was appropriate for giving an accurate shopping list based on the aisle in which the user was standing.

Speech Synthesis
Speech synthesis is the ability to convert text into speech. For our prototype, we used the speech synthesis system that’s available in iOS’s Accessibility menu.

The resulting experience is a voice guide throughout the shopping experience. The guide accompanies users from the moment they step into the store to when they leave. With every step, users’ locations are used to efficiently guide them through their shopping list, and they are only notified about nearby elements. At each step, users maintain full control over their shopping.

The Result
The result is a functional prototype that combines ethnographic insights and the newest technologies to enhance the shopping experience of the visually impaired. The technologies that we analyzed are impressive, however some limitations remain, especially around image recognition. Even if our idea does not remove all of the frictions that the visually impaired face while shopping, it has the potential to enhance their experience significantly.

Cécile Eymard from Nurun’s Montreal office also contributed to this article.