Our engineers were very excited about this idea and kept a constant eye on the spec and its evolution. Once WebAssembly 1.0 was shipped in all major browsers, teams around eBay were eager to try it out.
But there was a problem. Though there are many use cases and applications that would benefit from WebAssembly, the scope of the technology within ecommerce is still primitive. We were not able to find a proper use case to leverage WebAssembly. A few suggestions came up, but we were better off with JavaScript itself. At eBay, when we evaluate new technologies, the first question we ask is “What potential value does this add to our customers?” Unless there is clarity around this, we do not proceed to the next step. It is very easy to be carried away by the new shiny thing, often forgetting the fact that it may not make any difference to our customers and only complicate the existing workflow. User experience always trumps developer experience. But WebAssembly was different. It has tremendous potential, we just did not have the right use case. Well, that changed recently.
A barcode scanner
eBay native apps both iOS and Android have a barcode scanner feature in the selling flow. The feature leverages the device camera to scan a product UPC barcode and automatically fill out the listing, thus removing the manual overhead. This was a native app-only feature. It requires some intense image processing on the device to detect the barcode number from the camera stream. The retrieved code is then be sent to a backend service which, in turn, fills out the listing. This means that the on-device image processing logic has to very performant. For native apps, we compiled an in-house built C++ scanner library into native code for both iOS and Android. It was extremely well performant in generating the product barcode from the camera stream. We are slowly transitioning to iOS and Android native APIs, but the C++ library is still solid.
The barcode scanner is an intuitive feature for our sellers, as it made the listing flow more seamless. Unfortunately, this feature was not enabled for our mobile web users. We already have a well-optimized selling flow for the mobile web, except that the barcode scanner was not available, and sellers have to manually enter the product UPC, thus adding more friction.
A barcode scanner for the web
We have looked into implementing a barcode scanner for the web before. We, in fact, launched a version of the barcode scanner with the open source JavaScript library BarcodeReader. This was 2 years back. The issue was that it performed well only 20% of the time. The remaining 80% of the time, it was extremely slow and users assumed it was broken. It was timing out the majority of the cases. This is sort of expected, as JavaScript can indeed be equally fast as native code, but only when it is in a “hot path,” i.e. heavily optimized by JIT compilers. The trick is that JavaScript engines use a lot of heuristics to decide if a code path is “hot,” and it is not guaranteed on every instance. This inconsistency obviously resulted in user frustration, and we had to disable the feature. But things are different now. With the web platform evolving at a rapid pace, the question resurfaced “Can we implement a consistently performant barcode scanner for the web?”
One option is to wait for the Shape Detection API. This proposed web API brings many native image detection features to the web, one of which is barcode detection. It’s still in very early stages and still has a long way to achieve cross-browser compatibility. Even then, it is not guaranteed to work on every platform. So we have to think about other options.
This is where WebAssembly comes into play. If a barcode scanner is implemented in WebAssembly, we can make strong guarantees that it would be consistently performant. The strong typing and structure of WebAssembly bytecode enable the compilers to always stay on the hot path. On top of it, we had an existing C++ library that was doing the job for native apps. C++ libraries are ideal candidates to be compiled to WebAssembly. We thought we had a clear path. Well, not exactly.
Architecture
Our engineering design to implement a WebAssembly-based barcode scanner was pretty straightforward.
- Compile the C++ library using Emscripten. This will generate the JavaScript glue code and .wasm file.
- Create a Worker thread from the main thread. The worker JavaScript will import the generated JavaScript glue code, which in turn instantiates the .wasm file.
- The main thread will send a snapshot from the camera stream to the worker, and the worker will call the corresponding wasm API through the glue code. The response of the API is passed to the main thread. The response can either be the UPC string (which is passed to backend) or an empty string if no barcode is detected.
- For the empty scenario, the above step is repeated until a barcode is detected. This loop is timed by a configurable threshold in seconds. Once the threshold is reached, we display a warning message “This is not a valid product code. Please try another barcode or search by text.” This either means the user is not focusing on a valid barcode or the barcode scanner is not performant enough. We track these timeout instances, as it is a good indicator of how well the barcode scanner is performing.
Compilation
The first step for any WebAssembly project is to have a well-defined compilation pipeline. Emscripten has become the de facto toolchain for compiling WebAssembly, but the key is to have a consistent environment that produces a deterministic output. Our frontend is based on Node.js, which means we need a solution that works with the npm workflow. Fortunately, it was around the same time that the article “Emscripten and npm” was published by Surma Das. The Docker-based approach for compiling WebAssembly makes perfect sense, as it removes a ton of overhead. As recommended in the article, we went with the Docker Emscripten image by trzeci. We had to make a couple of tweaks to the custom C++ library to make it compatible to compile to WebAssembly. That was mostly a trial and error exercise. Ultimately we were able to compile and were also able to set up a neat WebAssembly workflow within our existing build pipeline.
It was fast, but…
The way we calculate the performance of the scanner is by analyzing the number of frames the wasm API can process in a second. The wasm API takes in a frame, in this case, an image snapshot pixel data from the live camera stream, performs the calculations and returns a response. This is done on a continuous basis until a barcode is detected. We measure it in terms of the well-known Frames Per Second (FPS) metric.
In our testing, the WebAssembly implementation performed at an astonishing 50 FPS on an average. However, it worked only 60% of the time with the current timeout threshold. Even with this high FPS, it was not able to quickly detect the barcode for the remaining 40% of valid scans and ended-up displaying the warning message. To put this in comparison, the JavaScript implementation that we tried earlier performed only at 1 FPS for the vast majority. So for sure, WebAssembly is faster (50x), but somehow it was not able to detect the barcode in the allocated time for nearly half of the scans. It should also be mentioned that in certain scenarios, JavaScript performed really well and was able to detect the barcode immediately. One obvious option would be to delay showing the warning message, but that would only increase user frustration, and we are not actually solving the real problem. So we dropped that idea.
Initially, we were clueless about why the custom C++ library, which worked perfectly well for native apps, did not produce the same result for the web. After a lot of testing and debugging, we found that the angle in which we focus the object, along with the background shadow, determines the time for successful detection. Then how did it work in native apps? Well, in native apps we use inbuilt APIs to either autofocus or provide user tap focus to the center of the object that is being scanned. This enables native apps to send high-quality image pixel data (i.e. information only about the barcode) to the scanner library at all times. This avoids the blurry image situation. Hence the consistently fast response times.
Now that we had an idea about what is going on, we thought maybe a different native library might perform better under varied focus conditions. The open source barcode reader ZBar is pretty popular and stable. More importantly, it works well with blurry and grainy images. Why not try that? Since we have a WebAssembly workflow already set up, compiling and deploying ZBar as WebAssembly was seamless. We then began evaluating the ZBar implementation. The performance was decent, around 15 FPS (not as good as our custom C++ lib). However, the success rate was close to 80% for the same timeout threshold. Definitely an improvement over our custom C++ library, but still not 100% reliable.
We were still not satisfied with the outcome, but we noticed something unexpected. The scenarios in which ZBar timed out, the custom C++ library was able to get the job done very quickly. This was a sweet surprise. Apparently, based on the quality of the image snapshot, the two libraries performed differently. This gave us an idea.
Multithreading and racing to the rescue
You probably guessed it. Why not create two web worker threads — one for ZBar and one for the custom C++ library — and race them against each other. The winning response (i.e. the first one to send a valid barcode) is sent to the main thread, and all workers are terminated. We set this up and started internal dogfooding to simulate as many scenarios as possible. This setup yielded us a 95% success rate when scanning a valid barcode. Much better than our previous success rates, but it still falls short of 100%.
One weird suggestion was to also put the original JavaScipt library into the mix. This would make it three threads. We honestly did not think that this would make a difference. But it was easy to try out, as we standardized the worker interface. To our surprise, with three threads racing against each other, the success rate was indeed close to 100%. This again was totally unexpected. As mentioned earlier in the post, JavaScript did perform very well on certain scenarios, and this factor seemed to close the gap. So yes, “always bet on JavaScript.” Jokes apart, the following diagram provides a good overview of the final architecture we implemented.
The following illustration shows a high-level flowchart:
A note about asset loading
The assets required for the barcode scanner are prefetched after the main page is rendered. This is to ensure that the sell landing page is loaded fast and ready to interact. The WebAssembly assets (wasm files and associated glue code scripts) and the JavaScript scanner lib are prefetched and cached after the load event of the page using XMLHttpRequest. The point to note here is that they are not executed. This is to keep the main thread free for user interaction. Execution happens only when the user taps on the barcode icon. In case a user taps on the barcode icon before the assets are loaded, we load them on demand and execute immediately. The barcode event handler and worker controller are bundled as a part of the initial page load, but they are very small in size.
Results
After thorough testing and internal dogfooding, the feature was launched as an A/B test. The “Test” bucket in the experimentation showed the barcode scanner icon (screenshot below) and the “Control” did not.
The metric that was used to evaluate the success of the A/B test was something called “Draft Completion Rate.” It is the rate at which a listing goes from a draft stage to successfully completed and submitted. The Draft Completion Rate is a good indicator to support the notion that reducing friction and proving a seamless selling flow through a barcode scanner should enable more listings to be completed. We ran the tests for a couple of weeks, and when the results came back it was indeed very satisfying. It perfectly aligned with our original hypothesis. The Draft Completion Rate improved by 30% for the listing flow with a barcode scanner enabled.
We also added profiling to get the distribution on which type of scanner wins the race. The results were as expected, with ZBar contributing to 53% of successful scans, followed by the custom C++ lib with 34%, and finally JavaScript lib with 13%.
Conclusion
The whole WebAssembly journey was a great learning experience for us. Engineers get pretty excited about new technologies and immediately want to try them out. If the same technology makes a positive impact on a customer-centric metric, it is a double delight. This alludes to an earlier point in this post. Technology evolves at a very rapid pace. Every day we hear new things getting launched. But only a few make a difference to customers, and WebAssembly is one of them. This was our biggest learning from this exercise — “Saying “No” to 99 things and “Yes” to the one thing that really matters to our customers.”
As next steps, we are looking into expanding the barcode scanner to the buying side of the mobile web, which would allow buyers to scan items to search for and purchase. We will also look into augmenting this feature with the Shape Detection API and other in-browser camera capabilities. Meanwhile, we are happy that we found the right use case for WebAssembly at eBay and in bringing the technology to ecommerce.
Special thanks to Surma Das and Lin Clark for their numerous articles on WebAssembly. It really helped us get unblocked on various instances.