A well known secret to everyone involved in the AI industry is the massive use of human classifiers and annotators – humorously referred to as AAI, or “Artificial AI” – while developing AI algorithms. Ever heard of Mechanical Turk? Amazon’s Mechanical Turk is the secret sauce for most of the AI development today. It’s a marketplace that offers developers access to a diverse, on-demand workforce for menial, micro-tasks, like describing what’s in a picture (data annotation) or clicking a button on a website page. And it’s a seminal component for research by the greatest companies and think-tanks the world over, like the Allen Institute of AI, and that’s a huge problem. Take caution when acquiring training data that’s built off low-wage workers in a developing country for a multimillion dollar AI algorithm. In most cases, domain expertise is needed, high attention to detail is crucial, and pennies on the dollar doesn’t fulfill those requisites. So if you’re a startup budgeting (or scraping together) for 18 months and desperately trying to annotate your own dataset, don’t risk building your AI from cheap inputs, or building your own annotation tools. AI startup Clay Sciences has developed a suitable solution to help you annotate your video and image based data sets without breaking the bank.

Clay Sciences was founded by Ariel Elbaz and Guy Kohen in 2016, two Israeli machine learning experts with deep expertise in building AI algorithms across a wide range of industries and intents. Elbaz, an alumnus of elite unit 8200 of the Israeli army and PhD in computer science from Columbia University, honed his skill set in machine learning at Google, OKCupid Labs and DataMinr where he dealt with Twitter’s relevant content from the behemoth platform, home to 10,000 tweets per second. Kohen, an alumnus of an unnamed elite technology unit of the Israeli army, worked for computer vision consultancy RSIP Vision, and later at DataMinr where he met Elbaz, working on classification of weaponry in images of Twitter. After years of practice, both Elbaz and Kohen agreed on a need for automation in the data annotation process of creating machine learning models.
“99 percent of AI is manual human labor, according to Oren Etzioni – legendary, serial AI entrepreneur – which is largely consumed by intricate, long term data annotation tasks. We’re trying to close the gap” explained Elbaz.

Clay Sciences accelerates the process of building machine learning models, providing a platform for data scientists for obtaining training data quickly, efficiently and at scale. Their web-based annotation tools for video, images, and text can be used with crowdsource workers, or with the client’s in-house expert annotators.

Video data annotation, in particular, is a core focus of theirs considering the complexity in understanding the context of a video clip, instead of a single frame like an image. Video content recognition, and the challenge of successful execution, is in dire need with the quick pace of the autonomous vehicle revolution, among other verticals. In fact, just last year, Uber’s autonomous vehicle was involved in a fatal crash in Tempe, Arizona.
“The accident was caused due to the pedestrian being wrongly classified first as an unknown object, then as a vehicle, then as a bicycle. If you consider temporal continuity, these objects can’t morph within seconds” explained Elbaz. Training an autonomous driving algorithm on images frame by frame is inherently flawed. “Such algorithms would be better trained on video data, with its richness of data and temporal features” added Elbaz.
With the Clay Sciences platform, annotating video is 10x faster, easier and more accurate with novel features like automated tracking, and consistent object IDs. Because no sampling of key frames is taking place, annotating on video can produce 15x to 30x more training data than annotating images. Annotators can annotate objects in videos using bounding boxes, lines (i.e. for pose estimation), polygons, or annotate videos by scene type.

Clay Sciences, after three years of operation, is still bootstrapped, and generating profits from selling access to their SaaS annotation platform. Their pricing model is usage-based and tied to the actual number of hours spent annotating data. Using in-browser javascript code, the platform uses cutting edge technology to cut annotation times (pun intended), leveraging in-browser Convolutional Neural Networks for object detection, in-browser computer vision algorithms, and human-in-the loop active-learning APIs. Those allow customers to connect external models to the annotation platform, speeding up the human annotators’ work while simultaneously providing the models with immediate feedback, exactly where the models are wrong – the most important feedback.
Looking forward, they’re focused on building an end-to-end data platform for machine learning that will soon enable training of machine learning models in mere days, starting from raw unlabelled data – that’s 10x to 100x faster than current offerings.
Clay Sciences sees growing demand for annotating video and predicts that the largest industry incumbents, like Google, Apple, Amazon, Microsoft and Snap will adopt the course of their agile competitors to abandon legacy frame-by-frame annotation methods, which they’ve been using for years, and adopt annotating directly on video. It’s a major disruption the startup anticipates to shape the future of AI data.