South African startup Enlabeler is reporting serious growth over the last couple of years, operating in the valuable “data labelling” niche.
Founded in 2019, Enlabeler is a data labelling service provider and offers end-to-end solutions for the classification, cleaning-up and labelling of datasets.
“Our platform turns raw, unlabelled data into high-quality training data. Our team of domain experts works with different datatypes, for a diverse range of industries. With this, Enlabeler creates flexible tech jobs and fights local unemployment across Africa,” Esther Hoogstad, the startup’s founder and chief executive officer (CEO) of Enlabeler, told Disrupt Africa.
The company’s services include image and video annotation for computer vision models, transcriptions of audio files into text, translations of video and audio content to another local language, and text classification and entity recognition to train models in the area of sentiment analysis.
“Machine learning models and algorithms require big datasets to train the models. Often data scientists or engineers don’t have the time and capacity to spend hours and hours creating, cleaning and labeling datasets for their models. So, they ask Enlabeler to help-out. Companies are looking for end to end solutions in the data space and need quick, reliable and accurate data for their internal artificial intelligence (AI) and machine learning (ML) model,” Hoogstad said.
“Global competitors in the data labelling and annotation space are Sama, Labelbox, Labelfuse, Scale AI, and a few others. However, none of these are based in Africa, and none share the same mission to create and build datasets in Africa for domestic and international clients. Ultimately, it’s about empowering a whole new generation of professionals in the data industry that will gain experiences in the growing AI and ML space. Because of Enlabeler’s price point, customised service offering and quick turnaround times, we are able to compete with some of the more automated, large players based in the US.”
Enlabeler raised funding in the middle of last year from new VC fund Entrepreneurs for Entrepreneurs (E4E) Africa to kickstart its operations and target international markets, and has seen strong uptake since.
“We now have a growing team of nine people and a database of over 350 data labellers, annotators and language specialists that often come from marginalised communities,” said Hoogstad.
“During 2020, we’ve created approximately 45 labelling jobs. In 2021, this has already been surpassed by more than 25 per cent. The current client mix is approximately 70 per cent South Africa-based, and 30 per cent international.”
Business development wise, the team is growing its international footprint, and is currently in talks with several large scaling partners that will help Enlabeler grow.
“As Enlabeler works 100 per cent remotely and offers a fully integrated and secure data pipeline with the main cloud providers, such as Amazon Web Services (AWS), clients from anywhere in the world can be serviced by the Enlabeler team,” Hoogstad said.
“The current client portfolio consists of AI and ML companies, but also big infrastructure players based out of the Netherlands, the US and Canada. The team is actively working on onboarding new clients from mainland Europe and other regions.”
For the core labelling work, Enlabeler charges clients per dataset or per unit of labeling, with the client only paying for the output that meets their agreed quality standard. It also offers additional services, such as data pipeline integration, building of customised APIs, dataset creation and cleaning, for which it quotes clients on a case-by-case basis.
“This is in case of a longer-term need, where the client is looking for a continuous stream of labeled data to train and retrain their ML model. Enlabeler works on a retainer basis,” said Hoogstad.
Unlike many startups, Enlabeler was not really affected by the COVID-19 pandemic and associated lockdowns, as it has always been 100 per cent remote.
“The AI developments keep accelerating, and for all of these type of ML and AI models, big structured and clean datasets are a must – so we foresee a growing demand for our services,” Hoogstad said.