The ability to tag image data may be China’s true AI strength, the only one that the United States may not be able to match. In China, this new industry offers a glimpse of a future that the government has long promised: an economy built on technology rather than manufacturing.
Some of the most critical work in advancing China’s technology goals takes place in a former cement factory in the middle of the country’s heartland, far from the aspiring Silicon Valleys of Beijing and Shenzhen. An idled concrete mixer still stands in the middle of the courtyard. Boxes of melamine dinnerware are stacked in a warehouse next door.
Inside, Hou Xiameng runs a company that helps artificial intelligence make sense of the world. Two dozen young people go through photos and videos, labeling just about everything they see. That is a car. That is a traffic light. That is bread, that is milk, that is chocolate. That is what it looks like when a person walks.
“I used to think the machines are geniuses,” Hou, 24, said. “Now I know we’re the reason for their genius.”
In China, long the world’s factory floor, a new generation of low-wage workers is assembling the foundations of the future. Startups in smaller, cheaper cities have sprung up to apply labels to China’s huge trove of images and surveillance footage. If China is the Saudi Arabia of data, as one expert says, these businesses are the refineries, turning raw data into the fuel that can power China’s AI ambitions.
Conventional wisdom says that China and the United States are competing for AI supremacy and that China has certain advantages. The Chinese government broadly supports AI companies, financially and politically. Chinese startups made up one-third of the global computer vision market in 2017, surpassing the United States. Chinese academic papers are cited more often in research papers. In a key policy announcement last year, China’s government said that it expected the country to become the world leader in artificial intelligence by 2030.
Most importantly, this thinking goes, the Chinese government and companies enjoy access to mountains of data, thanks to weak privacy laws and enforcement. Beyond what Facebook, Google and Amazon have amassed, Chinese internet companies can get more because people there so heavily use their mobile phones to shop, pay for meals and buy movie tickets.
Still, many of those claims are iffy. Chinese papers and patents can be suspect. Government money may go to waste. It is not clear that the AI race is a zero sum game, in which the winner gets the spoils. Data is useless unless somebody can parse and catalog it.
But the ability to tag that data may be China’s true AI strength, the only one that the United States may not be able to match. In China, this new industry offers a glimpse of a future that the government has long promised: an economy built on technology rather than manufacturing.
“We’re the construction workers in the digital world. Our job is to lay one brick after another,” said Yi Yake, co-founder of a data labeling factory in Jiaxian, a city in central Henan province. “But we play an important role in AI. Without us, they can’t build the skyscrapers.”
While AI engines are super-fast learners and good at tackling complex calculations, they lack cognitive abilities that even the average 5-year-old possesses. Small children know that a furry brown cocker spaniel and a black Great Dane are both dogs. They can tell a Ford pickup from a Volkswagen Beetle, and yet they know both are cars.
AI has to be taught. It must digest vast amounts of tagged photos and videos before it realizes that a black cat and a white cat are both cats. This is where the data factories and their workers come in.
Taggers helped AInnovation, a Beijing-based AI company, fix its automated cashier system for a Chinese bakery chain. Users could put their pastry under a scanner and pay for it without help from a human. But nearly one-third of the time, the system had trouble telling muffins from doughnuts or pork buns thanks to store lighting and human movement, which made images more complex. Working with photos from the store’s interior, the taggers got the accuracy up to 99 percent, said Liang Rui, an AInnovation project manager.
“All the artificial intelligence is built on human labor,” Liang said.
AInnovation has fewer than 30 taggers, but a surge in labeling startups has made it easy to farm out the work. Once, Liang needed to get about 20,000 photos in a supermarket labeled in three days. Colleagues got it done with the help of data factories for only a couple thousand dollars.
“We’re the assembly lines 10 years ago,” said Yi, the co-founder of the data factory in Henan.
The data factories are popping up in areas far from the biggest cities, often in relatively remote areas where both labor and office space are cheap. Many of the data factory workers are the kinds of people who once worked on assembly lines and construction sites in those big cities. But work is drying up, wage growth has slowed and many Chinese people prefer to live closer to home.
Yi, 36, was out of a job and trying to get other ventures going with elementary school classmates when someone mentioned AI tagging. After online searches, he decided it was not super technical but needed cheap labor, something Henan has in abundance.
In March, Yi and his friends set up Ruijin Technology, which rents offices the size of two professional basketball courts in an industrial park for $21,000 a year. It was previously the park’s Communist Party committee’s event space, so the ceiling lights are covered with red hammers and sickles.
Ruijin, which means smart gold, now employs 300 workers but plans to expand to 1,000 after the Chinese New Year holiday, when many migrant workers come home.
Unlike workers and business around the world, Yi is not worried that AI will take his job.
“The machines aren’t smart enough to teach themselves yet,” he said.
Hiring is a bigger worry.
Ruijin’s pay of $400 to $500 a month is higher than average in Jiaxian. Some potential job candidates worry that they do not know anything about AI. Others find the work boring.
Jin Weixiang, 19, said he would quit Ruijin after the Chinese New Year and go to sell furniture in a physical store in the southern city Guangzhou.
“I’m a people’s person,” said Jin. “I’m doing labeling for the money.”
But for some former migrant workers, the job is better than working on assembly lines.
“It was the same work, same movement, day after day,” said Yi Zhenzhen, a 28-year-old Ruijin employee who once worked at an electronic component company. “Now I have to use my brain a little bit.”
Most of the time, customers do not tell these data factories what the task is for. Some are obvious. Labeling traffic lights, road signs and pedestrians is usually for autonomous driving. Labeling many types of camellia flowers could be for search engines.
Once Ruijin was given the task of labeling the images of millions of human mouths. Yi said he wasn’t sure what it was for. Maybe facial recognition?
Roughly 300 miles to the north, in the Hebei city of Nangongshi, Hou Xiameng runs her data factory out of her in-laws’ former cement factory. Her first job out of college was labeling faces for Megvii, the Chinese facial recognition company with a $2 billion valuation that is most famous for its technology platform called Face++. To this day, some facial recognition systems recognize her before they do her friends because, she says, “my face is in the original database.”
But life in Beijing was too tough and expensive. She and her then-fiancé, Zhao Yacheng, decided to move back to their hometown and start a data factory. Hou’s parents would pay for computers and desks. They are renovating the warehouse next door to hire 80 more people.
Like Yi, Hou does not spend time thinking about the implications of her work. Are they contributing to a surveillance state and a dystopian future that machines will control humans?
“Cameras make me feel safe,” she said. “We’re in control of the machines for now.”