📢 Gate Square Exclusive: #PUBLIC Creative Contest# Is Now Live!
Join Gate Launchpool Round 297 — PublicAI (PUBLIC) and share your post on Gate Square for a chance to win from a 4,000 $PUBLIC prize pool
🎨 Event Period
Aug 18, 2025, 10:00 – Aug 22, 2025, 16:00 (UTC)
📌 How to Participate
Post original content on Gate Square related to PublicAI (PUBLIC) or the ongoing Launchpool event
Content must be at least 100 words (analysis, tutorials, creative graphics, reviews, etc.)
Add hashtag: #PUBLIC Creative Contest#
Include screenshots of your Launchpool participation (e.g., staking record, reward
Demystifying the "AI migrant workers" behind ChatGPT: boring and repetitive, paid by the piece, hourly wages as low as $1
**Source:**Tencent Technology
Abstract: Data annotators classify and label data, allowing artificial intelligence to learn by finding patterns in large amounts of data, and are regarded as "ghost labor" hidden behind the machine. Annotation work is the foundation of artificial intelligence, it has formed an entire supply chain, and this type of work will continue to exist for a long time.
Focus
Artificial intelligence learns by looking for patterns in large amounts of data, but first these data must be classified and labeled by humans, and data annotators come into being. They are regarded as "ghost workers" hidden behind the machines.
The work of annotators is boring and tedious. They often need to do repetitive work and are paid on a piece-by-piece basis. The average hourly salary is between 5 and 10 US dollars (about 36 to 72 yuan). By the beginning of this year, the hourly wages of some commentators had been reduced to $1 to $3 per hour (about 7 to 22 yuan).
Annotation work is still the basis of artificial intelligence, and it has formed a complete supply chain. These kinds of jobs are here to stay for a long time to come.
Annotation work is different from smartphones and car manufacturing in that it is easily deformed and fluid, and often flows to places with lower operating costs.
Then, in 2019, an opportunity popped up in front of him, and Joe started training recruits for a new company that desperately needed annotators, earning four times as much. Every two weeks, 50 new employees line up to enter an office building in Nairobi to start their apprenticeships. The need for annotators seems endless. They will be asked to categorize the clothes they see in a mirror selfie, determine the room they are in through the eyes of a robotic vacuum cleaner, and draw boxes around a motorcycle scanned by lidar. More than half of Joe's students usually drop out before the training is over. "Some people don't know how to stay in one place for a long time," he explained gently. Plus, he admits, "the job is boring."
But it's a good job in a place where jobs are scarce, and Joe has produced hundreds of graduates. After the training, the apprentices can go back home and work alone in their bedrooms and kitchens without telling anyone what they are doing. That's not the real problem because they don't even understand what they're doing.
Labeling objects for self-driving cars is easy, but classifying distorted snippets of dialogue and identifying whether the speaker is a robot or a human is fraught with challenges. Each recognition object is a small part of a larger project, so it's hard to tell what exactly they're training the AI to do. The names of these objects also don't provide any clues, Crab Generation, Whale Segment, Woodland Gyro, and Pillbox Bratwurst are all job codes without any logical order.
As for the company that hired them, most people only know it as Remotasks, a website that offers job opportunities to anyone who speaks fluent English. Like most annotators, Joe didn't know that Remotasks was a contract labor company owned by Scale AI. Scale AI is a multibillion-dollar Silicon Valley data provider whose clients include artificial intelligence startup OpenAI and the U.S. military. Neither Remotasks nor Scale AI mention each other on their websites.
01 Helping machines with uniquely human abilities
Much of the public reaction to big language models like OpenAI's ChatGPT has focused on the work they seem poised to automate. But even the most impressive AI system needs help from humans, who train it by labeling data and stepping in when it gets mixed up. Only those companies that can afford to buy the data can compete in the industry, and those that get the data will go to great lengths to keep it a secret. The result is that, except for a few people, we know very little about the information that influences the behavior of these systems, and even less about the people behind the behavior that shapes them.
For Joe's students, it's a job stripped of all semblance of normalcy: They're expected to stick to a strict schedule and don't need to know what they're doing, or who they're working for. In fact, they seldom call themselves work, just routine “tasks.” They call themselves task workers.
Anthropologist David Graeber has defined so-called "bullshit jobs"—jobs that have no meaning or purpose. These are jobs that should be automated, but are not, for reasons of bureaucracy, status, or inertia. The job of training artificial intelligence is similar: jobs that people want to automate, are generally considered to be automated, but still require humans to participate. These tasks have special purposes, but the annotators are not aware of it.
The current AI boom started with this rather tedious, repetitive labor. As early as 2007, Fei-Fei Li, an artificial intelligence researcher then a professor at Princeton University, suspected that the key to improving neural networks for image recognition was training on more data, requiring millions of labeled images rather than tens of thousands. The problem is, it would have taken her team decades and millions of dollars to tag that many photos.
Fei-Fei Li found thousands of workers on Amazon's crowdsourcing platform, Mechanical Turk, where people around the world complete small tasks for cheap. The resulting labeled dataset, known as ImageNet, enabled a major breakthrough in machine learning, reinvigorating the field and ushering in the progress of the last decade.
Annotation remains an essential part of developing AI, but engineers often feel like it's a fleeting, cumbersome prerequisite for the more glamorous modeling work. You could gather as much labeled data as cheaply as possible to train your own model, and if you could do that, at least in theory, you wouldn't need annotators anymore. However, the annotation work is never really done. The researchers argue that machine learning systems are "fragile" and prone to failure when encountering things in the training data that are not well explained. These failures are known as "edge cases" and can have serious consequences.
In 2018, a self-driving test car from the ride-hailing company Uber killed a woman because, despite being programmed to avoid cyclists and pedestrians, it didn't know what to do with cyclists crossing the street. people. As more AI systems provide legal advice and medical assistance, the more edge cases they encounter, requiring more humans to sort them through. This has spawned a global industry of people like Joe who use their uniquely human abilities to help machines.
Over the past six months, tech investigative reporter Josh Dzieza has spoken to more than two dozen annotators from around the world, many of whom are training cutting-edge chatbots, but also many Doing the mundane physical labor required to keep the AI running. Some have cataloged the emotional content of TikTok videos, new variations of spam, and inappropriate online advertising. Others are looking at credit card transactions and figuring out the types of purchases associated with them, or looking at e-commerce recommendations and deciding if you're actually going to like that shirt after buying another one.
Humans are correcting mistakes of customer service chatbots, listening to requests from Amazon’s intelligent assistant Alexa, and categorizing people’s emotions on video calls. They label food so smart refrigerators aren't confused by new packaging, check automated security cameras before sounding the alarm, and help confused autonomous tractors identify corn.
02 Comments are big business, spawning the "youngest self-made billionaire"
"It's a complete supply chain," said Sonam Jindal, program and research director at the nonprofit Partnership on AI. "The general perception in the industry is that this work is not a critical part of the technology development, it's not a critical part of the technology's development." Will prosper for a long time. All the excitement spreads around building AI, and once we build it, annotations are no longer needed, so why bother thinking about it? But data labeling is the foundation of AI, just like humans As much as intelligence is the foundation of AI, we need to see these as real jobs in the AI economy that are here to stay for a long time to come."
The data vendors behind familiar names like OpenAI, Google, and Microsoft come in different guises. There are also private outsourcing companies with call center-like offices, like CloudFactory in Kenya and Nepal, where Joe does annotation work for $1.20 an hour before switching to Remotasks.
There are also "crowdworker" sites like Mechanical Turk and Clickworker, where anyone can sign up to complete tasks. In the middle are services like Scale AI. Anyone can sign up, but everyone must pass a qualifying exam, a training course, and be monitored for performance. Annotations are big business. Scale AI, founded in 2016 by then-19-year-old Alexander Wang, was valued at $7.3 billion in 2021, making him one of Forbes' youngest self-made billionaires.
Given this, there is no way to give detailed estimates of the number of people working in annotation, but what is certain is that there are many and growing rapidly. Google Research recently published a paper that gives vague estimates of the number of annotators at "millions" and possibly "billions" in the future.
Automation often comes in unexpected ways. Erik Duhaime, CEO of Centaur Labs, a medical data annotation company, recalls that a few years ago, several prominent machine learning engineers predicted that artificial intelligence would replace radiologists. When that doesn't happen, conventional wisdom turns to radiologists using AI as a tool.
Neither of those things happened, according to Duheim. Artificial intelligence is very good at specific tasks, which prompts work to be broken down and assigned to specialized algorithmic systems and equally specialized humans. For example, he said, an AI system might be able to spot cancer, but only in certain types of machines, in certain types of images. So, you need someone to help check that the AI is being fed the correct type of data, and maybe someone else to check that it's working before handing it off to another AI to write a report and finally to a human. "AI won't replace human jobs, but it does change the way jobs are organized," Duheim said.
If you think of artificial intelligence as a smart, thinking machine, you may be ignoring the humans behind it. Du Haimei believes that the impact of artificial intelligence on modern work is like the transition from craftsmen to industrial manufacturing: coherent processes are broken down into small tasks, arranged along the assembly line, some steps are completed by machines, some by humans, but it is different from the previous The situation is quite different.
Concerns about AI disruption are often countered by saying that AI automates certain tasks, not entire jobs. These tasks are often tedious and dull, leaving people to pursue more fulfilling, human work. But it is equally possible that the rise of artificial intelligence will also look like the labor-saving technologies of the past, perhaps like the telephone or typewriter, which eliminate the drudgery of passing messages and handwriting, but generate more information about communication, commerce and paperwork. So much so that a new office staffed with new types of workers, clerks, accountants, typists, etc., was needed to manage them. You may not lose your job when AI joins your job, but it may become stranger, more isolating, and more tedious.
03 Simplify complex reality into something machine-readable
Earlier this year, journalist Ziyeza signed up for a job with Remotasks. The process is simple. You only need to enter computer specifications, network speed and basic contact information to enter the "Training Center". To get paid assignments, Ziyeza first had to complete the relevant, but unpaid, introductory courses. The training center showcased a series of classes with incomprehensible names like Glue Swimsuits and Poster Hawaii. Zieza clicked on something called GFD Chunking, which calls for clothes to be tagged in social media photos.
Beyond that, there are instructions for tasks such as having to label items that are real, wearable by humans, or intended to be worn by real people. Confident in his ability to distinguish real clothes that real people could wear from fake clothes that real people couldn't, Ziyeza set out to test. However, he was immediately hit on the head: the computer gave a magazine picture of a woman in a skirt. Should photos of clothes be considered real clothes? No, Ziyeza thought, because people can't wear pictures of clothes. The result shows an error! Because in the eyes of artificial intelligence, photos of real clothes are equivalent to real clothes.
The image that follows is of a woman taking a selfie in a full-length mirror in a dimly lit bedroom. The shirt and shorts she is wearing are real clothes, and is the reflection of the clothes real? Ziyeza also gave a negative answer, but the artificial intelligence system believes that the reflection of real clothes should also be real clothes.
Milagros Miceli, a researcher working on data at the Weizenbaum Institute in Germany, said there was widespread confusion across the industry. In part, this is a product of the way machine learning systems learn. Humans need only a few examples to understand the concept of a "shirt", whereas machine learning programs need thousands of examples, and they need to do so with perfect consistency and enough variety (polo shirts, shirts for outdoor wear, shirts hanging on a rack) so the system can handle real-world diversity. "Imagine we need to reduce complex reality to something that clumsy machines can read," Miselli said.
For machines, the act of simplifying reality introduces enormous complexity. Instruction writers must come up with rules that allow humans to classify the world with perfect consistency. To do this, they often create categories that humans would not use. If a person is asked to label all the shirts in a photo, they may not label the shirts in the mirror because they know they are reflections and not actual clothes. But to an AI that doesn't understand the real world, it's just pixels, the two are exactly the same. If some shirts in the dataset are labeled and other reflected shirts are not, then the model will not work. So the engineer went back to the supplier with updated information and asked to label the shirt that was reflected in the mirror. Soon, you'll have another 43-page guide, all in red capital letters.
The job of an annotator is usually to set aside human understanding and follow instructions very, very strictly. As one commentator put it, think like a robot. It's a weird mental space where you do your best to follow ridiculous but strict rules, like taking a standard test while taking hallucinogens. Annotators always have confusing questions like, is this a red shirt with a white stripe or is it a white shirt with a red stripe? If a wicker bowl is filled with apples, is it a "decorative bowl"? What color is the leopard print? Every question has to be answered, and one wrong guess could get you banned and start a whole new, entirely different mission with its own baffling rules.
04 Pay per piece, check the task every three hours
Most jobs on Remotasks are paid on a piece-by-piece basis, with earnings ranging from a few cents to a few dollars for a task. Because tasks may take seconds or hours to complete, salaries are difficult to predict. When Remotasks first came to Kenya, commentators said it paid relatively well. That averages out to about $5 to $10 an hour, depending on the task. But over time, the pay goes down.
The most common complaint about teletasking work is its variability. This type of work is stable enough to be a long-term full-time job, but has too much unpredictability to rely on it entirely. Annotators spend hours reading instructions and completing pro bono training just to complete a dozen tasks before the project is over. There might be no new tasks for a few days, and then, out of the blue, a completely different one pops up, possibly for hours to weeks. Any mission could be their last, and they never know when the next mission will come.
Engineers and data vendors say this boom-and-bust cycle stems from the pace of AI development. Training a large model requires lots of annotations, followed by more iterative updates, and engineers want all of this to happen as quickly as possible so they can meet their target release date. They may need thousands of annotators over the course of a few months, then drop to a few hundred, and finally just a dozen or so experts of a particular type. This process is sometimes repeated in cycles. “The question is, who bears the cost of these fluctuations?” said Partnership on AI’s Jindal.
To be successful, annotators must work together. Victor started working for Remotasks when he was a college student in Nairobi, and when he was told he was having trouble with a traffic control task, he said everyone knew to stay away from that task: too tricky, poorly paid, not worth it. Like many commentators, Victor uses an unofficial WhatsApp group to spread the word when good assignments come up. When he came up with a new idea, he would start an impromptu Google meeting to show others how to do it. Anyone can join and work together for a while, sharing tips. "We've developed a culture of helping each other because we know that one person can't know all the tricks," he said.
Annotators always need to be on their toes, as jobs appear and disappear without warning. Victor found that items often popped up in the middle of the night, so he made a habit of getting up every three hours or so to check them. When there is a task, he will always stay awake. At one point, he went 36 hours without sleep, marking elbows, knees and heads in crowd photos, though he didn't know why. Another time, he stayed up so long that his eyes were red and swollen.
Annotators often only know they’re training AI systems for companies elsewhere, but sometimes the veil of anonymity falls away and there are too many cues for brands or chatbots mentioned in instructions. One commentator said: "I read the instructions, googled and found out I was working for a 25-year-old billionaire. If I make someone a billionaire and I make A few dollars, and I'm literally wasting my life."
A self-described “wild believer” in artificial intelligence, Victor started annotation work because he wanted to help bring about a fully automated future. But earlier this year, someone posted a Time magazine story in his WhatsApp group about how employees at provider Sama AI were being paid less than $2 an hour to train ChatGPT to identify toxic content. “People are outraged that these companies are so profitable and pay so little,” said Victor, who didn’t know about the relationship between Remotasks and Scale AI until he was told about it. The instructions for one of the tasks he worked on were nearly identical to those used by OpenAI, which means he was probably also training on ChatGPT, for about $3 an hour. "
I remember someone posting that we will be remembered in the future," he said. Zero one replied: "We were treated worse than infantry. We will not be remembered anywhere in the future, I remember that well. No one will recognize the work we do and the effort we put in. "
Identifying clothing and labeling customer service conversations are just a few of the annotation jobs. Recently, the hottest thing on the market is chatbot trainers. Because it requires domain-specific expertise or language fluency, and salaries tend to be adjusted by region, the job tends to pay more. Certain types of professional annotation can make as much as $50 or more per hour.
When a woman named Anna was looking for a job in Texas, she came across a generic online job listing and applied. After passing an introductory exam, she was ushered into a Slack room of 1,500 people training on a project code-named Dolphin, which she later discovered was Google DeepMind's chatbot Sparrow, one of many chatbots competing with ChatGPT one. Anna's job is to chat with Sparrow all day, and the hourly salary is about $14, plus the bonus for high work efficiency, "this is definitely better than working in the local supermarket to earn $10 an hour."
05 AI responds to three criteria: accuracy, usefulness, and harmlessness
And, Ana loves the job. She has discussed science fiction, mathematical paradoxes, children's riddles and TV shows with Sparrow. Sometimes, the chatbot's answers would make her laugh out loud. Sometimes, she also feels speechless. Anna said: "Sometimes, I really don't know what to ask, so I have a little notebook with two pages already written in it. I Google interesting topics, so I think I can do a good job. Cope with seven hours, which isn't always the case."
Every time Anna prompts Sparrow, it gives two responses, and she chooses the best one, creating what's called "human feedback data." When ChatGPT debuted late last year, its impressively natural conversational style was credited to the fact that it had been trained on vast amounts of Internet data. But the language that powers ChatGPT and its competitors is filtered through several rounds of human annotation.
A team of contractors wrote examples of how engineers wanted the chatbot to behave, asking questions and then giving the correct answers, describing computer programs and then giving functional codes, asking for criminal tips and then politely declining. After the model is trained with these examples, more contractors are introduced to prompt it and rank its responses. That's what Ana did to Sparrow.
Exactly what criteria raters were told to use varied, such as honesty, helpfulness, or just personal preference. The point is, they're creating data about human taste, and once there's enough data, engineers can train a second model to mimic their preferences at scale, automating the ranking process and training their AI to recognize human taste way of acting. The result is a very human-like robot that basically rejects harmful requests and explains its artificial intelligence nature in a way that seems to be self-aware.
In other words, ChatGPT looks human because it was trained by a human-mimicking AI that is acting like a human.
This may cause the model to extract patterns from parts of its language map that are marked as accurate and produce text that happens to match the truth, but it may also cause it to mimic the confident style and jargon of accurate text while writing something completely wrong . There is no guarantee that text marked as accurate by annotators is actually accurate. Even if it is accurate, there is no guarantee that the model has learned the correct pattern from it.
This dynamic makes annotating chatbots not easy. It has to be rigorous and consistent, because sloppy feedback, such as marking correct-sounding material as accurate, can make the trained model more convincing. OpenAI and DeepMind used RLHF in an earlier joint project, in this case, to train a virtual robotic hand to grasp an object, which turned out to also train the robotic hand to position and swing around the object between the object and its rater, That way it will only appear to its human overseers.
Ranking the responses of a language model is always somewhat subjective because this is a language. Text of any length may contain multiple elements that may be true, false, or misleading. OpenAI researchers ran into this hurdle in another early RLHF paper. To get their model to summarize text, the researchers found that only 60 percent of the model's summaries were good. "Unlike many tasks in machine learning, our queries have no clear ground truth," they lament.
When Ana rates Sparrow's responses, she should look at their accuracy, usefulness, and harmlessness, while also checking that the model is not giving medical or financial advice, anthropomorphizing itself, or violating other criteria. To be useful as training data, the model's responses must be quantitatively ordered: Is a robot that can tell you how to make a bomb "better" than a harmless robot that refuses to answer any questions?
In a DeepMind paper, as Sparrow's makers took turns annotating, four researchers debated whether their bot made assumptions about the gender of users who turned to it for emotional advice. According to Geoffrey Irving, a research scientist at DeepMind, the company's researchers hold weekly annotation sessions in which they review the data themselves and discuss ambiguous cases. When a case is particularly tricky, they consult ethics or subject matter experts.
Anna often found that she had to choose between two poor options. “Even if they’re both horribly wrong answers, you still need to figure out which one is better and write down the text explaining why,” she says. Sometimes, when neither answer is good, she’s encouraged to give the answer herself. give a better answer. She does this about half the time during training.
06 Comments increasingly require specific skills and expertise
Because the feedback data is difficult to collect, the selling price is higher. The kind of basic preference data Ana is collecting sells for about $1 a piece, according to people with knowledge of the industry. But if you want to train a model to do legal research, you need someone with legal training, which leads to increased costs. Everyone involved won't reveal exactly how much they paid, but generally speaking, a professional written example can cost a few hundred dollars, while an expert grading can cost $50 or more. One engineer revealed that he once paid $300 for a sample of Socratic's dialogue.
OpenAI, Microsoft, Meta, and Anthropic did not disclose how many people contributed annotations to their models, how much they were paid, or where in the world they were located. Annotators working on Sparrow are paid at least minimum wage an hour, depending on where they are located, said Owen of Google sister company DeepMind. Ana knows "nothing" about Remotasks, but knows more about Sparrow, knowing that it is DeepMind's artificial intelligence assistant, which its creators trained using RLHF.
Until recently, it was relatively easy to spot bad output from language models, which looked like gibberish. But as models get better, this becomes harder, a problem known as "scalable supervision." Google's use of modern language models for the debut of its AI assistant, Bard, inadvertently demonstrated how difficult it is to spot mistakes in modern language models. This trajectory means that annotation increasingly requires specific skills and expertise.
Last year, a guy named Lewis was working on Mechanical Turk, and after completing an assignment, he received a message inviting him to join a platform he had never heard of. It's called Taskup.ai, and the site is pretty simple, just a navy background with the text "Pay as you go." Lewis chose to register.
The job pays much better than any other job he's had before, usually around $30 an hour. However, it is also more challenging, requiring the design of complex scenarios to trick chatbots into giving dangerous advice, testing the model's ability to maintain its own persona, and engaging in detailed conversations about scientific topics that are highly technical and require Do extensive research. Lewis found the job "satisfying and exciting". While checking out a model and trying to code it in Python, Lewis was learning. He can't work more than 4 hours straight lest he get mentally exhausted and make a mistake, and he wants to keep the job.
Lewis said: "If there's anything I can change, I just want to know more about what's going on on the other end. We only know what we need to do the job, but if I know more, maybe I can get the job done. greater achievement, and maybe consider it a career.”
Tech investigative reporter Ziyeza interviewed eight other people, mostly in the U.S., who had similar experiences answering surveys or completing tasks on other platforms and then finding themselves hired by Taskup.ai or a few similar sites, Like DataAnnotation.tech or Gethybrid.io. Their work often involves training chatbots, although their chatbots are of higher quality and more specialized in purpose than other sites they have worked on. One of them is a presentation spreadsheet macro, and the other just needs to have a conversation and rate the responses by whatever criteria she wants. She often asks the chatbot questions that also come up when chatting with her 7-year-old daughter, such as "What's the biggest dinosaur?" and "Write a story about a tiger."
Taskup.ai, DataAnnotation.tech, and Gethybri.io all appear to belong to the same company: Surge AI. Its chief executive, Edwin Chen, would neither confirm nor deny the connection, but he was willing to talk about his company and how he sees annotations evolving.
“I’ve always felt that the field of labeling was simplistic,” says Edwin, who founded Surge AI in 2020 after working in AI research at Google, Facebook, and Twitter, convinced that crowdsourced labels weren’t enough. Edwin said: "We hope that artificial intelligence can tell jokes, write good marketing copy, or help me when I need therapy. But not everyone can tell jokes or solve Python programming problems. This low-quality, low-skill mindset transforms into something richer and captures the human skills, creativity and values we want AI systems to have."
07 Machine learning systems are too weird to ever be fully trusted
Last year, Surge AI relabeled a dataset of Google's classification of Reddit posts by sentiment. Google stripped the context of each post and sent it to annotators in India for annotation. Surge AI employees familiar with American Internet culture found that 30% of the annotations were wrong. Posts like "Hell, my brother" were classified as "Hate," while "Cool McDonald's, my favorite" was classified as "Love."
Edwin said Surge AI vets the qualifications of annotators, such as whether the people doing creative writing tasks have experience in creative writing, but exactly how it finds staff is a "secret". As with Remotasks, workers typically must complete a training course, although unlike Remotasks, they can be paid for taking tasks during training. Having fewer, better-trained staff that produces higher-quality data allows Surge AI to pay better than its peers, but he declined to elaborate, saying only that employees are paid at a "fair and ethical level." Such annotators make between $15 and $30 an hour, but they represent a tiny fraction of all annotators, a group that now number 100,000. This secrecy stems from a client request, he explained.
These new models are so impressive that they have inspired a new wave of predictions that annotation is about to be automated. The financial pressure to do so is high given the costs involved. Anthropic, Meta, and others have recently made strides in using AI to reduce the amount of human annotation needed to guide models, and other developers have begun using GPT-4 to generate training data.
However, a recent paper found that models trained on GPT-4 may be learning to mimic GPT's authoritative style with less accuracy. Until now, when improvements in AI made one form of labeling obsolete, the need for other, more complex types rose. The debate went public earlier this year when the CEO of Scale AI tweeted that he predicted AI labs would soon be spending billions of dollars on human data, just as they were calculating same as above. OpenAI CEO Sam Altman responded that as artificial intelligence advances, the need for data will decrease.
Edwin doubts that AI will reach a point where human feedback is no longer needed, but he does see labeling becoming increasingly difficult as models improve. Like many researchers, he thinks the way forward will involve AI systems helping humans oversee other AIs. Surge AI recently partnered with Anthropic on a proof-of-concept that had human annotators answer questions about a lengthy piece of text with the help of an unreliable AI assistant, the theory being that humans must sense their AI assistant's weaknesses and Cooperative reasoning to find the correct answer.
Another possibility is for two AIs to debate each other, with a human making the final judgment. OpenAI research scientist John Schulman said in a recent Berkeley talk: "We haven't seen the real practical potential of this stuff yet, but it's starting to become necessary because it's hard for annotators to keep up." model advancement."
Edwin said: "I think you'll always need a human to monitor what the AI is doing just because they're this alien. Machine learning systems are too weird to ever be fully trusted. Most impressive today Some of our models have weaknesses that seem very strange to humans. Although GPT-4 can generate complex and convincing text, it cannot tell which words are adjectives."
08 ChatGPT helps a lot with task flow
As 2022 drew to a close, Joe started hearing from his students that their to-do lists were often empty. Then he got an email informing him that the training camp in Kenya was closing. He continued his online training assignments, but he began to worry about the future. "
There are indications that this won’t be the case for long,” Joe said. The annotation work is about to leave Kenya. From colleagues he has met online, he has heard that such missions are being sent to Nepal, India and the Philippines. Joe said : "Companies move from one region to another. They don't have the infrastructure locally, so they have the flexibility to move to where the operating costs are more beneficial to them. "
One way the AI industry differs from cellphone and automakers is its fluidity. This work is constantly changing, being automated and replaced by new demands for new types of data. It's a pipeline, but it can be constantly and rapidly reconfigured, moving wherever the right skills, bandwidth, and payroll are available.
Recently, the highest-paying jobs for annotation tasks returned to the United States. In May, Scale AI began listing annotation jobs on its website, looking for people with experience in nearly every field AI is expected to conquer. Some of these lists of AI trainers who have fitness coaching, human resources, finance, economics, data science, programming, computer science, chemistry, biology, accounting, taxation, nutrition, physics, travel, K-12 education, sports journalism and self-help expertise.
You can teach robots law and earn $45 an hour; teach them poetry and earn $25 an hour. The site also lists recruiting for people with security experience, presumably to help train military AI. Scale AI recently unveiled a defense language model called Donovan, which company executives called “ammunition in AI warfare,” and won a contract to work on the Army’s robotic combat vehicle program.
Ana is still training chatbots in Texas. Colleagues turned into commenters and Slack moderators, and she didn't know why, but it gave her hope that the job could be a long-term career. One thing she's not worried about is jobs being replaced by automation, she said: "I mean, chatbots can do a lot of amazing things, but they can also do some really weird things."
When Remotasks first came to Kenya, Joe thought annotation might be a good career. He was determined to continue the job even after it moved elsewhere. He reasoned that there were thousands of people in Nairobi who knew how to do the work. After all, he trained a lot of people. Joe rented an office in the city and began looking for outsourcing contracts: a job annotating blueprints for a construction company, another annotating insect-damaged fruit for some kind of agricultural project, and another for self-driving cars. Automotive and e-commerce do labeling routines.
But Joe found that his vision was difficult to achieve. He now has just one full-time employee, down from two before. “We haven’t had a steady flow of work,” he said. There was nothing to do for weeks because clients were still collecting data. When the client finished collecting the data, he had to bring in short-term contractors to meet their deadlines: "The client didn't care if we had ongoing work. As long as the dataset labeling was done, it would be fine."
In order not to let their skills be wasted, other task performers decide where the task goes, and they go there. They rent proxy servers to disguise their location and buy fake IDs to get past security so they can pretend to be working in Singapore, the Netherlands, Mississippi, or wherever the mission flows. This is a risky business. Scale AI has become increasingly aggressive in suspending accounts that are found to be hiding their locations, according to multiple mission actors. "
We’ve gotten a little smarter these days because we’ve noticed that in other countries, they’re paying good wages,” said Victor. He makes twice as much working in Malaysia as Kenya, but “you have to be careful ".
Another Kenyan commentator said he decided not to play by the rules after his account was blocked for mysterious reasons. Now, he runs multiple accounts in multiple countries, carrying out missions where the revenue is highest. Thanks to ChatGPT, he says he works fast and has a high quality score. The bot, he says, is great and allows him to quickly complete $10 tasks in minutes.