Chinese schools are testing AI that grades papers almost as well as teachers

Some schools in China have incorporated paper-grading artificial intelligence into their classrooms, according to the South China Morning Post. One in every four schools, or about 60,000 institutions, are quietly testing a machine learning-powered system that can score students' work automatically, and even offer suggestions where appropriate.

The AI, which can be accessed through various online portals, and which the report describes as similar to the system used by the Education Testing Service in the U.S., uses an evolving "knowledge base" to interpret the "general logic" and "meaning" of pupils' essays and to highlight stylistic, structural, and thematic areas that need improvement. It can read both English and Chinese, and it's reportedly perceptive enough to notice when paragraphs veer too far off topic.

It is also self-improving. The 10-year-old grading software leverages deep learning algorithms to "compare notes" with human teachers' scores, suggestions, and comments. An engineer involved in the project compared its capabilities to those of AlphaGo, the record-breaking AI Go player developed by Google subsidiary DeepMind.

In a test involving 120 million people, the South China Morning Post claims, the AI agreed with human graders about 92 percent of the time. But it isn't perfect. A user on Chinese question-and-answer website Zhihu posted a screenshot showing the AI's evaluation of a 2015 article from the Washington Post, "Why is Obama sticking it to stay-at-home moms?" The op-ed scored a 71.5 out of 100, with major point deductions for "flow," "focus," and length.

One day, though, researchers on the AI project hope it and software like it will help minimize grading inconsistencies, cut down on the amount of time teachers spend scoring essays, and improve the writing skills of students in remote areas of the country.

In the U.S., similar projects have achieved success on a smaller scale. For example, online grading application Gradescope, which was developed at the University of California Berkeley, claims to reduce grading times by as much as 90 percent.

But the Chinese project's origins and opacity are raising concerns among some educators.

The government is said to be heavily involved. Project lead Zhou Jianshe, a professor in China's language intelligence research center at Capital Normal University, has contributed data mining and natural language processing research to the military. Additionally, some of the teams responsible for designing the software have contributed to surveillance programs for the Chinese government and military, according to the South China Morning Post.

Another troubling aspect is the air of secrecy that surrounds it. Parents of children whose papers were subjected to the system were not informed beforehand, and test results are strictly classified. In some classes, students weren't even told that AI, rather than a human, had graded their work.

"There is no law forbidding AI from grading student essays," Yu Yafeng, a professor at the institute of educational theories at Beijing Normal University, told the South China Morning Post, "but this practice should raise ethical questions."

More