Jovita Vytask presents us with a new topic modeling software application designed to help writers with revision. The project is part of her dissertation research, and uses probabilistic inference to detect and flag topics throughout the text, as well as to point out sentences where the writer may have missed something, put too many topics in one paragraph, or introduced a non sequitur.
Jovita connected with topic modeling while working in a learning analytics team designing tools for massively open online courses, better known to the world as MOOCs. “Learning analytics is part of a newly developed field”, says Jovita, designed to use data analysis to support teaching and learning”. Learning analysts focus on the process and context of learning with the goal of helping learners discover their optimal learning methods. In this case, the research question, formulated by Dr. Alyssa Wise and Yi Cui, was how to best navigate conversations in online discussion boards to help learners find answers to their questions.
Jovita decided to take on the challenge of automated content analysis and began to study natural language processing. That led to a research project testing and comparing three different ways to detect and classify posts through topic modeling, a project that won her and fellow authors Dr. Wise and Sonya Woloshen best research poster prize at the 2017 International Learning Analytics and Knowledge Conference. “From there,” she says, “I started researching more about NLP techniques and I was fascinated by the research others were doing in providing Writing Analytics.” That’s how she began to look into how these natural language processing techniques could improve the writing process.
Some time has passed since the forum experiment, and she’s been working with natural language processing ever since. Although this wasn’t an interest she had before graduate school, writing analytics became a passion once she began. “Perhaps [that’s] because I've always struggled as a former science student to write anything beyond a lab report,” says Jovita, “and I have seen in my own work, and as an instructor working with students, how much can be learned from iterative feedback and revision support -- but equally, how challenging it is to provide opportunities for this type of support in large classes.” She began to design a tool that could offer that kind of service to writers and writing instructors revise work and give quick, targeted feedback. “I thought the idea of being able to provide automated feedback to support students learning to write in a systematic way addressed a real need in education, and great potential for the development of learning technology. From there I started researching writing analytics, which became my dissertation research and this project.”
Jovita’s software is an application of new topic-modeling methods. Most writing analytic approaches have been built to do one of three things: automatically mark papers; give feedback based on rhetorical moves; or, newest among the three, they might analyze the semantic cohesion of the writing. Jovita’s does the latter. By analyzing coherent correlations among words, she can accurately mark an essay’s topics and point you to places where the organization doesn’t hold together. Because her software analyses semantic patterns within a text instead of comparing it against others, it doesn’t require a large training data-set and it works for different assignments. You can use it on essays written on different topics, allowing students the creativity to develop their own theses. The tool is designed for expository writing, which could be extended to a large range of subjects.
Much of traditional topic modeling software uses hundreds or thousands of papers as a training method in order to ‘learn’ what topics look like. Jovita’s, however, can detect topics in a paper on a new topic, or even in a new language, right off the bat. That’s because the program uses sampling methods that compare the paper to itself. No database or training are needed.
Some functions of natural language software do raise questions about machines replacing human readers and writers. Software to grade essays is nothing new. What is unique about the approach Jovita is taking is that this software is designed to provide formative not summative feedback. We’ve covered how this tool can be used to help students revise their own work but it can also be used to support TAs, instructors, or writing support centers to augment their formative feedback to students. The feedback the model provides may help an instructor identify why a section of an essay was confusing to read, or where an argument went off topic, allowing them to more quickly provide specific, actionable feedback to students. Additionally, the topic coding provided by the model could help an instructor review the arguments a student is making in a paper and the evidence they are providing as support, to give feedback on the overall work or help the student craft a better thesis statement. The goal is to make it easier for instructors to provide formative feedback so that students have the opportunity to benefit from learning to revise their work and improve their writing.
Now that natural language processing is becoming more accessible, topic modeling and other tools for writing support are becoming a more common feature of the landscape. The right choice of analysis and statistical methods on today’s microprocessors let a person do with one essay and a little mathematical magic what used to require volumes of training texts. Software like this can support teaching and learning in ways we wouldn’t even have thought of 10 years ago. It's a great time for researchers, and for all of us to see computing software and mathematics support writing and teaching in creative new ways.
F T