Sentiment analysis is a research technique used to identify posts with highly emotive content. However, in the study led by Abrahams, the technique failed to distinguish defects from non-defects and safety defects from performance defects.
Investigating further to determine why positive and negative keywords used in conventional sentiment analysis are not predictive of vehicle safety issues or defects, the researchers found that “users were prone to be negative about performance issues even if these did not affect their safety, and users were also prone to use negative sentiment words even if not reporting a defect with the vehicle.”
The study notes that “a thread poster may be more aggrieved by a malfunctioning air conditioner than with a sticky accelerator pedal, yet the latter is almost certainly a more serious defect.” The writer in one post with numerous negative sentiment words grumbled about rear-ending a vehicle after the driver in front slammed on the brakes. In contrast, in a post about a malfunctioning brake light — a safety defect — the consumer used only two negative sentiment words.
"Smoke word" analysis provides the full picture
Moreover, sentiment alone is not enough in the motor vehicle industry, the researchers note. “To enable proper investigation, the defect must be associated with the troublesome component, so hazard analysis can be performed. Defects must be prioritized, so that those that threaten safety can be sieved from those that are merely a nuisance.”
Abrahams and his research team rejected what they call “junk” or non-defect posts, including those seeking vehicle or routine service information, questions from hobbyists interested in after-market vehicle modifications, and complaints about human mishandling of a vehicle.
“In some domains — for example, retail, hospitality, and retail box office — sentiment analysis has been successfully used to find product complaints,” says Abrahams. “In the automotive domain, however, we discovered that conventional sentiment determination may be a poor indicator of whether a defect exists and how critical it is for safety.
“We compiled an alternative set of automotive ’smoke’ words that have higher relative prevalence in defects versus non-defects, and in safety issues versus other postings,” he says. “These smoke words, discovered from Honda and Toyota postings, generalize well to a third brand, Chevrolet, which was used for validation.”