Hotels Cornell students, professors design software to spot fake hotel reviews By Danny King / August 08, 2011 Share 1 -- Students and professors at Cornell University said last week that they had created a software program that could sniff out bogus positive hotel reviews on user-review websites such as TripAdvisor and Yelp. Fake reviews have been a source of concern within the industry, with many suppliers voicing suspicions that at least some of the reviews are attempts by companies to bolster the reputations of their own products or disparage competitors’ products. The Cornell project -- a collaboration between one current computer-science student, one former computer-science student, one computer-science professor and one communications and information-science professor -- was designed only to spot false positive reviews. The decision to design and test the software reflects both the increasing importance of user reviews and the skepticism that some hoteliers and other travel businesses have expressed regarding the credibility of user-generated Internet content. In a report titled “Finding Deceptive Opinion Spam by Any Stretch of the Imagination,” the Cornell writers claimed to have designed software that can detect what the students dubbed “opinion spam,” de-facto destination advertisements offered in the guise of a user review. The writers claimed that in tests, their software could spot fake reviews with 90% accuracy, compared with 50% accuracy on the part of humans. “With the ever-increasing popularity of review websites that feature user-generated opinions, there comes an increasing potential for monetary gain through opinion spam — inappropriate or fraudulent reviews,” the report said. “Opinion spam can range from annoying self-promotion of an unrelated website or blog to deliberate review fraud.” To test their software, the writers paid people to write 400 fake positive reviews about TripAdvisor’s 20 most popular hotels in the Chicago area. Those reviews, which averaged about 115 words each, were matched against 400 actual reviews in which the same hotels had been given a maximum five-star rating. The test revealed that certain words were giveaways. For example, “hotel,” “Chicago,” “my,” “experience” and “vacation” were words that tended to show up in fake reviews, while legitimate content contained words such as “floor,” “bathroom” and “small” as well as the “$” sign. While such differentiators might seem somewhat intuitive, the writers reported that humans were only about half as successful as the program at spotting a bogus review. Overall, human judges estimated that about 12% of the reviews they read were fake, when in fact half of the reviews were bogus. Neither TripAdvisor nor Yelp responded to requests to comment on their efforts to remove potentially fake reviews from their sites. The study was undertaken at a time when the impact of user reviews on potential travelers is growing rapidly. The number of monthly visitors to travel-review sites, online travel agencies’ review pages and travel blogs jumped 35% between 2008 and 2010, research firm PhoCusWright reported last month. (PhoCusWright, like Travel Weekly, is owned by Northstar Travel Media.) TripAdvisor, which was founded in 2000, last month posted its 50 millionth user review. Its parent, Expedia, reported last week that TripAdvisor’s Q2 revenue had jumped 35% compared with a year earlier, to $169 million, while the division’s profit, excluding some items, rose 23%, to $89 million. Last year, TripAdvisor boosted its 2010 revenue by 38% from a year earlier, to $484.6 million, while profit from the unit rose 35%, to $138.8 million. Largely because of such rapid growth, Expedia announced in April that it would spin off TripAdvisor as a separately traded public company by the end of September. “Travel companies now operate in a landscape where everybody has something to say and ample means to say it,” said Douglas Quinby, senior director of research at PhoCusWright. “Their customers are sharing their experiences, airing grievances and inspiring friends.” It’s the airing of grievances that caused Expedia to take the step of warning potential TripAdvisor investors of possible liabilities in the form of legal claims stemming from such user reviews. In a 184-page document filed with the Securities and Exchange Commission on July 27, Expedia said TripAdvisor could be subject to legal expenses related to claims ranging from defamation and libel to negligence. “These claims, whether brought in the United States or abroad, could divert management time and attention away from TripAdvisor’s business and result in significant costs to investigate and defend,” Expedia stated in the document. “If TripAdvisor becomes subject to these or similar claims and is not successful in its defense, TripAdvisor may be forced to pay substantial damages.” Last September, a group representing owners of about 120 U.K. hotels said it would file a class action against TripAdvisor, alleging that the site hurts sales at many smaller hotels and other businesses in the hospitality industry by publishing libelous and potentially false reviews without verifying them. U.K.-based KwikChex, which had been soliciting companies to pay 35 pounds ($50) to join a group legal action, said it presented its findings to the U.S. Federal Trade Commission in March. Forrester Research principal analyst Henry Harteveldt said Expedia’s warning in SEC filings didn’t surprise him. “However, I wouldn’t expect it’s something they’d want to voluntarily disclose,” Harteveldt said. “I suspect that it was deemed important enough by the firms’ lawyers and investment bankers as a risk to the business that it needed to be disclosed.” CORRECTION: The software program was created by students and professors.