Drown in literature? These smart software tools can help you
Whenever Eddie Smolyansky had a few moments to himself, he tried to keep up with new publications in his field. But in 2016, the Tel Aviv, Israel-based computer vision researcher was receiving hundreds of automated literature recommendations a day. “At one point, bathroom breaks weren’t enough,” he says. The recommendations were “far too many and impossible to follow”.
Smolyansky’s “food fatigue” will be familiar to many academics. Academic whistleblowing tools, originally designed to draw attention to relevant articles, have themselves become a hindrance, flooding the inboxes of scientists around the world.
“I haven’t even read my automated PubMed searches lately because it’s really overwhelming,” says Craig Kaplan, a biologist at the University of Pittsburgh in Pennsylvania. “Honestly, I can’t keep up with the literature. “
But the change is underway. In 2019, Smolyansky co-founded Connected Papers, one of the next generation of visual mapping and recommendation tools. Other services that promise to tame information overload, including Twitter feeds and daily news and research, are also available.
Origin story
Instead of offering a daily list of new articles via email, Connected Papers uses a single user-chosen “original article” to create a map of related research, based in part on overlapping citations. The service recently surpassed one million users, says Smolyansky.
Cards are color coded by date of publication, and users can switch between “past”, founding, papers, and later, “derivative” works that build on them. The idea is that scientists can search for an original article that interests them and see from the resulting map which recent articles have caused a stir in their field, how they relate to other research and how many citations they have. accumulated.
“You don’t have to sit on the paper pipe and watch every paper that comes out lest you miss it,” says Smolyansky. The tool is also useful when scientists wish to delve into an entirely new field, he adds, offering an overview of essential literature.
Another visual mapping tool is Open Knowledge Maps, a service offered by a Vienna-based non-profit organization of the same name. It was founded in 2015 by Peter Kraker, a former scholarly communication researcher at the Graz University of Technology in Austria.
Open Knowledge Maps creates its maps based on keywords rather than a central article, and relies on similarity of texts and metadata to determine how articles relate. The tool organizes 100 articles in similar subfields into bubbles whose relative positions suggest similarity; a search for articles on “climate change”, for example, might generate a related bubble on “risk cognition”.
Maps of these bubbles can be constructed in about 20 seconds, and users can edit them to include the last 100 relevant articles published, or other resources. Open Knowledge Maps includes not only journal articles, but also content such as datasets and research software. Its users have created more than 400,000 cards to date, says Kraker.
Amie Fairs, who studies languages at the University of Aix-Marseille in France, is a self-proclaimed enthusiast of Open Knowledge Maps. “One particularly cool thing about Open Knowledge Maps is that you can research very broad topics, like ‘language production’, and it can group articles into topics that you might not have considered,” explains Fairs. For example, when she researched “phonological brain regions” – areas of the brain that process sound and meaning – Open Knowledge Maps suggested a subfield of research into age-related processing differences. “I hadn’t considered looking in the aging literature for information on this before, but now I will,” she says.
Yet despite his enthusiasm for the service, Fairs still tends to find new articles thanks to alerts from Google Scholar, the dominant tool in the field; it’s easier to go “into the rabbit hole,” she explains, following a chain of papers that quote each other.
Click to recommend
Google Scholar recommends articles based on articles that users have written and listed in their profiles. The algorithm is not public, but the company says the recommendations are based on “the subjects of your articles, the places you publish, the authors you work with and cite with, the authors who work in the same field as you and the quote chart ”. Users can manually configure additional email alerts based on specific keyword searches or authors.
Aaron Tay, a librarian at Singapore Management University who studies academic research tools, gets literature recommendations from both Twitter and Google Scholar, and finds that the latter often highlights the same articles as his human colleagues, although only a few days later. Google Scholar “is almost always on the right track,” he says.
In addition to published articles, Google Scholar can also retrieve preprints as well as “low-quality theses and dissertations,” says Tay. Even so, “you get gems that you might not have seen,” he says. (Scopus, a competing literature database run by Amsterdam-based publisher Elsevier, began incorporating preprints earlier this year, a spokesperson said. But it does not index theses and dissertations. covered by Google Scholar, ”he says.)
Google Scholar does not disclose the size of its database, but it is widely recognized that it is the largest body in existence, with nearly 400 million articles according to one estimate (M. Gusenbauer Scientometry 118, 177-214; 2019). Open Knowledge Maps, meanwhile, is based on the open-source university search engine Bielefeld, which contains over 270 million documents, including pre-prints, and is designed to suppress spam.
Connected Papers uses the publicly available corpus compiled by Semantic Scholar – a tool implemented in 2015 by the Allen Institute for Artificial Intelligence in Seattle, Washington – representing approximately 200 million articles, including preprints. Smolyansky acknowledges that this big gap means that “very rarely” Google Scholar will find “a niche article from the 1970s” that Semantic Scholar does not find.
Semantic Scholar’s alert system, called the Adaptive Search Flow, builds a list of recommended articles that users can form by liking or disliking the articles they see. To decide which articles are similar to these, it uses a machine learning model trained on mutual citations and which articles Semantic Scholar users viewed sequentially. It has some 8 million monthly users.
More from FOMO
Feedly, launched in 2008, also uses up and down votes to find out which new academic research is most relevant to the user, and benefits from an AI assistant who can be trained on specific keywords or topics. . But Feedly isn’t aimed specifically at searchers – it aims to be a comprehensive dashboard for monitoring news, RSS feeds (which provide a way to alert users to new content on websites), forum in online Reddit, Twitter and podcasts. A free version is available, but additional features, such as the ability to track 100+ sources and hide ads, cost $ 6 or more per month (unlike most of the other tools mentioned here, which are completely free; another paid option is ResearchGate + Plus, which boosts user visibility and offers advanced statistics).
ResearchRabbit, which was fully launched in August 2021, describes itself as “Spotify for Papers”. Users start by saving the relevant articles in a collection. With each article added, ResearchRabbit updates its list of recommended articles, reflecting how the music streaming platform makes recommendations based on the songs that users add to their playlists. The Seattle, Washington-based company behind it hasn’t revealed exactly how it assesses relevance, although it does say it’s focusing on specific recommendations rather than floods of alerts. “We only want to send the most relevant articles to our users,” says Managing Director Michael Ma.
Amber Brown Ruiz, a doctoral student in special education and disability policy at Virginia Commonwealth University in Richmond, finds ResearchRabbit alerts to be more personalized than Google Scholar, which sometimes feeds its articles that are superficially similar to her own work but turn out to be to be well outside his discipline.
Ruiz also uses Connected Papers to find new articles. She finds it less automated than Google Scholar, which sends out new articles via email, “but you can enter it manually and determine which articles are the most recent,” she says.
What all of these tools have in common is that they use some sort of artificial intelligence to craft their recommendations. But some researchers appreciate the human touch, valuing recommendations from colleagues and contacts on Twitter, for example. ResearchGate, the long-standing platform that touts itself as a sort of social network for scientists, says it offers the best of both worlds (ResearchGate is in a content-sharing partnership with Springer Nature, which publishes Nature).
Founded in 2008, ResearchGate sends article recommendations via email and delivers them via a continuous stream when users are logged in. (Users can also see a chronological news feed of articles posted by their ResearchGate contacts.) While it does not make its algorithm public, it does use information about a user’s posts and the posts they have. consulted on the platform to understand their interests. It then calculates the related articles based on the quotes shared and the extracted topics and keywords. ResearchGate currently has some 149 million publication pages and has 20 million users.
“ResearchGate’s secret sauce is the combination of an active social network and a huge search graph,” says Joseph Debruin, director of product management for ResearchGate, based in Los Angeles, California.
Five years after realizing that he was drowning in new articles, Smolyansky is finally able to shake off this scientific “fear of missing”. “You don’t need to have that FOMO feeling,” he says.