Workshop on Open Research Knowledge Graph on November 22 at TIB

On 22 November, TIB organized the second workshop on Open Research Knowledge Graph (ORKG). It was conducted as a DILS2018 post-conference. A total of 47 participants represented numerous organizations including Universität Augsburg, University of Bonn, Osnabrück University, Zoologisches Forschungsmuseum Alexander Koenig, Inria / ICM, TH Brandenburg, GWDG, Leibniz Institute for Psychology Information, Brandenburg University of Applied Sciences, ZB MED, University of Göttingen.

Prof. Sören Auer kicked off the workshop and described the ORKG vision. Prof. Dietrich Rebholz-Schuhmann (ZB MED) then provided an inspiring keynote titled “Knowledge Graphs in Life Sciences”. He showed how knowledge graphs can help researchers uncover hidden relations in life science data. Next, Markus Stocker introduced the recent ORKG developments and Viktor Kovtun presented preliminary findings on an experiment TIB conducted with authors of the DILS2018 conference proceedings to evaluate their response to the ORKG prototype.

A series of lightning talks about the state-of-the-art research on knowledge graphs in science concluded the morning session:

  • Vera G. Meister, Wenxin Hu: Semi-automatic Knowledge Graph Population Focusing on Qualitative Analysis of Scholarly Papers
  • Sebastian Hellmann: How to build an Open Research Knowledge Graph by self-publishing following metadata standards
  • Vincent Henry, Ivan Moszer, Olivier Colliot: Contribution of logical reasoning to deal with biomedical data analysis and quality
  • Péter Király: Researching Metadata Quality

The afternoon was dedicated to group work along the following thematic lines:

  1. Technical infrastructures: architecture and technologies. This group discussed technological aspects of the ORKG, such as backend and UI development. The group suggested a decentralized architecture that supports flexible integration of knowledge graphs published by various institutions. Essential is the linking to existing resources and content.
  2. Acquiring content: retrospective approaches. This group discussed ways how content can be extracted from published literature, specifically using NLP (possibly with a human supervisor), crowd-sourcing and manual article revision by professionals (e.g., librarians). The group underscored that automated methods based on text mining will surely require human curation. Simple and fast yes/no classification questions combined with gamification may be viable approaches to engage the crowd.
  3. Acquiring content: prospective approaches. This group discussed ways to create ORKG content in various phases of the research lifecycle, during data analysis or at the time of writing or submitting articles. A WYSIWYG web based or desktop editor could support the annotation at the time of writing or submission. The group also discussed information granularity and suggested that abstracts are not sufficient because they often miss important information.
  4. Using content: applications and use cases. This group assumed a production ORKG and discussed how it can be used. The prospect of precise search is surely a key driver. Other possibilities include suggesting related publications, e.g. at the time of article writing based on semantic similarity. Semantic plagiarism detection was raised as a possible use. The ORKG may also facilitate reviewer selection.
  5. Engaging stakeholders. Stakeholders include researchers, publishers, reviewers, librarians, journalists and general public. Attribution and credit are key instruments, required to engage researchers. Gamification may be explored to engage citizen scientists and the public at large. The group also discussed up/down voting of contribution quality as an interesting idea to consider. Journalists and the public may benefit from ORKG for its support to check and share (e.g., on social media) science facts more easily. The role of librarians in content curation and validation was also discussed.

The workshop ended with concluding remarks on the next steps, including the possibility to organize a hackathon. The event surely raised many questions among participants but also clarified direction, highlighted possible collaborations, and demonstrated collective enthusiasm to develop open digital libraries for semantic scientific knowledge.