Distributed Collaboration on Versioned Decentralized RDF Knowledge Bases
Abstract
The aim of this thesis is to support the development of RDF knowledge bases in a distributed collaborative setup. In this thesis a new methodology for distributed collaborative knowledge engineering – called Quit – is presented. It follows the premise that it is necessity to express dissent throughout a collaboration process and to provide individual workspaces for each collaborator. The approach is inspired by and based on the Git methodology for collaboration in software engineering. The state of the art analysis shows that no system is consequently transferring the Git methodology to knowledge engineering. The key features of the Quit methodology are independent workspaces for each user and a shared distributed workspace for the collaboration. Throughout the whole collaboration process data provenance plays an important role. To support the methodology the Quit Stack is implemented as a collection of microservices, that allow to integrate the Semantic Web data structure and standard interfaces with the distributed collaborative process. To complement the distributed data authoring, appropriate methods to support the data management process are researched. These management processes are in particular the creation and authoring of data as well as the publication and exploration of data. The application of the methodology is shown in various use cases for the distributed collaboration on organizational data and on research data. Further, the implementation is quantitatively compared to the related work. Finally, it can be concluded that the consequent approach followed by the Quit methodology enables a wide range of distributed Semantic Web knowledge engineering scenarios.