Semantic Big Data (SBD 2019)

Workshop @ ACM SIGMOD 2019

Loading...

International Workshop on
Semantic Big Data (SBD 2019)
Call for Papers: txtUTF-8 txtASCII pdf

The International Workshop on Semantic Big Data (SBD 2019)

In conjunction with ACM SIGMOD 2019

Aims of the Workshop

The current World-Wide Web enables an easy, instant access to a vast amount of online information. However, the content in the Web is typically for human consumption, and is not tailored for machine processing. The Semantic Web is hence intended to establish a machine-understandable Web, and is currently also used in many other domains and not only in the Web. The World Wide Web Consortium (W3C) has developed a number of standards around this vision. Among them is the Resource Description Framework (RDF), which is used as the data model of the Semantic Web. The W3C has also defined SPARQL as the RDF query language, RIF as the rule language, and the ontology languages RDFS and OWL to describe schemas of RDF. The usage of common ontologies increases interoperability between heterogeneous data sets, and the proprietary ontologies with the additional abstraction layer facilitate the integration of these data sets. Therefore, we can argue that the Semantic Web is ideally designed to work in heterogeneous Big Data environments.

We define Semantic Big Data as the intersection of Semantic Web data and Big Data. There are masses of Semantic Web data freely available to the public - thanks to the efforts of the linked data initiative. Many of these freely available Semantic Web datasets are accessible via SPARQL query servers called SPARQL endpoints. Everyone can submit SPARQL queries to SPARQL endpoints via a standardized protocol, where the queries are processed on the datasets of the SPARQL endpoints and the query results are sent back in a standardized format. Hence, not only Semantic Big Data is freely available, but also distributed execution environments for Semantic Big Data are freely accessible. This makes the Semantic Web an ideal playground for Big Data research.

The goal of this workshop is to bring together academic researchers and industry practitioners to address the challenges and report and exchange the research findings in Semantic Big Data, including new approaches, techniques and applications, make substantial theoretical and empirical contributions to, and significantly advance the state of the art of Semantic Big Data.

Categories of Papers

The workshop solicits papers of different categories:

  • Research Papers propose new approaches, theories or techniques related to Semantic Big Data including new data structures, algorithms and whole systems. They should make substantial theoretical and empirical contributions to the research field.

  • Experiments and Analysis Papers focus on the experimental evaluation of existing approaches including data structures and algorithms for Semantic Big Data and bring new insights through the analysis of these experiments. Results of Experiments and Analysis Papers can be, for example, showing benefits of well-known approaches in new settings and environments, opening new research problems by demonstrating unexpected behavior or phenomena, or comparing a set of traditional approaches in an experimental survey.

  • Application Papers report practical experiences on applications of Semantic Big Data. Application Papers might describe how to apply Semantic Web technologies to specific application domains with big data demands like social networks, web search, e-business, collaborative environments, e-learning, medical informatics, bioinformatics and geographic information system. Application Papers might describe applications using linked data in a new way.

  • Vision Papers identify emerging new or future research issues and directions, and describe new research visions having demands for Semantic Big Data. The new visions will potentially have great impacts on society.

  • Demo Papers deal with innovative systems and applications for Semantic Big Data. These papers describe a showcase of the proposed system/application, but may also explain the novelty of the system's architecture. We are especially interested in demonstrations having a WOW-effect.

For all categories (except for demo papers), we accept two different types of papers: Short and Full papers. The length of full papers cannot exceed 6 pages. The length of all other papers (i.e., short and demo papers) cannot exceed 4 pages. Accepted full and short papers will be presented in oral presentations. Demo papers will be presented as part of a combined demo and poster session. All accepted full and short papers will also be presented as posters in the combined demo and poster session in order to increase interactivity and discussion with the audience.

Topics of Interest

We welcome papers on the following topics:

  • Semantic Data Management, Query Processing and Optimization in

    • Big Data
    • Cloud Computing
    • Internet of Things
    • Graph Databases
    • Federations
    • Spatial and Spatio-Temporal Data

  • Evaluation strategies for Semantic Big Data of Rule-based Languages like RIF and SWRL
  • Ontology-based Approaches for Modeling, Mapping, Evolution and Real-world ontologies in the context of Semantic Big Data
  • Reasoning Approaches (Real-World Applications, Efficient Algorithms) especially designed for Semantic Big Data environments
  • Linked Data

    • Integration of Heterogeneous Linked Data (linking algorithms, heuristics, identity resolution, schema matching, clustering)
    • Real-World Applications (data browsers, search engines, marketplaces, aggregators, indexes, enterprise applications using LOD, LOD applications for social sciences, digital humanities, life-sciences)
    • Statistics and Visualizations
    • Quality Assessment (evaluating the quality and trustworthiness, tracking the provenance, profiling and change tracking)
    • Cleansing (data fusion, truth discovery, conflict resolution, crowdsourcing)
    • Ranking Techniques
    • Provenance
    • Mining and Consuming Linked Data (large-scale derivation of implicit knowledge, using LOD as background knowledge in data mining)

  • Semantic Web stream processing (Dynamic Data, Temporal Semantics)
  • Semantic Internet of Things
  • Semantic Smart Homes/Companies/Cities
  • Performance, Evaluation and Benchmarking of Semantic Web Technologies, Applications and Databases
  • Semantic Web Services
  • Semantic Big Data Archives

    • Efficient Archiving and Preservation Techniques
    • Evolution Representation
    • Compression Approaches
    • Querying Techniques

  • Semantic Big Data on Emergent Hardware Technologies

    • FPGA
    • GPU
    • SSD
    • Main-Memory Databases

  • Semantic Wikis

    • Verification of Content
    • Bias in Content/Gaps of Knowledge
    • Detection of Incorrect or Low-Quality Content, Fake News
    • Collaborative Content Creation and Editor Decisions
    • Dynamics of Discussion, of Collaborative Content Creation and of Reuse
    • Detection of Hidden Knowledge
    • Ontology Learning

Important Dates

Time Schedule
Submission (extended): March 18, 2019
Notification: April 16, 2019
Workshop: July 5, 2019

Diversity Considerations of the Program Committee

We have currently recruited 32 PC members and chairs listed below who are experts in the topics of interest of our workshop. The current PC members and chairs are selected from 18 nations all over the world as shown also by the map below. While most PC members are from academia, we have 4 experts also from industry (13%). 8 of the PC members and chairs are women (25%).

Legend

Program committee members and chairs: 1  9

Program Committee Chairs

Program Committee

  • Muhammad Intizar Ali, Insight, National University of Ireland, Galway
  • Carlos Buil Aranda, Universidad Técnica Federico Santa María, Chile
  • Mithun Balakrishna, Lymba Corporation, USA
  • Paulo Rupino da Cunha, University of Coimbra, Portugal
  • Melike Şah Direkoglu, Near East University, North Cyprus
  • Julian Dolby, IBM Research, USA
  • Vadim Ermolayev, Zaporizhzhia National University, Ukraine
  • Javier D. Fernández, Vienna University of Economics and Business, WU Vienna, Austria
  • Carlos Juiz García, Universitat de les Illes Balears, Spain
  • Katja Gilly de La Sierra-Llamazares, Miguel Hernandez University, Spain
  • Ekaterini Ioannou, Open University of Cyprus
  • Prudhvi Janga, University of Cincinnati and Amazon Web Services, USA
  • Herbert Kuchen, University of Münster, Germany
  • Isaac Lera, Universitat de les Illes Balears, Spain
  • Xiang Lian, Kent State University, USA
  • Qing Liu, Data61, CSIRO, Australia
  • Ioana Manolescu, INRIA and Ecole Polytechnique, France
  • Daniel Miranker, The University of Texas at Austin, USA
  • Grażyna Paliwoda-Pękosz, Cracow University of Economics, Poland
  • Alfredo Pulvirenti, University of Catania, Italy
  • Praveen Rao, University of Missouri-Kansas City, USA
  • Arjun Satish, Confluent Inc., USA
  • Omair Shafiq, Carleton University, Canada
  • Marta Tatu, Lymba Corporation, USA
  • Martin Theobald, University of Luxembourg, Luxembourg
  • Konstantinos Tserpes, Harokopio University of Athens, Greece
  • Dimitrios Tsoumakos, Department of Informatics, Ionian University, Greece
  • Xiang Zhao, National University of Defense Technology, China
  • Weiguo Zheng, Chinese University of Hong Kong, China
  • Dimitrios Zissis, University of the Aegean, Greece

Evaluation of Papers

To verify the originality of submissions, we will use Plagiarism Detection Tools to check the content of the submitted manuscripts against previous publications.

Papers will be evaluated according to the following aspects:

  • Relevance to the Workshop
  • Novelty and practical impact
  • Technical soundness
  • Appropriateness and adequacy of:
    • Literature review
    • Background discussion
    • Analysis of issues
  • Presentation, including:
    • Overall organization and structure
    • Correctness of English language
    • Readability

Accepted Papers

The proceedings are available here in ACM DL.
  • Gerald Haesendonck, Wouter Maroy, Pieter Heyvaert, Ruben Verborgh, Anastasia Dimou:
    Parallel RDF Generation from Heterogeneous Big Data
    DOI: 10.1145/3323878.3325802
  • Ahmed Al-Ghezi, Lena Wiese:
    UuniAdapt: Universal Adaption of Replication and Indexes in Distributed RDF Triples Store
    DOI: 10.1145/3323878.3325803
  • Victor Anthony Arrascue Ayala, Polina Koleva, Anas Alzogbi, Matteo Cossu, Michael Färber, Patrick Philipp, Guilherme Schievelbein, Io Taxidou, Georg Lausen:
    Relational Schemata for Distributed SPARQL Query Processing
    DOI: 10.1145/3323878.3325804
  • Georgios M. Santipantakis, Apostolos Glenis, Christos Doulkeridis, Akrivi Vlachou, George A. Vouros:
    stLD: Towards a Spatio-temporal Link Discovery Framework
    DOI: 10.1145/3323878.3325805
  • Oliver Lehmberg, Christian Bizer:
    Profiling the Semantics of N-ary Web Table Data
    DOI: 10.1145/3323878.3325806
  • Irena Holubova, Stefanie Scherzinger:
    Unlocking the Potential of NextGen Multi-Model Databases for Semantic Big Data Projects
    DOI: 10.1145/3323878.3325807
  • Tobias Zeimetz, Ralf Schenkel:
    Analyzing Online Data Summarization Approaches for Linked Data Knowledge Bases
    DOI: 10.1145/3323878.3325808
  • Pawel Guzewicz, Ioana Manolescu:
    Parallel Quotient Summarization of RDF Graphs
    DOI: 10.1145/3323878.3325809

Program

Session 1 (Keynote)

Time Type Description
9:00: keynote Stefan Schlobach (Vrije Universiteit Amsterdam, The Netherlands):
The Spirits that I Called: Semantic Web as Big Data!
Abstract:

The Semantic Web technology stack has proven a powerful enabler for publishing and consuming data on the Web, with more and more mature tools around, such as robust and scalable triple-stores, as well as established standardised languages and protocols. In this talk I will introduce some new technology developed in our group that has enabled linking and publishing hundreds of thousands of online datasets seamlessly in an integrated and unified way. This way we made a huge number of integrated knowledge graphs accessible on consumer hardware.

As a consequence of the recent successes the publicly available knowledge graphs, in particular when integrated, constitute instances of Big Data themselves, particularly with respect to veracity and variety, but increasingly also with volume and velocity, varying in size and formats with billions of triples, all made possible by the usage of formal semantics and standardised knowledge representation languages like RDF(S), SKOS or OWL.

Not surprisingly, the success of the semantic methods for integrating data comes at a cost. Once Linked Data becomes Big Data itself, its formal methods become too rigid. It is well-known that classical logics fail in the presence of inconsistency, as knowledge graphs become formally meaningless as soon as they contain even a single contradiction. None of the existing semantic standards takes contextual information into account, nor any other pragmatic information about the data itself, including human errors, popularity, interpretation or uncertainty. In this presentation, I will present some typical examples of knowledge that is not captured with the classical semantics in large knowledge graphs. All this should lead to some ideas and a discussion on how to make the current semantic web formalisms more robust against Big Data phenomena.

Bio: Stefan Schlobach is an Associate Professor at the Knowledge Representation and Reasoning group in the Artificial Intelligence Section of the Department of Computer Science of the Vrije Universiteit Amsterdam. He got a PhD from King’s College, London, on the combination of Learning and Reasoning in Description Logics. A formal logician at heart, he spends his working life balancing the beauty and simplicity of formal systems and their well-understood semantics, with the messiness of real life data, e.g. of large-scale Web data. This has led to work on explanation of reasoning and ontology integration, as well as alternative, more robust and context aware, methods for reasoning and querying, often based on some kind of emerging semantics.
10:30: break Coffee Break

Session 2 (Parallel and Distributed Processing)

Time Type Description
11:00: paper Gerald Haesendonck, Wouter Maroy, Pieter Heyvaert, Ruben Verborgh, Anastasia Dimou:
Parallel RDF Generation from Heterogeneous Big Data
DOI: 10.1145/3323878.3325802
11:30: paper Ahmed Al-Ghezi, Lena Wiese:
UuniAdapt: Universal Adaption of Replication and Indexes in Distributed RDF Triples Store
DOI: 10.1145/3323878.3325803
12:00: paper Victor Anthony Arrascue Ayala, Polina Koleva, Anas Alzogbi, Matteo Cossu, Michael Färber, Patrick Philipp, Guilherme Schievelbein, Io Taxidou, Georg Lausen:
Relational Schemata for Distributed SPARQL Query Processing
DOI: 10.1145/3323878.3325804
12:30: break Lunch Break

Session 3 (Misc)

Time Type Description
14:00: paper Georgios M. Santipantakis, Apostolos Glenis, Christos Doulkeridis, Akrivi Vlachou, George A. Vouros:
stLD: Towards a Spatio-temporal Link Discovery Framework
DOI: 10.1145/3323878.3325805
14:30: paper Oliver Lehmberg, Christian Bizer:
Profiling the Semantics of N-ary Web Table Data
DOI: 10.1145/3323878.3325806
15:00: paper Irena Holubova, Stefanie Scherzinger:
Unlocking the Potential of NextGen Multi-Model Databases for Semantic Big Data Projects
DOI: 10.1145/3323878.3325807
15:30: poster Workshop Poster Session

Session 4 (Summarization)

Time Type Description
16:30: paper Tobias Zeimetz, Ralf Schenkel:
Analyzing Online Data Summarization Approaches for Linked Data Knowledge Bases
DOI: 10.1145/3323878.3325808
17:00: paper Pawel Guzewicz, Ioana Manolescu:
Parallel Quotient Summarization of RDF Graphs
DOI: 10.1145/3323878.3325809
17:30: break End of Workshop

Manuscript Preparation

Authors are invited to submit original, unpublished research papers that are not being considered for publication in any other forum.

Manuscripts should be submitted electronically as PDF files using this webpage and be formatted using the camera-ready templates in the ACM proceedings double-column format according to the "sigconf" proceedings template. Papers cannot exceed 6 pages in length.

Accepted papers will be published online in the ACM digital library. The papers must include the standard ACM copyright notice on the first page.

The pdf version of your paper should consider the following items:

  • The pdf be optimized for fast web viewing.

  • The pdf should apply the ACM Computing Classification categories and terms (CCS concepts). The ACM templates provide space for this indexing and please consider the Computing Classification Scheme.

  • The pdf should contain the keywords.

  • The pdf should have the rights management statement and bibliographic strip on the bottom of the first page left column.

  • Please start numbering your paper with page number 1.

  • The pdf should have Type 1 fonts (scalable), not Type 3 (bit-mapped). All fonts MUST be embedded within the PDF file (to be corrected in the source files before the PDF is generated according to ACM documentation).

Submission

The submission is currently closed. Please check our Important Dates page.

Contact Program Chairs

Please contact us for any further information:

Editions

Please use the following links for further information on the edition of the given year of the International Workshop on Semantic Big Data (SBD):