Semantic Big Data-Workshop: HOME

The International Workshop on Semantic Big Data (SBD 2016)

In conjunction with ACM SIGMOD 2016

Program Committee

Our members of the Program Committee are coming from all around the world!

More...

Semantic Big Data

Do you want to know why we believe that Semantic Big Data is a hot topic in research?

More...

Questions

If you have any questions, please do not hesitate to contact the workshop chairs!

More...

Types of Papers

We accept four types of papers

Research Papers
Experiments and Analysis Papers
Application Papers
Vision Papers

More...

Evaluation Criteria

We evaluate our submissions according to a set of criteria...

More...

Topics of Interest

We are interested in submissions in any topic related to Semantic Big Data...

More...

Sponsor

Aims of the Workshop

The current World-Wide Web enables an easy, instant access to a vast amount of online information. However, the content in the Web is typically for human consumption, and is not tailored for machine processing. The Semantic Web is hence intended to establish a machine-understandable Web, and is currently also used in many other domains and not only in the Web. The World Wide Web Consortium (W3C) has developed a number of standards around this vision. Among them is the Resource Description Framework (RDF), which is used as the data model of the Semantic Web. The W3C has also defined SPARQL as the RDF query language, RIF as the rule language, and the ontology languages RDFS and OWL to describe schemas of RDF. The usage of common ontologies increases interoperability between heterogeneous data sets, and the proprietary ontologies with the additional abstraction layer facilitate the integration of these data sets. Therefore, we can argue that the Semantic Web is ideally designed to work in heterogeneous Big Data environments.

We define Semantic Big Data as the intersection of Semantic Web data and Big Data. There are masses of Semantic Web data freely available to the public - thanks to the efforts of the linked data initiative. According to http://stats.lod2.eu/ the current freely available Semantic Web data is approximately 90 billion triples in over 3,300 datasets, many of which are accessible via SPARQL query servers called SPARQL endpoints. Everyone can submit SPARQL queries to SPARQL endpoints via a standardized protocol, where the queries are processed on the datasets of the SPARQL endpoints and the query results are sent back in a standardized format. Hence, not only Semantic Big Data is freely available, but also distributed execution environments for Semantic Big Data are freely accessible. This makes the Semantic Web an ideal playground for Big Data research.

The goal of this workshop is to bring together academic researchers and industry practitioners to address the challenges and report and exchange the research findings in Semantic Big Data, including new approaches, techniques and applications, make substantial theoretical and empirical contributions to, and significantly advance the state of the art of Semantic Big Data.

Types of Papers

The workshop solicits papers of different types:

Research Papers propose new approaches, theories or techniques related to Semantic Big Data including new data structures, algorithms and whole systems. They should make substantial theoretical and empirical contributions to the research field.
Experiments and Analysis Papers focus on the experimental evaluation of existing approaches including data structures and algorithms for Semantic Big Data and bring new insights through the analysis of these experiments. Results of Experiments and Analysis Papers can be, for example, showing benefits of well-known approaches in new settings and environments, opening new research problems by demonstrating unexpected behavior or phenomena, or comparing a set of traditional approaches in an experimental survey.
Application Papers report practical experiences on applications of Semantic Big Data. Application Papers might describe how to apply Semantic Web technologies to specific application domains with big data demands like social networks, web search, e-business, collaborative environments, e-learning, medical informatics, bioinformatics and geographic information system. Application Papers might describe applications using linked data in a new way.
Vision Papers identify emerging new or future research issues and directions, and describe new research visions having demands for Semantic Big Data. The new visions will potentially have great impacts on society.

Topics of Interest

We welcome papers on the following topics:

Semantic Data Management, Query Processing and Optimization in
- Big Data
- Cloud Computing
- Internet of Things
- Graph Databases
- Federations
- Spatial and Spatio-Temporal Data
Evaluation strategies for Semantic Big Data of Rule-based Languages like RIF and SWRL
Ontology-based Approaches for Modeling, Mapping, Evolution and Real-world ontologies in the context of Semantic Big Data
Reasoning Approaches (Real-World Applications, Efficient Algorithms) especially designed for Semantic Big Data environments
Linked Data
- Integration of Heterogeneous Linked Data
- Real-World Applications
- Statistics and Visualizations
- Quality
- Ranking Techniques
- Provenance
- Mining and Consuming Linked Data
Semantic Web stream processing (Dynamic Data, Temporal Semantics)
Semantic Internet of Things
Semantic Smart Homes/Companies/Cities
Performance, Evaluation and Benchmarking of Semantic Web Technologies, Applications and Databases
Semantic Web Services
Semantic Big Data Archives
- Efficient Archiving and Preservation Techniques
- Evolution Representation
- Compression Approaches
- Querying Techniques
Semantic Big Data on Emergent Hardware Technologies
- FPGA
- GPU
- SSD
- Main-Memory Databases

Important Dates

Time Schedule
Submission (extended):	February 29, 2016
Notification:	April 22, 2016
Workshop:	July 1, 2016

Diversity Considerations of the Program Committee

We have currently recruited 46 PC members and chairs listed below who are experts in the topics of interest of our workshop. The current PC members and chairs are selected from 17 nations all over the world as shown also by the map below. While most PC members are from academia, we have 5 experts also from industry (11%). 8 of the PC members and chairs are women (17%).

Legend

Program committee members and chairs: 1 10

Program Committee Chairs

Sven Groppe, University of Lübeck, Germany
Le Gruenwald, University of Oklahoma, USA

Program Committee

Muhammad Intizar Ali, DERI, National University of Ireland, Ireland
Carlos Buil Aranda, Universidad Técnica Federico Santa María, Chile
Feng Cao, IBM China Research Laboratory, China
Isabel Cruz, University of Illinois at Chicago, USA
Paulo Rupino da Cunha, University of Coimbra, Portugal
Melike Şah Direkoglu, Near East University, North Cyprus
Julian Dolby, IBM Research, USA
Vadim Ermolayev, Zaporozhye National University, Ukraine
Javier D. Fernández, Vienna University of Economics and Business, WU Vienna, Austria
Carlos Juiz García, Universitat de les Illes Balears, Spain
Panagiotis Germanakos, University of Cyprus, Cyprus
Katja Gilly de La Sierra-Llamazares, Miguel Hernandez University, Spain
Ekaterini Ioannou, Technical University of Crete, Greece
Prudhvi Janga, University of Cincinnati and Amazon Web Services, USA
Ioannis Konstantinou, National Technical University of Athens, Greece
Nectarios Koziris, National Technical University of Athens, Greece
Herbert Kuchen, University of Münster, Germany
Wookey Lee, Inha University, Korea
Isaac Lera, Universitat de les Illes Balears, Spain
Xiang Lian, University of Texas - Pan American Texas, USA
Qing Liu, CSIRO, Australia
Nuno Lopes, Smarter Cities Technology Centre, IBM Research, Dublin, Ireland
Fadi Maali, National University of Ireland Galway, Ireland
Ioana Manolescu, INRIA and Université Paris-Sud, France
Daniel Miranker, The University of Texas at Austin, USA
Z. Meral Özsoyoglu, Case Western Reserve University, USA
Grażyna Paliwoda-Pękosz, Cracow University of Economics, Poland
Nikolaos Papailiou, National Technical University of Athens, Greece
Richard Picking, Glyndwr University, UK
Alfredo Pulvirenti, University of Catania, Italy
Louiqa Raschid, University of Maryland, USA
Sherif Sakr, School of Computer Science and Engineering University of New South Wales, Australia
Ismael Sanz, Universitat Jaume I, Spain
Stephan Seufert, Trifacta, Inc., USA
Rudi Studer, Institute AIFB, Karlsruhe Institute of Technology (KIT), Germany
Dezhao Song, Research and Development of Thomson Reuters, USA
Martin Theobald, University of Ulm, Germany
Dimitrios Tsoumakos, Department of Informatics, Ionian University, Greece
Juergen Umbrich, Vienna University of Economics and Business, Vienna, Austria
Dongyan Zhao, Peking University Beijing, China
Xiang ZHAO, National University of Defense Technology, China
Weiguo Zheng, Chinese University of Hong Kong, China
Dimitrios Zissis, University of the Aegean, Greece
Lei Zou, Peking University, China

Evaluation of Papers

To verify the originality of submissions, we will use Plagiarism Detection Tools to check the content of the submitted manuscripts against previous publications.

Papers will be evaluated according to the following aspects:

Relevance to the Workshop
Novelty and practical impact
Technical soundness
Appropriateness and adequacy of:
- Literature review
- Background discussion
- Analysis of issues
Presentation, including:
- Overall organization and structure
- Correctness of English language
- Readability

Accepted Papers

The proceedings are available here in ACM DL.

Sangkeun Lee, Supriya Chinthavali, Sisi Duan, Mallikarjun Shankar:
Utilizing Semantic Big Data for realizing a National-scale Infrastructure Vulnerability Analysis System
DOI: 10.1145/2928294.2928295
Richard M. Keller, Shubha Ranjan, Mei Y. Wei, Michelle M. Eshow:
Semantic Representation and Scale-up of Integrated Air Traffic Management Data
DOI: 10.1145/2928294.2928296
Slides
Stefano Bortoli, Flavio Pompermaier, Paolo Bouquet, Andrea Molinari:
Semantic Big Data for Tax Assessment
DOI: 10.1145/2928294.2928297
Slides
Mohammad Sadnan Al Manir, Alexandre Riazanov, Harold Boley, Artjom Klein, Christopher J.O. Baker:
Automated Generation of SADI Semantic Web Services for Clinical Intelligence
DOI: 10.1145/2928294.2928298
Slides
Hassan Issa, Ludger van Elst, Andreas Dengel:
Using Smartphones for Prototyping Semantic Sensor Analysis Systems
DOI: 10.1145/2928294.2928299
Slides
Shohreh Hosseinzadeh, Natalia Díaz Rodríguez, Seppo Virtanen, Johan Lilius:
A semantic security framework and context-aware role-based access control ontology for Smart Spaces
DOI: 10.1145/2928294.2928300
Slides
Rafael Peixoto, Thomas Hassan, Christophe Cruz, Aurélie Bertaux, Nuno Silva:
An unsupervised classification process for large datasets using web reasoning
DOI: 10.1145/2928294.2928301
Extended Version: URN: urn:nbn:de:101:1-201705194907 URL: Publisher
Slides
Marta Tatu, Steven Werner, Mithun Balakrishna, Tatiana Erekhinskaya, Dan Moldovan:
Semantic Question Answering on Big Data
DOI: 10.1145/2928294.2928302
Extended Version: URN: urn:nbn:de:101:1-201705194921 URL: Publisher
Slides
Pieter Pauwels, Tarcisio Mendes de Farias, Chi Zhang, Ana Roxin, Jakob Beetz, Jos De Roo, Christophe Nicolle:
Querying and reasoning over large scale building data sets: an outline of a performance benchmark
DOI: 10.1145/2928294.2928303
Slides
Dieter De Witte, Laurens De Vocht, Ruben Verborgh, Kenny Knecht, Filip Pattyn, Hans Constandt, Erik Mannens, Rik Van de Walle:
Big Linked Data ETL Benchmark on Cloud Commodity Hardware
DOI: 10.1145/2928294.2928304
Slides
Sagnik Ray Choudhury, Shuting Wang, C. Lee Giles:
Scalable Algorithms for Scholarly Figure Mining and Semantics
DOI: 10.1145/2928294.2928305
Slides
Jian Wu, Chen Liang, Huaiyu Yang, C. Lee Giles:
CiteSeerX Data: Semanticizing Scholarly Papers
DOI: 10.1145/2928294.2928306
Slides

Program

Session 1
Time	Type	Description
8:30:	keynote	Pascal Hitzler: Semantic Technologies for Big Data Integration Abstract: Increasing amounts of data are shared, often publicly on the World Wide Web, for reuse by third parties. Such reuse usually necessitates the integration of this data with other data, or with software, in order to enable data-based applications, fine-grained search, data analytics, etc. This integration is often a significant cost factor due to the wide variance regarding representational choices for data, ranging from syntactic data formats to semantic heterogeneity stemming from different viewpoints of data providers. In this presentation, we will shed light on the role of knowledge modeling for data sharing and reuse. In particular, we will discuss how Semantic Web Technologies make it easier to integrate and thus reuse heterogeneous data. Bio: Pascal Hitzler is (full) Professor and Director of Data Science at the Department of Computer Science and Engineering at Wright State University in Dayton, Ohio, U.S.A. His research record lists over 300 publications in such diverse areas as semantic web, neural-symbolic integration, knowledge representation and reasoning, machine learning, denotational semantics, and set-theoretic topology. He is Editor-in-chief of the Semantic Web journal by IOS Press, and of the IOS Press book series Studies on the Semantic Web. He is co-author of the W3C Recommendation OWL 2 Primer, and of the book Foundations of Semantic Web Technologies by CRC Press, 2010 which was named as one out of seven Outstanding Academic Titles 2010 in Information and Computer Science by the American Library Association's Choice Magazine, and has translations into German and Chinese. He is on the editorial board of several journals and book series and is a founding steering committee member of the Web Reasoning and Rule Systems (RR) conference series, of the Neural-Symbolic Learning and Reasoning (NeSy) workshop series, and of the Association for Ontology Design and Patterns (ODPA). For more information, see http://www.pascal-hitzler.de. Slides
9:15:	paper	Sagnik Ray Choudhury, Shuting Wang, C. Lee Giles: Scalable Algorithms for Scholarly Figure Mining and Semantics DOI: 10.1145/2928294.2928305 Slides
9:40:	paper	Jian Wu, Chen Liang, Huaiyu Yang, C. Lee Giles: CiteSeerX Data: Semanticizing Scholarly Papers DOI: 10.1145/2928294.2928306 Slides
10:05:	break	Coffee Break
Session 2
Time	Type	Description
10:30:	paper	Sangkeun Lee, Supriya Chinthavali, Sisi Duan, Mallikarjun Shankar: Utilizing Semantic Big Data for realizing a National-scale Infrastructure Vulnerability Analysis System DOI: 10.1145/2928294.2928295
10:55:	paper	Richard M. Keller, Shubha Ranjan, Mei Y. Wei, Michelle M. Eshow: Semantic Representation and Scale-up of Integrated Air Traffic Management Data DOI: 10.1145/2928294.2928296 Slides
11:20:	paper	Stefano Bortoli, Flavio Pompermaier, Paolo Bouquet, Andrea Molinari: Semantic Big Data for Tax Assessment DOI: 10.1145/2928294.2928297 Slides
11:45:	paper	Mohammad Sadnan Al Manir, Alexandre Riazanov, Harold Boley, Artjom Klein, Christopher J.O. Baker: Automated Generation of SADI Semantic Web Services for Clinical Intelligence DOI: 10.1145/2928294.2928298 Slides
12:10:	break	Lunch Break (lunch on your own)
Session 3
Time	Type	Description
13:30:	keynote	Ivan Bercovich: General purpose semantic platform as an information retrieval system Abstract: Over the past couple decades, information retrieval systems could be roughly categorized into two groups: keyword search and faceted search. Keyword search is the most popular offering, primarily driven by giant search engines like Google and Bing. Faceted search applications tend to be more narrow, and focused on specific verticals, such as e-commerce, travel, cars, etc. While traditional search provides convenience, breadth, and flexibility, it lacks when it comes to the precision and structure of the results. On the other hand, faceted search is more constrained, but the results often convey a higher degree of structure and context. Roughly, we can say keyword search retrieves documents, whereas faceted search returns records/entities. In essence, traditional search provides a more natural interface to prompt queries, while faceted search provides more optimal results. Therefore, the ideal experience would combine a natural language approach to query construction, combined with a structured knowledge base to power the results. In this presentation we will show a working product, powered by a comprehensive knowledge graph (data) and the corresponding knowledge platform (software), which leverages insights from the fields of data ingestion, semantic data, natural language processing, and faceted search, to create a hybrid information retrieval experience. In order to achieve this experience, we had to build a vast knowledge graph, with billions of entities and relationships and hundreds of billions of facts. We cover dozens of verticals, from politics, to sports, to health, and have hundreds of entity-collections for each one. Our knowledge graph is seen by over 300 million eyeballs a month, both through our owned and operated websites, as well as through our partnerships with publishers and other enterprises. Bio: Ivan Bercovich is Vice President of Engineering at Graphiq. For more information, see https://team.graphiq.com/l/49/Ivan-Bercovich.
14:10:	paper	Hassan Issa, Ludger van Elst, Andreas Dengel: Using Smartphones for Prototyping Semantic Sensor Analysis Systems DOI: 10.1145/2928294.2928299 Slides
14:35:	paper	Shohreh Hosseinzadeh, Natalia Díaz Rodríguez, Seppo Virtanen, Johan Lilius: A semantic security framework and context-aware role-based access control ontology for Smart Spaces DOI: 10.1145/2928294.2928300 Slides
15:00:	break	Coffee Break
Session 4
Time	Type	Description
15:30:	paper	Rafael Peixoto, Thomas Hassan, Christophe Cruz, Aurélie Bertaux, Nuno Silva: An unsupervised classification process for large datasets using web reasoning DOI: 10.1145/2928294.2928301 Extended Version: URN: urn:nbn:de:101:1-201705194907 URL: Publisher Slides
15:55:	paper	Marta Tatu, Steven Werner, Mithun Balakrishna, Tatiana Erekhinskaya, Dan Moldovan: Semantic Question Answering on Big Data DOI: 10.1145/2928294.2928302 Extended Version: URN: urn:nbn:de:101:1-201705194921 URL: Publisher Slides
16:20:	paper	Pieter Pauwels, Tarcisio Mendes de Farias, Chi Zhang, Ana Roxin, Jakob Beetz, Jos De Roo, Christophe Nicolle: Querying and reasoning over large scale building data sets: an outline of a performance benchmark DOI: 10.1145/2928294.2928303 Slides
16:45:	paper	Dieter De Witte, Laurens De Vocht, Ruben Verborgh, Kenny Knecht, Filip Pattyn, Hans Constandt, Erik Mannens, Rik Van de Walle: Big Linked Data ETL Benchmark on Cloud Commodity Hardware DOI: 10.1145/2928294.2928304 Slides
17:10:	break	End of Workshop

Manuscript Preparation

Authors are invited to submit original, unpublished research papers that are not being considered for publication in any other forum.

Manuscripts should be submitted electronically as PDF files using this webpage and be formatted using the camera-ready templates in the ACM proceedings double-column format according to the "sigconf" proceedings template. Papers cannot exceed 6 pages in length.

Accepted papers will be published online in the ACM digital library. The papers must include the standard ACM copyright notice on the first page.

The pdf version of your paper should consider the following items:

The pdf be optimized for fast web viewing.
The pdf should apply the ACM Computing Classification categories and terms (CCS concepts). The ACM templates provide space for this indexing and please consider the Computing Classification Scheme.
The pdf should contain the keywords.
The pdf should have the rights management statement and bibliographic strip on the bottom of the first page left column.
Please start numbering your paper with page number 1.
The pdf should have Type 1 fonts (scalable), not Type 3 (bit-mapped). All fonts MUST be embedded within the PDF file (to be corrected in the source files before the PDF is generated according to ACM documentation).

Submission

The submission is currently closed. Please check our Important Dates page.

Contact Program Chairs

Please contact us for any further information:

Editions

Please use the following links for further information on the edition of the given year of the International Workshop on Semantic Big Data (SBD):

2016
2017
2018
2019
2020

The International Workshop on Semantic Big Data (SBD 2016)

Program Committee

Semantic Big Data

Questions

Types of Papers

Evaluation Criteria

Topics of Interest

Sponsor

Aims of the Workshop

Types of Papers

Topics of Interest

Important Dates

Diversity Considerations of the Program Committee

Legend

Program Committee Chairs

Program Committee

Evaluation of Papers

Accepted Papers

Program

Session 1

Session 2

Session 3

Session 4

Manuscript Preparation

Submission

Contact Program Chairs

Editions