Big Data in Emergent Distributed Environments (BiDEDE 2024)

Workshop @ ACM SIGMOD 2024

Loading...

International Workshop on
Big Data in Emergent Distributed Environments (BiDEDE 2024)
Call for Papers: txtUTF-8 txtASCII pdf

The International Workshop on Big Data in Emergent Distributed Environments (BiDEDE 2024)

In conjunction with ACM SIGMOD 2024

This workshop is organized in cooperation with the International Federation for Information Processing (IFIP) Working Group WG2.6 Database.

Authors of selected papers will be invited to submit an extended version to a special issue of the Springer Journal of Big Data.

Aims of the Workshop

Today, new forms of distributed environments beyond Cloud Computing occur that offer new kinds of applications, but pose new challenges for data management. The recent efforts for serverless computing aim at simplifying the process of deploying code in the Cloud into production by hiding scaling, capacity planning and maintenance operations from the developer or operator. Other initiatives work on avoiding the communication to the Cloud by deploying and running environments for data processing near data sources in Internet-of-Things scenarios (e.g., fog and edge computing) for large-scale smart homes, companies and cities, and near the applications (e.g., Cloudlets for mobile applications and Offline First technologies for web applications).

Research on distributed data management evolves addressing new challenges specific to these new environments. Properties of emergent distributed environments regarding capabilities of nodes, bandwidth for communication, battery lifetime of nodes, reliability of nodes and communication, and heterogeneity of configurations impact data management mechanisms and approaches, such as those for fault tolerance, replication, resource provisioning, buffer management, query processing and optimization, and transaction management. In addition, federated approaches and polystores spanning over several emergent distributed environments also remain research challenges based on the need for combining these different distributed environments into one distributed runtime environment for easy handling of Big Data in different models, and for globally optimizing data management tasks across these different environments.

The goal of this workshop is to bring together academic researchers and industry practitioners to discuss the challenges and solutions, including new approaches, techniques and applications, that would significantly advance the state of the art of Big Data in emergent distributed environments.

Categories of Papers

The workshop solicits papers of the following categories:

  • Research Papers propose new approaches, theories or techniques related to Big Data in emergent distributed environments including new data structures, protocols and algorithms. They should make substantial theoretical and empirical contributions to the research field.

  • System Papers describe new data management tools, stream processing engines, databases and other systems, which are able to handle Big Data in emergent distributed environments.

  • Experiments and Analysis Papers focus on the experimental evaluation of existing approaches including data structures and algorithms for Big Data in emergent distributed environments and bring new insights through the analysis of these experiments. Results of Experiments and Analysis Papers can be, for example, showing benefits of well-known approaches in new settings and environments, opening new research problems by demonstrating unexpected behavior or phenomena, or comparing a set of traditional approaches in an experimental survey.

  • Application Papers report practical experiences on applications of Big Data in emergent distributed environments. Application Papers might describe how to apply technologies to specific application domains with big data demands in emergent distributed environments like social networks, web search, e-business, collaborative environments, e-learning, medical informatics, bioinformatics and geographic information systems.

  • Vision Papers identify emerging new or future research issues and directions, and describe new research visions having demands for Big Data in emergent distributed environments. The new visions will potentially have great impacts on society.

  • Demo Papers deal with innovative systems and applications for Big Data in emergent distributed environments. These papers describe a showcase of the proposed system/application, but may also explain the novelty of the system's architecture. We are especially interested in demonstrations having a WOW-effect.

The length of papers must be within 4 to 6 pages. Accepted papers will be published in the ACM Digital Library and presented as oral presentations.

Topics of Interest

We are interested in all issues concerning the management of data to be processed in emergent environments, such as the following:

  • Serverless Computing

    • Cloud Functions
    • App Engines
    • Cloud Runs

  • Post-Cloud Computing

    • Cloudlet
    • Fog Computing
    • Edge Computing
    • Cloud-Edge Continuum
    • Dew Computing
    • Offline First
    • Smart Home/Companies/Cities
    • Heterogeneous Computing
    • In-network Computing

  • Hardware/Software Co-Design for Distributed Computing

  • Disaggregated Architectures for Processing of Big Data

The Data Management issues to be solved in the emergent environments include, but are not limited to, the following:

  • Query Processing and Optimization
  • Transaction Management
  • Fault Tolerance Mechanisms
  • Cloud Data Warehouses
  • Distributed Databases
  • Federation/Polystore Architectures
  • Data Lakes
  • Artificial Intelligence in Big Data Environments
  • Interactive Data Analytics and Big Data Science
  • 5G/6G Impact on Data Management

Important Dates

Time Schedule
Submission (extended): April 7, 2024
Notification: April 24, 2024
Workshop: June 9, 2024

Diversity Considerations of the Program Committee

We have currently recruited 17 PC members and chairs listed below who are experts in the topics of interest of our workshop. The current PC members and chairs are selected from 12 nations all over the world as shown also by the map below. While most PC members are from academia, we have 2 experts also from industry (12%). 4 of the PC members and chairs are women (24%).

Legend

Program committee members and chairs: 1  4

Program Committee Chairs

Web Chair

Steering Committee

Program Committee

  • Ahmed S. Abdelhamid, Purdue University, USA
  • Paweł Boiński, Poznan University of Technology, Poland
  • Katja Gilly de La Sierra-Llamazares, Miguel Hernandez University, Spain
  • Ekaterini Ioannou, Tilburg University, The Netherlands
  • Ioannis Kontopoulos, Harokopio University of Athens, Greece
  • Tunç Durmuş Medeni, Ankara Yıldırım Beyazıt University, Turkey
  • Amira Mouakher, University of Perpignan, France
  • Grażyna Paliwoda-Pękosz, Cracow University of Economics, Poland
  • Alfredo Pulvirenti, University of Catania, Italy
  • Sanjay Vishwakarma, IBM Quantum - Almaden, USA
  • Xikui Wang, Google, USA
  • Adeleh Asemi Zavareh, University of Malaya, Malaysia
  • Steffen Zeuch, Technische Universität Berlin, Germany
  • Zhuoyue Zhao, University at Buffalo, USA

Evaluation of Papers

To verify the originality of submissions, we will use Plagiarism Detection Tools to check the content of the submitted manuscripts against previous publications.

Papers will be evaluated according to the following aspects:

  • Relevance to the Workshop
  • Novelty and practical impact
  • Technical soundness
  • Appropriateness and adequacy of:
    • Literature review
    • Background discussion
    • Analysis of issues
  • Presentation, including:
    • Overall organization and structure
    • Correctness of English language
    • Readability

Accepted Papers

The proceedings are available here.

Program

Keynote (keynote talk for BiDEDE)

Time Type Description
14:00-14:50: keynote Michał Bodziony (IBM Software Lab Kraków, Poland):
Building a data store and data fabric on a data lakehouse architecture for scaling AI workloads
Bio: With over two decades of professional experience in software architecture and data integration, Michał is an expert renowned for his expertise in performance optimization, security, and software development. Currently serving as the Software Architect responsible for the connectivity layer of IBM Cloud Pak for Data, Michał has played a pivotal role in shaping cutting-edge technologies to empower businesses with robust data integration capabilities. Michał holds the prestigious title of IBM Master Inventor, with a remarkable portfolio of over 30 patents to his name. In addition to his role as a software architect, Michał serves as the Chair of the Committee tasked with evaluating disclosures before patenting, underscoring his commitment to fostering innovation and intellectual property development. He is a member of Working Group WG2.6 of the International Federation for Information Processing. Throughout his career, Michał has been at the forefront of research and development, spearheading initiatives to develop next-generation data integration solutions that meet the evolving needs of modern enterprises. His unwavering dedication to excellence and his passion for innovation continue to drive transformative outcomes, cementing his reputation as a leader in the field of data integration software architecture.
Abstract: This 45-minute talk delves into constructing a scalable data infrastructure for AI workloads using a data lakehouse architecture. It explores principles like ephemeral engines, moving computation toward data, and connectivity layers with connectors as a service. The approach integrates vectorized embedding capabilities for enhanced AI analytics. This architecture combines data lakes and warehouses, prioritizing agility and resource optimization while minimizing data movement and latency. Practical strategies for optimizing storage, ensuring data quality and governance, and orchestrating AI workflows at scale will be covered. Real-world use cases will illustrate how organizations leverage this architecture for innovation and transformation.

Session 1

Time Type Description
14:50: paper Steven Purtzel, Samira Akili, Matthias Weidlich:
On-Demand Pattern Aggregation in Event Networks
DOI: https://doi.org/10.1145/3663741.3664781
15:10: paper Suvam Kumar Das, Ronnit Peter, Xiaozheng Zhang, Suprio Ray:
FunDa: Towards Serverless Data Analytics and In Situ Query Processing
DOI: https://doi.org/10.1145/3663741.3664788
15:30: break coffee break

Session 2

Time Type Description
16:00: paper Carlos Ordonez, Wojciech Macyna, Ladjel Bellatreche:
Energy-Aware Analytics in the Cloud
DOI: https://doi.org/10.1145/3663741.3664789
16:20: paper Lynsey Lin, Jamie Chen, Ricky Sun, Jason Zhang, Victor Wang:
A Unified Graph Framework for Storage-Compute Coupled Cluster and High-Density Computing Cluster
DOI: https://doi.org/10.1145/3663741.3664790
16:40: paper Valerio Bellandi, Paolo Ceravolo, Stefano Siccardi:
Enhancing Semantic Exploration: A Distributed Repository Approach
DOI: https://doi.org/10.1145/3663741.3664792
17:00: paper Sylvio Barbon Junior, Paolo Ceravolo, Sven Groppe, Mustafa Jarrar, Samira Maghool, Florence Sèdes, Soror Sahri, Maurice van Keulen:
Are Large Language Models the New Interface for Data Pipelines?
DOI: https://doi.org/10.1145/3663741.3664785
17:20: paper Sandro Bimonte, Gianni Bellocchi, François Pinet, Guillaume Charrier, Dimitris Sacharidis, Mahmoud Sakr, Ronan Tournier, Gentian Jakllari, Gerard Chalhoub, Tahar Kechadi, Boualem Benatallah, Francesco Marinello, Roberto Oberti, Jérôme Bindelle, Ginta Majore, Piotr Skrzypczyński:
Technological and Research Challenges in Data Engineering for Sustainable Agriculture
DOI: https://doi.org/10.1145/3663741.3664786
17:40: paper Hanh Nguyen Phuong, Asefeh Asemi, Mutaz Alshafeey:
Predicting Bitcoin price movement through Sentiment Analysis: A Comprehensive Study
DOI: https://doi.org/10.1145/3663741.3664791""
18:00: break End of Workshop

Manuscript Preparation

Authors are invited to submit original, unpublished research papers that are not being considered for publication in any other forum.

Manuscripts should be submitted electronically as PDF files using this webpage and be formatted using the camera-ready templates in the ACM proceedings double-column format according to the "sigconf" proceedings template. Papers cannot exceed 6 pages in length.

Accepted papers will be published online in the ACM digital library. The papers must include the standard ACM copyright notice on the first page.

The pdf version of your paper should consider the following items:

  • The pdf be optimized for fast web viewing.

  • The pdf should apply the ACM Computing Classification categories and terms (CCS concepts). The ACM templates provide space for this indexing and please consider the Computing Classification Scheme.

  • The pdf should contain the keywords.

  • The pdf should have the rights management statement and bibliographic strip on the bottom of the first page left column.

  • Please start numbering your paper with page number 1.

  • The pdf should have Type 1 fonts (scalable), not Type 3 (bit-mapped). All fonts MUST be embedded within the PDF file (to be corrected in the source files before the PDF is generated according to ACM documentation).

Contact Program Chairs

Please contact us for any further information:

Editions

Please use the following links for further information on the edition of the given year of the International Workshop on Big Data in Emergent Distributed Environments (BiDEDE):