Big Data in Emergent Distributed Environments-Workshop: EDITIONS

The International Workshop on Big Data in Emergent Distributed Environments (BiDEDE 2024)

In conjunction with ACM SIGMOD 2024

Program Committee

Our members of the Program Committee are coming from all around the world!

More...

Big Data in Emergent Distributed Environments

Do you want to know why we believe that Big Data in Emergent Distributed Environments is a hot topic in research?

More...

Questions

If you have any questions, please do not hesitate to contact the workshop chairs!

More...

Types of Papers

We accept six types of papers

Research Papers
System Papers
Experiments and Analysis Papers
Application Papers
Vision Papers
Demo Papers

More...

Evaluation Criteria

We evaluate our submissions according to a set of criteria...

More...

Topics of Interest

We are interested in submissions in any topic related to Big Data in Emergent Distributed Environments...

More...

This workshop is organized in cooperation with the International Federation for Information Processing (IFIP) Working Group WG2.6 Database.

Authors of selected papers will be invited to submit an extended version to a special issue of the Springer Journal of Big Data.

Aims of the Workshop

Today, new forms of distributed environments beyond Cloud Computing occur that offer new kinds of applications, but pose new challenges for data management. The recent efforts for serverless computing aim at simplifying the process of deploying code in the Cloud into production by hiding scaling, capacity planning and maintenance operations from the developer or operator. Other initiatives work on avoiding the communication to the Cloud by deploying and running environments for data processing near data sources in Internet-of-Things scenarios (e.g., fog and edge computing) for large-scale smart homes, companies and cities, and near the applications (e.g., Cloudlets for mobile applications and Offline First technologies for web applications).

Research on distributed data management evolves addressing new challenges specific to these new environments. Properties of emergent distributed environments regarding capabilities of nodes, bandwidth for communication, battery lifetime of nodes, reliability of nodes and communication, and heterogeneity of configurations impact data management mechanisms and approaches, such as those for fault tolerance, replication, resource provisioning, buffer management, query processing and optimization, and transaction management. In addition, federated approaches and polystores spanning over several emergent distributed environments also remain research challenges based on the need for combining these different distributed environments into one distributed runtime environment for easy handling of Big Data in different models, and for globally optimizing data management tasks across these different environments.

The goal of this workshop is to bring together academic researchers and industry practitioners to discuss the challenges and solutions, including new approaches, techniques and applications, that would significantly advance the state of the art of Big Data in emergent distributed environments.

Categories of Papers

The workshop solicits papers of the following categories:

Research Papers propose new approaches, theories or techniques related to Big Data in emergent distributed environments including new data structures, protocols and algorithms. They should make substantial theoretical and empirical contributions to the research field.
System Papers describe new data management tools, stream processing engines, databases and other systems, which are able to handle Big Data in emergent distributed environments.
Experiments and Analysis Papers focus on the experimental evaluation of existing approaches including data structures and algorithms for Big Data in emergent distributed environments and bring new insights through the analysis of these experiments. Results of Experiments and Analysis Papers can be, for example, showing benefits of well-known approaches in new settings and environments, opening new research problems by demonstrating unexpected behavior or phenomena, or comparing a set of traditional approaches in an experimental survey.
Application Papers report practical experiences on applications of Big Data in emergent distributed environments. Application Papers might describe how to apply technologies to specific application domains with big data demands in emergent distributed environments like social networks, web search, e-business, collaborative environments, e-learning, medical informatics, bioinformatics and geographic information systems.
Vision Papers identify emerging new or future research issues and directions, and describe new research visions having demands for Big Data in emergent distributed environments. The new visions will potentially have great impacts on society.
Demo Papers deal with innovative systems and applications for Big Data in emergent distributed environments. These papers describe a showcase of the proposed system/application, but may also explain the novelty of the system's architecture. We are especially interested in demonstrations having a WOW-effect.

The length of papers must be within 4 to 6 pages. Accepted papers will be published in the ACM Digital Library and presented as oral presentations.

Topics of Interest

We are interested in all issues concerning the management of data to be processed in emergent environments, such as the following:

Serverless Computing
- Cloud Functions
- App Engines
- Cloud Runs
Post-Cloud Computing
- Cloudlet
- Fog Computing
- Edge Computing
- Cloud-Edge Continuum
- Dew Computing
- Offline First
- Smart Home/Companies/Cities
- Heterogeneous Computing
- In-network Computing
Hardware/Software Co-Design for Distributed Computing
Disaggregated Architectures for Processing of Big Data

The Data Management issues to be solved in the emergent environments include, but are not limited to, the following:

Query Processing and Optimization
Transaction Management
Fault Tolerance Mechanisms
Cloud Data Warehouses
Distributed Databases
Federation/Polystore Architectures
Data Lakes
Artificial Intelligence in Big Data Environments
Interactive Data Analytics and Big Data Science
5G/6G Impact on Data Management

Important Dates

Time Schedule
Submission (extended):	April 7, 2024
Notification:	April 24, 2024
Workshop:	June 9, 2024

Diversity Considerations of the Program Committee

We have currently recruited 17 PC members and chairs listed below who are experts in the topics of interest of our workshop. The current PC members and chairs are selected from 12 nations all over the world as shown also by the map below. While most PC members are from academia, we have 2 experts also from industry (12%). 4 of the PC members and chairs are women (24%).

Legend

Program committee members and chairs: 1 4

Program Committee Chairs

Robert Wrembel, Poznan University of Technology and Artificial Intelligence and Cybersecurity Center, Poland
Andrea Kő, Corvinus University of Budapest, Hungary
Philippe Cudre-Maroux, University of Fribourg, Switzerland

Web Chair

Sven Groppe, University of Lübeck, Germany

Steering Committee

Nik Bessis, Edge Hill University, U.K.
Pedro Garcia Lopez, Universitat Rovira i Virgili, Spain
Claudio Agostino Ardagna, Universita' degli Studi di Milano, Italy
Schahram Dustdar, TU Wien, Austria
Konstantinos Karanasos, Meta, USA
Sanju Mishra Tiwari, Universidad Autonoma de Tamaulipas, Mexico
Sven Groppe, University of Lübeck, Germany
Le Gruenwald, University of Oklahoma, USA

Program Committee

Ahmed S. Abdelhamid, Purdue University, USA
Paweł Boiński, Poznan University of Technology, Poland
Katja Gilly de La Sierra-Llamazares, Miguel Hernandez University, Spain
Ekaterini Ioannou, Tilburg University, The Netherlands
Ioannis Kontopoulos, Harokopio University of Athens, Greece
Tunç Durmuş Medeni, Ankara Yıldırım Beyazıt University, Turkey
Amira Mouakher, University of Perpignan, France
Grażyna Paliwoda-Pękosz, Cracow University of Economics, Poland
Alfredo Pulvirenti, University of Catania, Italy
Sanjay Vishwakarma, IBM Quantum - Almaden, USA
Xikui Wang, Google, USA
Adeleh Asemi Zavareh, University of Malaya, Malaysia
Steffen Zeuch, Technische Universität Berlin, Germany
Zhuoyue Zhao, University at Buffalo, USA

Evaluation of Papers

To verify the originality of submissions, we will use Plagiarism Detection Tools to check the content of the submitted manuscripts against previous publications.

Papers will be evaluated according to the following aspects:

Relevance to the Workshop
Novelty and practical impact
Technical soundness
Appropriateness and adequacy of:
- Literature review
- Background discussion
- Analysis of issues
Presentation, including:
- Overall organization and structure
- Correctness of English language
- Readability

Accepted Papers

The proceedings are available here.

Steven Purtzel, Samira Akili, Matthias Weidlich:
On-Demand Pattern Aggregation in Event Networks
DOI: https://doi.org/10.1145/3663741.3664781
Suvam Kumar Das, Ronnit Peter, Xiaozheng Zhang, Suprio Ray:
FunDa: Towards Serverless Data Analytics and In Situ Query Processing
DOI: https://doi.org/10.1145/3663741.3664788
Carlos Ordonez, Wojciech Macyna, Ladjel Bellatreche:
Energy-Aware Analytics in the Cloud
DOI: https://doi.org/10.1145/3663741.3664789
Lynsey Lin, Jamie Chen, Ricky Sun, Jason Zhang, Victor Wang:
A Unified Graph Framework for Storage-Compute Coupled Cluster and High-Density Computing Cluster
DOI: https://doi.org/10.1145/3663741.3664790
Valerio Bellandi, Paolo Ceravolo, Stefano Siccardi:
Enhancing Semantic Exploration: A Distributed Repository Approach
DOI: https://doi.org/10.1145/3663741.3664792
Sylvio Barbon Junior, Paolo Ceravolo, Sven Groppe, Mustafa Jarrar, Samira Maghool, Florence Sèdes, Soror Sahri, Maurice van Keulen:
Are Large Language Models the New Interface for Data Pipelines?
DOI: https://doi.org/10.1145/3663741.3664785
Sandro Bimonte, Gianni Bellocchi, François Pinet, Guillaume Charrier, Dimitris Sacharidis, Mahmoud Sakr, Ronan Tournier, Gentian Jakllari, Gerard Chalhoub, Tahar Kechadi, Boualem Benatallah, Francesco Marinello, Roberto Oberti, Jérôme Bindelle, Ginta Majore, Piotr Skrzypczyński:
Technological and Research Challenges in Data Engineering for Sustainable Agriculture
DOI: https://doi.org/10.1145/3663741.3664786
Hanh Nguyen Phuong, Asefeh Asemi, Mutaz Alshafeey:
Predicting Bitcoin price movement through Sentiment Analysis: A Comprehensive Study
DOI: https://doi.org/10.1145/3663741.3664791""

Program

Keynote (keynote talk for BiDEDE)
Time	Type	Description
14:00-14:50:	keynote	Michał Bodziony (IBM Software Lab Kraków, Poland): Building a data store and data fabric on a data lakehouse architecture for scaling AI workloads Bio: With over two decades of professional experience in software architecture and data integration, Michał is an expert renowned for his expertise in performance optimization, security, and software development. Currently serving as the Software Architect responsible for the connectivity layer of IBM Cloud Pak for Data, Michał has played a pivotal role in shaping cutting-edge technologies to empower businesses with robust data integration capabilities. Michał holds the prestigious title of IBM Master Inventor, with a remarkable portfolio of over 30 patents to his name. In addition to his role as a software architect, Michał serves as the Chair of the Committee tasked with evaluating disclosures before patenting, underscoring his commitment to fostering innovation and intellectual property development. He is a member of Working Group WG2.6 of the International Federation for Information Processing. Throughout his career, Michał has been at the forefront of research and development, spearheading initiatives to develop next-generation data integration solutions that meet the evolving needs of modern enterprises. His unwavering dedication to excellence and his passion for innovation continue to drive transformative outcomes, cementing his reputation as a leader in the field of data integration software architecture. Abstract: This 45-minute talk delves into constructing a scalable data infrastructure for AI workloads using a data lakehouse architecture. It explores principles like ephemeral engines, moving computation toward data, and connectivity layers with connectors as a service. The approach integrates vectorized embedding capabilities for enhanced AI analytics. This architecture combines data lakes and warehouses, prioritizing agility and resource optimization while minimizing data movement and latency. Practical strategies for optimizing storage, ensuring data quality and governance, and orchestrating AI workflows at scale will be covered. Real-world use cases will illustrate how organizations leverage this architecture for innovation and transformation.
Session 1
Time	Type	Description
14:50:	paper	Steven Purtzel, Samira Akili, Matthias Weidlich: On-Demand Pattern Aggregation in Event Networks DOI: https://doi.org/10.1145/3663741.3664781
15:10:	paper	Suvam Kumar Das, Ronnit Peter, Xiaozheng Zhang, Suprio Ray: FunDa: Towards Serverless Data Analytics and In Situ Query Processing DOI: https://doi.org/10.1145/3663741.3664788
15:30:	break	coffee break
Session 2
Time	Type	Description
16:00:	paper	Carlos Ordonez, Wojciech Macyna, Ladjel Bellatreche: Energy-Aware Analytics in the Cloud DOI: https://doi.org/10.1145/3663741.3664789
16:20:	paper	Lynsey Lin, Jamie Chen, Ricky Sun, Jason Zhang, Victor Wang: A Unified Graph Framework for Storage-Compute Coupled Cluster and High-Density Computing Cluster DOI: https://doi.org/10.1145/3663741.3664790
16:40:	paper	Valerio Bellandi, Paolo Ceravolo, Stefano Siccardi: Enhancing Semantic Exploration: A Distributed Repository Approach DOI: https://doi.org/10.1145/3663741.3664792
17:00:	paper	Sylvio Barbon Junior, Paolo Ceravolo, Sven Groppe, Mustafa Jarrar, Samira Maghool, Florence Sèdes, Soror Sahri, Maurice van Keulen: Are Large Language Models the New Interface for Data Pipelines? DOI: https://doi.org/10.1145/3663741.3664785
17:20:	paper	Sandro Bimonte, Gianni Bellocchi, François Pinet, Guillaume Charrier, Dimitris Sacharidis, Mahmoud Sakr, Ronan Tournier, Gentian Jakllari, Gerard Chalhoub, Tahar Kechadi, Boualem Benatallah, Francesco Marinello, Roberto Oberti, Jérôme Bindelle, Ginta Majore, Piotr Skrzypczyński: Technological and Research Challenges in Data Engineering for Sustainable Agriculture DOI: https://doi.org/10.1145/3663741.3664786
17:40:	paper	Hanh Nguyen Phuong, Asefeh Asemi, Mutaz Alshafeey: Predicting Bitcoin price movement through Sentiment Analysis: A Comprehensive Study DOI: https://doi.org/10.1145/3663741.3664791""
18:00:	break	End of Workshop

Manuscript Preparation

Authors are invited to submit original, unpublished research papers that are not being considered for publication in any other forum.

Manuscripts should be submitted electronically as PDF files using this webpage and be formatted using the camera-ready templates in the ACM proceedings double-column format according to the "sigconf" proceedings template. Papers cannot exceed 6 pages in length.

Accepted papers will be published online in the ACM digital library. The papers must include the standard ACM copyright notice on the first page.

The pdf version of your paper should consider the following items:

The pdf be optimized for fast web viewing.
The pdf should apply the ACM Computing Classification categories and terms (CCS concepts). The ACM templates provide space for this indexing and please consider the Computing Classification Scheme.
The pdf should contain the keywords.
The pdf should have the rights management statement and bibliographic strip on the bottom of the first page left column.
Please start numbering your paper with page number 1.
The pdf should have Type 1 fonts (scalable), not Type 3 (bit-mapped). All fonts MUST be embedded within the PDF file (to be corrected in the source files before the PDF is generated according to ACM documentation).

Contact Program Chairs

Please contact us for any further information:

Editions

Please use the following links for further information on the edition of the given year of the International Workshop on Big Data in Emergent Distributed Environments (BiDEDE):

2021
2022
2023
2024

The International Workshop on Big Data in Emergent Distributed Environments (BiDEDE 2024)

Program Committee

Big Data in Emergent Distributed Environments

Questions

Types of Papers

Evaluation Criteria

Topics of Interest

Aims of the Workshop

Categories of Papers

Topics of Interest

Important Dates

Diversity Considerations of the Program Committee

Legend

Program Committee Chairs

Web Chair

Steering Committee

Program Committee

Evaluation of Papers

Accepted Papers

Program

Keynote (keynote talk for BiDEDE)

Session 1

Session 2

Manuscript Preparation

Contact Program Chairs

Editions