layout: true
--- class: title-slide, center, middle
--- # Platform-specific .darkblue[types of DBMS]
---
.more-condensed[ .bold[Polyglot Persistence]
-
.darkred[data sources: integration] at application level
-
.darkred[performance] of data processing cannot be fully optimized
-
.darkred[fault-tolerance] cannot be transparently offered across the different databases
-
.darkred[zoo of query languages]
+
.darkgreen[features of different] types of .darkgreen[databases] can be used ]
.more-condensed[ .bold[Multi-Model DBMS (MM-DBMS)]
+
full and uniform .darkgreen[data integration] at database level
+
.darkgreen[performance]: fully optimized across different data models
+
transparent .darkgreen[fault-tolerance]
+
SQL .darkgreen[standards]:
.small-font22[relational ('87), XML ('03), temporal ('11), JSON ('16), Multi-dimensional Arrays ('19), schemaless ('19), streams ('20?), property graphs ('21?)]
-
.darkred[features of different] types of .darkred[databases can.bold[not] be used] ]
--- # .darkblue[Federated] DBMS
.more-condensed.no-padding.no-margin[
.darkblue[Bottom-up-integration of] existent .darkblue[databases]
mostly .darkblue[independent DBMS] with private conceptual database schemes
partially enabling .darkblue[external accesses] (.darkblue[in cooperation])
.darkblue[heterogeneity of data models and transaction management possible] (but .darkblue[relational DBMS] in .darkblue[most] times)
-
problems with .darkblue[semantic heterogeneity]
-
.darkblue[transparency in distribution only partially] achievable ] --- # .darkblue[One Size]-Approach - M. Stonebraker, U. Cetintemel. .citation.darkblue["One Size Fits All": An Idea Whose Time Has Come and Gone].
ICDE 2005 - .citation[The last 25 years of commercial DBMS development can be summed up in a single phrase: "One size fits all".] - .citation[...this concept is no longer applicable to the database market...] - .darkblue[Our approach: .bold[Enlarge the size!]] - .darkblue[Over the boundaries and limitations of single platforms and] their .darkblue[specialized approaches] - .darkblue[Increase transparency, performance and ease of use] --- # .darkblue[Hybrid Multi-Model Multi-Platform (HM3P) Database] .reference[S. Groppe, J. Groppe, Hybrid Multi-Model Multi-Platform (HM3P) Databases, DATA 2020.] .more-condensed[
+
full and uniform .darkgreen[data integration] at database level
+
.darkgreen[performance]: fully optimized across different data models
+
transparent .darkgreen[fault-tolerance]
+
SQL .darkgreen[standards]: .small-font22[relational ('87), XML ('03), temporal ('11), JSON ('16), Multi-dimensional Arrays ('19), schemaless ('19), streams ('20?), property graphs ('21?)]
+
.darkgreen[features of different] types of .darkgreen[databases running on different platforms] can be used
] --- # Variant: .darkblue[Semantic] HM3P (.darkblue[S]HM3P) DB .reference[S. Groppe, Semantic Hybrid Multi-Model Multi-Platform (SHM3P) Databases, ISIC 2021.] .more-condensed.no-padding.no-margin[
Semantic Layer as glue
between other models and platforms
new challenges
like integrating different types of reasoners in a transparent global reasoner
+
.darkgreen[Features of HM3P] databases
+
Easier .darkgreen[data integration]
-
.darkblue[Performance issues] may occur due to semantic layer ] --- # .darkblue[Types of DBMS] .more-condensed.no-padding.no-margin[
] --- # .darkblue[Multi-Platform Development] of DBMS .more-more-condensed[ -
.darkblue[Native Binaries] via C/C++ - support of a new platform: .darkblue[porting code] is necessary - code .darkblue[close to hardware, fast execution] - direct access to .darkblue[native libraries] - .darkred[doesn't run in browser] - .darkblue[most server DBMS]: C/C++ code -
.darkblue[Java]/Java Virtual Machine (JVM) - runs on .darkblue[many platforms (without porting code)] - interpreted bytecode, via Just-In-Time compilation .darkblue[comparable speed to native] execution - .darkred[no] direct access to .darkred[native libraries] - .darkred[does neither run on iPhone nor in browser] - .darkblue[many NoSQL/NewSQL/Cloud DBMS]: Java (or JVM language like Scala) code - .darkblue[Code generation for query processing] via C/C++ or Janino-Compiler (JVM) ] --- # .darkblue[Multi-Platform Development] with
.bold[Targets:]
.more-condensed[ - .darkblue[Most target platforms] are supported - Splitting the project in .darkblue[platform-.bold[in]dependent and platform-dependent code] - Platform-dependent code can be partly coded .darkblue[in the programming language of the target platform]
.darkgray[(e.g., Java for JVM, JS for Web)] - Enables .darkblue[one code repository for various target platforms] - Sharing of code between server & .darkgray[(various)] clients - .darkblue[Avoids efforts to port code]
.darkgray[(into other programming languages)] ]
--- # .darkblue[Multi-Platform Development] with
.more-more-condensed[ - .darkblue[Common Module] - .darkblue[Code independent of platforms] containing declarations for platform dependent code without implementation.darkgray[, e.g.:] .small-font.small-margin-bottom.no-text-indent[```kotlin2 expect fun formatString(source: String, vararg args: Any): String expect annotation class Test ```] - .darkblue[Platform Module] - .darkblue[Implementation of] within the common module .darkblue[declared platform-dependent code] .darkgray[(and other platform-dependent code), e.g.:] .small-font.small-margin-bottom.no-text-indent[```kotlin2 actual fun formatString(source: String, vararg args: Any) = String.format(source, args) actual typealias Test = org.junit.Test ```] - .darkblue[Regular Module] - .darkblue[depend on platform modules or platform modules depend on this module] - .bold[However:] .darkred.bold[High compilation times], .darkgreen.bold[faster]: Including different sets of source code directories for different targets and configurations (e.g., centralized, Cloud, P2P, browser, ...) ] --- # .darkblue[The Power of Multi-Platform: LUPOSDATE3000] .more-more-condensed[
.more-more-condensed[ - .darkblue[ultra-fast in jvm]...
.smaller-font[
B. Warnke, M.W. Rehan, S. Fischer, S. Groppe: Flexible data partitioning schemes for parallel merge joins in semantic web queries in: BTW'21
]
]
.more-more-condensed[ - ...but also .darkblue[enabling web demos running completely in] the .darkblue[browser!]
Your browser does not support the video element!
S. Groppe, R. Klinckenberg, B. Warnke. Sound of Databases: Sonification of a Semantic Web Database Engine. PVLDB, 14(12), 2021
]
] --- # Using .darkblue[Hardware Accelerator] for optimizing Transaction Schedules .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
]
--- # .small-font32.bold[.darkblue[2 Phase Locking (2PL)] versus .darkblue[Strict Conservative 2PL]] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
]
.more-condensed[ - .darkblue[required locks] to be determined by - .darkblue[static analysis] of transaction, .darkgray[or if static analysis is not possible:] - an .darkblue[additional phase at runtime] before transaction processing -
A. Thomson et al., "Calvin: Fast distributed transactions for partitioned database systems", SIGMOD 2012.
] --- # .em[Optimizing] .darkblue[Transaction Schedules] .reference[$^*$ M.R. Garey, D.S. Johnson and R. Sethi, The Complexity of Flowshop and Jobshop Scheduling, 1(2):117-129, 1976] .more-more-condensed[ - Variant of job shop schedule problem (JSSP): - Multi-Core CPU - Process whole
job (here transaction) on core X
- .darkblue[Schedule: $\forall$ cores: Sequence of jobs] to be processed - What is the .darkblue[optimal schedule] for minimal overall processing time? - .darkgray[Additionally to JSSP:]
.darkblue[Blocking transactions not] to be processed .darkblue[in parallel] - Example:
Black: Blocking transactions
Transaction schedule
.more-more-condensed[ - JSSP is among the .darkblue[hardest combinatorial optimizing problems]$^*$ - $\Rightarrow$ .darkblue[Hardware accelerating] the optimization of transaction schedules ]
] --- # Architectures of .darkblue[Emergent Hardware] .reference[Extended from
C. Plessl, Accelerating Scientific Computing with Massively Parallel Computer Architectures, IMPRS Winter School, Wroclaw, 2012
]
--- #
.darkblue[Quantum Computer] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .more-more-condensed[
use of .darkblue[quantum-mechanical phenomena such as superposition and entanglement] to perform computation
Different types of quantum computer, e.g.
.darkblue[Universal Quantum Computer]
uses
quantum logic gates
arranged in a circuit to do computation
measurement
(sometimes called observation) assigns the observed variable to a single value
.darkblue[Quantum Annealing]
metaheuristic for finding the global minimum
of a given objective function over a given set of candidate solutions
i.e., some way to solve a special type of
mathematical optimization problem
]
.smaller-font.darkblue[Quantum Circuit (Full Adder)]
.smaller-font.darkblue[Simulated versus Quantum Annealing]
] --- #
.darkblue[Quantum] versus .darkblue[Simulated Annealing] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
]
--- # .em[Optimizing] .darkblue[Transaction Schedules] via .darkblue[Quantum Annealing] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .no-margin.no-padding.more-more-condensed[
.more-more-condensed[ - .darkblue[Transaction Model] - T: .darkblue[set of transactions] with |T| = n - M: .darkblue[set of machines] with |M| = k - $O \subseteq T \times T$: set of .darkblue[blocking] transactions - l
i
: .darkblue[length of transaction] i - R: .darkblue[maximum execution time] - .darkblue[upper bound] r
i
= R − l
i
.darkblue[for start time of transaction] i ]
.more-more-condensed[ - Example - T = {t
1
, t
2
, t
3
}, n=3 - M = {m
1
, m
2
}, k=2 - $O$ = {(t
2
, t
3
)} - l
1
= 2, l
2
= 1, l
3
= 1 - R = 2 - r
1
= 0, r
2
= 1, r
3
= 1 ]
- .darkblue[Quadratic unconstrained binary optimization (QUBO)] problems (solving is NP-hard) - A QUBO-problem is defined by N weighted binary variables $X_1, ..., X_N\in\{0, 1\}$, either as linear or quadratic term .darkblue[to be minimized]:
] --- # .em[Optimizing] .darkblue[Transaction Schedules] via .darkblue[Quantum Annealing] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .more-more-condensed[ - Multi-Core CPU - Process whole transaction on core X - Solution formulated as set of binary variables - X
i,j,s
is 1 iff transaction t
i
is started at time s on machine m
j
, otherwise 0 - Example:
Black: Blocking transactions
Transaction schedule
.more-more-condensed[ - Solution:
X
1,1,0
, X
3,1,2
, X
4,2,0
, X
7,2,1
, X
6,2,3
, X
5,2,6
, X
2,3,0
, X
8,3,5
]
] --- # .em[Optimizing] .darkblue[Transaction Schedules] via .darkblue[Quantum Annealing] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .more-more-condensed[ - .darkblue[Valid] Solution - A: each
transaction starts exactly once
] --- # .em[Optimizing] .darkblue[Transaction Schedules] via .darkblue[QA] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .more-more-condensed.no-margin[ - .darkblue[Valid] Solution - B:
transactions cannot be executed at the same time on the same machine
] --- # .em[Optimizing] .darkblue[Transaction Schedules] via .darkblue[Quantum Annealing] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .more-more-condensed[ - .darkblue[Valid] Solution - C:
transactions that block each other cannot be executed at the same time
] --- # .em[Optimizing] .darkblue[Transaction Schedules] via .darkblue[Quantum Annealing] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .more-more-condensed[ - .darkblue[Optimal] Solution - D:
minimizing the maximum execution time
- Increasing weights: Weight of step n is larger than of
all preceding steps 1 to n-1 $\Rightarrow$
prefer
ring
transactions ending earlier
- Weigths in A, B and C $\geq$ 1
$\Rightarrow$
first priority is validity
, second priority is optimality
] --- # .em[Optimizing] .darkblue[Transaction Schedules] via .darkblue[Quantum Annealing] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .more-more-condensed[ - .darkblue[Overall] Solution - Minimize $P = A + B + C + D$
] --- # .em[Optimizing] .darkblue[Transaction Schedules] via .darkblue[Quantum Annealing] .reference[
T. Bittner, S. Groppe, Avoiding Blocking by Scheduling Transactions using Quantum Annealing, IDEAS 2020
] .more-more-condensed[
.more-more-condensed[ - Experiments on real .darkblue[Quantum Annealer] (D-Wave 2000Q cloud service) - first minute free
(afterwards too much for our budget) - Versus .darkblue[Simulated Annealing on CPU] - .darkblue[Preprocessing time/Number of QuBits]: $O((n\cdot k\cdot R)^2)$ ]
] --- # .darkblue[Pipelining] for further Speedup .reference[
T.Bittner,S.Groppe, Hardware Accelerating the Optimization of Transaction Schedules via Quantum Annealing by Avoiding Blocking,OJCC 7(1)
]
--- # .darkblue[Caching of & Reusing] generated formulas to minimize preprocessing time .reference[
T.Bittner,S.Groppe, Hardware Accelerating the Optimization of Transaction Schedules via Quantum Annealing by Avoiding Blocking,OJCC 7(1)
] .more-more-condensed[ - .darkblue[Following parameters are fixed:] - .darkblue[the number $k$ of machines] (system does not change during runtime) - .darkblue[the number $n$ of transactions] (for batches of the same size) - .darkblue[We observe:] - The (maximal) execution time .darkblue[$R$ and hence] upper bounds of start times .darkblue[$r_i, ..., r_n$ depend on the lengths of the transactions], - the formulas .darkblue[$A$, $B$ and $D$ depend on] the fixed parameters .darkblue[$k$ and $n$, and on the lengths of the transactions], and - the formula .darkblue[$C$ is a sum of sub-formulas depending on $k$ and the lengths of blocking transactions as well as the identifiers of blocking transactions].
$\Rightarrow$ .darkblue[Caching $A$, $B$ and $D$, and sub-formulas of $C$] with the key of the lengths of the transactions (orderd by lengths) (Example: using $(1,1,2)$ as key)
Further information in paper! ] --- # .darkblue[Optimizing Transaction Schedules] via .darkred.bold[Quantum Computing] .reference[
S. Groppe, J. Groppe: Optimizing Transaction Schedules on Universal Quantum Computers via Code Generation for Grover’s Search Algorithm. IDEAS'21
]
--- # .darkblue[Grover]'s Search Algorithm
.more-more-condensed[ - .darkblue[Black box] function $f:${$0,...,2^b -1$} $\mapsto$ {$true, false$} - Grover's search algorithm finds one $x \in${$0,...,2^b -1$},
such that $f(x)=true$ - if there is only .darkblue[one solution]: $\frac{\pi}{4}\cdot\sqrt{2^b}$ basic steps each of which calls $f$
Let $f'(b)$ be runtime complexity of $f$ for testing $x$ to be true:
.darkblue[$\Rightarrow O(\sqrt{2^b}\cdot f'(b))$] - if there are .darkblue[$k$] possible .darkblue[solutions]: .darkblue[$O(\sqrt{\frac{2^b}{k}}\cdot f'(b))$] ] --- # .darkblue[Overview] of Optimizing Transaction Schedules via Quantum Computing
--- # .darkblue[Encoding Scheme] .small-font36[of Transaction Schedules]
.more-more-condensed[ - $29 =$ .darkgreen[$000111$] .darkred[$01$] $_{binary} \equiv$ Core 0 .darkgreen[$[3,1]$] .darkred[$\mu_1=1$] .darkgreen[$[0,2]$] Core 1, some bits for .darkgreen[permutation] and some for .darkred[separators] - $b = (m-1)\cdot \left \lceil log_2(n-1)\right \rceil + \left \lceil log_2(n!-1)\right \rceil$ ] --- # Generated .darkblue[Black Box] Function .more-more-condensed[ - Quantum computation: .darkblue[circuit of quantum logic gates
$\Rightarrow$ circuit must be generated] dependent on the concrete problem instance, no general circuit to solve all instances of a problem - Sketch of algo: .smaller-font[
1. Determine Separators and Permutation
$O(m+n)$
2. Check Validity of Separators
$O(m)$
3. $\forall i$: Determine lengths of $i$-th transaction in permutation
$O(n\cdot log_2(n))$ with decision tree over transaction number
4. Check: Which separator configuration? For current case:
$O(n)$
4a. determine total runtime of core and check if it's below given limit
$O(n)$
4a. determine start and end times of conflicting transactions
$O(n\cdot log_2(min(n,c))+c)$ with decision tree over conflicting transactions (for $n>>c$) or transaction numbers (for $c>>n$)
5. Check: Do conflicting transactions overlap?
$O(c)$
.darkblue[$\sum:O(n\cdot log_2(n)+c)$]
]] --- # .darkblue[Complexity Analysis] .more-more-condensed.smaller-font[
Approach
CPU
Quantum Computer
Quantum Annealing
Preprocessing
$O(1)$
$O(n^2\cdot c)$
$O(m\cdot R^2\cdot(c\cdot m + n^2))$
Execution
$O(\frac{(m+n-1)!}{(m-1)!}\cdot (n+c))$
$O(\sqrt{\frac{n!\cdot n^m}{k}}\cdot(n\cdot log_2(n)+c))$
$O(1)$
Space
$O(n+m+c)$
$O((n+m)\cdot log_2(n))$
$O(m\cdot R^2\cdot(c\cdot m + n^2))$
Code
$O(1)$
$O(n^2\cdot c)$
$O(m\cdot R^2\cdot(c\cdot m + n^2))$
.bold[$m$:] number of machines .bold[$n$:] number of transactions .bold[$c$:] number of conflicts .bold[$R$:] max. runtime .bold[$k$:] number of solutions
] --- # .darkblue[Number of Solutions]
.more-more-condensed.smaller-font[
$m=2$
$m=4$
$N$
8,589,934,592
2,199,023,255,552
$k$
48,384,000
559,872
$k$ for
$\leq 1.25\cdot R_{opt}$
1,472,567,040
2,047,306,752
]
--- # .darkblue[QC4DB: .small-font36[Accelerating Relational Database Management Systems via Quantum Computing]] .more-more-condensed[ -
Project Website@Quantentechnologien
- Project .darkblue[funded by BMBF] - Duration 3 years, 1.8M Euros - Topics
.darkblue[Query Optimization]
.darkblue[Optimizing Transaction Schedules]
of an open source relational database management system - Partners - .darkblue[University of Lübeck] (Coordinator Sven Groppe) - Hardware-Acceleration of Databases - Website:
https://www.ifis.uni-luebeck.de/~groppe/
- .darkblue[Quantum Brilliance] GmbH - Room Temperature Diamond Quantum Accelerators - Website:
https://quantumbrilliance.com/
] --- # .darkblue[Summary & Conclusions] .more-more-condensed[ - .darkblue[Scheduling transactions as variant of jobshop problem with] additionally considering .darkblue[blocking transactions] - .darkblue[Hard combinatorial optimization problem $\Rightarrow$ hardware acceleration] - .darkblue[Enumeration of all possible transaction schedules] for finding an optimal one - .darkblue[Hardware acceleration via quantum .bold[annealing]] - Formulating
transaction schedule problem as
quadratic unconstrained binary optimization (
QUBO
)
problem
-
Constant execution time
in contrast to simulated annealing on classical computers
-
Preprocessing time
increasing with larger problem sizes - .darkblue[.bold[Grover]'s search]: $\approx$ .darkblue[quadratic speedup] on Universal Quantum Computers -
Estimation of number of solutions for
a further
speedup
- Estimation of speedup for suboptimal solutions being a guaranteed factor away from optimal solution -
Code Generator
available at
https://github.com/luposdate/OptimizingTransactionSchedulesWithSilq
]