Finalized:Monday, April 13, 2015
Author(s):Pham, Q. , T. Malik, B. Glavic and I. Foster
We present a light-weight database virtualization (LDV) system that allows users to share and re-execute applications that operate on a relational database (DB). Previous methods for sharing DB applications, such as companion websites and virtual machine images (VMIs), support neither easy and efficient re-execution nor the sharing of only a relevant DB subset. LDV addresses these issues by monitoring application execution, including DB operations, and using the resulting execution trace to create a lightweight re-executable package. A LDV package includes, in addition to the application, either the DB management system (DBMS) and relevant data or, if the DBMS and/or data cannot be shared, just the application-DBMS communications for replay during re-execution. We introduce a linked DB-operating system provenance model and show how to infer data dependencies based on temporal information about the DB operations performed by the application's process(es). We use this model to determine the DB subset that needs to be included in a package in order to enable re-execution. We compare LDV with other sharing methods in terms of package size, monitoring overhead, and re-execution overhead. We show that LDV packages are often more than an order of magnitude smaller than a VMI for the same application, and have negligible re-execution overhead.
Q. Pham, T. Malik, B. Glavic and I. Foster, "LDV: Light-weight database virtualization," 2015 IEEE 31st International Conference on Data Engineering, Seoul, 2015, pp. 1179-1190. doi: 10.1109/ICDE.2015.7113366This material is based upon work supported by the National Science Foundation under Grant No. 1440327. Opinions, findings, conclusions or recommendations expressed are those of the authors and do not reflect the views of the NSF.