Indiana University Bloomington

Luddy School of Informatics, Computing, and Engineering

Technical Report TR556:
Dynamic Querying of Streaming Data with the dQUOB System

Beth Plale and Karsten Schwan
(Sep 2001), 22 pages pages
Abstract:
Data streaming has established itself as a viable communication abstraction in data-intensive parallel and distributed computations, occurring in applications such as scientific visualization, performance monitoring, and large-scale data transfer. A known problem in large-scale event communication is tailoring the data received at the consumer. It is the general problem of extracting data of interest from a data source, a problem that the database community has successfully addressed with SQL queries, a time tested, user-friendly way for non-computer scientists to access data.

Leveraging the efficiency of query processing provided by relational queries, the dQUOB system provides a conceptual relational data model and SQL query access over distributed data streams. Queries can extract data, combine streams, and create new streams. The language augments queries with an action to enable more complex data transformations, such as Fourier transforms. The dQUOB system has been applied to two large-scale distributed applications: a safety critical autonomous robotics simulation, and scientific software visualization for global atmospheric transport modeling. In this paper we present the dQUOB system and the results of performance evaluation undertaken assess its applicability in data-intensive wide-area computations where the benefit of portable data transformation must be evaluated against the cost of continuous query evaluation.

Index terms: wide-area computations, grid computing, data streams, publish-subscribe event channels, SQL, relational data model, database query processing.

Available as: