The contribution of this paper is threefold: the first is a synthetic database benchmark/workload. The queries are meaningful in a grid context, assume an underlying data model made up of realistic grid resources, and populated with realistic data. The second is the application of the database benchmark on three vastly heterogeneous ``database'' platforms: a relational database, mySQL 4.0, that uses the SQL query language; Xindice 1.1, a native XML database that uses XPath as its query language; and MDS2, an LDAP database that uses LDAP as its query language. The final contribution is a metric that captures both tangible and less tangible aspects of information retrieval. Some of the results we obtain are unexpected; others provide quantified results to substantiate suspicions that the grid community already holds.
The appendix gives extensive details of the benchmark, including the representation of queries and scenarios in English, SQL, XPath, and LDAP.