|
|
|
Project coordinator
|
Prof. Giorgio Ghelli |
|---|---|
|
Research team
|
Giovanni Conforti, Paolo Manghi, Carlo Sartiani |
|
Development team
|
Davide Basile, Emanuela Cardigli, Federico De Faveri, Nicole Lazzeri, Christian Tiralosi |
| Past Members | Giovanni Cambria, Giovanni Pardini, Giovanni Battaglia, Paolo Tomei |
Project funded by the FIRB GRID.IT project.
This project presents the architecture and the self-administration algorithms of XPeer, a P2P XML database system. The architecture has a two-layer structure: at the lower level peers export local XML data through an overlay network organized as a super-peer tree hierarchy. Such infrastructure allows XQuery XML queries to be compiled, optimized, and run only over a set of relevant peers, thereby avoiding broadcast.
Unlike existing P2P systems, XPeer is capable of self-organizing its administrative layer so as to adapt to changes in the network topology and in the workload. By exploiting these key features, the actual workload processing power of XPeer can scale linearly in the number of peers in the system.
The huge popularity of p2p systems is mainly due the diffusion of some file-sharing and file-transfer protocols, which proved that such systems can actually be efficient and robust in face of very high volatility. However, such systems are extremely limited in the kind of queries they can support, in general key-matching broadcast queries. Much of the research on the construction of real p2p databases is currently aimed at the p2p decentralization of data integration mediators (e.g., Piazza [10], and CoDB [8]). These systems, usually based on the GLAV paradigm [9], enable each peer to reformulate queries according to a given set of mappings, with no need of centralized mediator. Schema translation is crucial in many application fields, but implies constant human administrative work. Moreover, due to the effort required to define the translation and partake the system, such architectures are usually apt for long-connected peers. Other systems, like XPeer [13], address application fields where schema integration is not an issue, such as communities where a well-known common schema is exploited, or situations where nobody is going to define a schema mapping anyway, hence any query has to be exploratory. In this context, dynamicity and scalability are the central concerns. Our Contribution This paper describes the architecture of XPeer [13], a p2p database system for XML data. XPeer differs from other p2p systems for two main key ideas. First, XPeer decouples query compilation, demanded to an overlay network, from query execution, which is directly managed by peers. Compilation and execution decoupling allows for a better handling of complex FLWR queries (e.g., nested queries), and allows the overlay network to perform distributed optimizations on the query plan and avoid broadcast. Unlike other p2p systems, for XPeer to work, no human intervention is needed and no nodes have to be permanently connected with special administration duties. Indeed, nodes in the overlay network are themselves peers, selected by the system itself to perform extra administrative tasks. XPeer adopts selfadministration policies, namely root cloning protocols and network protocols, to provide workload balance and robustness to network failures. The protocols dynamically optimize the number of administrative nodes required by the overlay network to handle the query demand of the connected peers. XPeer builds p2p databases where peers expose their local XML repository only during their connection time and where human intervention is limited to the explicit registration of a peer to the overlay network. Peers can specify complex XQuery queries, which will be compiled by the overlay network and run on relevant peers only. The system will automatically maintain the overlay network during its lifetime, counting on the available peers. An example of application scenario is given by GRID networks. Here applications are run over a network of peers, each exposing some of its resources, such as CPU or disk space, for some time. For an application to work, a minimum amount of resources is needed and must be guaranteed during application lifetime. To this aim, applications require a knowledge discovery infrastructure to identify, among the connected nodes, those that offer the required resources. XPeer would permit connected nodes to export information about the resources they provide and could be used by applications to identify the set of nodes that may contribute to their run.
In this section, we will describe the architecture of the overlay network of XPeer as well as its selfadministration policies. The overlay network of XPeer is devoted to the compilation of queries submitted by peers, which are directly executed by peers. This is in sharp departure from usual p2p systems, where query execution and compilation are interleaved, and, as shown below, is inspired by data integration systems. The overlay network is formed by a two-level tree hierarchy. The lower level of the tree comprises a set of super-peers, each one managing a cluster of peers. The upper level of the tree is formed by a root, which links clusters together. Peers connect to the system by registering to a super-peer and transmitting their schema to it. The union of all peer schemas is used by the super-peer as its own schema, which is in turn exported to the root, which thus gathers a view of all the data in the system. Any peer schema modification or peer deregistration is communicated as an update message to the responsible super-peer, which in turn notifies the root. We shall see that the root is essentially a set of administrative peers, namely clones, that replicate the same metadata and that are kept synchronized by means of a loose synchronization algorithm. Since the separation between query compilation and execution, the use of a tree-shaped hierarchy as well as the presence of clones are quite unfamiliar in p2p systems, a detailed discussion about them is necessary. The overall architecture is shown in Figure 1.
Figure 1. XPeer Architecture
The separation between compilation and execution is borrowed from data integration systems like YAT [6], where the query mediator first compiles a given query and then dispatches subqueries to the involved data sources. In XPeer, when a peer pi wants to execute a query q, it submits q to the overlay network through its super-peer spi. The overlay network matches the query structure with the schema information available at the root node. Such process identifies the super-peers responsible of potentially relevant peers, and asks them to return the corresponding addresses to pi to complete the compilation of q. The algebraic expression resulting from q is executed by pi, which directly contacts the peers involved according to the execution plan strategy. The decoupling of query execution from query compilation is not frequent in p2p systems. We opt for this solution for two main reasons. In the first place, most peers that do not contain relevant data can be discarded during query compilation, if fairly detailed schema information is used. As a consequence, the number of peers to be contacted during query execution significantly decreases. A similar effect can be obtained in usual p2p systems by using, at run-time, detailed routing tables; unfortunately, peers usually have routing information about their neighbors only, so the exploration of a wide fragment of the peer network is still necessary. In second place, complex queries, like those of the FLWR core of XQuery, need to be optimized for an efficient execution in a p2p system. Distributed query optimization is nearly impossible when a flooding scheme is used. On the contrary, the overlay network, by accessing schema information and statistics, can easily perform distributed optimization choices; for instance, when replicas are allowed, the overlay network can choose the best replica for a given dataset. Same considerations hold for nested query optimization: by relying on schema information, the overlay network can apply unnesting techniques, similar to those of [7], that are precluded in the absence of schema information. ... ... ... root super-peers peers clones Fig. 1. Overlay network architecture
The XPeer architecture described above is clearly prone to workload issues, for all query compilation requests issued by the peers reach the root, via the responsible super-peers1. This problem, which hindered the adoption of tree overlays in p2p systems, is solved in XPeer by enforcing network protocols and root cloning protocols, as described below.
XPeer is capable of adapting its overlay to changes in the topology and, in particular, in the workload of the system. The administrative nodes of the two levels of the hierarchy use different policies to react to a change, most notably an increasing workload. Consider, for instance, a super-peer spi managing a cluster ci, and assume that peers in the cluster start sending too many query requests to spi. spi discovers this situation by monitoring its message queues; hence, when the length of the query queues exceeds a given threshold for a given period of time, spi can decide to split the cluster. Cluster splitting consists in the creation of a new cluster, managed by a new super-peer and containing a subset of ci peers. Cluster splitting, hence, requires spi to find a peer pj willing to become super-peer, and to identify an adequate subset of ci peers to move in the newly created cluster. If no peer wants to adopt peers in ci, spi just disconnects these peers. At this time, the self-management algorithm of spi just selects a minimal set of peers of ci to be discarded from the cluster; in future versions of XPeer, we plan to implement more sophisticated choice strategies based on schemas too.
Cluster splitting allows a super-peer to distribute its load to another (new) super-peer; however, the splitting does not change the overall load reaching the root of the hierarchy, since the combined load of the new cluster and of the existing cluster remains unchanged. Changes in the root workload are addressed by generating root clones. Under this technique, the root of the hierarchy is a virtual node, comprising a collection of nodes (clones) hosting copies of the union schemas. Clones are synchronized through a lazy synchronization algorithm, which distributes synchronization messages among distinct time intervals, so that the synchronization overhead for each clone is constant. Of course, a lazy synchronization may lead to inconsistencies in the clone schemas, which are, in general, not aligned. These inconsistencies do not represent a real problem, since p2p systems are usually assumed to return incomplete or even approximate responses to user queries. Moreover, the synchronization protocol ensures, under reasonable assumptions, that every single inconsistency is eventually resolved after some time intervals. By monitoring its message queues, a clone can determine if its load is too heavy; in that case, it searches for a peer willing to perform the administrative tasks of a clone. Since super-peers distribute query and update requests among clones in a random way, the average queue length is almost the same for all clones, hence race conditions in the generation of new clones may arise. To overcome this issue, 1 first-n queries, i.e., queries asking for a number n of results only, may not reach the root and could be introduced as an optimization. the system adopts a twofold strategy: first, each clone has a different threshold for the activation of the cloning process, obtained by using a semi-random technique, so to decrease the probability of concurrent cloning activations; second, the cloning process is governed by a distributed consensus algorithm[11] that allows the system o take almost consistent decisions.
We briefly compare XPeer with three notable p2p data sharing systems: Piazza [10], Maier’s system [12], and KadoP [5]. These systems are representative of unstructured and structured data sharing systems. For an ample review of p2p database systems, we refer the reader to [3].
Piazza is a decentralized data integration system for XML data. The system is formed by a network of loosely interconnected and autonomous peers, which share XML data and schemas. Each peer is connected to a very limited set of neighbors (usually one or two) through a set of schema mappings, that are used for reformulating queries. Peer schema mappings are designed by a local administrator, who is also in charge of maintaining and updating local mappings. The presence of mappings limits the scope of Piazza to almost static environments, where the network topology changes rarely. Query processing is based on a flooding algorithm that, even if optimized, severely limits the scalability of the system. Hence, Piazza focuses on sophisticated schema integration but does not address our core issues of high dynamicity, zero administration, and query routing.
In [12], the authors describe a coordinator-free architecture for distributed XML query processing in the context of p2p systems. The architecture is based on two key ideas: mutant query plans (MQP), and multi-hierarchic namespaces. An MQP is a logical query plan, where leaf nodes may consist of URN/URL references, or of materialized XML data. An MQP traverses the system, carrying partial results and unevaluated sub-plans, until it is fully evaluated, i.e., it becomes a constant XML fragment. MQPs are routed in the system according to information derived from multi-hierarchic namespaces. Indeed, authors assume that data contributed by peers are semantically connected, i.e., they are part of the same namespace, where a namespace is formed by several category hierarchies. This assumption makes the system not adequate when data are semantically heterogeneous. Moreover, since MQPs browse the network on the basis of namespace information, the number of messages required for query evaluation is comparable with that of flooding systems.
KadoP is a p2p content sharing system based on the use of DHT tables (in the form of FreePastry [1]) and Active XML documents [4]. KadoP allows users to share XML documents, web services, as well as AXML documents (e.g., XML documents with embedded service calls); resource sharing is improved by the use of ontologies. Peers are organized in a DHT ring, where a distributed full-text index about documents and web services is stored; this index is used during query processing for locating interesting data and services. The presence of the index makes peer connections quite expensive, since newly shared documents must be indexed by the system; for the same reason, document updates, while automatically managed by service call-backs, may require expensive index updates, which imposes a clear limit to the volatility of the system. On the contrary, XPeer processes updates, both in the topology and in the data, with a constant number of messages. Queries are processed in KadoP by (1) locating interesting resources on the network, and by (2) directly contacting involved peers, as happens with XPeer. Resource location requires the system to perform many key lookups in the DHT index, with a significant messaging cost, which depends on the dimension of the query, as well as on the data and services involved by the query.