SlideShare a Scribd company logo
1 of 8
Download to read offline
Distributed Databases for Challenged Networks
                                             Vinoth Chandar, Ashwini Athalye


    Most delay tolerant networking research has focussed          for query processing in a distributed Database for DTN
on developing routing protocols that optimize data delivery       environments. The benefits that an application can enjoy from
of a single data unit from a source to potentially multiple       such a system are location transparency and data decoupling.
destinations. However, distributed applications built on top of   Achieving location transparency in a DTN setting obviously
such intermittently connected networks, typically involve more    results in more flexible applications.
complex data access mechanisms. Distributed databases are
widely used to capture such interactions. Thus, exploring the       To our knowledge, no previous work has examined this
feasibility of building a distributed database system for DTNs    problem of distributed query processing in a completely
is an intriguing research problem. Applications can benefit        decentralized DTN environment. This work tries to put into
immensely from the resulting location transparency and data       perspective the issues that are to be addressed in order to
decoupling.                                                       build a practical system, over which multitude of interesting
   The paper proposes a distributed database system for such      applications can be written. The problem requires additional
networks. The key contributions of this work are 1) practical     mechanisms to improve latency involved in answering queries,
heuristics for query scheduling 2) a prereplication scheme        due to the inherently time varying nature of the links. Such a
that reduces the cost of on-demand retrieval by actively pre-     scheme is vital to ensure the practical usefulness of the system.
caching data. To our knowledge, this is the first work to
examine these issues in an unified framework. We describe             The paper is organized as follows. Section 2 details the
the results of several simulations conducted on a wide variety    relevant background and the other related research in this
of artificial and real world traces.                               area. Section 3 presents the motivations and challenges to
                                                                  solving this problem.Section 4 discusses the problem of query
                     I. I NTRODUCTION                             scheduling in DTNs. Section 5 presents the prereplication
   DTN is a category of networks that is characterized by         scheme proposed. Section 6 and 7 contain details about meta-
intermittent connectivity, irregular connectivity patterns,       data exchanges involved and the packet scheduling criteria
unpredictable communication opportuntity and diverse              involved. Section 8 presents detailed results from simulations
devices. As explained in [1], these types of networks are         of the scheme under various scenarios. This is followed by a
becoming important with the pervasiveness of wireless             short observations section. Section 10 concludes after outlining
technology. For example, the campus wide deployment               some of the future directions that we intend to pursue.
of wireless technology coupled with the large number of
users having laptops/cell phones with wireless capability,                  II. BACKGROUND AND R ELATED WORK
provides enormous opportunities for exchange of ’good-to-            Traditional query processing schemes in distributed
know’ information amongst students. Such good-to-know             databases construct an execution plan for an input query based
information can substantially augment the quality of decision     on the sites participating in the query and the size of the
making in daily life. In the past, there have also been many      data set that needs to be moved from one site to another
real world DTN deployments like Zebranet[11], DakNet[12]          for the execution of the query. Algorithms that optimize for
and DieselNet[13]. Zebranet tries to study the movement           response time typically do so by reducing the amount of data
patterns and the behaviour of zebras, by attaching sensor         shipped between sites, to execute the query. These schemes
nodes to the Zebras. Daknet is one of the earliest practical      [5] however assume a fixed network topology which means
DTN systems, that provides Internet connectivity to rural         that there are statically defined paths between nodes and that
villages in India. DieselNet is a DTN infrastructure deployed     they are always up. This is contrary to what DTNs possess.
over the bus transport system in Amherst. These practical         In a DTN, simple optimization for communication cost may
systems have clearly proved the utility of delay tolerant         not minimize the response time, since there can be schedules
networking.                                                       which involve earlier contacts between sites, leading to lower
                                                                  delays, even with larger data shipping. Hence query scheduling
   Thus, the wide applicability of DTNs call for a deeper look    is consequently more complex for such an environment. This
into the needs of the applications that can be supported over     is the focus of the research conducted in this paper.
it. [2] stresses on the need to develop robust mechanisms            ICEDB [8] proposes a system, where in users query a web
for DTN based collaborative applications to improve data          portal, connected to devices on the road. Devices are sensor
propagation rates in realistic deployments. Using this            nodes that build databases with the information of interest. The
information as a guiding light, we propose mechanisms             portal is responsible for sending the user queries and getting
the results back from the devices. The queries are annotated       track the nodes that carry interesting information, increasing
with priorities and the devices schedule results to be moved       the complexity of the application.
to the portal based on the priorities. This work illustrates          Hence, if applications query using an indirection of database
the utility of using database queries to retrieve content from     tables, which are then mapped to the nodes that contain
intermittently connected nodes, since the application can work     fragments of the global information, it can lead to reduced
with a simpler abstraction of a database table. [6] builds a       overhead for applications and seamless integration of different
federated database over Bluetooth. However, it assumes that        applications to work with each other. Moreover multiple
there is no mobility and hence, does not apply to most practical   applications can be built on top of this layer of abstraction
scenarios. [9] is a mobile surveillance system, where buses        since they no longer need to have node specific information.
upload spatially and temporally tagged images to a base station    In essence, building a distributed database system over DTNs
placed in the bus depot. These data stores are then queried        would foster the development of myriad of interesting appli-
from a central cloud server.                                       cations, that in turn enrich the practical benefits of conducting
   A survey of the related work gives insight into some            delay tolerant networking research.
practical aspects of DTNs. [7] formally analyzes the problem
of DTN routing at various degrees of knowledge of the future                         IV. Q UERY P ROCESSING
and the network. MaxProp[10] is a DTN routing protocol that
builds relative meeting probabilities and uses it to forward          In conventional distributed databases, selectivity factors are
packets to a destination. RAPID [3] is a DTN routing protocol      used to perform reductions on JOIN operations and optimize
that can be optimized specifically for a routing metric such as     for the response time of the query. Computing the selectivity
worst-case delay, average delivery delay. In this paper DTN        factor is a linear time operation that can be performed instan-
routing has been formulated as a resource allocation problem       taeneously. However, the traditional schemes break in a DTN
under a realistic scenario where both storage and bandwidth        since there is no way of maintaining selectivity factors in a
are limited. The selected routing metric is used to prioritize     DTN. Most traditional schemes reduce the response time by
a packet at each hop and locally optimize the delivery of          reducing the amount of data shipped across the network. Since
packets in the order of the change in their utility function.      the data size shipped, directly contributed to the delay involved
Another interesting work on DTN routing [4], DPSP, is tuned        in processsing the query , in a DTN setting, our approach
for applications that use a publish scheme to advertise content,   largely pursues this path.
on topics and the nodes subscribe to different topics. This           We work with approximations of contact times/probabilities
scenario does not require either the content provider or the       through metadata exchanges between the nodes. Deadlines are
content subscriber to have knowledge about each other. Also,       useful in identifying the time until the data is valid. This
the knowledge given out by the publisher and the subscriber is     correlates closely with the type of ’good-to-know’ informa-
made use of to effectively allocate the resources and to prevent   tion that the system intends to work with. Queries can be
flooding of the network with replicated packets.                    tagged with deadlines to indicate the amount of time that the
   To summarize, the approach to query processing in DTNs          results will be useful for. We also intend to reduce the query
has been largely centralized i.e one central server holds the      processing delay, by sending results as and when intermediate
node-to-data mapping and uses it to perform query scheduling.      results are available, without waiting for the completion of the
Most practical DTN routing mechanisms work with heuristics         entire query, that potentially provides additional results. The
and rely on empirical results to prove their usefullness, in an    duplicate results can be filtered out through a delta detection
attempt to keep the problem tractable.                             mechanism at either the query initiator or the intermediate
                                                                   nodes, using standard techniques.
              III. M OTIVATION AND C ONTEXT                           We assume that applications that talk to each other have
                                                                   knowledge about the database tables. The knowledge about
   DTNs offer opportunities to develop some exciting dis-          the horizontal fragmentation, including cardinalities, can be
tributed applications. For example, a newcomer to Austin           built based on metadata exchanges. The packets that are
might want to know about good places to dine, live music           sent through the network can be either those carrying the
events etc. A distributed peer to peer application can help        query or packets that carry the results. We divide the set
him to query the people around for this information, even          of queries that are issued by applications into Unions and
without Internet connectivity. Typically, this good-to-know        Joins. Unions consist of queries on single tables that are
information is not very time sensitive and hence, can be           horizontally fragmented across various nodes. Unions are
delivered over DTNs. Existing routing protocols in DTNs            simply sent to all the nodes that the query initiator knows
typically operate with a known source and destination. The         of, which contain the target query. Joins consist of queries on
application developer needs to schedule packets to different       multiple tables that can be horizontally fragmented and have
destinations, according to the needs of the application. The       at least one common attribute. Joins require a much more
problem becomes worse in case of open networks, where not          complicated query scheduling mechanism. Joins, which are
every node knows each other, even though unknown nodes             costly operations even in a conventional distributed database
may carry relevant information. Thus, the application needs to     system, are far more expensive in DTN environments.
For distributed databases, a number of factors contribute              For each t in T:
to heuristics that determine how the query gets processed.                    B[q] += cardinality(t)
These are typically the cardinality of a database at a node,          Processing node = i , such that
the size of a tuple in the database, and together the two              B[i] = max(B[q]), for all q in Q.
numbers give the size of the data that needs to be moved
between sites for complex queries. Query processing in DTN            C. Nearest site
would require heuristics to be designed such that they take              This heuristic picks the site which all of the relevant
into account, packet scheduling criteria for DTN as well as           sites (sites that house the relevant data) are more likely to
database query optimization parameters. In this paper, we             meet. Note that this processing site may not be amongst the
discuss two heuristics for scheduling Joins 1)Heaviest site           relevant sites. The relevant sites may accumulate data at some
2)Nearest site.                                                       intermediate node, that every node is most likely to meet.
                                                                      This heuristic tries to directly optimize for the time spent in
A. Optimal scheme
                                                                      getting the required data to the processing site. The meeting
   Assuming the knowledge of contact times and contact                probabilities used in MaxProp[10] are used directly to compute
durations of the nodes helps in improving the performance             the nearest site.
of routing mechanisms [7]. In many real life scenarios, such
accurate contact information can be obtained, such as schedule        N : set of nodes known to current node
of students in a campus. But, when this information is not            Q : set of all nodes that are relevant
available,statistical approximations can be used in their place,       to this query.
based on contact histories. Even with this complete informa-          C[n] : cummulative meeting probability
tion, computing an optimal path is NP Hard [3].Given that the           of nodes in Q to each node n in N
DTN routing problem, even with a perfect Oracle is NP Hard,           Pseudo-code:
the problem of scheduling the queries, which involves moving          For node n in N:
relevant data to the processing site in the best possible manner          For each q in Q:
also becomes NP hard, since the query processing problem is                   C[n] += meetingprobability(q,n)
composed of independent subproblems of the routing problem.           Processing site = i such that
   The non availability of selectivity factors, precludes any             C[i] = max(C[n] ), for all n in N
chances of reducing joins at intermediate sites. Also, with                               V. P REREPLICATION
the prereplication scheme discussed in section 6, even floating
approximate and possibly stale values of the selectivity factors         As mentioned earlier, the cost of on-demand retrieval is
in the network might not perform well, since the prereplication       much higher in a delay tolerant network, since the links may
would invalidate the selectivity factors very quickly due to          not become available again for a long time. In such cases,
addition of new content at the node. Moreover, selectivity            actively moving content to nodes that are likely to be involved
factors are very specific to a particular query. For a selectivity     in relevant queries, would help reduce the latency involved in
factor to be useful, the new query needs to match very closely        answering the queries. Each node builds a popularity index
to the original query. Hence, the relevant data must be brought       for table combinations that are involved in the queries that it
to a single common processing site, for query evaluation. In          sees. Nodes also build the popularity index through exchange
this paper, we primarily focus on choosing this processing site.      of this information amongst themselves. Thus, a global pop-
Hence, we try to approach the problem using heuristics, local         ularity metric is built for all the table combinations. Over a
optimizations, and techniques from existing work on DTN               longer duration of system operation, this metric captures the
routing and distributed database query optimization.                  probability that a future query will be generated for each of
                                                                      the table combinations. Let us define the servability, of a node
B. Heaviest site                                                      as the amount of popular content possessed by the node. When
   This heuristic tries to optimize for lower delays, by moving       two nodes meet, they exchange more popular content amongst
the data to the site which possesses the largest amount of            themselves.
data relevant to the query. The heaviest site is chosen as the
                                                                      When a node meets another,
processing site. Note that the processing site is one among
the sites that have the relevant data. The intuition is that if the    * Updates popularities from other node
heaviest site is prevented from shipping its data elsewhere,           * Computes a Integer Knapsack to deter
                                                                         -mine the transfers that are to be
the lesser bandwidth consumption would result in significant
                                                                         made to increase servability
savings in the response time.
                                                                       * Utility of transfering a table frag
Q : set of all nodes that are relevant to                                -ment to another node is the amount
 this query.                                                             of increase in servability due to
T : set of tables that are required.                                     that transfer
Pseudo-code:                                                           * Cost of transfer is the bandwidth
For each node q in Q:                                                    consumed
* The maximum cost that can be incurr                               -ed by deadline
   -ed is total bandwidth of the conta                              Exchange prereplication content.
   -ct Schedules the transfers that it
                                                                       Note that the prereplication content is sent at the last, in
   needs to make to the other node
                                                                    effect , using only any amount of spare bandwidth that may
A. Returning results                                                be available. Thus, prereplication is used as an add-on and it
                                                                    does not affect the data traffic.
   As more content is prereplicated, the more popular content
is widely spread across the nodes in the network. As a result,                               VIII. E VALUATION
the number of relevant sites in a join query increases with
                                                                       We have implemented our scheme on a DTN simulator
the amount of prereplication. Hence, the latency involved in
                                                                    called [14] The ONE [see figure 1]. This Simulator provides
processing the join also increases prohibitively. To keep the
                                                                    the following desirable features which led us to eventually
latency low, two mechanisms can be used to return the results.
                                                                    choosing it for our project:
   • Best : The processing site waits for all table fragments
                                                                       • Various Routing mechanisms like Epidemic Routing,
     from relevant sites to arrive, to begin processing the query
                                                                         MaxProp Routing, Spray and Wait etc. are supported.
   • Fast: The processing site processes the query and sends
                                                                       • Different mobility models for simuating movement of
     the result out, as soon as atleast one portion of all the
                                                                         nodes.
     tables involved in the join are available.
                                                                       • A visualization tool to view node movement as well as
   The two mechanisms are a tradeoff between accuracy of                 data exchanges.
results (best) and response time for the query (fast). We
considered two mechanisms at the two extremes for simplicity
of analysis. Obviously, an intermediate scheme that balances
between the two can also be used.
                VI. M ETADATA EXCHANGES
   The unstructured nature of DTNs requires that a mechanism
be in place for each node to either have a global view of the
nodes in the system or have some way of updating its local
information based on interactions with other nodes. The latter
involves learning state of neighbors by exchanging relevant
metadata between nodes in contact. Gradually as more and
more node exchanges happen, the metadata grows and gets
replaced with newer state.
   The following meta data are exchanged
   • Meeting probability, as described earlier
   • List of tables possessed by each node, and their cardinal-
     ities
   • Popularity indexes
                                                                                   Fig. 1.    Screenshot of ONE Simulator
                 VII. PACKET SCHEDULING
                                                                       We have implemented our scheme on top of the MaxProp
  The packet scheduling scheme used must reflect the goals           Router, provided by the simulator. The router maintains meet-
of the database layer above. It needs to decide on specific          ing probabilities of each node with other nodes it meets, which
per-packet and per-packet-per-hop metrics to replicate packets      is useful for nearest site selection.
in the network. We chose replication over forwarding since             As mentioned earlier, there is no existing work in this
replication has better chances of delivering the packets quickly,   space to compare our scheme against. Hence we have tried to
even though it involves more resources. The following sum-          evaluate our scheme by testing it against different movement
marizes the packet scheduling scheme used in the paper.             models. We have also varied the number of nodes in the
When node A meets node B                                            network to see the effect of node size on the performance
Exchange metadata                                                   of the proposed scheme. Through our evaluation, we hope to
Exchange directly deliverable content                               get an insight into the goodput of the scheme as well as its
  ordered by deadline                                               applicability to different DTN scenarios.
For all packets p in buffer:                                           We have analysed our scheme against 5 movement models:
    if B has higher meeting probability                                • Random Way Point: This movement model captures
       to dest[p] than A:                                                completely randomized node movement.
          Pick p for transfer                                          • Shortest Path Based: This model simulates the movement
Send all picked packets in buffer order                                  of people. The nodes move along fixed paths, however
their movement along these fixed paths is random and           real world events are roughly deterministic, it serves as a good
     based on certain points of interest.                          starting point for our analysis. The graphs in figure 2,3 convey
  • Map Based: This movement model simulates DTN where             results generated for 20 and 40 nodes in a Random Way Point
     nodes have periodicity in their movement. We have             model.
     utilized the movement of trams in Helsinki as part of
     this model.
  • DieselNet: This is a bus system that has been deployed
     by UMass consisting of 34 buses.
  • ZebraNet: This is a trace of a network of 5 zebras which
     have sensors deployed on them.
  The first 3 movement models are provided by the simulator.
Due to simulation time constraints, we have performed anal-
ysis on these models with 20 and 40 nodes. The remaining
2 movement models - DieselNet and ZebraNet are real traces
which were converted into a format that was acceptable to the
simulator. Hence we have been able to run our scheme on
simulation models as well as real traces.
A. Metrics
   To evaluate the performace of our scheme, we consider the
following metrics.
                                                                      Fig. 2.   Query processing throughput for Randomly moving nodes
   • Percentage of answered queries: This metric was choosen
     to get the throughput of the system for the total of
     forward(path along which query is sent) and reverse
     paths(path along which query result is sent).
   • Percentage of Unions: This metric gives the peformance
     for single table queries.
   • Percentage of Joins: This metric gives the peformance
     for multi table queries. This metric is especially useful
     to verify our scheme since we expect performance to
     improve for joins, especially with our processing site
     selection scheme and the prereplicaton scheme.
   • Average query latency: This metric is again important
     to understand the behavior of our scheme. We expect
     our scheme to reduce the query latency i.e. decrease the
     delay between the time when the query was issued by a
     node and the time that it received the results. The average
     query latency also provides a useful insight into the kind
     of deadlines that each DTN scenario is typcially able to
     handle.                                                                     Fig. 3.   Latency for Randomly moving nodes

B. Scenarios
  We present the results of the metrics introduced in the          D. Map based Tram movements
previous section, for three scenarios.                                This model simulates the movement of trams in Helsinki,
  • No repl - This represents the behavior of the system, with     using WKT files that were generated from a real GIS system.
     no prereplication, ’best’ return of results and heaviest      It simulates periodicity in meeting patterns. Figures 4,5 show
     site query processing. This serves as the vanilla scenario    the performance of the scheme for 20 and 40 people in the
     against which we compare the next two scenarios.              system.
  • Repl(Heavy,Fast) - This involves prereplication, ’fast’
                                                                   E. Shortest path People movement
     return of results and heviest site query processing.
  • Repl(Near,Fast) - This is same as Repl(Heavy,Fast) ex-            With the popularity of portable devices like hand held PDA’s
     cept that nearest site query processing scheme is used.       and cell phones, DTNs with such devices as nodes is another
                                                                   potential scenario. Hence the movement patterns for people
C. Random Way point                                                give an idea of connectivity in such an environment. The
  The random way point model simulates randomly moving             scheme was evaluated for 20 and 40 nodes in the system [see
nodes. Although, it is a unrealistic mobility model since most     figure 6,7].
Fig. 4.   Query processing throughput for Tram system                        Fig. 7.    Latency for People interactions



                                                                     were generated by contacts between metro buses, in Amherst.
                                                                     The number of nodes in the DieselNet traces is 34. Figures
                                                                     8,9 present the results of the simulation run for about a week.




                   Fig. 5.   Latency for Tram system




                                                                               Fig. 8.   Query processing throughput for DieselNet



                                                                     G. ZebraNet
                                                                        Monitoring wildlife and natural phenomena is another in-
                                                                     teresting area for DTNs. It also presents a good case, where
                                                                     a query processing scheme could be invaluable. ZebraNet is
                                                                     a real world deployment of sensors on 5 zebras. See figure
                                                                     10,11 for results.
                                                                                               IX. O BSERVATIONS
                                                                        In general, Pre-replication provides improvements in latency
     Fig. 6.   Query processing throughput for People interactions
                                                                     as well as the success ratio of the queries.However, the extent
                                                                     of benefit varies from one movement model to another.
F. DieselNet                                                            • The pre-replication benefits are around 30-40% for suc-
   DieselNet is a realtime trace which exemplifies the appli-               cess rate.
cability of our scheme for a real world scenario. The traces            • Benefits for latency are around 30-50%.
latency.
                                                                          X. C ONCLUSION AND F UTURE W ORK
                                                                  This is the first work to address the problem of query
                                                               scheduling in a completely decentralized and adhoc , delay
                                                               tolerant network. This is a very interesting research area with
                                                               enormous scope for improvement. We intend to pursue further
                                                               research along these lines.
                                                                  • Building a DB Aware packet replication scheme with
                                                                    multicast capability would help improve performance
                                                                    since the query issuals are inherently multicast traffic.
                                                                  • If the intermediate nodes cache previous results, queries
                                                                    could be answered from these caches, improving latency.
                                                                    However, this would involve complex tradeoffs in buffer
                                                                    management. Also, the applications should be willing to
                                                                    accept some degree of staleness in the results.
                   Fig. 9.    Latency for DieselNet
                                                                  • Given that many real world DTN environments have near
                                                                    periodic events, a local prereplication scheme could be
                                                                    used wherein the nodes perform the prereplication based
                                                                    on the queries that they answer and route. This is different
                                                                    from the current scheme where the global popularities are
                                                                    used.
                                                                  • A very important contribution would be to improve the
                                                                    heuristics used for query scheduling. Ways to reduce
                                                                    multi table joins, parallelly in this environment, is a open
                                                                    problem.
                                                                  • In our scheme we were not able to use bandwidth between
                                                                    nodes as a metric since the simulator doesnot provide a
                                                                    way to extract contact durations between nodes. Looking
                                                                    into this aspect can yield better more practical heuristics.
                                                                  • With better simulation results, the system needs to be im-
                                                                    plemented on real hardware to exemplify its real benefits.
                                                                  The work identifies an area that has not received due
        Fig. 10.   Query processing throughput for ZebraNet    attention before. We have proposed query scheduling and
                                                               prereplication mechanisms to enable applications to manage
                                                               data in a flexible manner, in a DTN setting. The simulation
                                                               results prove the validity of the approaches, followed in the
                                                               paper. We strongly believe that such a distributed solution,
                                                               will really foster the growth of killer applications that can
                                                               popularize the notion of DTNs.
                                                                                     XI. R EFERENCES
                                                                  [1] Kevin Fall, Delay Tolerant Network Architecture for
                                                               Challenged Internets, Sigcomm 03, August 25-29
                                                               [2] Xuwen Yu and Surendar Chandra, Delay tolerant
                                                               collaborations among campus-wide wireless users, IEEE
                                                               Infocom 2008
                                                               [3]Aruna Balasubramanian, Brian Neil Levine and Arun
                                                               Venkataramani, DTN Routing as a Resource Allocation
                                                               Problem, Sigcomm 07, August 27-31
                                                               [4] Janico Greifenberg and Dirk Kutscher, Efficient
                   Fig. 11.   Latency for ZebraNet             Publish/Subscribe-based    Multicast  for    Opportunistic
                                                               Networking with Self Organized Resource Utilization,
                                                               IEEE 2008
•   In most cases the Nearest site heuristic performs better   [5] Donald Kossmann,The State of the Art in Distributed
    than Heaviest Site in terms of sucess rate as well as      Query Processing, ACM 2001
[6] Hassan Artail, Haidar Safa and Manal Shihab,
Implementation of a Federated Database on Bluetooth-
Enabled Mobile Devices, ICPS 2008, July 6-10, ACM 2008
[7] S. Jain, K Fall, R Patra, Routing in delay tolerant network,
ACM SIGCOMM 2004, pg 145-158
[8] Yang Zhang, Bret Hull, Hari Balakrishnan and Samuel
Madden,ICEDB: Intermittently connected continuous query
processing
[9] Stewart GreenHill, Svetha Venkatesh, Distributed query
processing for mobile surveillance, Proceedings of the 15th
international conference on Multimedia, 2007
[10] J. Burgess, B. Gallagher, D. Jensen, and B. N. Levine,
MaxProp: Routing for Vehicle-Based Disruption-Tolerant
Networks. In Proc. IEEE Infocom, April 2006
[11] Philo Juang and Hidekazu Oki and Yong Wang and
Margaret Martonosi and Li-shiuan Peh and Daniel Rubenstein,
Energy-Efficient Computing for Wildlife Tracking: Design
Tradeoffs and Early Experiences with ZebraNet,ASPLOS-X
conference,San Jose,Oct 2002
[12]Pentland, A.; Fletcher, R.; Hasson, A., ”DakNet:
rethinking connectivity in developing nations,” Computer ,
vol.37, no.1, pp. 78-83, Jan. 2004
[13]            UMass              DieselNet             Project:
http://prisms.cs.umass.edu/dome/umassdieselnet
[14]              The              ONE                Simulator:
http://www.netlab.tkk.fi/tutkimus/dtn/theone/

More Related Content

What's hot

An efficient hybrid peer to-peersystemfordistributeddatasharing
An efficient hybrid peer to-peersystemfordistributeddatasharingAn efficient hybrid peer to-peersystemfordistributeddatasharing
An efficient hybrid peer to-peersystemfordistributeddatasharingambitlick
 
ODRS: Optimal Data Replication Scheme for Time Efficiency In MANETs
ODRS: Optimal Data Replication Scheme for Time Efficiency In  MANETsODRS: Optimal Data Replication Scheme for Time Efficiency In  MANETs
ODRS: Optimal Data Replication Scheme for Time Efficiency In MANETsIOSR Journals
 
Improving the Proactive Routing Protocol using Depth First Iterative Deepenin...
Improving the Proactive Routing Protocol using Depth First Iterative Deepenin...Improving the Proactive Routing Protocol using Depth First Iterative Deepenin...
Improving the Proactive Routing Protocol using Depth First Iterative Deepenin...Yayah Zakaria
 
AN EFFICIENT ROUTING PROTOCOL FOR DELAY TOLERANT NETWORKS (DTNs)
AN EFFICIENT ROUTING PROTOCOL FOR DELAY TOLERANT NETWORKS (DTNs)AN EFFICIENT ROUTING PROTOCOL FOR DELAY TOLERANT NETWORKS (DTNs)
AN EFFICIENT ROUTING PROTOCOL FOR DELAY TOLERANT NETWORKS (DTNs)cscpconf
 
A Neighbourhood-Based Trust Protocol for Secure Collaborative Routing in Wire...
A Neighbourhood-Based Trust Protocol for Secure Collaborative Routing in Wire...A Neighbourhood-Based Trust Protocol for Secure Collaborative Routing in Wire...
A Neighbourhood-Based Trust Protocol for Secure Collaborative Routing in Wire...IJCSIS Research Publications
 
Peer to peer cache resolution mechanism for mobile ad hoc networks
Peer to peer cache resolution mechanism for mobile ad hoc networksPeer to peer cache resolution mechanism for mobile ad hoc networks
Peer to peer cache resolution mechanism for mobile ad hoc networksijwmn
 
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...IRJET Journal
 
In network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyIn network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyGungi Achi
 
A Comprehensive Study on Vehicular Ad-Hoc Delay Tolerant Networking for Infra...
A Comprehensive Study on Vehicular Ad-Hoc Delay Tolerant Networking for Infra...A Comprehensive Study on Vehicular Ad-Hoc Delay Tolerant Networking for Infra...
A Comprehensive Study on Vehicular Ad-Hoc Delay Tolerant Networking for Infra...inventionjournals
 
Content Sharing over Smartphone-Based Delay-Tolerant Networks
Content Sharing over Smartphone-Based Delay-Tolerant NetworksContent Sharing over Smartphone-Based Delay-Tolerant Networks
Content Sharing over Smartphone-Based Delay-Tolerant NetworksIJERA Editor
 
A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...
A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...
A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...Tal Lavian Ph.D.
 
Congestion control, routing, and scheduling 2015
Congestion control, routing, and scheduling 2015Congestion control, routing, and scheduling 2015
Congestion control, routing, and scheduling 2015parry prabhu
 
Cloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for MapreduceCloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for MapreduceAIRCC Publishing Corporation
 
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTS
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA  AND CHORD DHTS COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA  AND CHORD DHTS
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTS ijp2p
 

What's hot (16)

An efficient hybrid peer to-peersystemfordistributeddatasharing
An efficient hybrid peer to-peersystemfordistributeddatasharingAn efficient hybrid peer to-peersystemfordistributeddatasharing
An efficient hybrid peer to-peersystemfordistributeddatasharing
 
ODRS: Optimal Data Replication Scheme for Time Efficiency In MANETs
ODRS: Optimal Data Replication Scheme for Time Efficiency In  MANETsODRS: Optimal Data Replication Scheme for Time Efficiency In  MANETs
ODRS: Optimal Data Replication Scheme for Time Efficiency In MANETs
 
Improving the Proactive Routing Protocol using Depth First Iterative Deepenin...
Improving the Proactive Routing Protocol using Depth First Iterative Deepenin...Improving the Proactive Routing Protocol using Depth First Iterative Deepenin...
Improving the Proactive Routing Protocol using Depth First Iterative Deepenin...
 
AN EFFICIENT ROUTING PROTOCOL FOR DELAY TOLERANT NETWORKS (DTNs)
AN EFFICIENT ROUTING PROTOCOL FOR DELAY TOLERANT NETWORKS (DTNs)AN EFFICIENT ROUTING PROTOCOL FOR DELAY TOLERANT NETWORKS (DTNs)
AN EFFICIENT ROUTING PROTOCOL FOR DELAY TOLERANT NETWORKS (DTNs)
 
Ijcatr04071003
Ijcatr04071003Ijcatr04071003
Ijcatr04071003
 
A Neighbourhood-Based Trust Protocol for Secure Collaborative Routing in Wire...
A Neighbourhood-Based Trust Protocol for Secure Collaborative Routing in Wire...A Neighbourhood-Based Trust Protocol for Secure Collaborative Routing in Wire...
A Neighbourhood-Based Trust Protocol for Secure Collaborative Routing in Wire...
 
Peer to peer cache resolution mechanism for mobile ad hoc networks
Peer to peer cache resolution mechanism for mobile ad hoc networksPeer to peer cache resolution mechanism for mobile ad hoc networks
Peer to peer cache resolution mechanism for mobile ad hoc networks
 
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
An Extensive Literature Review of Various Routing Protocols in Delay Tolerant...
 
In network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyIn network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a survey
 
A Comprehensive Study on Vehicular Ad-Hoc Delay Tolerant Networking for Infra...
A Comprehensive Study on Vehicular Ad-Hoc Delay Tolerant Networking for Infra...A Comprehensive Study on Vehicular Ad-Hoc Delay Tolerant Networking for Infra...
A Comprehensive Study on Vehicular Ad-Hoc Delay Tolerant Networking for Infra...
 
Content Sharing over Smartphone-Based Delay-Tolerant Networks
Content Sharing over Smartphone-Based Delay-Tolerant NetworksContent Sharing over Smartphone-Based Delay-Tolerant Networks
Content Sharing over Smartphone-Based Delay-Tolerant Networks
 
A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...
A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...
A Platform for Large-Scale Grid Data Service on Dynamic High-Performance Netw...
 
Congestion control, routing, and scheduling 2015
Congestion control, routing, and scheduling 2015Congestion control, routing, and scheduling 2015
Congestion control, routing, and scheduling 2015
 
Hm2413291336
Hm2413291336Hm2413291336
Hm2413291336
 
Cloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for MapreduceCloak-Reduce Load Balancing Strategy for Mapreduce
Cloak-Reduce Load Balancing Strategy for Mapreduce
 
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTS
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA  AND CHORD DHTS COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA  AND CHORD DHTS
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTS
 

Viewers also liked (13)

Dbms9
Dbms9Dbms9
Dbms9
 
Database Management System 1
Database Management System 1Database Management System 1
Database Management System 1
 
Database management system
Database management systemDatabase management system
Database management system
 
File Processing System
File Processing SystemFile Processing System
File Processing System
 
Database , 1 Introduction
 Database , 1 Introduction Database , 1 Introduction
Database , 1 Introduction
 
Unit 6
Unit 6Unit 6
Unit 6
 
Lecture 11 - distributed database
Lecture 11 - distributed databaseLecture 11 - distributed database
Lecture 11 - distributed database
 
Distributed database management systems
Distributed database management systemsDistributed database management systems
Distributed database management systems
 
Distributed System
Distributed System Distributed System
Distributed System
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management System
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Distributed Database System
Distributed Database SystemDistributed Database System
Distributed Database System
 
Dbms ppt
Dbms pptDbms ppt
Dbms ppt
 

Similar to Distributeddatabasesforchallengednet

A novel cache resolution technique for cooperative caching in wireless mobile...
A novel cache resolution technique for cooperative caching in wireless mobile...A novel cache resolution technique for cooperative caching in wireless mobile...
A novel cache resolution technique for cooperative caching in wireless mobile...csandit
 
A NOVEL CACHE RESOLUTION TECHNIQUE FOR COOPERATIVE CACHING IN WIRELESS MOBILE...
A NOVEL CACHE RESOLUTION TECHNIQUE FOR COOPERATIVE CACHING IN WIRELESS MOBILE...A NOVEL CACHE RESOLUTION TECHNIQUE FOR COOPERATIVE CACHING IN WIRELESS MOBILE...
A NOVEL CACHE RESOLUTION TECHNIQUE FOR COOPERATIVE CACHING IN WIRELESS MOBILE...cscpconf
 
A Survey of File Replication Techniques In Grid Systems
A Survey of File Replication Techniques In Grid SystemsA Survey of File Replication Techniques In Grid Systems
A Survey of File Replication Techniques In Grid SystemsEditor IJCATR
 
A Survey of File Replication Techniques In Grid Systems
A Survey of File Replication Techniques In Grid SystemsA Survey of File Replication Techniques In Grid Systems
A Survey of File Replication Techniques In Grid SystemsEditor IJCATR
 
Distributed Three Hop Routing Protocol for Enhancing Routing Process in WSN
Distributed Three Hop Routing Protocol for Enhancing Routing Process in WSNDistributed Three Hop Routing Protocol for Enhancing Routing Process in WSN
Distributed Three Hop Routing Protocol for Enhancing Routing Process in WSNpaperpublications3
 
Iaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd Iaetsd
 
QoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdf
QoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdfQoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdf
QoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdfneju3
 
A02120201010
A02120201010A02120201010
A02120201010theijes
 
Transfer reliability and congestion control strategies in opportunistic netwo...
Transfer reliability and congestion control strategies in opportunistic netwo...Transfer reliability and congestion control strategies in opportunistic netwo...
Transfer reliability and congestion control strategies in opportunistic netwo...IEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...
JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...
JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...IEEEGLOBALSOFTTECHNOLOGIES
 
An Integrated Inductive-Deductive Framework for Data Mapping in Wireless Sens...
An Integrated Inductive-Deductive Framework for Data Mapping in Wireless Sens...An Integrated Inductive-Deductive Framework for Data Mapping in Wireless Sens...
An Integrated Inductive-Deductive Framework for Data Mapping in Wireless Sens...M H
 
An Efficient Approach for Data Gathering and Sharing with Inter Node Communi...
 An Efficient Approach for Data Gathering and Sharing with Inter Node Communi... An Efficient Approach for Data Gathering and Sharing with Inter Node Communi...
An Efficient Approach for Data Gathering and Sharing with Inter Node Communi...cscpconf
 
Java Abs Peer To Peer Design & Implementation Of A Tuple S
Java Abs   Peer To Peer Design & Implementation Of A Tuple SJava Abs   Peer To Peer Design & Implementation Of A Tuple S
Java Abs Peer To Peer Design & Implementation Of A Tuple Sncct
 
A TIME INDEX BASED APPROACH FOR CACHE SHARING IN MOBILE ADHOC NETWORKS
A TIME INDEX BASED APPROACH FOR CACHE SHARING IN MOBILE ADHOC NETWORKS A TIME INDEX BASED APPROACH FOR CACHE SHARING IN MOBILE ADHOC NETWORKS
A TIME INDEX BASED APPROACH FOR CACHE SHARING IN MOBILE ADHOC NETWORKS cscpconf
 
Scalable and adaptive data replica placement for geo distributed cloud storages
Scalable and adaptive data replica placement for geo distributed cloud storagesScalable and adaptive data replica placement for geo distributed cloud storages
Scalable and adaptive data replica placement for geo distributed cloud storagesVenkat Projects
 
Synthesis of Non-Replicated Dynamic Fragment Allocation Algorithm in Distribu...
Synthesis of Non-Replicated Dynamic Fragment Allocation Algorithm in Distribu...Synthesis of Non-Replicated Dynamic Fragment Allocation Algorithm in Distribu...
Synthesis of Non-Replicated Dynamic Fragment Allocation Algorithm in Distribu...IDES Editor
 
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...Tal Lavian Ph.D.
 
Traffic-aware adaptive server load balancing for softwaredefined networks
Traffic-aware adaptive server load balancing for softwaredefined networks Traffic-aware adaptive server load balancing for softwaredefined networks
Traffic-aware adaptive server load balancing for softwaredefined networks IJECEIAES
 
Delay Tolerant Network
Delay Tolerant NetworkDelay Tolerant Network
Delay Tolerant NetworkMichel Chamat
 

Similar to Distributeddatabasesforchallengednet (20)

Ax34298305
Ax34298305Ax34298305
Ax34298305
 
A novel cache resolution technique for cooperative caching in wireless mobile...
A novel cache resolution technique for cooperative caching in wireless mobile...A novel cache resolution technique for cooperative caching in wireless mobile...
A novel cache resolution technique for cooperative caching in wireless mobile...
 
A NOVEL CACHE RESOLUTION TECHNIQUE FOR COOPERATIVE CACHING IN WIRELESS MOBILE...
A NOVEL CACHE RESOLUTION TECHNIQUE FOR COOPERATIVE CACHING IN WIRELESS MOBILE...A NOVEL CACHE RESOLUTION TECHNIQUE FOR COOPERATIVE CACHING IN WIRELESS MOBILE...
A NOVEL CACHE RESOLUTION TECHNIQUE FOR COOPERATIVE CACHING IN WIRELESS MOBILE...
 
A Survey of File Replication Techniques In Grid Systems
A Survey of File Replication Techniques In Grid SystemsA Survey of File Replication Techniques In Grid Systems
A Survey of File Replication Techniques In Grid Systems
 
A Survey of File Replication Techniques In Grid Systems
A Survey of File Replication Techniques In Grid SystemsA Survey of File Replication Techniques In Grid Systems
A Survey of File Replication Techniques In Grid Systems
 
Distributed Three Hop Routing Protocol for Enhancing Routing Process in WSN
Distributed Three Hop Routing Protocol for Enhancing Routing Process in WSNDistributed Three Hop Routing Protocol for Enhancing Routing Process in WSN
Distributed Three Hop Routing Protocol for Enhancing Routing Process in WSN
 
Iaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme viaIaetsd a secured based information sharing scheme via
Iaetsd a secured based information sharing scheme via
 
QoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdf
QoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdfQoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdf
QoS_Aware_Replica_Control_Strategies_for_Distributed_Real_time_dbms.pdf
 
A02120201010
A02120201010A02120201010
A02120201010
 
Transfer reliability and congestion control strategies in opportunistic netwo...
Transfer reliability and congestion control strategies in opportunistic netwo...Transfer reliability and congestion control strategies in opportunistic netwo...
Transfer reliability and congestion control strategies in opportunistic netwo...
 
JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...
JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...
JAVA 2013 IEEE NETWORKING PROJECT Transfer reliability and congestion control...
 
An Integrated Inductive-Deductive Framework for Data Mapping in Wireless Sens...
An Integrated Inductive-Deductive Framework for Data Mapping in Wireless Sens...An Integrated Inductive-Deductive Framework for Data Mapping in Wireless Sens...
An Integrated Inductive-Deductive Framework for Data Mapping in Wireless Sens...
 
An Efficient Approach for Data Gathering and Sharing with Inter Node Communi...
 An Efficient Approach for Data Gathering and Sharing with Inter Node Communi... An Efficient Approach for Data Gathering and Sharing with Inter Node Communi...
An Efficient Approach for Data Gathering and Sharing with Inter Node Communi...
 
Java Abs Peer To Peer Design & Implementation Of A Tuple S
Java Abs   Peer To Peer Design & Implementation Of A Tuple SJava Abs   Peer To Peer Design & Implementation Of A Tuple S
Java Abs Peer To Peer Design & Implementation Of A Tuple S
 
A TIME INDEX BASED APPROACH FOR CACHE SHARING IN MOBILE ADHOC NETWORKS
A TIME INDEX BASED APPROACH FOR CACHE SHARING IN MOBILE ADHOC NETWORKS A TIME INDEX BASED APPROACH FOR CACHE SHARING IN MOBILE ADHOC NETWORKS
A TIME INDEX BASED APPROACH FOR CACHE SHARING IN MOBILE ADHOC NETWORKS
 
Scalable and adaptive data replica placement for geo distributed cloud storages
Scalable and adaptive data replica placement for geo distributed cloud storagesScalable and adaptive data replica placement for geo distributed cloud storages
Scalable and adaptive data replica placement for geo distributed cloud storages
 
Synthesis of Non-Replicated Dynamic Fragment Allocation Algorithm in Distribu...
Synthesis of Non-Replicated Dynamic Fragment Allocation Algorithm in Distribu...Synthesis of Non-Replicated Dynamic Fragment Allocation Algorithm in Distribu...
Synthesis of Non-Replicated Dynamic Fragment Allocation Algorithm in Distribu...
 
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
DWDM-RAM: An Architecture for Data Intensive Service Enabled by Next Generati...
 
Traffic-aware adaptive server load balancing for softwaredefined networks
Traffic-aware adaptive server load balancing for softwaredefined networks Traffic-aware adaptive server load balancing for softwaredefined networks
Traffic-aware adaptive server load balancing for softwaredefined networks
 
Delay Tolerant Network
Delay Tolerant NetworkDelay Tolerant Network
Delay Tolerant Network
 

More from Vinoth Chandar

[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi
[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi
[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/HudiVinoth Chandar
 
Hoodie - DataEngConf 2017
Hoodie - DataEngConf 2017Hoodie - DataEngConf 2017
Hoodie - DataEngConf 2017Vinoth Chandar
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkVinoth Chandar
 
Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Vinoth Chandar
 
Voldemort : Prototype to Production
Voldemort : Prototype to ProductionVoldemort : Prototype to Production
Voldemort : Prototype to ProductionVinoth Chandar
 
Voldemort on Solid State Drives
Voldemort on Solid State DrivesVoldemort on Solid State Drives
Voldemort on Solid State DrivesVinoth Chandar
 
Composing and Executing Parallel Data Flow Graphs wth Shell Pipes
Composing and Executing Parallel Data Flow Graphs wth Shell PipesComposing and Executing Parallel Data Flow Graphs wth Shell Pipes
Composing and Executing Parallel Data Flow Graphs wth Shell PipesVinoth Chandar
 
Triple-Triple RDF Store with Greedy Graph based Grouping
Triple-Triple RDF Store with Greedy Graph based GroupingTriple-Triple RDF Store with Greedy Graph based Grouping
Triple-Triple RDF Store with Greedy Graph based GroupingVinoth Chandar
 

More from Vinoth Chandar (9)

[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi
[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi
[Pulsar summit na 21] Change Data Capture To Data Lakes Using Apache Pulsar/Hudi
 
Hoodie - DataEngConf 2017
Hoodie - DataEngConf 2017Hoodie - DataEngConf 2017
Hoodie - DataEngConf 2017
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on Spark
 
Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived
 
Voldemort : Prototype to Production
Voldemort : Prototype to ProductionVoldemort : Prototype to Production
Voldemort : Prototype to Production
 
Voldemort on Solid State Drives
Voldemort on Solid State DrivesVoldemort on Solid State Drives
Voldemort on Solid State Drives
 
Composing and Executing Parallel Data Flow Graphs wth Shell Pipes
Composing and Executing Parallel Data Flow Graphs wth Shell PipesComposing and Executing Parallel Data Flow Graphs wth Shell Pipes
Composing and Executing Parallel Data Flow Graphs wth Shell Pipes
 
Triple-Triple RDF Store with Greedy Graph based Grouping
Triple-Triple RDF Store with Greedy Graph based GroupingTriple-Triple RDF Store with Greedy Graph based Grouping
Triple-Triple RDF Store with Greedy Graph based Grouping
 
Bluetube
BluetubeBluetube
Bluetube
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Distributeddatabasesforchallengednet

  • 1. Distributed Databases for Challenged Networks Vinoth Chandar, Ashwini Athalye Most delay tolerant networking research has focussed for query processing in a distributed Database for DTN on developing routing protocols that optimize data delivery environments. The benefits that an application can enjoy from of a single data unit from a source to potentially multiple such a system are location transparency and data decoupling. destinations. However, distributed applications built on top of Achieving location transparency in a DTN setting obviously such intermittently connected networks, typically involve more results in more flexible applications. complex data access mechanisms. Distributed databases are widely used to capture such interactions. Thus, exploring the To our knowledge, no previous work has examined this feasibility of building a distributed database system for DTNs problem of distributed query processing in a completely is an intriguing research problem. Applications can benefit decentralized DTN environment. This work tries to put into immensely from the resulting location transparency and data perspective the issues that are to be addressed in order to decoupling. build a practical system, over which multitude of interesting The paper proposes a distributed database system for such applications can be written. The problem requires additional networks. The key contributions of this work are 1) practical mechanisms to improve latency involved in answering queries, heuristics for query scheduling 2) a prereplication scheme due to the inherently time varying nature of the links. Such a that reduces the cost of on-demand retrieval by actively pre- scheme is vital to ensure the practical usefulness of the system. caching data. To our knowledge, this is the first work to examine these issues in an unified framework. We describe The paper is organized as follows. Section 2 details the the results of several simulations conducted on a wide variety relevant background and the other related research in this of artificial and real world traces. area. Section 3 presents the motivations and challenges to solving this problem.Section 4 discusses the problem of query I. I NTRODUCTION scheduling in DTNs. Section 5 presents the prereplication DTN is a category of networks that is characterized by scheme proposed. Section 6 and 7 contain details about meta- intermittent connectivity, irregular connectivity patterns, data exchanges involved and the packet scheduling criteria unpredictable communication opportuntity and diverse involved. Section 8 presents detailed results from simulations devices. As explained in [1], these types of networks are of the scheme under various scenarios. This is followed by a becoming important with the pervasiveness of wireless short observations section. Section 10 concludes after outlining technology. For example, the campus wide deployment some of the future directions that we intend to pursue. of wireless technology coupled with the large number of users having laptops/cell phones with wireless capability, II. BACKGROUND AND R ELATED WORK provides enormous opportunities for exchange of ’good-to- Traditional query processing schemes in distributed know’ information amongst students. Such good-to-know databases construct an execution plan for an input query based information can substantially augment the quality of decision on the sites participating in the query and the size of the making in daily life. In the past, there have also been many data set that needs to be moved from one site to another real world DTN deployments like Zebranet[11], DakNet[12] for the execution of the query. Algorithms that optimize for and DieselNet[13]. Zebranet tries to study the movement response time typically do so by reducing the amount of data patterns and the behaviour of zebras, by attaching sensor shipped between sites, to execute the query. These schemes nodes to the Zebras. Daknet is one of the earliest practical [5] however assume a fixed network topology which means DTN systems, that provides Internet connectivity to rural that there are statically defined paths between nodes and that villages in India. DieselNet is a DTN infrastructure deployed they are always up. This is contrary to what DTNs possess. over the bus transport system in Amherst. These practical In a DTN, simple optimization for communication cost may systems have clearly proved the utility of delay tolerant not minimize the response time, since there can be schedules networking. which involve earlier contacts between sites, leading to lower delays, even with larger data shipping. Hence query scheduling Thus, the wide applicability of DTNs call for a deeper look is consequently more complex for such an environment. This into the needs of the applications that can be supported over is the focus of the research conducted in this paper. it. [2] stresses on the need to develop robust mechanisms ICEDB [8] proposes a system, where in users query a web for DTN based collaborative applications to improve data portal, connected to devices on the road. Devices are sensor propagation rates in realistic deployments. Using this nodes that build databases with the information of interest. The information as a guiding light, we propose mechanisms portal is responsible for sending the user queries and getting
  • 2. the results back from the devices. The queries are annotated track the nodes that carry interesting information, increasing with priorities and the devices schedule results to be moved the complexity of the application. to the portal based on the priorities. This work illustrates Hence, if applications query using an indirection of database the utility of using database queries to retrieve content from tables, which are then mapped to the nodes that contain intermittently connected nodes, since the application can work fragments of the global information, it can lead to reduced with a simpler abstraction of a database table. [6] builds a overhead for applications and seamless integration of different federated database over Bluetooth. However, it assumes that applications to work with each other. Moreover multiple there is no mobility and hence, does not apply to most practical applications can be built on top of this layer of abstraction scenarios. [9] is a mobile surveillance system, where buses since they no longer need to have node specific information. upload spatially and temporally tagged images to a base station In essence, building a distributed database system over DTNs placed in the bus depot. These data stores are then queried would foster the development of myriad of interesting appli- from a central cloud server. cations, that in turn enrich the practical benefits of conducting A survey of the related work gives insight into some delay tolerant networking research. practical aspects of DTNs. [7] formally analyzes the problem of DTN routing at various degrees of knowledge of the future IV. Q UERY P ROCESSING and the network. MaxProp[10] is a DTN routing protocol that builds relative meeting probabilities and uses it to forward In conventional distributed databases, selectivity factors are packets to a destination. RAPID [3] is a DTN routing protocol used to perform reductions on JOIN operations and optimize that can be optimized specifically for a routing metric such as for the response time of the query. Computing the selectivity worst-case delay, average delivery delay. In this paper DTN factor is a linear time operation that can be performed instan- routing has been formulated as a resource allocation problem taeneously. However, the traditional schemes break in a DTN under a realistic scenario where both storage and bandwidth since there is no way of maintaining selectivity factors in a are limited. The selected routing metric is used to prioritize DTN. Most traditional schemes reduce the response time by a packet at each hop and locally optimize the delivery of reducing the amount of data shipped across the network. Since packets in the order of the change in their utility function. the data size shipped, directly contributed to the delay involved Another interesting work on DTN routing [4], DPSP, is tuned in processsing the query , in a DTN setting, our approach for applications that use a publish scheme to advertise content, largely pursues this path. on topics and the nodes subscribe to different topics. This We work with approximations of contact times/probabilities scenario does not require either the content provider or the through metadata exchanges between the nodes. Deadlines are content subscriber to have knowledge about each other. Also, useful in identifying the time until the data is valid. This the knowledge given out by the publisher and the subscriber is correlates closely with the type of ’good-to-know’ informa- made use of to effectively allocate the resources and to prevent tion that the system intends to work with. Queries can be flooding of the network with replicated packets. tagged with deadlines to indicate the amount of time that the To summarize, the approach to query processing in DTNs results will be useful for. We also intend to reduce the query has been largely centralized i.e one central server holds the processing delay, by sending results as and when intermediate node-to-data mapping and uses it to perform query scheduling. results are available, without waiting for the completion of the Most practical DTN routing mechanisms work with heuristics entire query, that potentially provides additional results. The and rely on empirical results to prove their usefullness, in an duplicate results can be filtered out through a delta detection attempt to keep the problem tractable. mechanism at either the query initiator or the intermediate nodes, using standard techniques. III. M OTIVATION AND C ONTEXT We assume that applications that talk to each other have knowledge about the database tables. The knowledge about DTNs offer opportunities to develop some exciting dis- the horizontal fragmentation, including cardinalities, can be tributed applications. For example, a newcomer to Austin built based on metadata exchanges. The packets that are might want to know about good places to dine, live music sent through the network can be either those carrying the events etc. A distributed peer to peer application can help query or packets that carry the results. We divide the set him to query the people around for this information, even of queries that are issued by applications into Unions and without Internet connectivity. Typically, this good-to-know Joins. Unions consist of queries on single tables that are information is not very time sensitive and hence, can be horizontally fragmented across various nodes. Unions are delivered over DTNs. Existing routing protocols in DTNs simply sent to all the nodes that the query initiator knows typically operate with a known source and destination. The of, which contain the target query. Joins consist of queries on application developer needs to schedule packets to different multiple tables that can be horizontally fragmented and have destinations, according to the needs of the application. The at least one common attribute. Joins require a much more problem becomes worse in case of open networks, where not complicated query scheduling mechanism. Joins, which are every node knows each other, even though unknown nodes costly operations even in a conventional distributed database may carry relevant information. Thus, the application needs to system, are far more expensive in DTN environments.
  • 3. For distributed databases, a number of factors contribute For each t in T: to heuristics that determine how the query gets processed. B[q] += cardinality(t) These are typically the cardinality of a database at a node, Processing node = i , such that the size of a tuple in the database, and together the two B[i] = max(B[q]), for all q in Q. numbers give the size of the data that needs to be moved between sites for complex queries. Query processing in DTN C. Nearest site would require heuristics to be designed such that they take This heuristic picks the site which all of the relevant into account, packet scheduling criteria for DTN as well as sites (sites that house the relevant data) are more likely to database query optimization parameters. In this paper, we meet. Note that this processing site may not be amongst the discuss two heuristics for scheduling Joins 1)Heaviest site relevant sites. The relevant sites may accumulate data at some 2)Nearest site. intermediate node, that every node is most likely to meet. This heuristic tries to directly optimize for the time spent in A. Optimal scheme getting the required data to the processing site. The meeting Assuming the knowledge of contact times and contact probabilities used in MaxProp[10] are used directly to compute durations of the nodes helps in improving the performance the nearest site. of routing mechanisms [7]. In many real life scenarios, such accurate contact information can be obtained, such as schedule N : set of nodes known to current node of students in a campus. But, when this information is not Q : set of all nodes that are relevant available,statistical approximations can be used in their place, to this query. based on contact histories. Even with this complete informa- C[n] : cummulative meeting probability tion, computing an optimal path is NP Hard [3].Given that the of nodes in Q to each node n in N DTN routing problem, even with a perfect Oracle is NP Hard, Pseudo-code: the problem of scheduling the queries, which involves moving For node n in N: relevant data to the processing site in the best possible manner For each q in Q: also becomes NP hard, since the query processing problem is C[n] += meetingprobability(q,n) composed of independent subproblems of the routing problem. Processing site = i such that The non availability of selectivity factors, precludes any C[i] = max(C[n] ), for all n in N chances of reducing joins at intermediate sites. Also, with V. P REREPLICATION the prereplication scheme discussed in section 6, even floating approximate and possibly stale values of the selectivity factors As mentioned earlier, the cost of on-demand retrieval is in the network might not perform well, since the prereplication much higher in a delay tolerant network, since the links may would invalidate the selectivity factors very quickly due to not become available again for a long time. In such cases, addition of new content at the node. Moreover, selectivity actively moving content to nodes that are likely to be involved factors are very specific to a particular query. For a selectivity in relevant queries, would help reduce the latency involved in factor to be useful, the new query needs to match very closely answering the queries. Each node builds a popularity index to the original query. Hence, the relevant data must be brought for table combinations that are involved in the queries that it to a single common processing site, for query evaluation. In sees. Nodes also build the popularity index through exchange this paper, we primarily focus on choosing this processing site. of this information amongst themselves. Thus, a global pop- Hence, we try to approach the problem using heuristics, local ularity metric is built for all the table combinations. Over a optimizations, and techniques from existing work on DTN longer duration of system operation, this metric captures the routing and distributed database query optimization. probability that a future query will be generated for each of the table combinations. Let us define the servability, of a node B. Heaviest site as the amount of popular content possessed by the node. When This heuristic tries to optimize for lower delays, by moving two nodes meet, they exchange more popular content amongst the data to the site which possesses the largest amount of themselves. data relevant to the query. The heaviest site is chosen as the When a node meets another, processing site. Note that the processing site is one among the sites that have the relevant data. The intuition is that if the * Updates popularities from other node heaviest site is prevented from shipping its data elsewhere, * Computes a Integer Knapsack to deter -mine the transfers that are to be the lesser bandwidth consumption would result in significant made to increase servability savings in the response time. * Utility of transfering a table frag Q : set of all nodes that are relevant to -ment to another node is the amount this query. of increase in servability due to T : set of tables that are required. that transfer Pseudo-code: * Cost of transfer is the bandwidth For each node q in Q: consumed
  • 4. * The maximum cost that can be incurr -ed by deadline -ed is total bandwidth of the conta Exchange prereplication content. -ct Schedules the transfers that it Note that the prereplication content is sent at the last, in needs to make to the other node effect , using only any amount of spare bandwidth that may A. Returning results be available. Thus, prereplication is used as an add-on and it does not affect the data traffic. As more content is prereplicated, the more popular content is widely spread across the nodes in the network. As a result, VIII. E VALUATION the number of relevant sites in a join query increases with We have implemented our scheme on a DTN simulator the amount of prereplication. Hence, the latency involved in called [14] The ONE [see figure 1]. This Simulator provides processing the join also increases prohibitively. To keep the the following desirable features which led us to eventually latency low, two mechanisms can be used to return the results. choosing it for our project: • Best : The processing site waits for all table fragments • Various Routing mechanisms like Epidemic Routing, from relevant sites to arrive, to begin processing the query MaxProp Routing, Spray and Wait etc. are supported. • Fast: The processing site processes the query and sends • Different mobility models for simuating movement of the result out, as soon as atleast one portion of all the nodes. tables involved in the join are available. • A visualization tool to view node movement as well as The two mechanisms are a tradeoff between accuracy of data exchanges. results (best) and response time for the query (fast). We considered two mechanisms at the two extremes for simplicity of analysis. Obviously, an intermediate scheme that balances between the two can also be used. VI. M ETADATA EXCHANGES The unstructured nature of DTNs requires that a mechanism be in place for each node to either have a global view of the nodes in the system or have some way of updating its local information based on interactions with other nodes. The latter involves learning state of neighbors by exchanging relevant metadata between nodes in contact. Gradually as more and more node exchanges happen, the metadata grows and gets replaced with newer state. The following meta data are exchanged • Meeting probability, as described earlier • List of tables possessed by each node, and their cardinal- ities • Popularity indexes Fig. 1. Screenshot of ONE Simulator VII. PACKET SCHEDULING We have implemented our scheme on top of the MaxProp The packet scheduling scheme used must reflect the goals Router, provided by the simulator. The router maintains meet- of the database layer above. It needs to decide on specific ing probabilities of each node with other nodes it meets, which per-packet and per-packet-per-hop metrics to replicate packets is useful for nearest site selection. in the network. We chose replication over forwarding since As mentioned earlier, there is no existing work in this replication has better chances of delivering the packets quickly, space to compare our scheme against. Hence we have tried to even though it involves more resources. The following sum- evaluate our scheme by testing it against different movement marizes the packet scheduling scheme used in the paper. models. We have also varied the number of nodes in the When node A meets node B network to see the effect of node size on the performance Exchange metadata of the proposed scheme. Through our evaluation, we hope to Exchange directly deliverable content get an insight into the goodput of the scheme as well as its ordered by deadline applicability to different DTN scenarios. For all packets p in buffer: We have analysed our scheme against 5 movement models: if B has higher meeting probability • Random Way Point: This movement model captures to dest[p] than A: completely randomized node movement. Pick p for transfer • Shortest Path Based: This model simulates the movement Send all picked packets in buffer order of people. The nodes move along fixed paths, however
  • 5. their movement along these fixed paths is random and real world events are roughly deterministic, it serves as a good based on certain points of interest. starting point for our analysis. The graphs in figure 2,3 convey • Map Based: This movement model simulates DTN where results generated for 20 and 40 nodes in a Random Way Point nodes have periodicity in their movement. We have model. utilized the movement of trams in Helsinki as part of this model. • DieselNet: This is a bus system that has been deployed by UMass consisting of 34 buses. • ZebraNet: This is a trace of a network of 5 zebras which have sensors deployed on them. The first 3 movement models are provided by the simulator. Due to simulation time constraints, we have performed anal- ysis on these models with 20 and 40 nodes. The remaining 2 movement models - DieselNet and ZebraNet are real traces which were converted into a format that was acceptable to the simulator. Hence we have been able to run our scheme on simulation models as well as real traces. A. Metrics To evaluate the performace of our scheme, we consider the following metrics. Fig. 2. Query processing throughput for Randomly moving nodes • Percentage of answered queries: This metric was choosen to get the throughput of the system for the total of forward(path along which query is sent) and reverse paths(path along which query result is sent). • Percentage of Unions: This metric gives the peformance for single table queries. • Percentage of Joins: This metric gives the peformance for multi table queries. This metric is especially useful to verify our scheme since we expect performance to improve for joins, especially with our processing site selection scheme and the prereplicaton scheme. • Average query latency: This metric is again important to understand the behavior of our scheme. We expect our scheme to reduce the query latency i.e. decrease the delay between the time when the query was issued by a node and the time that it received the results. The average query latency also provides a useful insight into the kind of deadlines that each DTN scenario is typcially able to handle. Fig. 3. Latency for Randomly moving nodes B. Scenarios We present the results of the metrics introduced in the D. Map based Tram movements previous section, for three scenarios. This model simulates the movement of trams in Helsinki, • No repl - This represents the behavior of the system, with using WKT files that were generated from a real GIS system. no prereplication, ’best’ return of results and heaviest It simulates periodicity in meeting patterns. Figures 4,5 show site query processing. This serves as the vanilla scenario the performance of the scheme for 20 and 40 people in the against which we compare the next two scenarios. system. • Repl(Heavy,Fast) - This involves prereplication, ’fast’ E. Shortest path People movement return of results and heviest site query processing. • Repl(Near,Fast) - This is same as Repl(Heavy,Fast) ex- With the popularity of portable devices like hand held PDA’s cept that nearest site query processing scheme is used. and cell phones, DTNs with such devices as nodes is another potential scenario. Hence the movement patterns for people C. Random Way point give an idea of connectivity in such an environment. The The random way point model simulates randomly moving scheme was evaluated for 20 and 40 nodes in the system [see nodes. Although, it is a unrealistic mobility model since most figure 6,7].
  • 6. Fig. 4. Query processing throughput for Tram system Fig. 7. Latency for People interactions were generated by contacts between metro buses, in Amherst. The number of nodes in the DieselNet traces is 34. Figures 8,9 present the results of the simulation run for about a week. Fig. 5. Latency for Tram system Fig. 8. Query processing throughput for DieselNet G. ZebraNet Monitoring wildlife and natural phenomena is another in- teresting area for DTNs. It also presents a good case, where a query processing scheme could be invaluable. ZebraNet is a real world deployment of sensors on 5 zebras. See figure 10,11 for results. IX. O BSERVATIONS In general, Pre-replication provides improvements in latency Fig. 6. Query processing throughput for People interactions as well as the success ratio of the queries.However, the extent of benefit varies from one movement model to another. F. DieselNet • The pre-replication benefits are around 30-40% for suc- DieselNet is a realtime trace which exemplifies the appli- cess rate. cability of our scheme for a real world scenario. The traces • Benefits for latency are around 30-50%.
  • 7. latency. X. C ONCLUSION AND F UTURE W ORK This is the first work to address the problem of query scheduling in a completely decentralized and adhoc , delay tolerant network. This is a very interesting research area with enormous scope for improvement. We intend to pursue further research along these lines. • Building a DB Aware packet replication scheme with multicast capability would help improve performance since the query issuals are inherently multicast traffic. • If the intermediate nodes cache previous results, queries could be answered from these caches, improving latency. However, this would involve complex tradeoffs in buffer management. Also, the applications should be willing to accept some degree of staleness in the results. Fig. 9. Latency for DieselNet • Given that many real world DTN environments have near periodic events, a local prereplication scheme could be used wherein the nodes perform the prereplication based on the queries that they answer and route. This is different from the current scheme where the global popularities are used. • A very important contribution would be to improve the heuristics used for query scheduling. Ways to reduce multi table joins, parallelly in this environment, is a open problem. • In our scheme we were not able to use bandwidth between nodes as a metric since the simulator doesnot provide a way to extract contact durations between nodes. Looking into this aspect can yield better more practical heuristics. • With better simulation results, the system needs to be im- plemented on real hardware to exemplify its real benefits. The work identifies an area that has not received due Fig. 10. Query processing throughput for ZebraNet attention before. We have proposed query scheduling and prereplication mechanisms to enable applications to manage data in a flexible manner, in a DTN setting. The simulation results prove the validity of the approaches, followed in the paper. We strongly believe that such a distributed solution, will really foster the growth of killer applications that can popularize the notion of DTNs. XI. R EFERENCES [1] Kevin Fall, Delay Tolerant Network Architecture for Challenged Internets, Sigcomm 03, August 25-29 [2] Xuwen Yu and Surendar Chandra, Delay tolerant collaborations among campus-wide wireless users, IEEE Infocom 2008 [3]Aruna Balasubramanian, Brian Neil Levine and Arun Venkataramani, DTN Routing as a Resource Allocation Problem, Sigcomm 07, August 27-31 [4] Janico Greifenberg and Dirk Kutscher, Efficient Fig. 11. Latency for ZebraNet Publish/Subscribe-based Multicast for Opportunistic Networking with Self Organized Resource Utilization, IEEE 2008 • In most cases the Nearest site heuristic performs better [5] Donald Kossmann,The State of the Art in Distributed than Heaviest Site in terms of sucess rate as well as Query Processing, ACM 2001
  • 8. [6] Hassan Artail, Haidar Safa and Manal Shihab, Implementation of a Federated Database on Bluetooth- Enabled Mobile Devices, ICPS 2008, July 6-10, ACM 2008 [7] S. Jain, K Fall, R Patra, Routing in delay tolerant network, ACM SIGCOMM 2004, pg 145-158 [8] Yang Zhang, Bret Hull, Hari Balakrishnan and Samuel Madden,ICEDB: Intermittently connected continuous query processing [9] Stewart GreenHill, Svetha Venkatesh, Distributed query processing for mobile surveillance, Proceedings of the 15th international conference on Multimedia, 2007 [10] J. Burgess, B. Gallagher, D. Jensen, and B. N. Levine, MaxProp: Routing for Vehicle-Based Disruption-Tolerant Networks. In Proc. IEEE Infocom, April 2006 [11] Philo Juang and Hidekazu Oki and Yong Wang and Margaret Martonosi and Li-shiuan Peh and Daniel Rubenstein, Energy-Efficient Computing for Wildlife Tracking: Design Tradeoffs and Early Experiences with ZebraNet,ASPLOS-X conference,San Jose,Oct 2002 [12]Pentland, A.; Fletcher, R.; Hasson, A., ”DakNet: rethinking connectivity in developing nations,” Computer , vol.37, no.1, pp. 78-83, Jan. 2004 [13] UMass DieselNet Project: http://prisms.cs.umass.edu/dome/umassdieselnet [14] The ONE Simulator: http://www.netlab.tkk.fi/tutkimus/dtn/theone/