Professional Documents
Culture Documents
1 Introduction
Today more and more products are being equipped with sensors and actuators,
and many of these are connected to the Internet. In the Internet of Things (IoT)
vision, these sensors and actuators become part of an IoT infrastructure that is
accssible to a variety of applications, typically through IoT services hosted in the
cloud. This enables a new class of opportunistic IoT applications that are not
configured or hard-wired for specific IoT services; they need to discover relevant
ones, e.g. in the current physical environment of a user, making discovery a key
functionality for an IoT infrastructure, as identified in the functional view of the
Architectural Reference Model (IoT ARM) [1].1
1
The IoT ARM was developed in the European Project IoT-A [2].
R. Hervás et al. (Eds.): UCAmI 2014, LNCS 8867, pp. 424–431, 2014.
c Springer International Publishing Switzerland 2014
Geographic Service Discovery for the Internet of Things 425
With the number of connected IoT devices growing into the billions – e.g.
Cisco forecasts 50 billion devices connected by 2020 [3] – the discovery function-
ality needs to be highly scalable. To achieve the required scalability, a distributed
discovery approach is needed since the throughput of a single node in the cloud
is always limited. The number of nodes involved in each discovery operation
should be limited as well, as this may become the bottleneck for aggregation.
Geographic loaction based on geographic coordinates is a slective criterion
for the distribution, i.e. one node is responsible for a certain geographic area.
Geographic coordinates are easy to determine, e.g. by using GPS or selecting a
location on a map and also highly selective, e.g. there may be millions of services
providing air quality information, but only a few are related to the location of
interest. Within a single node, efficient access can be achieved by using a spatial
index structure. Some spatial index structures like quadtrees [4] or kd-trees are
used for indexing point locations, whereas R-trees [5] and its variants can also
be used for indexing area locations, which we need for indexing service areas.
In this paper, we propose a cloud-based distributed service discovery for the
Internet of Things, based on geographic scopes. Geographic Information Systems
(GIS) have been using spatial data infrastructures with catalogues for geograph-
ical information [6]. These are utilized for storing and accessing large amounts
of relatively static geographic information like roads or buildings, but not for
the discovery of services. On the other hand, there have been proposals for
ontology-based service discovery using symbolic locations [7]. The focus of our
approach is on using geographic scopes based on geographic coordinates. The
core contribution is on measuring key performance aspects for evaluating the
practical feasibility of such an approach – something we have not been able to
find elsewhere.
Section 2 gives an overview of the geographic service discovery architecture
and the functionalities provided. Section 3 presents the evaluation of our pro-
totype. We first look at the performance of a single node and then we analyze
the distributed case with a single provider, allowing a perfect geographic parti-
tioning, where each node is responsible for a distinct geographic area and the
multi-provider case with overlapping geographic service areas. Finally, we pro-
vide a conclusion and an outlook on future work in Section 4.
2 Approach
The core idea of geographic discovery is to find information related to a geo-
graphic area. The geographic area is given as a geographic scope. In addition the
information to be discovered needs to be specified. The result of a geographic
discovery request is all the information whose geographic location matches the
geographic scope. Geographic location can be given as a point location or an
area location. For example, a point location may be suitable for determining the
location of small objects, whereas services may have larger service areas, e.g. the
area covered by a video camera.
426 M. Bauer and S. Longo
2.1 Functionality
Overall, the approach follows the service-oriented architecture (SOA) paradigm [8].
In a typical interaction, a client queries the geographic discovery for service descrip-
tions (represented in RDF/XML), providing a service specification, which specifies
what services are of interest to the client, and a geographic scope, that describes the
geographic area for which the services are requested. The geographic scope is then
matched against the geographic service areas, filtering the service descriptions ac-
cording to the service specifications. The fitting service descriptions are returned
to the client. Subsequently the client may call one or more of the services using the
information provided.
To serve the different needs of applications, we see the requirement to sup-
port synchronous one-time discovery requests, as well as requests for continuous
asynchronous notifications informing about changes.
In addition, management operations for inserting, updating and deleting ser-
vice descriptions with the respective service areas are needed in order to update
index structures. For the purpose of this paper we only evaluate synchronous
discovery requests (with rectangles specified by the coordinates of two diagonal
vertices as scopes).
2.2 Architecture
Geographic Index Server. The geographic index server implements the dis-
covery and management operations described above, using a REST-like binding.
The internal subcomponents are the discovery indexer based on the spatial index
and the object information index. The discovery indexer part implements the
logic core of the geographic index server using an in-memory spatial index, based
on an R-Tree [5], indexing the geographic information, or a persistent spatial in-
dex implementation, which internally also uses an R-Tree index. The in-memory
object information index is used for storing other information associated to the
services like the output of a service or the service type. We decided to use the
standard R-Tree data structure because we need a spatial index structure that
can handle rectangular geographic areas, as we are indexing service areas.
Distributed Architecture. Due to the large number of IoT services and the
required throughput to serve the expected number of application requests, a
single geographic index server will not be sufficient. Therefore, we propose a
distributed hierarchical architecture as shown in Figure 1. We introduce cata-
logue servers that do not store the service areas of IoT services, but rather the
service areas of geographic index servers. So for the discovery of IoT services
Geographic Service Discovery for the Internet of Things 427
first the top-level catalogue server is contacted, which then uses the geographic
scope to identify the (small) set of geographic index servers that have overlap-
ping service areas. The request is then forwarded to this subset and the results
are aggregated.
In principle, a hierarchy of catalogue servers can be used as indicated in
Figure 1, since catalogue servers can transparently be used instead of geographic
index servers.
3 Evaluation
based on geographic scopes. We show how our approach performs with respect
to throughput in different settings. For a single geographic index server, the
parameters we vary are the number of service descriptions stored, the number of
requests executed, the available network bandwidth, the size of the result set, and
the use of persistent and in-memory spatial index implementations. Finally, we
evaluate a distributed setting with partitioned as well as overlapping geographic
areas.
5 locators, that is the url where the service description is stored, were returned
per each request, encapsulated in an XML message with response body size of
1578 bytes). The achieved average throughput was around 2,000 requests/second.
The performed tests have demonstrated that the number of inserted service
descriptions has little if any influence on the discovery throughput.
The tests performed on the internal index structure show that no matter if
there are 500 or 100,000 services inserted in the geographic index server, the ge-
ographic discovery operation is only marginally affected. Therefore, all following
tests are based on a geographic index server pre-populated with 10,000 services.
The next step was to analyze how the available network bandwidth limits
the geographic discovery throughput using different network configurations. For
changing the network configuration, we used the netem [10] tool with the fol-
lowing ethernet configurations: 1Mb, 10Mb and 100Mb uplink/downlink.
The evaluation of the distributed approach with one catalogue server took
into consideration the single and multi-domain approaches. In the first case there
is a single operator that will serve a particular geographic area that could be
partitioned as shown on the right side of the tree in Figure 1. In this case
each geographic index server could be assigned to a specific area without any
overlaps. In the case of the multi-domain approach the overlap between areas
cannot be prevented as shown on the left side of the tree in Figure 1. The
tests were performed using the same testbed configuration with one catalogue
and four geographic index servers running on the server. Results are shown in
the Figure 4. Compared to the single geographic index server evaluation, the
overall performance decreased and this seems reasonable because we introduced
an additional layer between the test client and the geographic index server.
The maximum catalogue throughput achieved in this environment was about
650 requests per second for the single domain approach. The penalty is almost
2/3 of the achieved throughput on a single server (2,000 requests/seconds). The
performance comparison shows that the penalty for having overlapping service
areas is visible, but limited as shown in Figure 4.
4 Conclusion
As can be seen from the evaluation, the available network bandwidth plays an
important role for the overall performance of the geographic discovery infras-
tructure. This shows that a high selectivity of the request, i.e. limiting the result
set early in the process is important. Using geographic scopes already provides
relatively high selectivity as compared to other parts of the service description.
Geographic Service Discovery for the Internet of Things 431
In addition, the network bandwidth should be taken into account when choosing
the representation of the information, e.g. a plain RDF/XML-based represen-
tation is relatively verbose and thus has a negative impact on the throughput.
A distributed setting with a single catalogue server has a lower performance,
because the catalogue server has to wait for and aggregate responses from the
geographic index servers. The good point is that it is comparatively cheap to
replicate catalogue servers as the set of geographic index servers is expected to be
relatively stable compared to the set of IoT services, so the overhead of keeping
replicas synchronized is low. Based on the measurements we took, we believe that
it will be possible to build a scalable geographic discovery infrastructure for the
Internet of Things. As a next step, we plan to analyze a large scale IoT scenario
with respect to the discovery request load it generates and evaluate what geo-
graphic discovery infrastructure configuration is needed to support such a load
and whether such a configuration seems viable from a business perspective.
References
1. Bassi, A., Bauer, M., Fiedler, M., Kramp, T., van Kranenburg, R., Lange, S.,
Meissner, S. (eds.): Enabling Things to Talk: Designing IoT solutions with the IoT
Architectural Reference Model. Springer, Heidelberg (2013)
2. IoT-A European Project, http://www.iot-a.eu
3. Cisco IoT Forecast, http://share.cisco.com/internet-of-things.html
4. Finkel, R., Bentley, J.: Quad trees a data structure for retrieval on composite keys.
Acta Informatica 4(1), 1–9 (1974)
5. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Pro-
ceedings of the 1984 ACM SIGMOD International Conference on Management of
Data, SIGMOD 1984, pp. 47–57. ACM, New York (1984)
6. Groot, R., McLaughlin, J.: Geospatial data infrastructure - Concepts, cases, and
good practice. Oxford University Press (2000)
7. Lutz, M.: Ontology-based descriptions for semantic discovery and composition of
geoprocessing services. Geoinformatica 11(1), 1–36 (2007)
8. Papazoglou, M.P., Traverso, P., Dustdar, S., Leymann, F.: Service-oriented com-
puting: State of the art and research challenges. Computer 40(11), 38–45 (2007)
9. Apache Benchmark Tool, http://httpd.apache.org/docs/2.2/programs/ab.html
10. Netem, Network Emulator Tool,
http://www.linuxfoundation.org/collaborate/workgroups/networking/netem