Professional Documents
Culture Documents
KID Model-Driven Things-Edge-Cloud Computing Paradigm For Traffic Data As A Service
KID Model-Driven Things-Edge-Cloud Computing Paradigm For Traffic Data As A Service
ry es
ula anc
of the KID-driven TDaaS system under the TEC
cab st
Alg itio
computing paradigm.
Vo re in
de
fin
ori ns
h
As Fig. 3 shows, the KID model is embedded
s in h a
Vo nd o
ap
thm in
sse rap
c a b utp
Gr
a
in the edge server. Knowledge triples 〈Vocabu-
wil V
cla e g
he
ula uts
l p oc
st
the
lary, Graph, Algorithm〉 in the K-store are retrieved
the th
r y of
us a bu
fie
of ies in
sp th
h n la
eci
to support the three cognitive processes, that is,
eci e A
ew r y
sp
tit
fie lg
interpretation, assimilation, and instantiation. Data
En
ry
s t or
ula
he ith
are therefore transformed into information, and
cab
inp ms
Entities of Graph are
Vo
uts
inputs of Algorithms
es, some data, information, and knowledge are
Graph Algorithm stored in the local D-store, I-store, and K-store, but
Outputs of Algorithms some are sent to the cloud. The edge server has
enrich the Graph service workers that provide services directly back
e.g. e.g. to the IoT devices if information and knowledge
Longitude URL: algorithms/vehicleSpeedEstimation/ from other edge servers are not required. That is,
GPSSensor#001
139.0011 Input Demo: future IoT devices are transparent clients that can
Latitude {values:[ be both data collectors and service receivers in
embeddedOn
servedBy {Ing: 139.0011, lat: 39.0022, t: 08081322} a local processing mode. In this mode, data pro-
39.0022 ]} cessing and service provision are conducted com-
Taxi#002 Output Demo:
EdgeServer#003
{values:[{t:08081322, speed:36.5}]} pletely locally without going through the Internet
to the cloud.
FIGURE 2. Knowledge repository structure and examples. For some cases, the source data or the inter-
mediate results of local processing have to be
sent to the cloud. Of course, the cloud generally
ing for traffic data, and there is a need to expand holds more computational and storage resources;
them for TDaaS applications. balancing these resources considering network
Graph is a knowledge representation that is bandwidth and network traffic conditions is a key
composed of vertices and links [11]. A vertex issue. The KID model aims to optimize the bal-
represents an entity, and a link holds a relation ance between resource allocation and computa-
between two entities. The whole Graph is a set tion with its growing knowledge as it interprets
of entities and their relations, which are defined external data, similar to humans’ recognition of
in Vocabulary. Entities in Graph can be inputs of the external world.
Algorithm, and outputs of Algorithm enrich Graph Determining what data to send, where to send
in return. it, and how to send it is done by the KID model
Algorithm is a pool that holds a collection of based on its awareness of both data resources
executable programs. Although they are regard- and data service situations while it continuously
ed as a type of knowledge that can extract fea- interprets incoming data using prior knowledge in
tures, discover moving patterns, or even support the K-Store. Knowledge in a local K-Store can be
decision making, they cannot be represented in updated to the global K-Store in the cloud, and
Graph. They are deployed in a separate runtime any local K-Store can share the knowledge in the
environment and can be exported to other sys- global K-Store with permission.
tems as services. In a TDaaS system, the data providers are IoT
For better understanding, examples of Vocab- devices and the service receivers are Internet of
ulary, Graph, and Algorithm are given in Fig. 2. Things (IoT) devices/things or human users. Ser-
in the Vocabulary, GPSSensor is a data type defi- vice providers can be the edge or cloud servers.
nition, longitude is a property, and embeddedOn The data providers do not need to know how
is a relation; in the Graph, GPSSensor#001 and data are collected and processed; these process-
Taxi#002 are entities; and in the Algorithm, the es are transparent to them. Similarly, how the
presented instance is called vehicleSpeedEstima- service is formed is transparent to service receiv-
tion, which takes vehicle position records as input ers. Three levels (data collection, processing and
and outputs vehicle speeds as a time series. integration, and services) are transparent to data
providers and service receivers. In the next sec-
Overview of the KID-Driven TDaaS System tion, we describe how the KID model enables the
When the TEC paradigm is embedded with the three-level transparency.
KID model, it is enhanced with the KID model’s
cognitive ability to smartly: TDaaS Example
• Direct data to the cloud or edge server for When requesting a TDaaS, such as an estima-
processing tion of whether there are any traffic events, a
• Direct information (i.e., data endowed with user first posts the request to a URL on the cloud
meaning) to the global or local K-store for server. The cloud server parses the URL and
assimilation maps it to the service algorithm trafficEvents().
• Direct knowledge to the cloud for sharing or Moreover, algorithm parameters to specify the
to the local K-store for fusion. time span and road sections for the required traf-
Data as a service
Service worker D
Real time History Traffic D-store
traffic event flow Streaming
condition influence distribution Sub-task parsing HDFS data
files
Scheduling
Big data service engine
Data mining Interpretation I
Service Knowledge Data
parsing matching mining Local K-store
I-store
Global K-store Local Assimilation
vocabulary RDBMS
Global Global Global
vocabulary graphDB algorithm Local Local K
algorithm graphDB
Cloud server Edge server
KID Model Enables Transparency resents the newly produced information, and the
The KID model [12] was inspired by human cog- K on the arrow represents the predefined knowl-
nitive processes: signals that stimulate the brain edge in the K-Store. This indicates that when
are defined as data and denoted as D; sub- producing new information, more than one data
sequently, the nervous system transforms data source may be used.
into information, denoted as I; finally, the brain This process is transparent with respect to
understands the information and transforms it data because the data model is predefined in the
into knowledge, denoted as K, and stores it in vocabulary, and each data model is bound with
memory by establishing connections with other a specific interpretation(). The system can intelli-
related knowledge. Figure 4 shows an example gently recognize the incoming data and input it
of the present processes of transforming data in the specific interpretation() that produces the
into knowledge. Driven by KID, the TEC comput- wanted information.
ing paradigm is transparent at three levels, data,
model, and services, which correspond to three Assimilation for Model Transparency
abstract functions: interpretation(), assimilation(), The newly generated information is assimilated by
and instantiation(), respectively. These three pro- assimilation() into the body of the K-store by link-
cesses are defined as follows. For clarity, the nota- ing it to existing relevant knowledge. This means
tions in the equations are consistent with those in that not only are new entities added, but also
Fig. 4. new relations are established. This process can be
expressed as
I nterpretation for Data Transparency assimilation() : { I1, I 2 ,…, I t } → K1
K
After data are collected by IoT devices, they are (2)
outside the information systems. Here, interpre- where I1, I2, …, It represent the information pro-
tation() means endowing the data with meaning duced in the interpretation parse and K1 is the
by finding associative knowledge, which is stored new knowledge, which is added into the K-Store,
in the K-Store, to interpret it. This process can be enriching it.
expressed as This process is transparent with respect to
model because it is triggered by the former pro-
{
interpretation() : D11, D21 ,…, D1m → I1 } K
(1)
cess, automatically absorbs information as knowl-
edge, and puts knowledge in the correct place,
where D11, D21, …, 1
Dm represent input data, I1 rep- on either the edge or cloud servers.
600
300
200 400
100 200
0
130.1 247.4 385.1 515.9 639.2 763.5 930.1 0
130.1 247.4 385.1 515.9 639.2 763.5 930.1
Data size (Mb)
Data size (Mb)
(a)
(b)
120
TC1 TC2 TEC KID-TEC
2
TEC KID-TEC
100 1.8
1.6
80 1.4
1.2
60 1
0.8
40 0.6
0.4
20 0.2
0
10,000 20,000 30,000 40,000 50,000 60,000 70,000
0
130.1 247.4 385.1 515.9 639.2 763.5 930.1 Entities in the local knowledge graph
Data size (Mb) (d)
(c)
ers. Data are transformed into knowledge before rithm is implemented in Python and contained
transmission to the cloud. This process reduces in a Python sandbox by Anaconda so that the
the computational burden of the cloud. dependencies can be transmitted along with the
codes. The sandbox uses 54 Mb.
Data Description
There are five kinds of data collected in the sys- Experiments
tem: GPS locations, highway pass records, vid- Three Windows Server 2012 64-bit systems (12
eos, events, and weather data. GPS locations and CPUs, 128 GB main memory) were used for sim-
video data are complex data, and to handle them, ulating the edge servers. A Tecent Cloud (https://
high-capacity and high-performance systems are cloud.qq.com/) server was used as the cloud. The
required to meet real-time requirements. For GPS proposed method was compared with a cloud
locations, a complex model is needed to link and computing model without edge servers. To sim-
match them on the road. About 1000 vehicles ulate a surveillance video camera with compu-
pass through a road section; hence, millions of tational ability, a smartphone with an Android
records must be processed because there are operating system was utilized. The network band-
thousands of road sections. A surveillance camera width within the local area network was 128
with 720 P (1280 × 720) produces about 42 GB Mb/s. The related algorithms were implemented
of video per day, which is sent to the edge servers with Java using OpenCV for video processing.
as streaming data. To estimate the performance Within the TEC, the surveillance cameras
of the proposed KID-driven TEC, the simulation were only responsible for collecting and trans-
experiment takes video files ranging from about mitting raw data. The edge servers received and
100 MB to about 1 GB in size. Knowledge is processed the video stream and then discovered
extracted from these unstructured videos. the different moving patterns caused by traffic
For the algorithm, our video processing algo- accidents. Next, the edge servers transferred the
rithm counts the number of vehicles in each results to the cloud as knowledge.
frame. Using the OpenCV framework, this func- The experiments considered the following sce-
tion first recognizes the background and then narios (Fig. 6):
picks out moving areas, which should be vehi- • IoT devices collect data that are then trans-
cles. Morphological algorithms are also applied to mitted to the cloud, which deals with the
eliminate interference caused by noise. The algo- data for decision making (TC1).