You are on page 1of 5

2017 International Conference on Mathematics and information Technology, Adrar, Algeria – December 4 - 5, 2017 196

A new approach based mobile agent system for


ensuring secure Big Data transmission and storage

KASSIMI dounya, KAZAR Okba, and SAOULI SAIFI Safa and HASSANI Iman
Hamza Department of Computer Science, University of Mohamed
LINFI Laboratory KHIDER, Biskra, Algeria
Department of Computer Science, University of Mohamed safa16saif@gmail.com, rose.rose.hass@gmail.com
KHIDER, Biskra, Algeria
dounya_kassimi@yahoo.fr, kazarokba@yahoo.fr,
hamza_saouli@yahoo.fr

BOUSSAID Omar
ERIC Laboratory - Warehouse, Knowledge Representation
and Engineering
Department of Psychology of Health, Education and
Development (PSED), University Lumiere Lyon 2
Bron, Rhone-Alpes, France
omar.boussaid@univ-lyon2.fr

Abstract—Big Data dwarfs all the knowledge that we knew in In this paper we propose a new approach based on agents,
this decade and also for the rest of our natural lives as well. It is to ensure bigdata security. The main objectives are: ensuring
more than just lot of data, it represents the end beginning of secure data transmission, avoiding data losing, scanning and
industry experience as core competitive advantage. In this paper detecting any intrusion and protecting the already stocked data
we treat the problem of Big Data security and privacy using by taking into consideration the access level of system users.
Mobil and stationary agents’ technologies. Section 2 presents
some related works, in section 3, we explain the proposed The remain of the paper is organised as follow: section 2
architecture. The proposed architecture is mainly based on: related works with their drawbacks, section 3 the proposed
Integrity agent for verifying data concordance, path agent to architecture, section 4 realisation and implementation and
ensure data transmission, and Intrusion detection agent to scan section 5 conclude this work.
the stored data. In section 4, we implemented a prototype and we
use a real Big Data base, to test the effectiveness of using Multi
I. RELATED WORKS
agent systems to improve data security and privacy.
In [2], the authors presented a system of analyzing and
Keywords—Big data; Security; Multi-Agent system; Pentaho preserving the information in Big Data called Crypsis. Crypsis
is using the Homomorphic cryptography method, which is
I. INTRODUCTION based on both extended program and system perspective. This
method has its risks because it is not always available; and
Data protection become the major problem that needs our also Crypsis does not address integrity and availability issues.
attention especially after the apparition of Big Data notion.
This notion involves [1] volume, variety and velocity In [3], the authors presented a set of methods that are used
constraints, which increase the number and types of threats. to protect privacy which is previously placed by computer
scientists and statisticians. And also presented the problems in
Big Data is based on a set of technologies like: Hadoop, those methods for example: the Aggregation (has the problem
MapReduce, spark and so on, to secure and protect massive of ecological inference) and Delete (can delete integer values,
amounts of produced petaoctect each day. which creates missing and difficult to parse data).
By security we essentially mean, confidentiality, integrity, In [4], the authors addressed the multi-level security
availability and data utility. To ensure all these security (MLS). It is it is generally based on a formal model called the
constraints we need to take into consideration the data routing, Bell-LaPadula model. The MLS brings the change in how to
intrusion detection and the access level of each user. protect privacy in SE Linux. The weakness of MLS is the fact
that it implements a unique security objective that protects the

978-1-5386-3269-7/17/$31.00 ©2017 IEEE


2017 International Conference on Mathematics and information Technology, Adrar, Algeria – December 4 - 5, 2017 197

confidentiality of sensitive data, and uses the model of


government documents classified in a strict inflexible manner.
In [5], the authors propose a new MapReduce architecture
of hybrid cloud (IntegrityMR) to assure the integrity, which
benefits from the advantage of private cloud control while
using the calculation capacity of public clouds. But this
architecture have some problems like the Cost of inter-cloud
communications and Hybrid clouds require channels between
the private cloud and public clouds, and this is a security
weakness, because these channels increase the likelihood of a
malicious attack on the private cloud.
In [6], the authors use a strategy based on the gradual
perturbation of the curve while keeping track of the error
introduced. This allows them to refine the PL curve, if
necessary, and to avoid erroneous topological changes. This
work allows simulation and visualization of high performance.
In [7], the authors speak about the current achievements of
several representative approaches for integrity and verification
of external parties, they specify the cryptographic algorithms,
because it has a lot of inconvenient. The cryptographic
algorithms is not compatible with Big data and the
preprocessing time can sometimes be considerable for the
incremental data sets, and the malicious control procedure
consume a lot of computing resources at the expense of the Fig. 1. Multi-agent system architecture.
data owner.
External security system: this system is represented by 4
II. PROPOSED ARCHITECTURE agents named MOBILE AGENT, PATH AGENT, AUTHENTICATION agent
To design our architecture we use agent paradigm. The and INTEGRITY AGENT; those agents make sure that the fail or the
idea and the motivation to use multi-agent system based on information that will be stocked are safe and arrive completely.
characteristics which are given by agent; autonomy; Internal security system: It contains 3 agents: ACCESS LEVEL
intelligence, parallel processing, cooperation. AGENT, INTERFACE AGENT and INTRUSION DETECTION AGENT there job
is to protect to privacy of the users and also protect the
Security process must be smart and efficiency. The
previous work cited in section 2 present a limits which can be information stocked in the data center, there principle is based
solved by using agent paradigm. on the 3 reference [9] [10] [11].

In our proposition we define a solution for previous four A. External security system
criteria’s: two agents for Integrity (mobile agent and Integrity
agent), one agent for Authentication (Authentication agent), In this section, we select an agent to represent from external
for the Privacy and Access control we have two agent security system, we name it as MOBILE AGENT and its role is:
(ACCESS Level agent and Interface agent). Also we have an • Protection of data path;
agent for Intrusion detection.
• Guaranteed not to lose data in the shipment;
We present the overall architecture of the proposed system
that is depicted by “Figure 1” • Assured that there is no intervention;
• This agent works with integrity agent and the agent
virtual return.

978-1-5386-3269-7/17/$31.00 ©2017 IEEE


2017 International Conference on Mathematics and information Technology, Adrar, Algeria – December 4 - 5, 2017 198

We represent this agent by the flowing picture “Figure 2”


C. Modeling the operational space
The explanation of the above sequence diagram “Figure
4,” is:
External security: The activation of the mobile agent will
be with the receiver of a message from the webserver which
seeks to store its information in the big data
The mobile agent sending an activation message to the
virtual router agent, the latter sending an activation message
for authentication agent that makes the verification of data
Fig. 2. Mobile agent architecture. source (check the connected server certificate). After that, it
sends a message or notification of reject of the connection to
the virtual router agent and this one sends a message to the
B. Internal Security System mobile agent.
This part of the system contain two agents, we present one
of them. The role of INTRUSION DETECTION AGENT is: • If the received message is a notification of acceptance,
the virtual router agent protects the port of the
• Scan (look for viruses, the redundancy check) connection and sends a message to the mobile agent to
segmented data stored itself because we are trying to protect the path that transfers the data. Then, it sends
solve the following problems: an activation message to the integrity agent to analyze
the data. It must connect the Hadoop (HDFS) for
─ Large amount of data (processing capacity is limited) storing data using Name Node Data Nodes. When data
and Variety (easy intrusion); storing is ended, Hadoop answers the mobile agent to
─ Real Temp (speed); release the port connection.
• The release of processing capacity.
• If the received message is a reject message of the
It is represented by “Figure 3”. connection, the virtual router agent and the mobile
agent should just cut the connection to the server and
release the port.
- Internal Security: We have two parts in this section :

• User part: the interface agent is activated with the


receiver of the client request, after that the interface
Fig. 3. Intrusion detection Agent architectur
agent sends an activation message to the access level
agent with the inquiry treat this last agent look for the
customer query response through communication with
DataNode Name Node and, Job Tracker and Task
Tracker..
• System part: The Intrusion detection agent connects
Job Tracker (J.T) and Task Tracker (T.T) to scan the
data stored in the Data Node also search for problems
in the content of secondary Name Nodeand the Name
Node.

III. REALISATION AND RESULTS


To evaluate the performances of the proposed system we
use a machine under Windows operating system with 4 G of
RAM and 2.6 GHT microprocessor. We have also
downloaded [12, 13, 14, 15, 16] and generated the Big Data
base rows in order to use them with JADE platform to
implement the system components.
We have also used five main classes to implement the
proposed prototype:

1) MultipartUploadUtility: implements the functionality to


Fig. 4. Sequence diagram of the proposal system upload a file to a server through HTTP’s multipart
request.

978-1-5386-3269-7/17/$31.00 ©2017 IEEE


2017 International Conference on Mathematics and information Technology, Adrar, Algeria – December 4 - 5, 2017 199

2) UploadTask: this class extends the


javax.swing.SwingWorker class to perform the upload
in a background thread so the GUI does not freeze when
the progress bar is being updated by the Swing’s event
dispatching thread (EDT).
3) SwingFileUploadHTTP: this is the main application
which is a JFrame and displays the GUI. It implements
the java.beans.PropertyChangeListener interface to be
notified about the progress of upload which is sent by
the UploadTask class. This class also utilizes the
JFilePicker class to show a file chooser component. The Fig. 6. Pentaho job controller interface.
JFilePicker in turn, uses the FileTypeFilter class [8].
4) DNA: it is based on DNA algorithm for Encryption and TABLE I. COMPARISON TABLE
decryption files.
Approach / Integrity Authen- Privacy Access
5) Path: Create the connection port to transmit the files. Criteria tification Policy control
Preservation and
Analysis of Big X √ √ X
Data [2]
Privacy Policy [3] √ √ √ √
Privacy protection
√ √ √ X
in SE Linux [4]
Externalintegrity
√ X √ (-)
check [5]
Integrity
assurance √ (-) √ (-)
framework [6]
Big data viewing
(-) (-) (-)
[7]
Proposed
√ √ √ √
approach

A. Comparison of the related works and the proposed


approach
Using MAS the four criteria’s can be treated as follow:
Integrity: we are treating the problem of integrity using
mobile agent and Integrity agent saw we can control all the
information that we are receiving and Finding out the real
sender.
Fig. 5. Class diagram of our system
Authentication: Using Authentication agent and virtual
“Figure 6” represents Pentaho platform on which we router agent we can say that we can Trust the source of the
launch Hadoop cluster (which is the Big Data platform used information or not.
with various projects). We also use this platform to transform Privacy: we protect the privacy of the information and the
XML, SQL and JSON to the Hadoop distributed file system privacy of our users with both Access Level agent and
(HDFS) in order to store it in Hadoop. Before this those Interface agent with them we are assuring no one is seeing the
files are treated with the proposed external security System information that is not allowed to hem.
and checked by the Integrity agent. Access control: using ACCESS Level agent and Interface
agent we control the user’s that they are accessing to the
information. We decide to each user the information that he
can consult.

B. EVALUATION
The architecture that we proposed allow as to reduce
network traffic and bandwith requirement by using the mobile
agent and also robustness and faults and faults tolerance.
Using the DNA algorithm for deciphering the files we can trait
a big number of files because the DNA algorithm in one of the
popular algorithm’s used by the users. We need to extend it to
support integration of hundreds of access control policies. At

978-1-5386-3269-7/17/$31.00 ©2017 IEEE


2017 International Conference on Mathematics and information Technology, Adrar, Algeria – December 4 - 5, 2017 200

same pointe we can automatically granite permissions to a


large data sets. We can assure that the data error-free, and REFERENCES
protected from malicious parties, but aware solution is not [1] H. Saouli,O. Kazar, D. Kassimi, '' Applications et enjeux des Big Data
comprehensive we need to extend it to handle more malicious. dans le contexte des défis mondiaux'', Conférence sur Les Avancées des
With the policies that we are using we can assure data Systèmes Décisionnels (ASD) (2016), pp 41-52.
confidentiality in the same time we can make the data [2] J. J. Stephen, S. Savvides, R. Seidel, P. Eugster, '' Practical
available to everyone in the same time, and that is inefficient Confidentiality Preserving Big Data Analysis'', Proceeding HotCloud'14
Proceedings of the 6th USENIX conference on Hot Topics in Cloud
we need to integrate more policies to our system. Computing, pp 10-10, Philadelphia, PA — June 17 - 18, 2014.
There is a comparative study based on the user experience [3] A. Machanavajjhala, J.P.Reiter”Big Privacy: Protecting Confidentiality
between Pentaho and two other Business Intelligence Software in Big Data”, XRDS: Crossroads, The ACM Magazine for Students –
Big Data: Volume 19 Issue 1, Fall 2012.
solution that exist in the market. For this comparison they
[4] X. Xu and C. Xiao, ChaoqinGao and GuozhongTian, ''A Study on
examine the tools. It’s also possible to compare their score Confidentiality and Integrity Protection of SELinux'', International
(8.4 for Jaspersoft vs. 8.1 for Pentaho vs. 9.7 Sisense) and user Conference on Networking and Information Technology (2010).
satisfaction level (100% for Jaspersoft vs. 95% for Pentaho vs. [5] Yongzhi Wang, Jinpeng Wei, MudhakarSrivatsa, YucongDuan, Wencai
99% Sisense). Du, '' IntegrityMR: Integrity Assurance Framework for Big Data
Analytics and Management Applications '', IEEE International
Conference on Big Data 2013, pp 33-40.
[6] Hugh P. Cassidy, Thomas J. Peters, HoreaIlies, and Kirk E. Jordan,
''Topological Integrity for Dynamic Spline Models During Visualization
of Big Data’’, Springer International Publishing Switzerland 2014, pp
167-183.
[7] Chang Liu, Chi Yang, Xuyun Zhang, Jinjun Chen, '' External integrity
verification for outsourced big data in cloud and IoT: A big picture'',
Journal Systèmes informatiques de génération future, Volume 49
Numéro C, août 2015, pp 58–67.
[8] consulted on 13/04/2016: http://www.codejava.net/coding/swing-
application-to-upload-files-to-http-server-with-progress-bar
[9] D. Boukhlouf, O. Kazar, ''Hybrid Approach based Mobile Agent for
Intrusion Detection System: HAMA-IDS''. Journal of Information
Security Research, ISSN: 0976-4143 Volume 3, pp: 30-41, March 2012.
[10] D. Boukhlouf, O. Kazar, ''Intrusion Detection System: Hybrid Approach
Fig. 7. comparative study between Business Intelligence Software [17]. based Mobile Agent''. ICEELI‟2012 : International Conference on
Education and E-Learning Innovations July 1-3 Sousse,Tunisia, 2012.
[11] D. Boukhlouf, O. Kazar, ''Hybrid Approach based Mobile Agent for
IV. CONCLUSION Distributed Intrusion Detection System''. In Proceeding ICDSD‟2012 :
International Conference on Distributed Systems and Decision, ISSN
In this paper we have proposed a new architecture based 2335-1012,Oran Algeria 2012.
agent for Big Data security and safety, the proposed system [12] http://aws.amazon.com/fr/publicdatasets/. Consulted on 10/03/2016 :
proposed a set of solutions for some related works [13] https://archive.ics.uci.edu/ml/datasets.html. Consulted on 25/03/2016 :
disadvantages by using: a mobile and virtual router agents to [14] http://www.kdnuggets.com/datasets/. Consulted on 25/03/2016 :
protect the data paths, an intrusion detection agent to detect
[15] https://www.opensciencedatacloud.org/ publicdata/. Consulted on
malicious programmers, and authentication and integrity agent 12/03/2016.
to be sure that the stored Big Data is conform to the original [16] http://en.wikipedia.org/wiki/Wikipedia: Database_download. Consulted
sent data. on 20/03/2016.
We have also used Pentaho platform to deploy Hadoop [17] https://comparisons.financesonline.com/jaspersoft-vs-pentaho. consulted
on 20/04/2016.
clusters to manage the Big Data base and to deploy the multi
agent system.
Finally, we plan to resolve the problem of hired party trust
especially when it’s deployed in cloud platform to ensure Big
Data integrity and confidentiality.

978-1-5386-3269-7/17/$31.00 ©2017 IEEE

You might also like