You are on page 1of 4

2017 29th International Conference on Microelectronics (ICM)

Hybrid Compression Technique with Data


Segmentation for Electroencephalography Data
Madyan Alsenwi1,3 , Mohamed Saeed1,2 , Tawfik Ismail1 , Hassan Mostafa3,4 and Salam Gabran5
1
National Institute of Laser Enhanced Science, Cairo University, Egypt.
2
Institute of Aviation Engineering and Technology, Giza, Egypt.
3
Faculty of Engineering, Cairo University, Egypt.
4
Center for Nanoelectronics and Devices, AUC and Zewail City of Science and Technology, New Cairo 11853, Egypt.
5
Novela Incorporated, Canada.

Abstract—In the medical applications, a large data size of set which exists on the Internet for public use [2]. EEG data
Electroencephalography (EEG) is produced due to long recording used in this paper is a collection of one differential electrode
time, high sampling rate, and a large number of electrodes. contains 32-channel. Data is sampled at 1000 Hz.
Therefore, more space and bandwidth are required for efficient
data transmission and storing. So, to transmit EEG data effi- There are many several works are focused on the EEG
ciently with less bandwidth and storing it in a less space, EEG data compression. The work in [3] considered the use of
data compression is a very important problem. This paper intro- DCT algorithm only for lossy EEG compression, but this
duces an efficient algorithm for EEG compression. First, the EEG algorithm unable to achieve a high CR. While in [4] DCT,
data are segmented into N segment and then transformed through RLE, and Huffman encoding are composed for ECG data
Discrete Cosine Transform (DCT). The transformed coefficients
are passed through a thresholding process and the values below compression algorithm. High CR can be obtained by using
the threshold are set to zero. Finally, the resulting coefficients this algorithm, but it consumes a long time for compression
are coded using the Run-Length Encoding (RLE) scheme. The and decompression processes. The study in [5] is based on
EEG signal can be recovered by an inverse process. Total time using three transformation methods, DCT, Discrete Wavelet
for compression and reconstruction (T), Compression Ratio (CR) Transform (DWT), and Hybrid (DCT+DWT) Transform. DCT
and Percentage Root Mean Error Difference (PRD) are evaluated
in order to check the effectiveness of the proposed algorithm. and DWT both are lossy algorithms then a higher distortion
Simulation results show that a significant improvement in the can occur in the reconstructed signal.
compression time by using data segmentation. The main contribution of this presented work is to reduce
the compression time by segmenting the EEG signal to N
I. I NTRODUCTION parallel data. Furthermore, a hybrid compression technique
Currently, transmission of biomedical signals through com- uses DCT and RLE is applied. The results show that there
munication channels is one of the most important issues in is an improvement in the compression time by using data
the medical applications. As an example of this issue is the segmentation.
transmission of the EEG signals. Recording the EEG signals The rest of this paper is organized as follows. Section II
for several hours generates a large amount of data. Therefore, introduces EEG compression techniques. DCT and RLE are
data compression techniques are required for efficient commu- briefly described in this section. Section III discusses the
nication purposes. implementation of the proposed system and the performance
Data compression techniques can be classified into lossy and measures. Section IV presents the simulation results. Finally,
lossless compression. In the lossless compression, the original Section V concludes the paper.
data can be perfectly reconstructed without any distortion.
While in the lossy compression, some of the data can be loosed II. DATA C OMPRESSION T ECHNIQUES
and this causes a non-perfect reconstruction. In this section, an overview of the data compression tech-
Compression of the EEG signal is a difficult task due to niques is introduced. DCT, which is a type of lossy compres-
the randomness in the EEG signal. Therefore, it is difficult to sion, and RLE, which is a type of lossless compression, are
achieve a high CR with lossless compression techniques [1]. presented here.
In this paper, an efficient hybrid compression technique
based on DCT and RLE is developed. The original EEG data A. Discrete Cosine Transform (DCT)
is segmented into N segments before starting the compression DCT is a type of transformation methods which used to
process in order to reduce the compression and reconstruction convert the time series signal to frequency components. The
time (T ) and increase the efficiency of the developed algo- first few coefficients contains the energy of the input signal
rithm. which is the main feature of DCT.
Since there is no own database for EEG data available for Let f (x) is the input of DCT which is a set of n data
testing and simulations, we have decided to use EEG data values (EEG samples) and Y (u) is the output of DCT which

978-1-5386-4049-4/17/$31.00
Authorized ©2017
licensed use limited to: CZECH IEEE
TECHNICAL UNIVERSITY. Downloaded on July 27,2022 at 12:15:47 UTC from IEEE Xplore. Restrictions apply.
2017 29th International Conference on Microelectronics (ICM)

is a set of n DCT coefficients. For n real numbers, the one in the transformed data can be increased by transfor-
dimensional DCT is expressed as follows [3], [5], [6]: mation and thresholding steps together. Finally, high
 compression ratio achieves due to the high redundancy
n−1

2 π(2x + 1)u in the transformed data by using RLE [10].
Y (u) = α(u) f (x) cos( ) (1)
n x=0
2n 1) Reconstruction Unit: First, the compressed data is de-
coded using inverse RLE. Then the inverse DCT is
where
applied in order to reconstruct the EEG data.

√1 , u=0
α(u) = 2
B. Data Segmentation and Compression
1, u>0
The first step, in this case, is reading the EEG data and
where Y (0) is the DC coefficient and the rest coefficients
segment it into N sample. Each sample is taken every Ts time
are referred to as AC coefficients. The Y (0) coefficient
as shown in Fig. 3. We can reduce the total time (compression
contains the mean value of the original signal.
and reconstruction) by decreasing Ts . However, the Ts value
must be maintained upper threshold value to guarantee that
Inverse DCT takes transform coefficients Y (u) as input and
each unit completes the current segment before arriving a new
converts them back into time series f (x). For a list of n DCT
segment according to the following condition:
coefficients, the inverse transform is expressed as follows [7]:
 n−1
2  π(2x + 1)u
f (x) = α(u) Y (u) cos( ) (2) Ts >= max(TDCT , Tthr , TRLE , TIRLE , TIDCT ) (3)
n u=0
2n
where TDCT is the DCT time, Tthr is the thresholding time,
Most of the n coefficients produced by DCT are small
TRLE is the RLE time, TIRLE is the inverse RLE time and
numbers or zeros. These small numbers usually down to zero.
TIDCT is the inverse DCT time. Therefore, the minimum
B. Run Length Encoding (RLE) sampling time Tmin can be obtained from the following
RLE is a type of the lossless compression. The main idea of equation:
RLE is to consider the consecutive repeating occurrences of a
certain data value and replace this repeating value by only one
Tmin = max(TDCT , Tthr , TRLE , TIRLE , TIDCT ) (4)
occurrence followed by the number of occurrences as shown
in Fig. 1. This is most useful on data that contains many such
This algorithm achieves the smallest compression and re-
runs [4], [8], [9].
construction time if Ts = Tmin . After that, the segmented
data will go through the compression unit and reconstruction
unit respectively. The final step is combining the reconstructed
data which is the inverse process of the data segmentation.
Algorithm 1 shows the compression and reconstruction process
in case of compression with data segmentation.
Fig. 1: The idea of RLE
C. Performance Metrics
III. I MPLEMENTATION The three performance metrics which are used in this paper
This section introduces the proposed algorithm implementa- are presented here.
tion and also the performance measures. The compression with 1) Percentage Root Mean Difference (PRD): Which is the
and without data segmentation is introduced here to show the measurement of the distortion between the original signal and
difference between the two cases. the reconstructed signal. PRD can be defined as [11]:
A. DCT with RLE 
n
(y − y  )2
First, we propose a system consists of two main units: P RD = n i 2 i
i=1
∗ 100 (5)
compression unit and reconstruction unit as shown in Fig. 2. i=1 yi

Algorithm in [10] shows the compression and reconstruction where y  and y are the reconstructed and original signals,
process for DCT with RLE. respectively.
1) Compression Unit: The first step in the compression unit
is reading the EEG data file, and then transform it by 2) Compression Ratio (CR): The second performance mea-
DCT. After that, to get a high redundancy in the trans- sure, is the CR, which is defined as:
formed data, thresholding step is applied. By varying the
threshold value, the number of zero coefficients can be OriginalData − CompData
increased or decreased. The probability of redundancies CR = × 100 (6)
OriginalData

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 27,2022 at 12:15:47 UTC from IEEE Xplore. Restrictions apply.
2017 29th International Conference on Microelectronics (ICM)

Fig. 2: Block diagram of DCT with RLE

Fig. 3: Block diagram of compression with data segmentation

3) Compression and Reconstruction Time (T): The final 16


metric is the time (T), which is the total time of compression 15
process and decompression process. 14
13
12

PRD (%)
T = Tcomp + Treconst (7) 11
10
9
where Tcomp is the total compression time and Treconst 8
is the total reconstruction time and can be defined as the 7
6
following: 5
4
50 55 60 65 70 75 80 85 90
Tcomp = TDCT + Tthr + TRLE (8)
CR (%)

Treconst = TIRLE + TIDCT (9) Fig. 4: CR versus PRD

Therefore, the total time can be defined as the following:


16
15
14
T = TDCT + Tthr + TRLE + TIRLE + TIDCT (10) 13
12
IV. S IMULATION R ESULTS 11
PRD (%)

10
The performance of the proposed compression algorithm is 9

studied using MATLAB and it’s run on Intel(R) Core(TM) i3 8


7
CPU 2.27 GHz, 4 GB RAM. The size of EEG data is 1 MB.
6
Fig. 4 shows the CR with different values of PRD in the 5
case of compression without data segmentation. The value of 4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
PRD can be changed by varying the threshold value. Time (s)

Fig. 5 shows the total time of compression and recon- Fig. 5: Time versus PRD
struction processes with PRD in the case of compression
without data segmentation. If the threshold value is increased,
20
more coefficients will be set to zero. Therefore, the PRD will 18
increase and the time will decrease. 16
Fig. 6 shows the total time with different values of N in 14
Time (s)

the case of compression with data segmentation. We can get 12


10
a high reduction in the total time if the original EEG data is
8
segmented. The minimum value of T can be achieved if we 6
put N = 200, as shown in this figure. 4
Finally, a comparison between the compression without 2
segmentation and with segmentation (N = 200) regarding 0
1 101 201 301 401 501 601 701 801 901 100
CR is shown in Fig. 7. For the same PRD, the compression N
with segmentation has a higher CR than compression without
segmentation. Fig. 6: Time versus Number of samples (N)

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 27,2022 at 12:15:47 UTC from IEEE Xplore. Restrictions apply.
2017 29th International Conference on Microelectronics (ICM)

Algorithm 1 Compression and Reconstruction Algorithm with Data Segmen-


tation 80
 Data Splitter 75
N ← N umberOf RequiredSamples 70
65

CR (%)
L ← LengthOf EEGData
sp ← f loor(L/N ) k ≤ N k = 1 60
Data ← EEGData(1 : sp) 55
initial ← (k − 1) ∗ sp + 1 50
Original Case
f inals ← k ∗ sp 45 Splitting Case
Data ← EEGData(initial : f inal) k = N 40
vector ← EEGData(k ∗ sp + 1 : L) 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9
Data ← [Data vector] PRD (%)
Fig. 7: CR versus PRD in case of N=1 and N=200
 DCT Compression
T ransData ← DCT (Data)
zero. Finally, the resulting data is compressed using RLE. The
 Thresholding inverse process is applied in order to recover the original EEG
T hr ← T hresholdV alue data. CR, PRD, and T are evaluated to check the effectiveness
[SortedData, index] ← sort(abs(Data)) of the proposed algorithm. The case of compression with data
i ← 1 Lengthof Data abs(x(i)/x(1)) > T hr segmentation has a higher CR, and less time compared with
i←i+1 compression without segmentation.
continue ACKNOWLEDGEMENT
break
T ransData(index(i + 1 : end)) ← 0 This work was supported by the Egyptian Information Tech-
nology Industry Development Agency (ITIDA) under ITAC
 RLE Compression Program CFP 96.
n←1 R EFERENCES
d(n) ← T ransData(1)
[1] L. J. Hadjileontiadis, “Biosignals and compression standards,” in
c(n) ← 1 MHealth, pp. 277-292. Springer, 2006.
i ← 2 i ≤ LengthOf Data T ransData(i − 1) = [2] https://sccn.ucsd.edu/ arno/fam2data/publicly available EEG data.html,
T ransData(i) 2017.
[3] D. Birvinskas, I. Jusas, and Damasevicius, “Fast DCT algorithms for
c(n) ← c(n) + 1 EEG data compression in embedded systems,” Computer Science and
n←n+1 Systems, vol. 12, no. 1, pp. 49-62, 2015.
d(n) ← T ransData(i) [4] S. Akhter and M. Haque, “ECG compression using run length encoding,”
in Signal Processing Conference, 2010 18th European. IEEE, 2010.
c(n) ← 1 [5] A. Deshlahra, G. Shirnewar, and A. Sahoo, “A comparative study of
CompressedData ← [d, c] DCT, DWT & hybrid (DCT-DWT) transform,” International Conference
on Emerging Trends in Computer and Image Processing(ICETCIP),
2013.
 Reconstruction [6] S. Fauvel and Ward, “An energy efficient compressed sensing framework
 RL Dcoding for the compression of electroencephalogram signals,” Sensors, vol. 14,
d ← CompressedData(:, 1) no. 1, pp. 1474-1496, 2014.
[7] Z. T. Drweesh and L. E. George, “Audio compression based on discrete
c ← CompressedData(:, 2) cosine transform, run length and high order shift encoding,” International
RLDec ← [ ] Journal of Engineering and Technology (IJEIT), vol. 4, issue 1, pp. 45-
i ← 1 i ≤ LengthOf Data 51, 2014.
[8] Y.-S. Chen, H.-Y. Lin, H.-C. Chiu, and H.-P. Ma, “A compressive
RLDec = [RLDec d(i) ∗ ones(1, c(i))] sensing framework for electromyogram and electroencephalogram,” in
 Inverse DCT Medical Measurements and Applications (MeMeA), IEEE International
ReconstructedData ← IDCT (RLDec) Symposium on, pp. 1-6. IEEE, 2014.
[9] R. Mahajan and D. Bansal, “Hybrid multichannel EEG compression
 Data combining scheme for tele-health monitoring,” in Reliability, Infocom Technolo-
F inalOutput ← [F inalOutput ReconstructedData] gies and Optimization (ICRITO) (Trends and Future Directions), 3rd
International Conference on, pp. 1-6. IEEE, 2014.
[10] M. Alsenwi, T. Ismail, and H. Mostafa, “Performance analysis of
hybrid lossy/lossless compression techniques for EEG data,” in 28th
International Conference on Microlectronics (ICM), IEEE, 2016.
V. C ONCLUSION [11] G. Higgins, S. Faul, R. P. McEvoy, B. McGinley, M. Glavin, W. P.
Marnane, and E. Jones, “ECG compression using jpeg2000: How much
In this presented work, a new compression algorithm for loss is too much?” in Annual International Conference of the IEEE
EEG data is proposed. First, The EEG data is segmented Engineering in Medicine and Biology, pp. 614-617. IEEE, 2010.
into N segment then each segment is compressed using DCT.
After that, all the coefficients below the threshold are set to

Authorized licensed use limited to: CZECH TECHNICAL UNIVERSITY. Downloaded on July 27,2022 at 12:15:47 UTC from IEEE Xplore. Restrictions apply.

You might also like