You are on page 1of 29

INTRODUCTION TO

SENSOR DATA,
ANALYTICS AND
RESEARCH
Dr. Firoz Anwar
CONTENTS
 Introduction
 Basics/Descriptive Statistics
 Scales of measurement
 Graphical exploration of data
 Descriptive characteristics for a variable

 Estimation
 Characteristics of an estimator
 Confidence interval

 Statistical hypothesis testing


 Statistical testing principle
 Testing errors
 Power analysis

 Why multivariate analysis?

Source: https://www.edureka.co/blog/what-is-data-science/
SENSORS
 Development of miniature Sensors
 GPS-enabled devices
 Pedometers
 Accelerometers
 RFID
 Variety of other Sensors
 Cheap, cost effective
 Different types of data
 Enormous scale of data
OPPORTUNITIES
 GPS-enabled devices allows context and location aware applications. E.g. Map
services, Location based stock reporting, location based review.
 Social sensing.
 Decreasing cost of RFID tags has lead to tremendous volumes of RFID data. RFID tag
price can be less than 5 cents.
 Heavy usage of RFID in supply chain.
 Medial Research and Patient monitoring.
OPPORTUNITIES
 Military applications, use a wide variety of sensors in order to track for unusual events
or activity. This could include visual or audio cameras, or seismometers for tracking
movements of large objects.
 Wide variety of environmental applications, such as detecting weather and climate
trends, and tracking pollution levels in water networks.
RESEARCH QUESTIONS
 How can sensor data be processed?
 How can features be extracted from sensors?
 What are the parameters that can be extracted from sensors?
 How to conserve sensor battery life?
 How to find the best transmission route?
 Push or Pull?
 Query optimisation.
CHALLENGES
 Real-time processing from massive volumes.
o Design of efficient methods for stream processing.
o One pass of the data.

 Limited battery life.


 Large volumes lead to huge challenges in terms of storage and processing of the data.
 Data collection - natural errors and incompleteness in the collection process.
CHALLENGES
 In-network processing, wherein the data is processed within the network itself, rather
than at a centralized service.
 Possibly uncertain data, the errors in the underlying data may lead to uncertainty of
the data representation.
HOW CAN SENSOR DATA BE
PROCESSED?
TEMPORAL-BASED
SEGMENTATION
 Time interval based, and
 Sliding window based.
 The time-interval-based approach divides the sensor datasets into equal time
durations. This is commonly used for breaking down the temporal stream data
obtained by accelerometer and gyroscope sensors.
ACTIVITY-BASED
SEGMENTATION
 The sensory data stream is divided into multiple segments by identifying the
start and end points of each activity.
 The main consequence of this method is the correct identification of boundary
instants.
 Various methods are proposed to identify these limits, distinguishing between
static activities (such as standing and sitting) and mobility (such as walking
and running).
 A threshold is set for identifying the changing points of stand-alone activities
and the analysis of changes in the frequency domain which is used for
determining the beginning and ending points of movement-related activities.
SENSOR-OCCASION-BASED
SEGMENTATION
 Grouping of developments, occasions, or activities that occur in a specific time
request and that may be interleaved with other occasions.
 Example: “family unit” or “dinner arrangement” exercises.
 Challenge:
 Contrasted with fleeting-based division, right now, occasions that structure the
action may not be disseminated consistently in time and may happen sporadically;
in this manner, the size of the windows is not fixed.
DATA EXTRACTION
RESEARCH IN SENSOR
PROCESSING
 Data Collection and Cleaning Issues
 Data Management Issues
 Sensor Data Mining and Processing
 Application-Specific Issues
DATA COLLECTION AND
CLEANSING
 Sensor data is inherently noisy and uncertain
 May have many missed readings or redundant readings depending upon the application
domain.
 For example, in the context of RFID data, almost 30% of the readings are dropped, and multiple
sensors may track the same RFID object.
 Erroneous, errors may arise during data transmission, and there may also be significant
incompleteness because of limited battery life
DATA MANAGEMENT
 Very large volumes of collected data.
 Sometimes, it may be impractical to store the entire raw data
 Often data gets compress or portions of the data gets dropped
 The errors and uncertainty in sensor data, have spurred the development of algorithms for
uncertain database management.
QUERY PROCESSING OF
SENSOR DATA
 Challenging from the perspective of indexing and query processing.
 Event detection
o Wherein continuous queries are posed on the sensor data in order to detect the underlying events.

 Challenge
o High level semantic events are often a complex function of the underlying raw sensor data.
o Sometimes, the event-query cannot be posed exactly, since the event detection process is ambiguously
related to the underlying data.
MINING SENSOR DATA
 Traditional data mining methods
o Clustering,
o Classification,
o Frequent pattern mining, and
o Outlier detection.

 Data usually needs to be compressed and filtered for more effective mining and analysis.
 Challenge
 Conventional mining algorithms are often not designed for real time processing of the data.
 The sensor scenario may often require in-network processing, wherein the data is processed to higher
level representations before further processing. This reduces the transmission costs, and the data
overload from a storage perspective.
APPLICATIONS - SOCIAL
SENSING APP
 Socially-aware data
 Crowdsensing
o According to Wikipedia – “A technique where a large group of individuals having mobile
devices capable of sensing and computing collectively share data and extract information to
measure, map, analyse, estimate or infer (predict) any processes of common interest.
 Social Sensing
o “Social sensing broadly refers to a set of sensing and data collection paradigms where data are
collected from humans or devices on their behalf”. – Wang. D. et al, in Social Sensing, 2015
APPLICATIONS - OTHERS
 RFID Data and the Internet of Things
 Software Bug Tracing in Sensor Networks
 Healthcare Applications
 Environmental and Climate Applications
DATA COLLECTION
 Smartphone sensor-
Accelerometer data collection.
PREPARATION
 Noise reduction
o Fourier transformation

o Dropping
o Fill-in
o Visualisation
BASIC PLOT
HEATMAP
WINDOWING
WINDOWING
REFERENCES
 Suresh, A., R Udendran and Ahmed (2021). Sensor data analysis and management : the role
of deep learning. Hoboken: John Wiley & Sons, Inc.
Practise
PRACTISE
1. Week_4_Lab_Sensor_Data_Cleaning_lab.ipynb [Make Submission for 1 mark]
2. Accelerometer-analysis.ipynb [Includes solution, will be discussed]

You might also like