You are on page 1of 6

Computer Science - BotPro Case Study Notes

Bundle Adjustment
• Technique used in computer vision and photogrammetry
• Refines parameters of a 3D reconstruction system
• Used to align multiple views of the scene points and camera poses
• “Bundle” - a collection of features observed in multiple images of the same scene.
• The bundles are then optimised while taking into account the camera parameters such as distortion, cali-
bration and relative positioning to rearrange into a proper scene.
• Iterative minimisation of reprojection error to provide a more precise representation of the scene

Computer Vision
• Field of study and research focusing on enabling computers to understand, interpret and analyse informa-
tion from images and videos
• Development of algorithms, techniques and models to extract meaningful information from visual data to
replicate human vision capabilities
• Steps in computer vision
• Accepts the digital and video frames as input
• Extract high-level information from visual data using complex mathematical and statistical algo-
rithms to analyse patterns, shapes, colours and textures to recognise objects.

Dead reckoning

• The technique of estimating current position, orientation or velocity of an object or vehicle using previ-
ously interpreted measurements of kinematic properties
• The data gathered about the position using dead reckoning is called dead reckoning data
• Initial position: starting position, reference point
• Acceleration: in X, Y and Z axes using accelerometer to estimate changes in orientation or angular
velocity
• Rotation: provided by gyroscopes or inertial measurement units (IMU) to estimate changes in ori-
entation or angular velocity
• Time: integrates the other measured parameters
• Prone to accumulating errors due to sensor drift, noise and imprecise measurements leading to inaccurate
position estimates if not properly calibrated using GPS or visual tracking

Edge Computing
• Distributed computing paradigm
• Brings computation and data storage to the edge of the network where data is generated or consumed com-
pared to sole reliance on centralised cloud computing infrastructure
• Processing algorithms is located near the edge devices or sensors which reduces the need to transmit data
to remote cloud servers to compute data
• Idea is to enable real time processing and processing closer to the data source
• Advantages
• Low latency: less access times, good for time-sensitive applications such as autonomous vehicles
and real-time analytics
• Bandwidth Optimisation: Reduces amount of data needed to be transferred across the internet, alle-
viates congestion and high bandwidth costs
• Privacy and security: data does not need to traverse external networks, allows localised data stor-
age and processing
• Offline operation: Allows operation during times of limited cloud/network connectivity
Global Map Optimisation
• Technique used to refine and improve accuracy of 3D map reconstruction
• Involves simultaneous optimisation of 3D positions of landmarks (scene points) and its relevant camera
poses
• Employed in SLAM algorithms to create a map of the environment while simultaneously determining po-
sition of sensor within the map
• Minimises discrepancy between predicted positions of 3D points and actual observed points.
• Bundle adjustment or non-linear least squares optimisation is used

GPS Signal
• Refers to radio frequency signals transmitted by GPS satellites that provide information to receivers on
earth
• Allows receivers to calculate precise location, position and time to synchronise its motion
• Consists of satellites that continuously transmit signals about their orbital parameters and exact time sig-
nals
• Components of the GPS Signal
• Navigation Message - information about satellite orbit, clock errors and other parameters at 50bits
per second
• Carrier Wave - radio wave carrying the navigation message on L1 or L2 in the form of modulated
signals
• Spread Spectrum Signal - to enhance signal quality and resistance to interference and/or disrup-
tions to spread the signal over wider frequency band.
• Receivers intercept the signal to analyse time delay between sending and receiving to calculate distance

GPS-degraded environment
• The GPS signals are severely compromised or degraded, leading to challenges/limitations in accurate posi-
tioning and navigation
• Causes
• Signal obstruction - physical obstructions along line-of-sight of GPS receiver and satellites
• Multi path interference - signals reflect off buildings and other terrain before reaching receiver to
interference with direct signals to cause errors/inaccuracies
• Signal jamming - intentional or unintentional interference to disrupt or block GPS using EM
waves, concern in high electronically active areas
• Alternative positioning methods such as inertial navigation system (INS) using accelerometers and gyro-
scopes or other satellites.

GPS-denied environment
• Situation where the GPS signals are too weak/completely unavailable
• Indoor, underground, dense areas where signals may be weakened, distorted or blocked
• Again, alternative positioning methods or technologies are required

Human Pose Tracking


• Computer vision task that involves estimating position and orientation of human body joints or body parts
• Understand and analyse human movement and posture
• Algorithms operate on visual data and machine learning techniques to estimate 2D and 3D positions or ori-
entations of the body joints to represent human poses
• Can employ deep-learning, graphic or optimisation based methodologies
• Leverage on CNNs (convolutions neural networks), or recurrent neural networks (RNNs) to learn features
and relationships between body key points

Inertial Measurement Unit (IMU)


• Electronic sensor device combining multiple sensors to measure linear and angular motion of object
• Integrates multiple devices including accelerometer, gyroscope and magnetometer into one single compact
unit
• Provide complete insight into object’s kinematic state
• Used in navigation systems to estimate changes in position to enable precise control and localisation

Keyframe Selection
• Video processing technique involving the identification and selection of key frames that represent the en-
tire scene from a sequence of videos / images
• Capture essential information from the content of the visual sequence
• Reduces amount of data to be processed or analysed while limiting the data to only relevant information
• Criteria for a frame to be considered keyframe
• Visual Saliency: only capture visually salient regions or objects in the video
• Content Diversity: represent different scenes, perspectives and/or actions to provide a comprehen-
sive overview of text
• Content Diversity: represent different perspectives of the visually salient regions for comprehen-
siveness
• Temporal Significance: select specific points in time of significance
• Motion Characteristics: based on motion analysis
• Redundancy: Selecting frames that only offer unique information compared to neighbouring
frames
• Computational efficiency: strike balance between accuracy and complexity

Key points / pairs


• Distinctive or informative locations or regions in a set of images
• Identified based on unique visual characteristics including corners, edges and blobs
• Serve as landmarks or reference points for computer vision tasks
• Detected using feature extraction algorithms that analyse local properties such as intensity gradients, tex-
ture or scale-space representation.
• After detection, they are described using feature descriptions that encode local appearance around the key
point
• Key pairs - corresponding key points detected in two or more images
• Matching key pairs allows the tracking of movement by observing changes in the environments

Light Detection and Ranging (LiDAR)


• Remote sensing technology using laser light to measure distances and create precise 3D representations of
surrounding environment
• Emit laser pulses to measure time taken to bounce back and delay is used to calculate the kinematic infor-
mation.
• Components
• Laser source: to emit pulsated bursts of light in rapid succession
• Scanner / Receiver: to detect the reflected pulses and to steer laser beam in larger directions
• Timing and Positioning: measures time to detect laser pulses to enable calculation of distance
• Applications
• Mapping and Surveying
• Autonomous Vehicles
• Environmental Monitoring

Object Occlusion
• Phenomena in which an object positioned in front of another obstructs the visibility of obscured object
from viewpoint of observer
• Affects tracking, segmentation and recognition
• Causes complexities due to partial visibility making it difficult to predict actual nature
• Disadvantages
• Full extent and boundary of object not visible
• Loss of tracking on an object - continuity requires complex algorithms
• Exhibit limited visual cues or fragmented appearance
• Depth relationships between occluded and occluded objects are not visible

Odometer Sensor
• Device used to measure movement and displacement of mobile robot or vehicle
• Provides information about vehicle change in position based on motion
• Use rotational encoders or sensors on wheels / motor shafts to measure movement
• Combined with other information such as from IMU or GPS to improve accuracy and reliability of vehicle
pose estimation and localisation
• Provide real-time feedback to allow for precise control and monitoring

Optimisation
• Process of finding best possible solution by maximising a specific objective function within a given set of
constraints
• Involves systematic exploration and performance improvement
• Process
• Defining the problem and its constraints
• Identification of the space of possible solutions through range and bound determination
• Objective function to measure quality of solution based on optimisation goal
• Reference to any constraints of the problem that solution must aim to solve
• Developing algorithms and techniques to solve the problem
• Assessing the optimised solution and evaluating its performance against defined objectives and con-
straints

Re-localization
• Relocating the position when camera or robot looses track or encounters environmental change due to sen-
sor drift, occlusion, movement or lighting
• Successful re-localization allows system to accurately cover its pose estimation and continue operation
• Steps to re-localization
• Map of the environment is created with key reference points for pose estimation
• Extraction of visual features including key points and key frames
• Features matched against map
• Estimation of camera or robot pose calculated using matched features
• Refinement or verification of estimation to improve accuracy and reliability
• Most useful in SLAM situations where the robot/camera needs to constantly update its location
Rigid Pose Estimation (RPE)
• Process of determining precise position and orientation of rigid object in 3D space
• Estimates six degrees of freedom (6DoF) transformation
• Rigid - object does not deform or change its shape during pose estimation
• Process steps
• Feature detection - distinctive key points and frames detected
• Feature matching - matching the data with corresponding figures
• Pose estimation - solving for 6DoF transformation and estimating the pose using key points in ref-
erence frame
• Refinement - to improve accuracy
• Performed using various algorithmic techniques including PnP, ICP or RANSAC algorithms

Robot Drift
• Robot’s estimated position gradually deviates from actual position over time
• Factors that contribute
• Sensor noise: inaccuracies or anomalies
• Calibration errors: misalignment of components, incorrect calibration
• Environmental changes: terrain, lighting or magnetic field affect readings
• Accumulative integration: adding sensor measurements causing error propagation and accumula-
tion of mistake
• Uncertainty: complex or dynamic environments
• Methods for mitigation
• Sensor fusion: integrating data from multiple sensors doing same thing
• Kalman Filtering: to mitigate noise and uncertainties
• Loop closure: correction mechanisms to correct accumulated error
• Environmental Constraints: connect drift by aligning estimated pose with actual environment
• Online calibration/recalibration: reduce systematic errors

Simultaneous Localisation and Mapping (SLAM)


• Technique used in order to enable a robot or an autonomous system to build a map of an unknown environ-
ment while estimating its own position
• Process
• Data Acquisition: images, range measurements, visual cues
• Extraction: key frames, key pairs and reference points
• Data association: identify common features across the different positions and viewpoints to create
consistent map
• Mapping: construct map using point clouds, occupancy grids and feature-based maps
• Localisation: use of IMU and GPS data to estimate robot position and orientation
• Loop Closure: revisit previously obstructed areas to correct accumulated error and improve consis-
tency

Sensor Fusion Model


• Process of combining information from multiple sensors to obtain more accurate and comprehensive un-
derstanding of the environment/state
• Integration of data from different sensors to overcome limitation
• Process
• Sensor selection: choosing appropriate sensor based on location
• Data acquisition: from selected sensors including measurements, images, point clouds
• Preprocessing: to remove noise, anomalies and align data spatially and temporally
• Data Fusion Algorithms: to combine the data using statistic methods such as Kalman filter or
Bayesian networks
• Fusion output: generation of data that provides a more accurate and comprehensive representation
• Benefits
• Improved accuracy and reliability
• More robust
• Enhance situational awareness

Visual Simultaneous Localisation and Mapping (vSLAM)


• Technique of using visual information from cameras to simultaneously estimate pose and construct map
• Step 1: Initialisation
• Camera calibration: estimation of intrinsic parameters (distortion, foci)
• Feature extraction: visual features or key points from initial frames to serve as reference points.
• Pose estimation: estimate initial position relative to initial reference frame using PnP algorithm
• Map initialisation: sparse drawing of surroundings as starting point
• Scale estimation: obtaining absolute size using LiDAR or depth cameras for accuracy
• Step 2: Local Mapping
• Feature extraction: visual features or key points from current camera frames that are distinctive
and unique using SIFT, SURF or ORB
• Feature tracking: track position of the key points across consecutive frames to maintain consis-
tency and measure time delay
• Triangulation: estimate 3D position of key points and calculate spatial coordinates
• Map representation: Triangulated key points placed on map
• Update: as per new processing of camera frames
• Step 3: Loop closure
• Feature matching: compares current frames with previous to find similarities or matches
• Similarity detection: determination of whether or not current frames contain similarities
• Hypotheses Generation: algorithm generates hypothetical ways to close error loops
• Verification and Consistency: to determine true loop closure
• Update and Correction: based on changes to revisited area
• Step 4: Re-localization (if any)
• Image/frame matching: find match between current frames and existing map
• Hypothesis Generation: about camera pose or position in environment
• Hypothesis verification: using RANSAC or geometric verification
• Map Re-association: cameras current position with map to continue mapping process
• Step 5: Tracking
• Feature extraction: of visual key points or features from frames
• Feature matching: with corresponding features from previous frames
• Motion estimation: using IMU / kinematic data
• Pose Update: based on pose estimation using features of current frame
• Robustness and error handling: recover from errors and gives real time update

You might also like