You are on page 1of 5

A Machine Learning Approach to

Georeferencing
D Sudheer Reddy, D Rajesh Reddy, R Usha, Ankit Chaudhary, SS Solanki
Photogrammetric Cell, Advanced Data Processing Research Institute, Dept. of Space, Secunderabad, India.

sudheer@adrin.res.in , rajesh@adrin.res.in, usha@adrin.res.in, ankit@adrin.res.in solanki@adrin.res.in

Abstract— Imaging from space involves certain


complications which are quite different from airborne ܺ௣ ‫ݔ‬௦ ܺ௦
platforms such as MAVs, UAVs and drones. All these
቎ ܻ௣ ቏ ൌ ߣࡾ ൥ ‫ݕ‬௦ ൩ ൅ ൥ ܻ௦ ൩ --- (1)
platforms require mathematical models to represent the
ܼ௣ െ݂ ܼ௦
geometry of image acquisition and further georeferencing the
acquired image. Conventionally, a Rigorous Sensor Model
(RSM) involving mission critical parameters and a sequence of where ൫ܺ௣ ǡ ܻ௣ ǡ ܼ௣ ൯ and ሺܺ௦ ǡ ܻ௦ ǡ ܼ௦ ሻ are the ground
rotations serves the purpose, alternately Rational Functional coordinates of the point and the camera perspective centre
Models (RFM) are developed which empirically mimics RSM respectively, ሺ‫ݔ‬௦ ǡ ‫ݕ‬௦ ሻ are the image coordinates of the
to certain degree of acceptable accuracy. In this paper, a
machine learning approach is proposed for georeferencing of
satellite images and compares the results with RFM and RSM.

Keywords— Georeferencing, satellite imaging, Image to


ground, neural networks

I. INTRODUCTION
The geometry of image acquisition is an important aspect
of any imaging platform whether it is spaceborne or airborne.
At the time of imaging as shown in Figure(1a) (not to scale),
the position and attitude deviates from the nominal values
due to several perturbations caused by temperature variation,
irregularity of earth gravity, moon and sun. Further, the least
count of the instruments that measure these parameters are
not within the required specifications [1], [2], [3]. For
example, a linear pushbroom scanner acquires a line of the
order of few microseconds but the position and velocity are
available only at an order of few milliseconds. Consequently,
these values are to be interpolated at required times and
trade-off on the residual error. Accurately modeling the
geometry under the above concerns is a serious task for all
mapping agencies. Different Models are developed by
different space agencies over many decades to describe the Fig. 1a. Central projection model with sensor coordinate system to ground
relationship between image (2D) and ground (3D) coordinate system.
coordinates for a given sensor at a given time. Rigorous
Sensor Models (RSM) and Rational Functional Models corresponding ground point, f is the principal distance of
(RFM) are thoroughly studied and mostly employed the camera and ߣ is the scaling factor. ࡾ is the ͵ ൈ ͵
industrial standard for map production and GIS applications orthogonal rotational matrix comprising the component
[4], [5]. rotations given by
II. RIGOROUS SENSOR MODELS FOR PUSHBROOM SENSORS ࡾ ൌ  ܴீூ  ‫ܴ  כ‬ூை  ‫ܴ  כ‬ை஻  ‫ܴ  כ‬஻௉  ‫ܴ  כ‬௉஼ ǡ
Remote sensing optical imaging missions usually carry a where,
linear pushbroom array and uses spacecraft velocity to ܴ௉஼ CCD to payload transformation matrix
acquire ground image by scanning line after line (Ground ܴ஻௉ Payload to body transformation matrix
Sampling Distance = velocity ൈ dwell time) as shown in ܴை஻ Body to Orbit matrix
Figure (1b). ܴூை Orbit to Inertial matrix
Rigorous sensor model attempts to describe the physical ܴீூ Inertial to Ground matrix which is a function
characteristics of the image acquisition by deriving a of liberation angle.
mathematical relationship between the image and the
ground coordinate system [6]. The observation that camera By reversing the operations of rotation matrices or using
system, perspective centre and the ground point are collinear partial derivatives with further necessary computations will
can be exploited to form collinearity equations expressed as give the inverse mapping from ground coordinate to image
follows

978-1-5386-6678-4/18/$31.00 2018
c IEEE 91
coordinate [7]. Here onwards, we shall represent equation (1) specific proprietary formats and unit system [8]. These
as ‫ܩʹܫ‬ோௌெ and the inverse mapping as ‫ܫʹܩ‬ோௌெ . formats lack consistency among different agencies as they
The errors in satellite ephemeris data which includes are not standardized and in some cases few agencies refrain
orbit, attitude and interior orientation parameters can affect to share certain critical data to their users. These limitations
pointing accuracy of the line of sight vector of each pixel. To demand for designer’s valuable time for data preparation
compensate these errors orbital parameters are modeled by a and pre-processing.
3rd degree polynomial and attitude parameters are modeled
by a 9th degree polynomial.

III. RATIONAL FUNCTIONAL MODEL


Grid Generation The RFM model provides a transformation from 2D image
Typically a scene of swath 10 km and strip length 10 km space coordinates ሺ‫݌‬ǡ ‫ݏ‬ሻ to 3D object space coordinates
with a ground sampling distance of 0.5m will result in image ሺ߶ǡ ߣǡ ݄ሻ in a geographic reference system. RFM is
size of 20,000 ൈ 20,000 pixels. Geolocating each of these
image pixels using RSM is computationally expensive as it considered as a generic sensor model and in practice it
involves solving equation (1) for each and every pixel (i.e., serves as an approximation to RSM [9]. The mathematical
calling ‫ܩʹܫ‬ோௌெ ǡ Ͷ ൈ ͳͲ଼ times). This poses a challenge to form of RFM is given by
approximate representation of RSM for every pixel
computation. One way to overcome this computational load, ௉ ሺ௣ǡ௦ǡ௛ሻ ௉ ሺ௣ǡ௦ǡ௛ሻ
equation (1) is solved at discretely selected points with a ߶ ൌ ௉భ ሺ௣ǡ௦ǡ௛ሻ , ߣ ൌ ௉య ሺ௣ǡ௦ǡ௛ሻ --- (2)
మ ర
constant spacing in scan lines and as well as pixels, this
process is called grid generation here onwards we refer this
grid as RSM grid. A choice of 500 gap in scan lines and where, ‫݌‬ǡ ‫ݏ‬ǡ ߶ǡ ߣǡ ݄ are normalized pixel, scan, latitude,
pixels reduces the ‫ܩʹܫ‬ோௌெ function calls to just 1600 times longitude and height respectively. Normalization by scaling
without losing much on the accuracy. Geolocation of every and offset is a numerical conditioning to restrict the range of
pixel within 500×500 pixel grid cell is obtained by coordinates between -1.0 to 1.0 to avoid overflow errors
interpolating the grid cell corners. However, depending on during computations. The inverse mapping from object
the resolution, complexity of the terrain and ephemeris space to image space is given by
errors, a finer grid with small spacing may result in better
accuracy than a sparse grid. ௉ ሺథǡఒǡ௛ሻ ௉ ሺథǡఒǡ௛ሻ
‫ ݌‬ൌ ௉ఱ ሺథǡఒǡ௛ሻ , ‫ ݏ‬ൌ ௉ళ ሺథǡఒǡ௛ሻ --- (3)
ల ఴ


ܲ௪ ሺߙǡ ߚǡ ߛሻ ൌ  σଷ௜ୀ଴ σ௜௝ୀ଴ σ௞ୀ଴ ܽ௜௝௞
௪ ௪
ߙ ௜ି௝ ߚ ௝ି௞ ߛ ௞ ; ܽ௜௝௞ is
called as the rational polynomial coefficient (RPC)
corresponding to polynomial ܲ௪ for ‫ ݓ‬ൌ ͳǡ Ǥ Ǥ Ǥ ǡͺ . ሺߙǡ ߚǡ ߛሻ
denotes ሺ‫݌‬ǡ ‫ݏ‬ǡ ݄ሻfor ‫ ݓ‬ൌ ͳǡ Ǥ Ǥ ǡͶ and ሺ߶ǡ ߣǡ ݄ሻ for ‫ ݓ‬ൌ ͷǡ Ǥ Ǥ ǡͺ
accordingly. We shall represent equation (2) and (3) as
‫ܩʹܫ‬ோிெ and ‫ܫʹܩ‬ோிெ respectively.
RSM grid as discussed in section II, relating ሺ‫݌‬ǡ ‫ݏ‬ǡ ݄ሻ to
ሺ߶ǡ ߣሻ, was initially calculated with a constant height value
for ݄଴  i.e., ‫ܩʹܫ‬ோௌெ ሺ‫݌‬ǡ ‫ݏ‬ǡ ݄଴ ሻ ฺ ሺ߶଴ ǡ ߣ଴ ሻǤ By slicing the
height range of the given scene to five intervals say ݄௜ and
calculating the corresponding ሺ߶௜ ǡ ߣ௜ ሻ for each grid point
i.e., ‫ܩʹܫ‬ோௌெ ሺ‫݌‬ǡ ‫ݏ‬ǡ ݄௜ ሻ ฺ ሺ߶௜ ǡ ߣ௜ ሻ݂‫ ݅ݎ݋‬ൌ ͳǡʹǡ͵ǡͶǡ we densify
the RSM grid points at each height value. These are called
anchor points and are used to derive RPCs. Bias
compensated RFMs are successful approximation for
representing the RSM and are currently the best method to
geolocate each pixel in reasonable time, of course
Fig. 1b. Image formation on CCD from line by line acquisition of depending on the length of the strip. The anchor points will
ground scene by a linear pushbroom array.
represent the characteristics of the sensor and platform.

Rigorous models developed by different space agencies use The RFMs are generic in the sense that they can be used for
specific reference systems to describe the exact geometry of any sensor and any coordinate system. They also serve as a
a sensor. The critical parameters such as orbit, attitude, sensor independent universal metadata standard,
imaging time, temperature, etc., are stored as metadata in a consequently they are supported by almost all commercial

92 8th International Advance Computing Conference (IACC).


softwares. Several studies have shown that RFMs are Network Architecture
sufficient for map making and other applications though For the experimentation of the proposed neural
they are limited by their physical description [10]. Space network based model we have taken cartosat-2E satellite data
agencies who do not want to disclose the details of their of different orbits and their corresponding RSM and RFM.
proprietary sensor properties resort to distribution of RPCs The neural network architecture for both forward (I2G) and
back (G2I) transformations are chosen different. We have
as a primary level of protecting their sensor data. Recently,
experimented with different architectures and observed that
there were attempts which have tried to recover some sensor simple network architecture of two hidden layers with 25
information from RPCs [11]. In this context neural network neurons each was sufficient for forward approximation
model offers a much more secure way of distributing the denoted by I2G_NN. The inverse transformation denoted by
sensor models. G2I_NN, required more epochs to converge to a reasonably
Besides the advantages, There is also the risk of unstable good solution with two hidden layers of 50 neurons each. In
solution and over-parameterisation for longer strips. These some cases, different configurations also resulted in same
performance but shown varied levels of inconsistency for
limitations of RFMs inspire us to consider a machine
different data sets. The proportions of training, validation
learning paradigm as an alternate approach for RSM and testing the data are taken as 70%, 15% and 15%
approximation. respectively.

IV. MACHINE LEARNING APPROACH

Approximating nonlinear functions of several variables by


superpositions and sums of functions of one variable dates
back to 1927 famously known as Hilbert 13th problem [13].
Several classical approaches for approximations using
polynomials, Fourier Series, wavelets, radial basis functions,
multivariate splines or ridge functions are developed for
approximating univariate functions. However, much of
univariate approximation theory does not generalize well to
higher dimensional spaces. Neural networks have emerged
as a convenient method of such representation because they
are universal approximators that can learn the functions
from sample data [Brattka-2007]. Many of their applications
have proven successful in the areas of pattern recognition,
pattern classification, or function approximation where Fig. 2. A typical NN architecture for three inputs and two outputs with
two hidden layers.
sufficient sample/training data was available [12]. This
motivates us to look at problem of georeferencing as a
function of multivariate approximation using Neural
Networks framework.
In the last decade ANNs emerged as the most popular V. RESULTS AND COMPARISONS
data driven algorithms capable of solving complex nonlinear
relationships between input and output data sets [14]. A Small (<30 km) and long image strips (>2000 km) are taken
typical network architecture for the problem under
from carto2E satellite. For very long strips of length more
consideration is as shown in Figure(2). Here for I2G
than 2000 km with a GSD of 0.6 m the number of scan lines
functions we consider inputs as (scan, pixel, height) and
outputs as (latitude, longitude). Similarly for G2I function scales to 36 lakhs or more accordingly. Several image strips
we consider inputs as (latitude, longitude, height) and of strip length 28km acquired over Delhi area are studied
outputs as (scan, pixel). The hidden layers and the number of along with very long strips extending from Delhi to
neurons for both the problems need not be the same, further Karnataka. A RSM grid was constructed with a spacing of
we observe that I2G with less number of layers and neurons 500 in both scan and pixel directions with heights extracted
serves the purpose of forward approximation than the G2I from SRTM-30 corresponding to given strip. An anchor
function approximation. The mathematical form of input point grid at each (scan, pixel) was generated with 5 heights
output relationship for I2G using neural networks is as obtained by slicing the SRTM region for this strip. RPCs are
follows generated from these anchor points. The grid points that fall
in each scan line are stacked and indexed as a vector and a
ሾ݈ܽ‫ݐ‬௞ ǡ ݈‫݊݋‬௞ ሿ ൌ ݃ଶ ൣȭ௝ ‫ݓ‬௞௝ ݃ଵ ൫‫ݓ‬௝ǡ଴ ൅ ‫ݓ‬௝ǡଵ ‫ ݊ܽܿݏ‬൅ ‫ݓ‬௝ǡଶ ‫ ݈݁ݔ݅݌‬൅ plot of error between ‫ܩʹܫ‬ோௌெ to ‫ܩʹܫ‬ோ௉஼ and ‫ܩʹܫ‬ோௌெ  to
‫ݓ‬௝ǡଷ ݄݄݁݅݃‫ݐ‬൯ ൅ ‫ݓ‬௞଴ ൧ ‫ܩʹܫ‬ேே corresponding coordinates of I2G and G2I responses
Where, ݆ is the number of neurons and ݇ is the number of is made as shown in Fig. 3, Fig. 4, Fig. 5, Fig. 6.
hidden layers. The variables in inputs and outputs will The following evaluation metrics are used to measure
change accordingly as I2G or G2I respectively. prediction accuracy of neural network generated model [15].
ଵ ௡
(1) Mean absolute error ሺ‫ܧܣܯ‬ሻ  ൌ σ ȁ‫ݕ‬ െ ‫ුݕ‬௜ ȁ
௡ ௜ୀଵ ௜

8th International Advance Computing Conference (IACC). 93


(2) Mean absolute percentage error ሺ‫ܧܲܣܯ‬ሻ ൌ
ଵ ௡ ȁ௬೔ ି௬ු೔ ȁ
σ
௡ ௜ୀଵ ௬೔

(3) Mean Squared Error (MSE) = σ௡ ሺ‫ ݕ‬െ ‫ුݕ‬௜ ሻଶ
௡ ௜ୀଵ ௜

where ‫ුݕ‬௜ is the predicted value of the variable at input ݅.

For smaller image strips the grid spacing of 500 in scan


and pixel directions produces sufficient samples to fit an
RPC model, where as it is quite inadequate in case of NN
fitting. Hence we experimented by generating finer grids
with multiple space options of 100 to 500 with an increment
of 100 in both pixel and scan line direction. We observe that,
for smaller strips of less than 30 km and a given grid spacing
of 500, the parametric RPC model does well compared to Fig. 3. The error plot between ‫ܫʹܩ‬ோ௉஼ and‫ܫʹܩ‬ோௌெ models at the grid
NN models. However, there is a possibility of improving the points.
NN model by increasing the grid points via reducing the grid
spacing. In case of longer strips of more than 2000 km the
RPC fitting is slightly deteriorated, in this case NNs
performed almost ten times better than RPCs. The ܴଶ value
for NN and RFM models are 99.8%, 98.6% in I2G and
99.6%, 99.6% in G2I respectively. The results are evaluated
in terms of MAE, MAPE, MSE and are provided in Table .1.
The performance of NNs can be further improved to exactly
match the RSMs provided more training data. Network
architecture, number of iterations and tuning parameters such
as learning rate/step size also can alter the performance,
hence choosing the right architecture plays the important
role.

TABLE I. PREDICTIVE ACCURACIES OF RFM AND NN MODELS FOR


I2G AND G2I. Fig. 4. The error plot between ‫ܩʹܫ‬ோ௉஼ and‫ܩʹܫ‬ோௌெ models at the grid
points.
Model side MAE MAPE MSE

Lat 3.93051e-05 1.43130e-06 2.35079e-09


I2GRFM
Lon 0.00012 1.06869e-06 2.38639e-08

Lat 8.27606e-06 3.125000e-07 1.08563e-10


I2GNN
Lon 1.35245e-05 1.201884e-07 2.83385e-10

Scan 0.000893 5.869709e-08 1.090000e-05


G2IRFM
Error

Pixel 0.000359 0.000939 2.501425e-06

Scan 0.001780 1.557714e-04 6.922583e-06


G2INN
Pixel 0.001978 0.003463 6.824851e-06

Fig. 5. The error plot between ‫ܫʹܩ‬ேே and‫ܫʹܩ‬ோௌெ models at the grid
points.

94 8th International Advance Computing Conference (IACC).


[2] Manual of Photogrammetry, 5th ed., Chris McGlone, ASPRS, 2004,
chapter. 3., pp.181-316.
[3] Changno Lee, H.J.Thesis, J.S. Bethel and Edward M Mikhail,
Rigorous mathematical modelling of Airborne pushbroom imaging
systems, PERS April, 2000, pp 385-392.
[4] Daniela Poli, Review of developments in Geometric Modelling for
High Resolution Satellite Pushbroom Sensors, The Photogrammetric
Record 27(137), pp. 58–73, March 2012.
[5] Ayman Habib, Sung Woong Shin, Kyungok Kim, Changjae
Kim, Ki-In Bang, Eui-Myoung Kim, Dong-Cheon Lee,
Comprehensive Analysis of Sensor Modeling Alternatives for High
Resolution Imaging Satellites, Photogrammetric Engineering &
Remote Sensing, Vol. 73, No. 11, pp. 1241–1251, November 2007.
[6] Poli, D., 2007. A rigorous model for spaceborne linear array sensors.
Photogrammetric Engineering & Remote Sensing, 73(2): 187–196.
[7] Radhadevi.P.V., Solanki.S.S, Nagasubramanian.V, Archana
Mahapatra, Sudheer Reddy.D, Jyothi M V, Krishna Sumanth,
Saibaba.J, and Geeta Varadan, New Era of Cartosat Satellites for
Large Scale Mapping, Photogrammetric Engineering and Remote
Sensing. Pp. 1031-1040, Sep-2010.
Fig. 6. The error plot between ‫ܩʹܫ‬ேே and‫ܩʹܫ‬ோௌெ models at the grid
points. [8] Weser, T., Rottensteiner, F., Willneff, J., Poon, J. and Fraser, C. S.,
Development and testing of a generic sensor model for pushbroom
satellite imagery. Photogrammetric Record, 23(123): 255–274, 2008.
CONCLUSION [9] C Vincent Tao, Young Hu, A comprehensive study of the rational
function model for photogrammetric processing, PERS, vol 67, No
In this paper we proposed a machine learning approach as 12, Dec. 2001., pp. 1347-1357.
an alternate method for georeferencing satellite images. The [10] Madani, M., 1999. Real-time sensor-independent positioning by
methodology is tested on various data sets and result of one rational functions. Proceedings of ISPRS Workshop on Direct Versus
Indirect Methods of Sensor Orientation, Barcelona, Spain. 64–75.
typical data set with limited terrain undulations is presented.
[11] Wen-chao Huang, Guo Zhang, and Deren Li, Robust Approach for
It is observed from our analysis that the proposed method Recovery of Rigorous Sensor Model Using Rational Function Model,
overcomes the short falls of RFM for addressing IEEE Trans. on Geoscience and Remote Sensing, 2016.
georeferencing long strips. Further, it can be noticed that the [12] Simon Haykin, Neural Networks and Learning Machines, 3ed,
distribution of sensor model with neural network model Pearson Education, Inc., Upper Saddle River, New Jersey 07458,
2009.
provides a layer of protection for the sensor parameters. We
[13] Brattaka. V., From Hilbert’s 13th problem to the theory of Neural
anticipate that deep neural network models for highly networks: constructive aspects of Kolmogorov’s superposition
undulating terrains provide better representations for theorem. Chap. 13, Heritage in Mathematics, springer, Heidelberg,
rigorous sensor models, hence there is a scope for improving 2007.
the NN model using deep architectures which we envisage [14] Silvia Ferrari, Robert F. Stengel, Smooth Function Approximation
in our future work. Using Neural Networks, IEEE Trans. on Neural Networks, Vol. 16,
No. 1, January 2005.
[15] Usha A.Kumar, Comparison of neural networks and regression
analysis: A new insight, Volume 29, Issue 2, Pages 424-430, Expert
REFERENCES Systems with Applications. 2005.

[1] George Joseph, Building Earth observation cameras, pp. 225-227,


Taylor & Francis Group, CRC Press, 2015.

8th International Advance Computing Conference (IACC). 95

You might also like