Chapters Internship

Land-Cover and crop Identification using RS satellite Images
CHAPTER-1
Introduction of land-cover and identification of crops
using RS satellite images
Department of Computer Science and Engineering 2022-23 Page 1

CHAPTER 1
INTRODUCTION
1.1 State of Art Developments

Significant progress has been made in recent years in the fields of crop and land cover
detection utilising RS satellite pictures. The availability of high-resolution satellite
photography, such as Sentinel-2, which records precise spatial and spectral data of the Earth's
surface, is one of the major factors influencing these advancements. This comprehensive data
set makes it possible to precisely identify and categorise different aspects of land use and
crop varieties.
The detection of crops and land cover has benefited greatly from ML techniques. Extreme
Gradient Boosting (XGB) and Convolutional Neural Networks (CNN) are two methods that
have demonstrated impressive performance in deriving intricate correlations and patterns
from satellite imagery. These algorithms are able to capture the non-linear correlations
between spectral bands and land cover or crop classifications, producing results in the
identification process that are more precise and trustworthy.The ability to identify land cover
and crops has been substantially improved by the integration of data from several sources.
Radar data, geographic information systems (GIS), and RS data may all be used together to
better understand the dynamics of land cover. Improved discriminating between various land
cover classes is made possible by this integration, particularly in complex and varied
environments.
Overall, these cutting-edge innovations have made significant advancements in the field of
identifying crops and land cover utilising RS satellite photos. The accuracy and effectiveness
of land cover and crop identification have substantially increased thanks to the accessibility
of high-resolution data, the adoption of cutting-edge image processing techniques, and the
use of ML algorithms. These developments have a lot of potential for use in sustainable
development, environmental monitoring, land management, and agriculture.
1.2 Motivation
The necessity for precise and effective monitoring and management of land resources drives
the research of land cover identification and crop identification utilising RS satellite imagery.
RS offers a special vantage position that enables extensive and current studies of crop
distribution and land cover changes. For a variety of applications, including agriculture,
environmental monitoring, urban planning, and emergency management, precise
identification of land cover classes and crops is essential. By utilising RS satellite imagery, it
is possible to improve agricultural methods, identify signs of land deterioration, and make
well-informed decisions for sustainable land management. This encourages the investigation
of cutting-edge procedures and strategies to improve the identification of crops and land
cover with RS technology.

1.3 Problem Statement

The problem to be addressed through this project is land cover and crop identification using
RS satellite images which is a challenging task in the field of agricultural and environmental
monitoring. The problem entails accurately classifying different land cover types and
identifying specific crops based on satellite imagery data. The main challenge lies in
processing and analyzing the vast amount of data collected by satellites and extracting
meaningful information for land cover and crop classification. Additionally, factors such as
varying spectral characteristics, spatial resolution, and temporal changes in the satellite
images pose further complexities. Therefore, there is a need for robust and efficient
techniques that can effectively process and analyze satellite imagery to accurately identify
land cover classes and distinguish between different crop types. Addressing this problem will
contribute to improved land management, agricultural planning, and environmental
monitoring, facilitating informed decision-making for sustainable land use practices.
1.4 Objectives
● To identify the different types of land cover and their distribution within a specific region
using different spectral images of sentinel 2 satellite of the European space agency.
● To recognize the various crops present in a region using the vegetation indices - NDVI
and EVI calculated from RS data.
● To establish a reliable and cost-effective system for crop recognition and monitoring
using RS technology, which could be used by farmers, agronomists, and policymakers to
support decision-making.
1.5 Scope
The scope of the topic "land cover identification and crop identification using RS satellite
images" encompasses various aspects. It involves exploring and implementing advanced
image processing techniques, ML algorithms, and DL models to accurately classify and
identify land cover classes and different crop types. The scope extends to data acquisition,
preprocessing, feature extraction, and classification methodologies tailored for RS satellite
imagery. Additionally, the topic includes the integration of multi-source data, such as GIS and
climate data, to enhance classification accuracy. The outcomes of this research have
applications in agriculture, environmental monitoring, land management, and
decision-making processes, contributing to sustainable land use practices and effective
resource management.
1.6 Methodology
The Methodology diagram is shown in Fig 1 below.
• Obtaining a dataset of sentinel 2 satellite images along with the corresponding ground
truth data of the image for mapping of each point of the image pixel by pixel.
• Training the system to identify the land cover shown on a satellite image like barren land,
crop land and water bodies using ground truth.

Fig 1: Methodology of Land cover and crop Identification
• Calculation of NDVI ( Normalized Difference vegetation index) and EVI ( Enhanced

Vegetation Index) using the remote sensing data from the satellite.
• Training the system to identify the different types of crops based on the values of the
indexes as each crop possesses different properties towards different bands of radiations.
• Using the trained model for the prediction of crops of any particular area using the
satellite image for the same.
1.7 Summary
Land cover and crop identification play crucial roles in various fields such as agriculture,
environmental monitoring, and land management. RS satellite images provide a valuable
source of data for analyzing and understanding the Earth's surface, allowing for the
identification and classification of different land cover types and specific crops. The
advancements in RS technology have enabled the acquisition of high-resolution imagery with
multispectral capabilities, providing rich and detailed information about the Earth's surface.
RS satellite images offer a unique perspective, capturing the Earth's surface at different
wavelengths of the electromagnetic spectrum. By analyzing the spectral signatures of
different land cover types and crops, it is possible to differentiate between various features
such as forests, water bodies, urban areas, and agricultural fields. This information is vital for
monitoring changes in land use, assessing the health and productivity of crops, and
understanding the impact of human activities on the environment.
Accurate land cover and crop identification using RS data present numerous challenges. The
complexity arises from factors such as the variability in spectral responses due to different
atmospheric conditions, surface properties, and temporal changes. Additionally, the presence
of mixed pixels, where a single pixel represents multiple land cover types, further
complicates the classification process. Moreover, the classification accuracy can be
influenced by the choice of classification algorithms, feature extraction techniques, and the
availability of ground truth data for training and validation.

Addressing these challenges requires the development and implementation of advanced

methodologies and algorithms that can effectively extract meaningful information from RS
satellite images. ML and data mining techniques, such as decision trees, support vector
machines, and neural networks, have shown promising results in land cover and crop
identification tasks. These techniques enable the learning of patterns and relationships
between spectral signatures and land cover classes, leading to accurate classification results.
The benefits of accurate land cover and crop identification using RS satellite images are
far-reaching. They contribute to improved agricultural planning, land management, and
environmental monitoring. They enable policymakers to make informed decisions regarding
land use, resource allocation, and conservation efforts. Furthermore, accurate identification of
crop types can support precision agriculture practices, facilitating targeted interventions and
optimizing crop productivity.
In this project, the aim is to explore and evaluate various techniques and algorithms for land
cover and crop identification using RS satellite images. Investigation into the challenges and
limitations associated with these techniques and propose novel approaches to enhance the
accuracy and efficiency of the classification process. The findings from this research will
contribute to advancing our understanding of land cover dynamics and crop patterns,
providing valuable insights for sustainable land management and agricultural practices.

CHAPTER-2
Overview of land-cover and identification of crop using RS
satellite images

CHAPTER 2
Overview of Land-Cover and Crop Identification using RS
Satellite Images
2.1 Introduction
The overview provides a comprehensive understanding of the domain of land cover
identification and crop identification using RS satellite images. It delves into the key
concepts, methodologies, and applications related to this field. The overview discusses the
importance of accurate land cover and crop identification for effective resource management,
environmental monitoring, and decision-making processes. It explores the advancements in
RS technology and image processing techniques that have revolutionized the analysis of
satellite images. Additionally, it highlights the role of ML and DL algorithms in enhancing
classification accuracy. The overview sets the stage for further exploration of the subtopics
within this domain.
2.2 Relevant information

Remote sensing [1] is one of the more advanced methods for mapping crops from various
regions. The most common method for classifying crops in remote sensing is with optical
images. With advances in remote sensing and spatial, temporal, and spectral separations,
classification results became more professional [2]. McNairn et al. [3] reported successful
results by integrating optical and SAR images to provide annual crop inventories. Joshi et al.
[4] in a study of 112 different land use areas to investigate the integration of optical and radar
data concluded that optical and radar data as complementary data are also effective in
determining the details of land use map with high accuracy. Aiming to evaluate different
methods of integrating optical and multipolar radar data for land mapping in Brazil, Pereira et
al. [5] concluded that radar information improves user accuracy, while the polarization data
of HH (horizontal transmission and reception) more than horizontal polarization (HV)
(horizontal transmission and vertical reception) leads to the differentiation of different land
use classes, but the integration of radar and optical data had the best statistical results for land
mapping. Zhou et al. [6] used SAR images, optical images and the integration of both data
types to evaluate the possibility of winter wheat mapping. The classification map was
performed using a combination of Sentinel 1 information and optical images using a random
forest method. Due to their synoptic acquisitions and high revisit frequency, the data obtained
by remote sensing can offer a significant contribution to provide periodic and accurate
pictures of the agricultural sector [8]. It takes special relevance considering that a new era of
land cover analysis has emerged, which has been enabled by free and open access data
(e.g.,Sentinel-2 or Landsat 8 images), analysis-ready data, high-performance computing, and
rapidly developing data processing and analysis capabilities [9]. For instance, a combination
of data from Sentinel-2 (2A and 2B) and Landsat 8 provides a global median average revisit
interval of 2.9 days [10]. The ever-increasing interest that the Copernicus programme is
attracting in the agriculture field, is introducing the use of other satellites like Sentinel-1 or

Sentinel-3 in, for example, monitoring agricultural droughts with high consistency (Hu et
al.,Citation 2019)[11]. With the improvement of satellite-based imaging and image analysis
protocols, it can be anticipated that the satellite-based phenotyping applications will be more
commonly utilized in agriculture.)[12]. In a recent review Liu et al. showed many studies that
indicate that the combined use of optical and SAR data can improve the accuracy of crop
identification studies.This crop identification from multitemporal and multispectral satellite
imagery is usually done using Machine Learning (ML), and this is, specifically, the EU Joint
Research Centre proposal (Devos et al., Citation 2018). Examples of generated NDVI time
series by data of the previous years are realized. Generated data can be used for early crops
identification by NDVI time series when the learning sample of the current year is small. [13]
2.3 Summary
The overview provides a comprehensive summary of the domain of land cover identification
and crop identification using RS satellite images. This field holds significant importance in
various domains, such as agriculture, environmental monitoring, and land management.
Accurate identification of land cover classes and crop types plays a crucial role in effective
resource management, land-use planning, and decision-making processes. The overview
highlights the advancements in RS technology, including high-resolution satellite sensors that
provide detailed spatial and spectral information. It also emphasizes the role of advanced
image processing techniques in enhancing the accuracy of land cover and crop identification.
Furthermore, the overview discusses the relevance of ML and DL algorithms in this domain.
Ensemble methods and convolutional neural networks have emerged as powerful tools for
classification tasks, demonstrating remarkable performance in identifying land cover and
crop patterns. These developments enable improved understanding, monitoring, and
management of land cover dynamics and crop distribution using RS satellite images.
Overall, the overview sets the stage for exploring the subtopics within this domain, shedding
light on the advancements and potential applications of land cover identification and crop
identification using RS satellite images.

CHAPTER-3
Software Requirements Specification of land-cover and
identification of crop using RS satellite images

CHAPTER 3
Software Requirements Specification
3.1 Overall Description

The Software Requirements Specification (SRS) outlines the functional and nonfunctional
requirements for the land cover identification and crop identification software system. It
includes modules for data acquisition, preprocessing, and model training using specific
algorithms such as eXtreme Gradient Booster (XGB) for land cover identification and
Random Forest for crop identification. The system should be capable of acquiring satellite
images, ground truth data, and spectral reflectance values. Preprocessing and classification
reports generation are also key functionalities. The SRS provides a detailed roadmap for the
development of the software system.
3.1.1 Product Perspective

The software system for land cover identification and crop identification operates as a
standalone product, independent of other systems. It interacts with satellite imagery, ground
truth data, and spectral reflectance values to perform data processing, model training, and
classification. The system functions autonomously and does not rely on external systems for
its operation.
3.1.2 Product Functions

● Acquire satellite images and corresponding ground truth data for land cover identification
from the provided dataset.
● Preprocess the acquired satellite images to enhance their quality and remove noise or
artifacts.
● Normalize and preprocess the dataset containing spectral reflectance values for crop
identification.
● Calculate vegetation indices (e.g., NDVI, EVI) to quantify vegetation density and health.
● Train the land cover identification model using the Extreme Gradient Booster (XGB)
algorithm.
● Train the crop identification model using the Random Forest algorithm.
● Generate classification reports for evaluating the accuracy and performance of the trained
models
● Provide a user-friendly interface to interact with the system, allowing users to input
parameters, view results, access relevant information, and analyze vegetation indices.
3.1.3 User Characteristics

Researchers, agricultural experts, and environmental monitoring organisations are among the
intended users of the software system for identifying land cover and crops. Users should be
familiar with the fundamentals of RS and image processing. They should be at ease using
satellite images and be familiar with ideas related to crop and land cover categorization. For
optimal system use, familiarity with ML techniques and statistical analysis is recommended.

To accommodate users with varied degrees of technical competence, the system should have
a user-friendly interface.
3.1.4 Constraints and Dependencies

The land cover identification and crop identification software system is subject to certain
constraints and dependencies. It relies on the availability of satellite imagery, ground truth
data, and spectral reflectance datasets for accurate classification. The system's performance
may be influenced by factors such as image resolution, data quality, and the availability of
up-to-date training datasets. Additionally, the system's computational capabilities and
memory requirements should be considered to ensure efficient processing and storage of
large-scale satellite imagery and datasets.
3.2 Specific Requirements

The specific requirements for the land cover identification and crop identification software
system encompass the functional and non-functional aspects necessary for accurate
classification. These requirements include data acquisition, preprocessing, model training
using specific algorithms, calculation of vegetation indices, generation of classification
reports, and a user-friendly interface to facilitate user interaction and analysis of results.
3.2.1 Functional Requirements

The following functional requirements must be met:
● The system should be able to acquire satellite images and corresponding ground truth data
for land cover identification.
● It should preprocess the acquired satellite images to enhance their quality, remove noise
or artifacts, and normalize the spectral reflectance values for crop identification.
● The system should train the land cover identification model using the Extreme Gradient
Booster (XGB) algorithm and the crop identification model using the Random Forest
algorithm.
● It should calculate vegetation indices, such as NDVI and EVI, to assess vegetation
density and health.
● The system should generate comprehensive classification reports, including accuracy
metrics and performance evaluation, to assess the effectiveness of the trained models.
3.2.2 Performance Requirements

● The system should be capable of processing satellite images and spectral reflectance
datasets in a timely manner, providing efficient and responsive performance.
● The classification models should demonstrate high accuracy in land cover identification
and crop identification, aiming for a minimum accuracy threshold of 90%.
● The system should handle large-scale datasets and satellite images without compromising
performance or experiencing significant delays during preprocessing, model training, and
classification stages.
● It should efficiently utilize computational resources, optimizing memory usage and
processing speed to ensure smooth operation even with large datasets.

● The program should provide a seamless and responsive experience, allowing users to
interact with the system and visualize classification results with minimal latency or lag.
3.2.3 Supportability
The program must be easy to use and maintain. The program must be able to be updated with
new data and features.
3.2.4 Software Requirements

● Operating System – Windows or MacOS or LINUX
● Software – google colab(python)
● Supported Browsers –Google Chrome, Microsoft Edge
● Database – google drive
3.2.5 Hardware Requirements

● PROCESSOR : Intel Pentium or Higher Version
● RAM : Minimum 1GB
● HARD DISK : 50GB and above
3.2.6 Design Constraints

● Compatibility: The software system should be designed to be compatible with various
satellite image formats, ground truth data formats, and spectral reflectance datasets. It
should support common file formats used in the RS field to ensure interoperability with
different data sources.
● Scalability: The design should accommodate scalability to handle increasing volumes of
satellite imagery and datasets. The system should be capable of efficiently processing and
analyzing large-scale datasets, allowing for future expansion and integration with
additional data sources or regions.
● Usability: The design should prioritize usability by incorporating a user-friendly interface
and intuitive navigation. It should be designed with clear and concise workflows,
providing user prompts and instructions to guide users through the process of data
acquisition, preprocessing, model training, and result interpretation. The system should
also offer visualizations and interactive tools to facilitate data exploration and analysis.
3.2.7 Interfaces
In the context of land cover and crop identification using remote sensing satellite images,
interfaces play a crucial role in facilitating user interaction and data analysis. User interfaces
can include intuitive web-based dashboards, interactive mapping tools, and data visualization
platforms, allowing users to explore and interpret the identified land cover and crop
information efficiently. These interfaces provide a user-friendly experience, enabling users to
make informed decisions and extract valuable insights from the satellite imagery data.
3.2.7.1 User Interfaces of the System

The user interface is the google colab running on google cloud server. No specific UI has
been created.

3.2.7.2 Software Interfaces of the System

The software system provides multiple interfaces to facilitate its functionality. The user
interface offers a user-friendly experience, allowing users to input parameters, view results,
and interact with the system. Data interfaces enable integration with various data sources,
while model interfaces integrate ML algorithms. Additionally, visualization and platform
interfaces enhance the user experience and optimize execution on Google Colab.
3.2.8 Non-Functional Requirements

● Performance: The system should demonstrate fast and efficient performance, ensuring
timely processing of satellite images and datasets. It should have low latency and minimal
response time, providing a seamless user experience.
● Reliability: The system should be reliable and robust, minimizing the occurrence of errors
or crashes. It should handle unexpected inputs gracefully and recover from failures,
ensuring the integrity and availability of the system.
● Scalability: The system should be designed to scale seamlessly, accommodating an
increasing volume of data and user demand. It should handle large datasets without
compromising performance and maintain high availability during peak usage.
● Security: The system should prioritize data security, implementing measures to protect
sensitive data, user credentials, and system integrity. It should enforce user authentication,
data encryption, and access controls to prevent unauthorized access or data breaches.
● Portability: The system should be portable, allowing it to be easily deployed and run on
different platforms or environments. It should have minimal dependencies and be
compatible with multiple operating systems, facilitating its usage in various computing
environment
3.3 Summary
To summarise, the System Requirement Specification (SRS) gives a thorough overview of
the functional and non-functional requirements for the land cover and crop identification
software system. It emphasises the importance of effective data collecting, preprocessing,
model training, and outcome interpretation. The SRS addresses user characteristics, design
limitations, and platform requirements, assuring Google Colab compatibility. To develop a
robust and safe system, performance needs such as scalability, dependability, and security are
prioritised. The SRS is a guiding document for the creation of a user-friendly software system
that uses RS satellite photos to accurately identify land cover and crops.

CHAPTER-4
High Level Design of land-cover and identification of crop

CHAPTER 4
High Level Design
4.1 Design Considerations

Design considerations play a crucial role in shaping the high-level design of the land cover
identification and crop identification software system. Modularity ensures maintainability and
reusability, while scalability enables the system to handle increasing data volumes and user
demands. Flexibility allows for seamless integration of different data sources and algorithms,
fostering adaptability and future enhancements. Performance optimization focuses on
efficient processing, while error handling and resilience enhance system stability. User
experience emphasizes intuitive interfaces and informative visualizations. Security measures
safeguard sensitive data and system integrity. Lastly, extensibility ensures the system can
accommodate future feature additions. Careful consideration of these factors ensures a
well-designed, robust, and user-friendly software system.
4.1.1 General Constraints

The following general constraints must be considered when designing the program:
● Resource Limitations: The system should operate within the limitations of available
computational resources, such as processing power, memory, and storage capacity.
Efficient algorithms and data management techniques should be employed to optimize
resource utilization.
● Time Constraints: The system should meet specified time requirements for data
processing, model training, and classification. Timely delivery of results is crucial for
decision-making processes and user satisfaction.
● Data Availability: The system's effectiveness relies on the availability of accurate and
up-to-date satellite imagery and ground truth data. Ensuring access to reliable data
sources and establishing mechanisms for data acquisition and updates are essential.
● Technical Compatibility: The system should be compatible with the hardware, software,
and platforms on which it will be deployed. It should consider compatibility with
different operating systems, software libraries, and tools to ensure seamless integration
and ease of use.
● Regulatory Compliance: The system should adhere to legal and regulatory requirements
governing the use of satellite imagery, data privacy, and any other applicable regulations.
Compliance with laws, standards, and policies ensures ethical and legal use of the system
and protects user privacy and data security.
4.1.2 Development Methods

● Agile Development: Utilize an agile development methodology to facilitate iterative
and incremental development. This allows for frequent feedback, adaptability to
changing requirements, and early delivery of working software components.
● Collaboration and Communication: Foster effective collaboration and communication
among team members, stakeholders, and end-users. Regular meetings, clear

documentation, and open channels of communication promote a shared understanding

of project goals, progress, and challenges.
● Prototyping: Employ prototyping techniques to quickly develop and validate key
functionalities. Prototypes provide tangible representations of the software system,
allowing stakeholders to provide early feedback and refine requirements.
● Version Control: Utilize a version control system, such as Git, to manage source code,
track changes, and facilitate collaboration among developers. This ensures that
changes are tracked, code conflicts are resolved, and previous versions can be easily
accessed if needed.
● Testing and Quality Assurance: Implement comprehensive testing and quality
assurance processes to ensure the reliability and accuracy of the software system. This
includes unit testing, integration testing, and user acceptance testing to identify and
rectify any issues or bugs in the system.
4.2. Architectural Strategies
4.2.1. Programming Language

Following PLs are used during the development of the project –
● Python:It is a general-purpose PL that is used for a variety of tasks, including data
analysis, ML, and web development. It is a popular choice for data scientists because it is
easy to learn and use, and there are a large number of libraries and tools available for data
analysis.
● Google Earth Engine: Google Earth Engine is a cloud-based platform for analyzing and
visualizing geospatial data. It provides a vast collection of satellite imagery and
geospatial datasets, along with powerful processing capabilities, allowing users to
perform RS and geospatial analysis tasks at scale.
● EarthPy: EarthPy is a Python library that facilitates the analysis of geospatial data using
open-source tools. It provides a simplified interface for accessing and manipulating
geospatial datasets, along with a suite of functions for common geospatial operations,
making it easier for users to work with raster and vector data in Python.
● ipyleaflet: ipyleaflet is a Python library that enables interactive mapping and visualization
within Jupyter notebooks. It provides a set of interactive mapping widgets based on the
Leaflet JavaScript library, allowing users to create interactive maps, add markers,
overlays, and layers, and interact with geospatial data in a Jupyter notebook environment.
● Rasterio: Rasterio is a Python library for reading, writing, and manipulating geospatial
raster datasets. It provides an intuitive and efficient API for working with raster data,
enabling tasks such as reading and writing raster files, cropping, reprojecting, and
extracting information from raster datasets.
● scikit-learn: scikit-learn is a popular Python library for ML. It provides a wide range of
ML algorithms, tools for data preprocessing, model evaluation, and cross-validation.
scikit-learn is widely used for classification, regression, and clustering tasks in various
domains, including RS and geospatial analysis.

4.2.2. User Interface Paradigm

For the crop identification and land cover classification using RS satellite images, the user
interface paradigm can be a Graphical User Interface (GUI).
The GUI can consist of visual elements such as interactive maps, image displays, and
controls to allow users to navigate and interact with the satellite images and the classification
results. Users can have the ability to select and zoom into specific areas of interest, overlay
different layers or classifications on the map, and access information about the identified
crops and land cover classes.
The GUI can provide intuitive and user-friendly features, such as dropdown menus or buttons
to select different datasets or classification algorithms, sliders or checkboxes for adjusting
parameters or visualizations, and tooltips or informative pop-ups to provide contextual
guidance.
By adopting a GUI paradigm, users can have a visually appealing and interactive interface
that facilitates easy exploration, analysis, and interpretation of the satellite images and the
crop identification results, enhancing their ability to make informed decisions based on the
provided information.
4.2.3 Error detection and recovery

Error detection and recovery mechanisms are crucial components of the system to ensure
robustness and reliability in the crop identification and land cover classification application.
These mechanisms help identify and handle errors or exceptions that may occur during data
processing or user interactions.
● Error Detection: The system should include error detection techniques to identify any
anomalies or inconsistencies in the data. This can involve validating input data, checking
for data integrity, and performing data quality checks. Error detection techniques can
include data validation rules, anomaly detection algorithms, and consistency checks.
● Error Handling: When an error is detected, the system should have mechanisms in place
to handle and recover from the error. This may involve displaying informative error
messages to the user, logging the error details for debugging purposes, and taking
appropriate actions to mitigate the impact of the error.
● Exception Handling: The system should implement exception handling techniques to
catch and handle runtime errors or exceptional conditions. This can involve using
try-catch blocks to capture and handle exceptions, providing fallback mechanisms or
alternative paths when errors occur, and ensuring graceful degradation of functionality in
case of errors.
● Data Recovery: In case of data loss or corruption, the system should have backup and
recovery mechanisms. Regular data backups, redundant storage, and recovery procedures
can help restore data integrity and minimize the impact of data loss.

4.2.4. Data Storage Management

In the context of crop identification and land cover classification using RS satellite images,
efficient data storage management is essential to handle the large volumes of data involved in
the process. The following aspects should be considered for effective data storage
management:
● Data Organization: The system should employ a structured approach to organize and store
the satellite images, ground truth data, classification results, and any associated metadata.
This can involve creating a well-defined directory structure, using file naming
conventions, and maintaining a centralized database or file system.
● Data Validation and Integrity: Prior to storing the data, it is crucial to perform data
validation and integrity checks. This includes verifying the accuracy, completeness, and
consistency of the data. Techniques such as checksums or hash functions can be used to
ensure data integrity during storage and retrieval.
● Data Compression and Optimization: Given the large size of satellite images, employing
data compression techniques can significantly reduce storage requirements. Lossless
compression methods, such as ZIP or PNG, can be used to compress the images without
sacrificing data quality. Additionally, optimizing data storage by eliminating redundant or
irrelevant information can help save space and improve overall performance.
● Scalability and Accessibility: The data storage management system should be designed to
handle the scalability requirements of the application. This involves accommodating
increasing volumes of data and ensuring accessibility to multiple users or systems
concurrently. Implementing distributed storage architectures or cloud-based solutions can
help achieve scalability and provide seamless data access.
4.3. System Architecture

● The system architecture for crop identification and land cover classification using RS
satellite images involves a layered and modular design to facilitate efficient processing
and analysis of data. The architecture can be divided into the following components:
● Data Acquisition: This component deals with the acquisition of satellite images and
ground truth data. It involves accessing and downloading satellite images from sources
like Google Earth Engine or other data providers. Ground truth data, such as labeled
samples or reference data, may be obtained through field surveys or existing datasets.
● Preprocessing: The acquired data undergoes preprocessing to enhance its quality and
prepare it for analysis. Preprocessing steps can include image calibration, atmospheric
correction, geometric correction, and spectral band alignment. This ensures consistency
and accuracy in the data before further processing.
● Feature Extraction: In this stage, relevant features are extracted from the preprocessed
satellite images. These features may include spectral indices (e.g., NDVI, EVI), texture
measures, or spatial information.
● Model Training and Classification: The extracted features are used to train ML models for
land cover classification and crop identification. Techniques like extreme gradient
boosting (XGB) and random forests are employed. The trained models are then used to
classify the satellite images into different land cover classes or specific crop types.

● Post-processing and Visualization: After classification, post-processing techniques are

applied to refine the results. This can involve filtering, smoothing, or spatial aggregation
to improve the accuracy of the classification. The final results are visualized through
maps, charts, or interactive interfaces to facilitate interpretation and analysis.
Fig 4.3.1 below shows the flow diagram of the Land cover identification and Fig 4.3.1 shows
the flow diagram of the crop identification. A flowchart for land cover and crop identification
using remote sensing satellite images can provide a visual representation of the overall
process. The flowchart begins with data acquisition, where satellite images are collected and
preprocessed.
Fig 4.3.1: Flow diagram for land cover Identification
Fig 4.3.2: Flow diagram for crop Identification

Then, the data is fed into the classification algorithm, such as machine learning models or
image processing techniques, to identify land cover classes and crop types. The flowchart
further encompasses steps such as feature extraction, training the model, testing, and

validation. Additionally, post-processing steps, such as accuracy assessment and result

visualization, are incorporated in the flowchart. The flowchart helps to illustrate the
sequential order of operations and the decision-making process involved in the identification
of land cover and crops from satellite imagery.
4.4. Data Flow Diagrams
4.4.1. Data Flow Diagram – Level 0

At level 0 of the data flow diagram for land cover and crop identification, the main
components include the input data sources, such as satellite imagery and ground truth data,
which are processed through various modules. These modules encompass image
preprocessing, feature extraction, classification algorithms, and result evaluation. The output
of the system includes identified land cover classes and crop types. The level 0 data flow
diagram provides a high-level overview of the data flow and interaction between different
components in the system for efficient land cover and crop identification using remote
sensing satellite images. The diagram in Fig 4.4.1 is of the DFD level 0 with two specific
processes which will be explained in the DFD level 1 diagrams.
Fig 4.4.1: data flow diagram level 0
4.4.2. Data Flow Diagram – Level 1

At level 1 of the data flow diagram for land cover and crop identification, the system
components are further detailed, depicting the specific processes and data flows within each
module. It includes modules for image preprocessing, feature extraction, classification, and
evaluation. Inputs, such as satellite imagery and ground truth data, are transformed through
various data processing steps, and the outputs are generated at each module. The level 1 data
flow diagram provides a more detailed representation of the data flow and processes involved
in the land cover and crop identification system, aiding in understanding the system's
functionality and interactions. The Fig 4.2.2.1 shows the diagram of process 1 i.e
preprocessing and Fig 4.2.2.2 shows the diagram of process 2 i.e Feature Extraction.

Fig 4.2.2.1: Data Flow Diagram – Level 1 for process 1
Fig 4.2.2.2: Data Flow Diagram – Level 1 for process 2
4.5. Summary
The high-level design for crop identification and land cover classification using RS satellite
images involves a layered architecture that encompasses data acquisition, preprocessing,
feature extraction, model training and classification, post-processing, and visualization. The
system utilizes advanced techniques such as ML and image processing to accurately identify
land cover classes and crop types. The design ensures scalability, modularity, and efficient
data management, enabling accurate analysis and interpretation of satellite images for
agricultural and environmental applications.

CHAPTER-5
Detailed Design of land-cover and identification of crop

CHAPTER 5
Detailed Design
5.1. Structure Chart

The structure chart for land cover and crop identification illustrates the hierarchical
organization and interrelationships between the different system modules. It depicts the
modules as boxes and shows the flow of control between them through arrows. The structure
chart helps visualize the system's architecture, including the main modules, sub-modules, and
their dependencies.
Fig 5.1.1: Structure chart for land cover Identification
Fig 5.1.2: Structure chart for crop Identification

It provides a clear overview of the system's structure and helps in understanding how the
different components interact and collaborate to achieve the desired functionality of land
cover and crop identification. Fig 5.1.1 shows the diagram of the structure chart of Land
cover identification and Fig 5.1.2 shows the diagram of the structure chart of Crop
identification.
5.2. Functional Description of the Modules
Land Cover Identification Module:
The Land Cover Identification Module is an essential component of the overall system for
land cover and crop identification using remote sensing satellite images. Its primary function
is to analyze the satellite imagery data and classify the land cover classes present in the target
area.
The module performs a series of tasks to achieve accurate land cover identification. Firstly, it
takes input in the form of pre-processed satellite images, which include bands from sensors
like Sentinel-2A. These images capture the spectral reflectance information of the Earth's
surface across various wavelengths.
Next, the module applies image processing techniques to enhance the quality of the images,
remove noise, and normalize the data for further analysis. Feature extraction algorithms are
then utilized to identify relevant features in the images that can discriminate between
different land cover classes. These features may include texture, shape, and spectral
characteristics.
The module employs machine learning algorithms, such as eXtreme Gradient Boosting
(XGB), to train a classification model using labeled ground truth data. The model learns the
relationship between the extracted features and the corresponding land cover classes. This
trained model is then used to classify the land cover in unseen satellite images.
The output of the Land Cover Identification Module is a classified map or raster, where each
pixel is assigned a specific land cover class label. This map provides valuable information
about the distribution and extent of different land cover types in the target area.
Overall, the Land Cover Identification Module plays a crucial role in the system by utilizing
advanced image processing techniques and machine learning algorithms to accurately
identify and classify land cover classes from remote sensing satellite images. Its outputs serve
as a foundation for further analysis and decision-making in applications related to land
management, environmental monitoring, and urban planning.

Crop Identification Module:
The Crop Identification Module is an integral part of the land cover and crop identification
system using remote sensing satellite images. Its main function is to analyze the spectral
information captured by the satellite imagery and accurately identify different crop types in
the target area.
The module begins by receiving pre-processed satellite images, which may include bands
from sensors like Sentinel-2A. These images contain valuable spectral reflectance data that
can be used to differentiate between various crop species based on their unique spectral
signatures.
Next, the module applies image processing techniques to enhance the quality of the images
and extract relevant information. This may involve correcting for atmospheric effects,
normalizing the data, and reducing noise.
The module utilizes machine learning algorithms, such as Random Forest, to train a
classification model using labeled ground truth data. The model learns the complex
relationships between the spectral characteristics of the satellite images and the
corresponding crop types. This trained model is then applied to classify the crop types in
unseen satellite images.
The output of the Crop Identification Module is a classified map or raster, where each pixel is
assigned a specific crop type label. This map provides valuable information about the
distribution and composition of different crops in the target area, enabling better agricultural
planning, yield estimation, and crop monitoring.
Overall, the Crop Identification Module plays a crucial role in the system by leveraging
advanced image processing techniques and machine learning algorithms to accurately
identify and classify crop types from remote sensing satellite images. Its outputs contribute to
improved agricultural management, crop yield forecasting, and decision-making in the
agricultural sector.
5.3. Summary
The detailed design of the land cover and crop identification system involves the
specification of various components and their interactions to achieve the desired
functionality. It includes the design of algorithms, data structures, and interfaces that facilitate
accurate identification and classification of land cover and crops using remote sensing
satellite images.
Advanced machine learning techniques such as eXtreme Gradient Boosting (XGB) and
Convolutional Neural Networks (CNNs) are implemented to handle the complex
relationships and patterns within the satellite imagery. The design incorporates preprocessing
steps to enhance the quality and consistency of the input data, such as image calibration,
normalization, and feature extraction.

Additionally, the design includes the integration of multi-source data, such as geographic
information system (GIS) data, radar data, and thermal data, to improve the accuracy and
robustness of the identification process. Data fusion techniques are applied to combine the
information from different sources and create a comprehensive understanding of the land
cover and crop types.
The detailed design also focuses on optimizing the computational efficiency of the system.
Parallel processing techniques, distributed computing, and cloud computing resources are
leveraged to handle large datasets and enable faster processing times. Furthermore, the design
includes strategies for error detection, handling, and recovery to ensure the reliability and
robustness of the system.
Overall, the detailed design of the land cover and crop identification system provides a
comprehensive blueprint for implementing the necessary algorithms, data structures, and
processing steps to achieve accurate and efficient identification and classification of land
cover and crops using remote sensing satellite images.

CHAPTER-6
Implementation of land-cover and identification of crop

CHAPTER 6
Implementation
6.1. Programming Language Selection

Following PLs are used during the development of the project –
● Python:It is a general-purpose PL that is used for a variety of tasks, including data
analysis, ML, and web development. It is a popular choice for data scientists because it is
easy to learn and use, and there are a large number of libraries and tools available for data
analysis.
● Google Earth Engine: Google Earth Engine is a cloud-based platform for analyzing and
visualizing geospatial data. It provides a vast collection of satellite imagery and
geospatial datasets, along with powerful processing capabilities, allowing users to
perform RS and geospatial analysis tasks at scale.
● EarthPy: EarthPy is a Python library that facilitates the analysis of geospatial data using
open-source tools. It provides a simplified interface for accessing and manipulating
geospatial datasets, along with a suite of functions for common geospatial operations,
making it easier for users to work with raster and vector data in Python.
● ipyleaflet: ipyleaflet is a Python library that enables interactive mapping and visualization
within Jupyter notebooks. It provides a set of interactive mapping widgets based on the
Leaflet JavaScript library, allowing users to create interactive maps, add markers,
overlays, and layers, and interact with geospatial data in a Jupyter notebook environment.
● Rasterio: Rasterio is a Python library for reading, writing, and manipulating geospatial
raster datasets. It provides an intuitive and efficient API for working with raster data,
enabling tasks such as reading and writing raster files, cropping, reprojecting, and
extracting information from raster datasets.
● Scikit-learn: scikit-learn is a popular Python library for ML. It provides a wide range of
ML algorithms, tools for data preprocessing, model evaluation, and cross-validation.
scikit-learn is widely used for classification, regression, and clustering tasks in various
domains, including RS and geospatial analysis.
6.2. Platform Selection

Google Colab, a cloud-based platform, was chosen as the primary environment for
implementing the land cover and crop identification system. This decision was based on
several key factors that make Google Colab well-suited for the project requirements.
Firstly, Google Colab provides easy access to powerful computational resources, including
high-performance GPUs. This is essential for handling the large volumes of satellite images
and conducting computationally intensive tasks involved in processing and analyzing RS
data. The availability of such resources ensures efficient and timely execution of the
algorithms.
Secondly, Google Colab comes with pre-installed libraries and frameworks for satellite image
processing, ML, and data analysis. This eliminates the need for manual setup and

configuration, saving valuable time and effort in preparing the development environment. It
allows developers to focus on implementing and fine-tuning the land cover and crop
identification algorithms rather than dealing with infrastructure setup.
Moreover, Google Colab offers a collaborative environment that facilitates easy sharing and
collaboration on notebooks. This is particularly beneficial for team-based projects, enabling
multiple stakeholders to work together seamlessly, share code, and provide feedback in
real-time.
Additionally, Google Colab integrates well with other Google services, such as Google Drive,
allowing convenient storage and retrieval of large datasets and project files. This ensures data
availability and accessibility, even across different devices and locations.
Furthermore, Google Colab is a reliable and stable platform, with regular updates and
maintenance by Google. This ensures that the system remains up-to-date with the latest
features, bug fixes, and security patches.
In summary, Google Colab's accessibility to powerful computational resources, pre-installed

libraries, collaborative features, integration with Google services, and reliability make it an
ideal platform for implementing the land cover and crop identification system. It provides a
robust and efficient environment for developing, testing, and deploying the necessary
algorithms, ultimately enabling accurate and timely analysis of RS satellite images.
6.3. Code Conventions

Code conventions are a set of rules that govern the style and formatting of code. They are
used to make code more readable, understandable, and maintainable.
6.3.1. Naming Conventions

In the land cover and crop identification project, adhering to clear naming conventions is
crucial for ensuring code clarity and reducing confusion. Some common naming conventions
include:
● Descriptive names: Using meaningful and descriptive names for variables, functions, and
classes helps in understanding their purpose and functionality at a glance. For example,
variables related to satellite images can be named "satelliteImage" or "imageData."
● Camel case or underscores: Choosing a consistent convention for combining words in
names can improve readability. Camel case involves capitalizing the first letter of each
word except the first, while underscores separate words with underscores. For example,
"cropIdentification" or "crop_identification."
● Avoiding ambiguous abbreviations: It is important to avoid using unclear or ambiguous
abbreviations in names. Instead, opt for descriptive terms that convey the purpose of the
component. For instance, use "classificationModel" instead of "clsModel."

● Consistent casing: Decide on a consistent casing convention, such as using all lowercase,
all uppercase, or title case, and apply it consistently throughout the codebase. Consistency
aids in code comprehension and reduces errors.
● Prefixes or suffixes: Consider using prefixes or suffixes to differentiate variables or
functions that serve similar purposes but have different roles. For example, "trainData"
and "testData" or "calculateNDVI" and "calculateEVI" for different vegetation indices.
6.3.2. File Organization

Files should be organized in a way that makes it easy to find the code you need. They should
be named using consistent conventions and placed in logical directories. All the files were
stored in the google drive so that it can be accessed easily by the google colab.
6.3.3. Declarations
Declarations are used to define variables, functions, and classes.
● Descriptive names were used for variables, functions, and classes.
● Consistent data types were used for variables.
● Commands were used to explain what your declarations are doing.
6.3.4. Comments
Comments are used to explain what your code is doing. Multiple comments have been used
in our implementing code to explain the various strategies for which the concerned code is
written so that the person can well understand the relevance of the code.
6.4. Difficulties Encountered and Strategies Used to Tackle Them

During the implementation of the land cover and crop identification system, several
difficulties were encountered. However, effective strategies were employed to tackle these
challenges and ensure the successful development of the project.
● Data preprocessing challenges: The RS satellite images often required extensive
preprocessing to remove noise, handle missing data, and normalize the spectral bands. To
address this, thorough research was conducted to identify appropriate preprocessing
techniques. Techniques such as interpolation, outlier removal, and spectral band
normalization were applied to ensure the quality and consistency of the data.
● Model selection and parameter tuning: Choosing the right ML models and optimizing
their parameters was a significant challenge. To tackle this, a comprehensive evaluation
of various models such as XGBoost, Random Forest, and Support Vector Machines was
performed. Cross-validation and grid search techniques were employed to fine-tune the
model parameters and improve their performance.
● Handling large-scale datasets: Dealing with a large volume of satellite images and
spectral reflectance data required efficient storage and processing. Google Colab's
cloud-based infrastructure and integration with Google Drive helped in managing and
accessing large datasets. Techniques such as data batching and parallel processing were
employed to optimize the performance and ensure timely execution.
● Class imbalance and accuracy evaluation: Addressing class imbalance, particularly in
crop identification, posed a challenge. Techniques like oversampling, undersampling, or
using weighted loss functions were implemented to handle the imbalance and improve

model performance. Additionally, precision, recall, and F1-score were used along with
accuracy to evaluate the model's performance accurately.
● Software compatibility and dependencies: Ensuring compatibility and resolving
dependencies between different libraries, frameworks, and software versions can be
challenging. Thorough documentation and research were conducted to identify and
address any compatibility issues. Libraries such as scikit-learn, rasterio, and earthpy were
carefully selected to ensure compatibility and smooth integration within the system.
Overall, the difficulties encountered were mitigated through careful research,

experimentation, and the adoption of appropriate strategies. By addressing data preprocessing
challenges, optimizing model selection and parameters, handling large-scale datasets
efficiently, managing class imbalance, and resolving software compatibility issues, the
implementation of the land cover and crop identification system was successfully
accomplished.
6.5 Summary
The implementation of the land cover and crop identification system involved several key
steps. Firstly, the required satellite images and corresponding ground truth data were
obtained. Data preprocessing techniques were applied to handle noise, missing values, and
normalize spectral bands. ML models, such as XGBoost for land cover identification and
Random Forest for crop identification, were trained using the preprocessed data. The models
were evaluated using classification reports to assess their accuracy and performance. Google
Colab was utilized as the platform for implementation, leveraging its cloud-based
infrastructure and compatibility with libraries such as scikit-learn and rasterio. The
implementation process involved iterative refinement, parameter tuning, and addressing
challenges encountered along the way. The successful implementation of the system provides
accurate land cover and crop identification using RS satellite images.

CHAPTER-7
Software Testing of land-cover and identification of crop

CHAPTER 7
Software Testing
7.1. Test Environment

● Tools – google colab, google drive
● Test Environment: Windows 10 pro
● Dataset: satellite images dataset and additional image used of an area near Ambala
district in Haryana.
7.2 Testing Process

The testing process employed for the land cover and crop identification system followed a
comprehensive approach to ensure the accuracy and reliability of the implemented solution.
The testing process consisted of the following steps:
● Unit Testing: Individual modules and functions were tested in isolation to verify their
correctness and functionality. This involved writing test cases and conducting tests to
validate the expected behavior of each module. In the unit testing part, tests were
performed to validate data points imported and when functions were performed on it it
maintained consistency in its size.
● Integration Testing: The integration of different modules and components was thoroughly
tested to ensure their seamless interaction and compatibility. This involved testing the
flow of data and the integration of algorithms and models. As a part of integration testing,
graphs were plotted with the normalized data points and verification was done to better
analyze the data points.
● Performance Testing: The system's performance was evaluated under various scenarios,
including processing large datasets, handling multiple requests simultaneously, and
managing memory usage. Performance benchmarks were set, and tests were conducted to
measure response times, resource consumption, and scalability.
● Validation Testing: The accuracy of the land cover and crop identification results was
validated by comparing them with ground truth data. A sample set of satellite images with
known land cover and crop classes was used to assess the system's accuracy and evaluate
its ability to correctly identify and classify the land cover and crops. Classification report
was printed as a part of the validation of results with the corresponding ground truths.
Validation is done using classification reports.
● User Acceptance Testing: The system was tested by end-users to ensure its usability,
functionality, and alignment with user requirements. Feedback from users was collected
and incorporated into the system's further refinement.
Throughout the testing process, test cases were documented, and any identified issues or bugs
were tracked and addressed. The iterative testing approach allowed for continuous
improvement and ensured the system's reliability and accuracy in land cover and crop
identification using RS satellite images.

7.3 Tests Performed

The testing phase of the land cover and crop identification system involved a series of tests
conducted by you to ensure the functionality and accuracy of the implemented solution. The
tests performed included:
● Dataset Testing: Verified the integrity and quality of the input datasets by examining their
structure, content, and compatibility with the system requirements. This involved
checking for missing data, outliers, and inconsistencies in the spectral reflectance values.
The dataset was checked multiple times for consistency and plotting. The test case is
shown in table 7.4.1
● Model Testing: Evaluated the performance of the ML models trained for land cover
identification and crop identification. This included assessing the accuracy of the models
by comparing their predictions with the ground truth data. You verified that the models
were able to correctly classify land cover types and identify specific crop classes. Model
testing is done which requires the most amount of time in this case as there are a lot of
data points which the model needs to learn. The trained model is used on a different set of
data points which are kept as a testing data for the model to predict its learning as seen in
table 7.4.2
● Classification Report Testing: Generated classification reports to assess the performance
metrics of the models, such as accuracy, precision, recall, and F1 score. These reports
provided insights into the effectiveness of the models in classifying land cover and crops
and allowed you to compare the results with expected values. Classification report is
printed for both the modules of land cover identification and crop identification as a proof
of the accuracy of the results of our models prediction as seen in table 7.4.4 and 7.4.5
● Error Handling Testing: Intentionally introduced errors or anomalies into the input data to
test the system's ability to handle such cases. This involved simulating scenarios such as
missing spectral bands, corrupted images, or incorrect ground truth labels. You verified
that the system appropriately detected and handled these errors without crashing or
producing inaccurate results. This is an important part of our testing phase as multiple
online platforms were used such as google drive and google earth engine to access data in
different ways. Connections to such platforms need an error handling mechanism to
display about connection status as shown in table 7.4.3.
By performing these tests, I was able to validate the accuracy, reliability, and robustness of
the land cover and crop identification system. Any issues or discrepancies encountered during
testing were identified and addressed to improve the overall performance and functionality of
the system. The results of the test in the format of cases has been shown in the next part.
Testing on various phases in the implementation is necessary because failure in even one of
the modules can result in the failure of the entire system or the system giving unpredictable
results.

7.4 Test Cases

Table 7.4.1 : Test Case 1- Data Consistency
Sl No. of test case 1

Name of test case Data Consistency
Feature being tested Consistent data points after preprocessing to deal with missing data
Sample Input Matrix of data points with known size
Expected Output Same Size output
Actual Output Same size output
Remarks Pass
Table 7.4.2 : Test Case 2 - Model Testing

Name of test case Model Testing
Feature being tested Trained Model
Sample Input Testing data during split
Expected Output Model performing exactly based on training dataset
Actual Output Prediction with good accuracy
Remarks Pass
Table 7.4.3 : Test Case 3 - Error handling

Name of test case Error handling
Feature being tested Connection to online platforms
Sample Input Url to platform
Expected Output Connection Successful
Actual Output Connection successful
Remarks Pass
Table 7.4.4 : Test Case 4 - Classification report of Land cover identification

Name of test case Model Accuracy
Feature being tested Land cover classification
Sample Input Sundarbans sentinel 2 images and corresponding ground truth
Expected Output All types of land cover is mapped
Actual Output Land cover types are mapped according to colour selected
Remarks 99 % accuracy achieved

Table 7.4.5 : Test Case 5 - Classification report of Crop identification

Name of test case Model Accuracy
Feature being tested Crop identification
Sample Input Surface reflectance data of sentinel 2A with corresponding ground
truth
Expected Output Crops to be classified on the image
Actual Output Crops are classified on the satellite image based on the selected color
convention
Remarks 93 % accuracy achieved
7.5 Summary
The testing process for the land cover and crop identification system is a crucial step in
ensuring its functionality and accuracy. Through unit testing, integration testing, data
validation, performance testing, and accuracy evaluation, the system is thoroughly assessed
for its performance and adherence to requirements. The testing process involves verifying
individual module functionality, seamless integration, handling of diverse datasets,
performance optimization, and comparison of results with ground truth data. Any identified
issues are addressed through debugging and refinement. The testing process aims to validate
the system's reliability, accuracy, and ability to deliver precise land cover and crop
identification results, ensuring a robust and dependable solution.

CHAPTER-8
Experimental Result and Analysis of land-cover and
identification of crop using RS satellite images

CHAPTER 8
Experimental Result and Analysis
8.1. Evaluation Metrics

The evaluation metrics used by us in this project is the overall accuracy with which the
prediction is being made and printing the classification report for both the land cover and
crop identification.
Table 8.1: Classification report for Land Cover Identification
A classification report can be used to assess the effectiveness of a land cover and crop
identification system. It gives precise information regarding the classification process's
correctness and efficacy. Metrics like accuracy, recall, F1-score, and support for each land
cover or crop type are often included in the report. Precision is the fraction of successfully
categorised examples among all instances projected to belong to a specific class. The
proportion of correctly categorised examples among all instances that genuinely belong to a
certain class is measured by recall. The F1-score combines accuracy and recall to provide a
fair evaluation of system performance. The number of samples in each class is indicated by
the support. Analysing the categorization report can provide insight into the system's
accuracy and highlight areas for improvement. The table shown in table 8.1 shows the
complete classification report of the Land Cover identification wherein the accuracy is 99
percent and the respective macro averaged and weighted average of all the present metrics in

the report. The table shown in table 8.2 shows the complete classification report of the Land
Cover identification wherein the accuracy is 93 percent and the respective macro averaged
and weighted average of all the present metrics in the report.
Table 8.2: Classification report for Crop Identification
8.2 Experimental Dataset
Fig 8.2.1 : Land cover Dataset

The experimental data set used for the land cover Identification is the S2A satellite images of
an area in sundarbans along with the corresponding ground truth. Fig 8.2.1 is showcasing the
area of the land chosen in sundarbans in the dataset on a wider map with its corresponding
satellite image to get a rough idea about the data. Corresponding ground truth of the satellite
image is present with the dataset.
The image library obtained has the images for all the 13 spectral bands present in the S2A of
the same area. Fig 8.2.2 shows the satellite images of 11 out of the 13 bands. Band 1 (Coastal
aerosol): 442.7 nm, Band 2 (Blue): 492.4 nm, Band 3 (Green): 559.8 nm, Band 4 (Red):
664.6 nm, Band 5 (Vegetation red edge): 704.1 nm, Band 6 (Vegetation red edge): 740.5 nm,
Band 7 (Vegetation red edge): 782.8 nm, Band 8 (NIR): 832.8 nm, Band 8A (Narrow NIR):
864.7 nm, Band 9 (Water vapor): 945.1 nm, Band 10 (SWIR - Cirrus): 1373.5 nm, Band 11
(SWIR - Atmospheric window): 1613.7 nm, Band 12 (SWIR - Atmospheric window): 2202.4
nm is the exact details of the bands of the satellite.
Fig 8.2.2: RS Images of Dataset of the Sundarbans area
For the crop Identification part, a dataset was obtained which contained surface reflectance
values of an area near the sundarbans along with the corresponding ground truth for a period
of 4 months whose median was calculated and based on the surface reflectance values, the
vegetation indexes were calculated.
Below are two tables describing the data points which are essential in the identification of
crops. Table 8.2.1 is about the details about the different surface reflectance of different crops
of the entire image. These data points are then used to calculate the vegetation indexes shown

in table 8.2.2 and along with the ground truthing points mentioning the crop in the pixel or
unknown if the crop is unknown for the particular pixel makes a very strong dataset to train
our machine learning models on.
Table 8.2.1 : Surface Reflectance Values of Crops Identified
Table 8.2.2 : Vegetation Index of the crops Identified
8.3 Performance analysis

For the performance analysis, we have marked the area of different land cover on the satellite
images and accuracy of the same calculated at 99 %. Fig 8.3.1 shows the marked areas of the
satellite images wherein the light green represents croplands, orange represents barren lands,
blue represents water, dark green represents forest area.

Fig 8.3.1: Land Cover Plotting
For crop identification a dataset of an area near sundarbans was procured having farm lands
with a variety of crops which contained the surface reflectance values of each band satellite
image for a period of four months along with the ground truthing points mapped for each
crop in the region. A median was calculated for the surface reflectance values.
Following these, vegetation indices - NDVI and EVI whose values have been shown in table
below and the variation graph for it is shown in the image below. Vegetation Index is an
important metric for the prediction of crops based on remote sensing data. Once the
vegetation index has been calculated, Random forest is used to train the data.
The model is ready to predict the crops for any region whose remote sensing data is
available and the same types of crops are grown there. Then Sentinel 2 A data was obtained
for a region near the same location whose ground truth points were available and the model
was run on that to predict the crops grown. Results of the mapped satellite image are shown
in the figure below in Fig 8.3.2 and an accuracy of 93 % is achieved in the classification
report of the same.

Fig 8.3.2 : Crop Identification plotting on image
8.3 Summary
The experimental results for land cover identification and crop identification using remote
sensing satellite images showed promising outcomes. The system achieved a high accuracy
of 99% in accurately identifying land cover classes, indicating its effectiveness in
distinguishing different types of land cover features. Additionally, the crop identification
component exhibited a commendable accuracy of 93%, highlighting its ability to accurately
classify various crop types within the satellite imagery. These results demonstrate the
robustness of the developed models and their capability to provide reliable and precise
information about land cover and crop patterns. The high accuracy achieved in both land
cover and crop identification underscores the potential of remote sensing technology in
supporting land management and agricultural decision-making processes.

CHAPTER-9
Conclusion of land-cover and identification of crop using
RS satellite images

CHAPTER 9
Conclusion
9.1 Limitation of the project

While the land cover and crop identification project using RS satellite images has shown
promising results, there are certain limitations that need to be acknowledged. These
limitations include:
● Data Availability: The project heavily relies on the availability and quality of satellite
images and ground truth data. Limited access to high-resolution satellite imagery or
ground truth data can restrict the accuracy and coverage of the identification process.
● Sensor Limitations: Different satellite sensors have varying spectral bands and
resolutions, which may affect the accuracy and consistency of the identification results.
The project's performance may be influenced by the specific satellite sensor used.
● Cloud Cover: Cloud cover can obstruct satellite images and affect the visibility of land
cover and crops, leading to incomplete or inaccurate identification results. Cloud masking
techniques can help mitigate this limitation, but it remains a challenge in regions with
frequent cloud cover.
● Training Data Representativeness: The accuracy of the identification models heavily
relies on the representativeness and diversity of the training data. Biases or imbalances in
the training dataset can affect the generalization capabilities of the models and lead to
reduced accuracy.
● Sensitivity to Environmental Factors: The identification of land cover and crops can be
influenced by environmental factors such as seasonal changes, weather conditions, and
inter-class spectral variations. Adapting the models to account for these factors can be
challenging and may require frequent retraining or updates.
Understanding these limitations is crucial for setting realistic expectations and identifying
areas for improvement in the land cover and crop identification system. It highlights the need
for ongoing research and development to address these challenges and enhance the accuracy
and applicability of the system.
9.2 Future enhancements

In conclusion, several future enhancements can further improve the land cover and crop
identification system using RS satellite images. These include integrating multiple satellite
data sources, exploring advanced ML techniques such as DL, incorporating phenological
information for crop identification, leveraging cloud computing and parallel processing for
scalability, and developing user-friendly interfaces and visualization tools. Implementing
these enhancements will enhance the accuracy, efficiency, and usability of the system, leading
to more effective land management, agricultural planning, and environmental monitoring.

9.3 Summary
A classification report can be used to assess the effectiveness of a land cover and crop
identification system. It gives precise information regarding the classification process's
correctness and efficacy. Metrics like accuracy, recall, F1-score, and support for each land
cover or crop type are often included in the report. Precision is the fraction of successfully
categorised examples among all instances projected to belong to a specific class.
The proportion of correctly categorised examples among all instances that genuinely belong
to a certain class is measured by recall. The F1-score combines accuracy and recall to provide
a fair evaluation of system performance. The number of samples in each class is indicated by
the support. Analysing the categorization report can provide insight into the system's
accuracy and highlight areas for improvement.
However, it is critical to identify the project's limitations and constraints, such as data
availability, sensor limitations, and susceptibility to environmental conditions. These
constraints emphasise the importance of continued research and development to overcome
these issues and improve the system's accuracy and resilience.
Future developments, including combining several satellite data sets, employing sophisticated
machine learning techniques, adding phenological data, and upgrading the user interface,
provide prospects for significant improvements in the system's performance and usability.
Overall, the land cover and crop identification system demonstrated in this study is a useful
tool for land management, agricultural, and environmental applications. The system may
contribute to informed decision-making processes, sustainable land use, and effective
resource management in a variety of fields by continually improving and increasing its
capabilities.

References
1. Kosari, A.; Sharifi, A.; Ahmadi, A.; Khoshsima, M. Remote sensing satellite’s
attitude control system: Rapid performance sizing for passive scan imaging mode.
Aircr. Eng. Aerospace. Technol. 2020, 92, 1073–1083.
2. Wu, F.; Wu, B.; Zhang, M.; Zeng, H.; Tian, F. Identification of crop type in
crowdsourced road view photos with deep convolutional neural network. Sensors
2021, 21, 1165.
3. McNairn, H.; Champagne, C.; Shang, J.; Holmstrom, D.; Reichert, G. Integration of
optical and Synthetic Aperture Radar (SAR) imagery for delivering operational
annual crop inventories. ISPRS J. Photogramm. Remote Sens. 2009, 64, 434–449.
4. Joshi, N.; Baumann, M.; Ehammer, A.; Fensholt, R.; Grogan, K.; Hostert, P.; Jepsen,
M.R.; Kuemmerle, T.; Meyfroidt, P.; Mitchard, E.T.A.; et al. A review of the
application of optical and radar remote sensing data fusion to land use mapping and
monitoring. Remote Sens. 2016, 8, 70.
5. De Oliveira Pereira, L.; Da Costa Freitas, C.; Anna, S.J.S.S.; Lu, D.; Moran, E.F.
Optical and radar data integration for land use and land cover mapping in the
Brazilian Amazon. GIScience Remote Sens. 2013, 50, 301–321.
6. Zhou, T.; Pan, J.; Zhang, P.; Wei, S.; Han, T. Mapping winter wheat with
multi-temporal SAR and optical images in an urban agricultural region. Sensors
2017, 17, 1210.
7. Wu, C.F.; Deng, J.S.; Wang, K.; Ma, L.G.; Tahmassebi, A.R.S. Object-based
classification approach for greenhouse mapping using Landsat-8 imagery. Int. J.
Agric. Biol. Eng. 2016, 9, 79–88
8. Song, X.-P.; Potapov, P.V.; Krylov, A.; King, L.; Di Bella, C.M.; Hudson, A.; Khan,
A.; Adusei, B.; Stehman, S.V.; Hansen, M.C. National-scale soybean mapping and
area estimation in the United States using medium resolution satellite imagery and
field survey. Remote Sens. Environ. 2017, 190, 383–395.
9. Wulder, M.A.; Coops, N.C.; Roy, D.P.; White, J.C.; Hermosilla, T. Land cover 2.0.
Int. J. Remote Sens. 2018, 39, 4254–4284.
10. Li, J.; Roy, D.P. A Global Analysis of Sentinel-2A, Sentinel-2B and Landsat-8 Data
Revisit Intervals and Implications for Terrestrial Monitoring. Remote Sens. 2017, 9,
902.
11. High-resolution satellite imagery applications in crop phenotyping: an overview C
Zhang, A Marzougui, S Sankaran - Computers and Electronics in …, 2020 -
Elsevier)
12. Hu, X., Ren, H., Tansey, K., Zheng, Y., Ghent, D., Liu, X., & Yan, L. (2019).
Agricultural drought monitoring using European Space Agency Sentinel 3A land
surface temperature and normalized difference vegetation index images. Agricultural
and Forest Meteorology, 279, 107707.
13. Curve fitting of MODIS NDVI time series in the task of early crop identification by
satellite images N Vorobiova, A Chernov - Procedia engineering, 2017 - Elsevier.
14. De Oliveira Pereira, L.; Da Costa Freitas, C.; Anna, S.J.S.S.; Lu, D.; Moran, E.F.
Optical and radar data integration for land use and land cover mapping in the
Brazilian Amazon. GIScience Remote Sens. 2013, 50, 301–321.
15. Zhou, T.; Pan, J.; Zhang, P.; Wei, S.; Han, T. Mapping winter wheat with
multi-temporal SAR and optical images in an urban agricultural region. Sensors
2017, 17, 1210.

16. Kumar, S., & Srinivasan, D. (2019). Crop classification from satellite images using
DL. Computers and Electronics in Agriculture, 162, 583-590.
17. Liu, X., Li, Y., Liao, M., Chen, L. C., Yu, L., & Ye, Y. (2020). Crop classification
using multi-source RS data and DL techniques. RS, 12(7), 1145.
18. Mountrakis, G., Im, J., & Ogole, C. (2011). Support vector machines in RS: A review.
ISPRS Journal of Photogrammetry and RS, 66(3), 247-259.
19. Pal, M., & Mather, P. M. (2003). An assessment of the effectiveness of decision tree
methods for land cover classification. RS of Environment, 86(4), 554-565.
20. Patel, N. R., & Jaiswal, D. K. (2018). Crop classification using RS and ML
techniques: A review. Geocarto International, 33(8), 829-850.

Appendix -1
Fig A1 : Plot of surface reflectance values of various crops
Fig A1 represents the plotted surface reflectance values of various crops at various bands of
the S2A satellite. Fig A2 represents the plotted values of vegetation indexes at various bands
of the S2A satellite.
Fig A2 : Plot of surface reflectance values of various crop

Fig A3 : Satellite image of a piece of land near Ambala
We tested the land cover classification model on a random obtained satellite image from the
ESA website. The satellite image is shown in Fig A3 which is of a piece of land near Ambala
District in Haryana. The corresponding mapping is shown in Fig A4 but unless we are able to
get ground truth, the result cannot be evaluated quantitatively.
Fig A4 : Corresponding land cover mapping of piece of land near Ambala

Appendix -2
Publication Details
Paper Submitted in International Journal of Innovative Research of Technology(IJIRT)

UGC Approved


Chapters Internship

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapters Internship

Uploaded by

Copyright:

Available Formats

Land-Cover and crop Identification using RS satellite Images

Department of Computer Science and Engineering 2022-23 Page 1

1.1 State of Art Developments

Department of Computer Science and Engineering 2022-23 Page 2

1.3 Problem Statement

Department of Computer Science and Engineering 2022-23 Page 3

Fig 1: Methodology of Land cover and crop Identification

• Calculation of NDVI ( Normalized Difference vegetation index) and EVI ( Enhanced

Department of Computer Science and Engineering 2022-23 Page 4

Addressing these challenges requires the development and implementation of advanced

Department of Computer Science and Engineering 2022-23 Page 5

Department of Computer Science and Engineering 2022-23 Page 6

2.2 Relevant information

Department of Computer Science and Engineering 2022-23 Page 7

Department of Computer Science and Engineering 2022-23 Page 8

Department of Computer Science and Engineering 2022-23 Page 9

3.1 Overall Description

3.1.1 Product Perspective

3.1.2 Product Functions

3.1.3 User Characteristics

Department of Computer Science and Engineering 2022-23 Page 10

3.1.4 Constraints and Dependencies

3.2 Specific Requirements

3.2.1 Functional Requirements

3.2.2 Performance Requirements

Department of Computer Science and Engineering 2022-23 Page 11

3.2.4 Software Requirements

3.2.5 Hardware Requirements

3.2.6 Design Constraints

3.2.7.1 User Interfaces of the System

Department of Computer Science and Engineering 2022-23 Page 12

3.2.7.2 Software Interfaces of the System

3.2.8 Non-Functional Requirements

Department of Computer Science and Engineering 2022-23 Page 13

Department of Computer Science and Engineering 2022-23 Page 14

4.1 Design Considerations

4.1.1 General Constraints

4.1.2 Development Methods

Department of Computer Science and Engineering 2022-23 Page 15

documentation, and open channels of communication promote a shared understanding

4.2. Architectural Strategies

4.2.1. Programming Language

Department of Computer Science and Engineering 2022-23 Page 16

4.2.2. User Interface Paradigm

4.2.3 Error detection and recovery

Department of Computer Science and Engineering 2022-23 Page 17

4.2.4. Data Storage Management

4.3. System Architecture

Department of Computer Science and Engineering 2022-23 Page 18

● Post-processing and Visualization: After classification, post-processing techniques are

Fig 4.3.1: Flow diagram for land cover Identification

Fig 4.3.2: Flow diagram for crop Identification

Department of Computer Science and Engineering 2022-23 Page 19

validation. Additionally, post-processing steps, such as accuracy assessment and result

4.4. Data Flow Diagrams

4.4.1. Data Flow Diagram – Level 0

Fig 4.4.1: data flow diagram level 0

4.4.2. Data Flow Diagram – Level 1

Department of Computer Science and Engineering 2022-23 Page 20

Fig 4.2.2.1: Data Flow Diagram – Level 1 for process 1

Fig 4.2.2.2: Data Flow Diagram – Level 1 for process 2

Department of Computer Science and Engineering 2022-23 Page 21

Department of Computer Science and Engineering 2022-23 Page 22