You are on page 1of 52

République Tunisienne

Ministère de l’Enseignement Supérieur Code : GSP-RS-03-00


et de la Recherche Scientifique
Université de Tunis
RAPPORT DE PROJET
DE FIN D’ETUDES Date de création :
16-06-2023

Department of Computer Engineering

Presented in order to obtain

Engineering degree in Computer Sciences


Speciality: GLID

By M. : Zioudi Ahmed

Table extraction and structure recognition

Realised within Axefinance

Defended on September, 29th 2023, In front of the jury composed of:

President: M. KAMMOUNE Slim

Reporter: M. BOULARES Mehrez

University supervisor: M. KOUKI Zoulel

Industrial supervisor: M. BOUDHHIR Maher

Academic Year 2022 - 2023

ENSIT 5, avenue Taha Hussein Montfleury, 1008 Tunis Site Web : http://www.ensit.tn E-mail : direction.stage@ensit.rnu.tn
Tél (+216) 71 49 68 96/71 4968 80/7139 25 91 Fax : (+216) 71 39 11 66
Dedications

To my dear parents and all my family,

To all my friends,

To everyone who has helped me, To everyone I love,

Kindly, I dedicate this work to you.


Acknowledgment

At the end of this modest work, I would like to express my deep gratitude to all those

who gave me their support and helped me to accomplish this project in the best conditions

and especially under exceptional and critique circumstances. First of all,

I would like to express my gratitude to Mr. Ayad abdelmajid team lead at Axe

Finance, for allowing me to live, within his team, an experience full of interests.

I would like to thank sincerely my technical supervisor, Mr BOUDHHIR Maher

for having guided me throughout my graduation project and for sharing with me his broad

experience. I could not have imagined having a better advisor for this work.

I would also like to present my sincere thanks to Mrs.KOUKI Zoulel my supervisor at

the Higher National Engineering School of Tunis (ENSIT) for the support, the permanent

help and the precious directives

Finally, I address my most devoted thanks to the members of the jury for having honored

me byagreeing to evaluate this work while hoping that they will find in it the qualities of

clarity and motivation that they expect... From the bottom of the heart: thank you!
General Introduction

Artificial Intelligence (AI) refers to the simulation of human intelligence in machines

or computer systems. AI technologies enable machines to perform tasks that typically

require human intelligence, such as understanding natural language, recognizing patterns,

solving complex problems, and making decisions. AI has a wide range of applications

across various industries and has the potential to transform the way we live and work.

As organizations seek to gain a competitive advantage and change customer demands,

digital transformation has impacted almost every industry. The banking industry has

also recognized the game-changing effects of innovative technologies such as Artificial

Intelligence. Its capacity to identify and process data across diverse formats has reshaped

conventional workflows. The work described in this report explores the application of an

AI-based table structure recognition model.

This report comprises four intricately structured chapters, each dedicated to a specific

facet of table structure recognition . Starting with an in-depth presentation of our host

company and the precise delineation of project scope. secon chapter exposes the popular

methodologies used in table recognition.The third chapter deals with data annotation

and environment set up , and the final chapter undertake an examination of modeling,

evaluation, and deployment process.

1
Chapter 1

Host Company Presentation and


Project Scope

1.1 Introduction

This chapter is devoted to present in details the host organization, its characteristics,

the project scope and overview of the work Methodology.

1.2 Company presentation

Axe Finance, founded in 2004, is a global software vendor specializing in loan automation

for financial institutions (including traditional and Islamic banking) with over 20000 users

in 20 countries seeking a competitive advantage in efficiency and customer service across all

client segments: commercial, retail, corporate, and so on. Axe Finance has offices in Tunis,

Amsterdam, Abu Dhabi, and Mumbai, among other places. The graph below depicts some

key figures from Axe Finance’s activity.

2
CHAPTER 1. HOST COMPANY PRESENTATION AND PROJECT SCOPE

Figure 1.1: Key numbers of the activity of Axe Finance

Axe Finance employs about 200 individuals across four departments, each of which is

organized into multiple sub-departments, as illustrated in the organizational chart (Figure

1.2):

Figure 1.2: Axe Finance Organizational chart

1.2.1 Main Services

Axe Credit Portal is an end-to-end comprehensive credit automation system from

Axe Finance, offered as a locally hosted. Société Générale, Al Rajhi Bank, Banque

3
CHAPTER 1. HOST COMPANY PRESENTATION AND PROJECT SCOPE

Internationale de Luxembourg, and First Abu Dhabi Bank are among axe finance’s trusted

global financial partners.

Axe Limit Management is responsible for multi-level facility structures of various

types throughout all of the bank’s divisions, including Corporate & Commercial Lending,

Treasury, Trade Finance, Specialized Lending, and others (the method of software delivery

and licensing in which software is accessed online via a subscription). Large and medium-

sized businesses, retail investment banks, public sector groups (PSGs), SMEs, non-bank

financial institutions (NBFIs), and high-net-worth individuals are among the wholesale

clients served (HNIs).

Axe Collateral Management is a service that proactively monitors the institution’s

whole portfolio of collateral by sending out early alerts and notifications about coverage

shortfalls, deferment expiration, documentation renewals, margin calls (equities), product

disposals, and other events. The complexity of collateral management is efficiently man-

aged by a number of advanced business rules and adaptable processes across the financial

institution’s operations and departments, including remote locations and international

subsidiaries. Credit Risk Analysts, Relationship Managers, Risk Managers, Credit Adminis-

trators, Collateral and Collection Officers, Legal Officers, Sustainability and Environmental

Offices, and Portfolio Teams will all benefit from Axe Retail Lending. It uses a single co-

operative solution to streamline their tasks and build on their input, running all automated

processes on a single platform of risk and credit data. Axe Collection and Provisioning

is a system aimed to improve repair methods by increasing efficiencies throughout the

repair process by ensuring consistent data flow and streamlining processes. A low-cost,

end-to-end solution that provides a high return on investment in the short, medium, and

long term. The solution collects all of the information needed to compute expected losses

and individual or group provisions for good, bad, and damaged assets.

4
CHAPTER 1. HOST COMPANY PRESENTATION AND PROJECT SCOPE

1.2.2 Customers

The most well-known consumers and those who trust the company’s services are depicted

in the diagram below:

]logos/image.png

Figure 1.3: Customers

1.3 Project Scope

The project is titled "Table Extraction and Structure Recognition." Its core objective is

the examination of Existing Table Recognition Approaches , Selection and implementation

of a model designed to extract the structure of tables from images.

5
CHAPTER 1. HOST COMPANY PRESENTATION AND PROJECT SCOPE

1.3.1 Problem Definition

The problem involves the detection, recognition, and precise retention of table contents,

along with maintaining the original layout and structure. This aims to enable the storage of

tabular data in editable formats like Docx, Excel, and more. This problem holds significant

relevance in digitizing documents, facilitating editing, and enabling efficient document

search and retrieval processes.

Typical sub-problems of table data:

• Table Detection (TD): determines the position of the table in the image.

• Table Structure Recognition (TSR): identify, reconstruct and store relative position

information of cells in a table.

• Table Recognition (TR): similar to TSR, but includes reading information, recognizing

characters on the table, and mapping accurately to each cell in the table

1.3.2 Basic Definitions

• Table: table

• Row, Column: row, column of table

• Grids: the smallest unit representing coordinates in a table, belonging only to 1 row

and 1 given column

• Cells: larger than the Grid, 1 cell can include multiple sub-grids (called span-cells),

or only 1 child grid is single-cell. Contains location information and text content

inside the cell

• Single-cells: is a cell, 1 cell corresponds to 1 grid

• is a cell, 1 cell corresponding to many grids, ie rows / columns are stretched / wide,

covering many other rows / sub-columns


6
CHAPTER 1. HOST COMPANY PRESENTATION AND PROJECT SCOPE

• Relative Position:the relative position of the cells/grid in the table, represented by

index 0,1,2,3,... Original coordinates on top-left, including 4 parameters (start-row,

end-row, start-column, end-column)

1.4 Methodology of the project

To better organize the structure and steps of our project, we used the CRISP method,

which is an agile and iterative process.[7] Let’s first clarify what an Agile method is. It is a

strategy that advocates for setting short-term objectives. Therefore, the project is divided

into several sub-projects. Once an objective is reached, we move on to the next one until

the final objective is reached. This method is more flexible. As it is impossible to predict

and anticipate everything, it allows for the acceptance of unexpected events and changes.

CRISP- DM (cross-industry standard process for data mining) is, as the name implies,

an open standard process framework specifically for planning data mining projects.It is

important to note that the process is highly non-linear, and moving from one step to the

next is the norm rather than the exception and it is divided into six major steps

• Business Understanding: The business understanding phase is concerned with de-

termining what problems the company wishes to solve. According to data science

project management, the most important tasks in this phase are: Determine the

business question and goal. Make a detailed plan for each project phase, including

the tools we will use

• Data Understanding: Following comprehension of the business perspective, it is time

to determine which data is available to us and concentrate on comprehending it in

order to solve 4 the business problem. In other words, we demonstrate everything we

know about the data and how it relates to the business question. This phase may

include the following steps: collecting, attempting to describe the data you have at a

glance, and exploring it.


7
CHAPTER 1. HOST COMPANY PRESENTATION AND PROJECT SCOPE

• Data Preparation: Once the data and the business are clear and understood, it is

time to prepare the collected data for modeling. In this step, we will select the data

that will be used for analysis, then clean it up and pre-process it. Indeed, this is the

key to a great modeling process that will set your data science project apart even

more.

• Modeling: In this phase, we choose and apply various modeling techniques and

calibrate their parameters to optimal values. We will develop our ML or DL

model/product by experimenting with many models before settling on a specific

algorithm, then we will design our modeling test design by dividing the data into

training, testing, and validation sets, and finally we will define the technical success

measures and select the best viable model(s) to solve the business question.

• Evaluation: Before proceeding with the model’s final deployment, it is critical to

further evaluate the model and review the steps taken to build the model to ensure

that it is meeting the business objectives. A key goal is to identify any significant

business issues that have not been adequately addressed. A decision on how to use

the data mining results should be made at the end of this phase.

• Deployment: This is the last step in the procedure. It entails putting the obtained

models into production for the final users. Its goal is to present the results in an

appropriate format and incorporate them into the decision-making process.

8
CHAPTER 1. HOST COMPANY PRESENTATION AND PROJECT SCOPE

Figure 1.4: CRISP-DM Process Diagram

1.5 Conclusion

In culmination, this chapter laid the foundation for our journey by introducing the Host

Company Presentation and delineating the Project Scope. The Host Company Presentation

provided a comprehensive overview of our collaborating entity, offering insight into its

objectives, expertise, and industry standing.Simultaneously, the Project Scope demarcated

the boundaries and objectives of our endeavor

9
Chapter 2

Current Methods for Table


Recognition

2.1 Introduction

Within this chapter, we embark on a comprehensive exploration of contemporary

methodologies employed for table structure recognition. Our aim is to provide an insightful

analysis of these existing techniques, shedding light on their strengths, limitations, and

practical implications. By juxtaposing these methods, we strive to offer a holistic perspective

that aids in understanding the evolving landscape of table structure recognition. This

exploration equips us with a nuanced understanding, enabling informed decision-making

and laying the groundwork for the model choice.

2.2 LGPMA

This method is built on Mask-RCNN is one model instance-segmentation technique that

performes pixellevel segmentation on detected objects ,when applied with table recognition

purposes it outputs bounding-box and mask segment. The model consists of four main

modules: Aligned Bounding-box Detection, LPMA (local pyramid mask alignment), GPMA
10
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

(global pyramid mask alignment) and Aligned Bounding-box Refinement.[5]

Figure 2.1: LGPMA model

• The Align Bounding-box Detection uses features extracted from the Region of Interest-
11
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

Align (RoI-Align) witch may be defined as a region in an image where a potential

object might be located. It includes two output branches: bounding box-classification

and bounding box-regression, similar to the model of Mask-RCNN, used to identify

non-empty cells. However,empty cells can not be easily recognized .Therefor the

authors of LGPMA prpose the following workflow for the aime of the alignment and

refinment of these empty cells

• LPMA:is applied at the scale of each single cell and consists of two subbranches:

The first is used for binary segmentation aiming to identify text regions. The second

performs a local pyramid mask regression task .It creates masks with a descending

gradient from the center of the text, defined for both vertical mask and horizontal

mask. In the figure 2.2 (a) shows the original aligned bounding box (blue) and

text region box (red). (b) shows the pyramid mask labels in horizontal and vertical

directions, respectively :

Figure 2.2: LPMA subbranches

• GPMA: is also a segmentation module, with two submodules, Global segmentation

and Global pyramid mask regression.The first simply performs a binary segmentation

to identify aligned non-empty cells and empty-cells . The second section is identifies

the entire set of non-empty cells with only two outputs, global horizontal pyramid

mask and global vertical pyramid mask .

• The last module is Aligned Bounding-box Refinement which refines the cell identity.It

is based on merging the local-mask and global-mask of the previously presented


12
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

modules to create a voting area .The process starts with Cell matching to identify

cells of the same row and same column,identifies the empty-cells and merge them

together if needed. .

The Figure 2.3 is a visualization of an example that is successfully refined. (a) shows The

aligned bounding boxes before refinement. (b) gives LPMA (in horizontal). (c) displays

GPMA (in horizontal). (d) presents Global binary segmentation. (e) unveils Final result

after refinement and empty cell merging

Figure 2.3: Exemple of LGPMA application

13
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

2.3 Split-Embed-Merge

this model uses the divide-and-rule approach, the authors divides the model into three

smaller submoduels, respectively Split, Embed, Merge as shown by figure 2.4.

Figure 2.4: Split-Embed-Merge model

• Split-model: is used to define row/column separation areas. However it takes a

segmentation approach, the output is two corresponding masks for the separated row

and column.

Figure 2.5: Split model

• Embed-model: with the grid specified from the Split-model, uses Roi-Align to extract

the feature of each grid area of the image, called the Vision Module. At the same

time, Embed uses a Text-Module with BERT as a feature extraction to extract more

text features. Then, 2 modules are combined to make input for the 3rd model,
14
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

• Merge-model with the same concept as Deep-Split-Merge. However, the modeling

and paper section uses 1 GRU model with Attention, at each timestep there is an

output of 1 merged-map MxN dimension (MxN is the grid size from the Split-model),

indicating that at this timestep, which grids need to be merged together, denoted

as 1, vice versa as 0. From there, perform a restructuring of the table similar to

Deep-Split-Merge.

Figure 2.6: Merger model

2.4 GraphTSR

The modeling part is designed in graph form and modeled with the Graph Neural

Network (GNN)

Graph Neural Networks (GNNs): GNNs are a type of neural network architecture designed

to work with graph-structured data. In a graph, you have nodes (vertices) and edges

connecting them. GNNs are used to process and analyze data in this graph format, making

them suitable for tasks that involve relationships between entities, such as nodes and edges.

Figure 2.7 presents an Overview of the method : (a) Preprocessing: obtaining cell contents
15
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

and their corresponding bounding box from the image; (b) Graph construction: building

an undirected graph on these cells; (c) Relation prediction: predicting adjacent relations by

our proposed GraphTSR; (d) Post-processing: recovering table structure from the labeled

graph.

Figure 2.7: Overview of the method

Utilizing Graph Neural Networks (GNN), the network’s nodes are symbolized as text-

contained bounding boxes. For instance, text labels like "Method," "D1," "D2," "P," "R,"

"F1," correspond to nodes in the graph. Edges represent relationships between nodes,

visualized as connecting lines.

In the study, the Text Structure Recognition (TSR) issue is approached as an edge-

classification challenge. This involves categorizing edge relationships. Given existing

vertices and edges, the goal is to assign newly created edges into three labels:

• Label "0": No significant relationship between connected nodes.

• Label "1": Horizontal relationship, like a green line, indicating a side-by-side connec-

tion.

• Label "2": Longitudinal relationship, akin to a red line, implying a vertical or

hierarchical link.

By defining the relationships between each cell (0/1/2) thus, we can identify the span-cells

and perform table restructuring as shown above. With 1 row-span cell Method consists of
16
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

2 sublines and 2 column-spans D1, D2 with 3 subcolumns (, , ).PRF1

2.5 Comparison of methods

2.5.1 TEDS Metric (ICDAR 2021 Competition)

TEDS(Tree-Edit-Distance-based Similarity) is a metric employed to quantify the simi-

larity between two tree structures. It measures the minimum number of edit operations

required to transform one tree into another. These edit operations typically include insert-

ing, deleting, or substituting nodes in one tree to make it match the structure of the other

tree.

When calculating similarity using Tree-Edit-Distance-based Similarity (TEDS), the idea

is to use the tree edit distance as a basis to determine how similar or dissimilar two tree

structures are. Smaller tree edit distances indicate higher similarity, while larger distances

indicate greater dissimilarity.

The process generally involves the following steps:

• Tree Representation: Convert the input trees or hierarchical structures into appro-

priate representations for tree edit distance calculations. This representation often

includes nodes, edges, labels, and their relationships.

• Tree Edit Distance Calculation: Calculate the tree edit distance between the two tree

structures. This involves finding the minimum sequence of edit operations (insertions,

deletions, substitutions) needed to transform one tree into the other.

• Similarity Calculation: Convert the tree edit distance into a similarity score. This

can be done by inversely scaling the distance or applying a transformation function

to map it to a similarity scale (e.g., 0 to 1).

• Interpretation: The resulting similarity score provides information about how similar

the two trees are. Higher similarity scores indicate greater structural resemblance,
17
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

while lower scores indicate more structural differences.

LGPMA , GraphTSR and Split-Embed-Merge were put to proof in the ICDAR 2021

Competition and were tested using the TEDS metric using the following settings :

• Cost of Operations: The cost of insertion and deletion operations is set to 1.

• Substitution Cost:

– When a substitution operation is performed, if either of the nodes being substi-

tuted (replaced) is not "td,"(table data) the cost is 1.

– If both nodes are "td," the substitution cost depends on whether the column

span or row span of the nodes is different. If they are different, the cost is 1. if

the column span and row span of both nodes are the same, the substitution cost

is calculated using the normalized Levenshtein similarity between the content of

the nodes (ranging from 0 to 1).

• TEDS Calculation: The TEDS similarity between two trees (or tables) is calculated

using the formula:

EditDist(Ta , Tb )
T EDS(Ta , Tb ) = 1 −
max(|Ta |, |Tb |)

Where EditDist is the tree-edit distance between Ta and Tb, and |Ta| and |Tb|

represent the number of nodes in the trees.

• Table Recognition Performance: The performance of a method on a set of test samples

is defined as the mean TEDS score between the recognition result produced by the

method and the corresponding ground truth for each sample.

The results are shown in the table :

18
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

Method TEDS Simple TEDSComplex TEDS all


LGPMA 97.88 94.78 96.36.
Split-Embed-Merge 97.60 94.89 96.27.
GraphTSR 97.18 92.40 94.84.

Table 2.1: TEDS results

The table displays the results of different methods applied to three datasets: TEDS

Simple(dataset of tables with simple structure), TEDS Complex(dataset of tables with

complex structure), and TEDS all(represents the combination of two both previous datasets).

Here’s an interpretation of the results:

• TEDS Simple:

– LGPMA: LGPMA achieved the highest performance on the TEDS Simple

dataset with a score of 97.88

– Split-Embed-Merge: Split-Embed-Merge also performed well on the TEDS

Simple dataset, with a score of 97.60

– GraphTSR: GraphTSR had a slightly lower performance score of 97.18% on

the TEDS Simple dataset.

• TEDS Complex:

– LGPMA: LGPMA continued to lead in performance on the TEDS Complex

dataset with a score of 94.98%.

– Split-Embed-Merge: Split-Embed-Merge achieved a performance score of

94.89% on the TEDS Complex dataset.

– GraphTSR: GraphTSR had a lower performance score of 92.40% on the TEDS

Complex dataset.

• TEDS all:
19
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

– LGPMA: LGPMA maintained its lead on the TEDS all dataset with a score

of 96.36%.

– Split-Embed-Merge: Split-Embed-Merge achieved a performance score of

96.27% on the TEDS all dataset.

– GraphTSR: GraphTSR had a lower performance score of 94.84% on the TEDS

all dataset.

In summary, based on these results:

• The "LGPMA" method consistently performed well across all three datasets, making

it a strong candidate for various applications related to TEDS data. The "Split-

Embed-Merge" method also demonstrated strong performance across the datasets,

with results close to those of LGPMA..

• The "GraphTSR" method, while still achieving reasonable performance, had slightly

lower scores compared to the other methods, particularly on the more complex TEDS

Complex and TEDS all datasets

2.5.2 Results and Interpretation

In the table presented below, we delineate the strengths and limitations of the examined

models :

20
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

Method Advantages Limitations


LGPMA Empty cells can be located eas- the model is built for dis-
ily tributed training
Very good at detecting span- The model has a substantial
ning cells GPU resource demand
Available weights for the pre-
trained model on 750 000 tab-
ular images
Holds the first place in ICDAR
2021 Competition
Split-Embed-Merge Has the highest TEDS score A significant dataset is re-
for the complex structures quired to properly train the
Simple to implement model
Top3 ICDAR 2021 competi- The model has a substantial
tion GPU resource demand
GraphTSR Excels at detecting spanning Graph models can become
cells computationally intensive and
Node Context: Graph models memory-hungry as the size of
leverage the context of neigh- the graph increases)
boring nodes, allowing them can suffer from overfitting if
to capture local patterns and the dataset is too small
propagate information across
the graph, leading to improved
predictions and insights.

Table 2.2: Current methods :Advantages and Limitations

The table presents an evaluation of three different methods for Tabular Structure Recogni-

tion: LGPMA, Split-Embed-Merge, and GraphTSR. These methods are assessed based

on their respective advantages and limitations, providing insights into their practical

applicability.[1]

LGPMA:

• Advantages: - LGPMA excels in locating empty cells within tabular structures, a

crucial task in table recognition. - It demonstrates strong performance in detecting

spanning cells, which are common in complex tables. - The availability of pretrained
21
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

model weights trained on a substantial dataset of 750,000 tabular images enhances

its recognition capabilities. - LGPMA’s notable achievement of holding the first place

in the ICDAR 2021 Competition highlights its effectiveness.

• Limitations: - The method is designed for distributed training, which places substan-

tial demands on GPU resources, making it resource-intensive.

Split-Embed-Merge:

• Advantages: - Split-Embed-Merge achieves the highest TEDS score for complex table

structures, indicating its suitability for intricate layouts. - It is relatively simple to

implement, making it accessible to a wide range of users. - The method achieved

recognition by ranking among the top three in the ICDAR 2021 Competition.

• Limitations: - Proper training of Split-Embed-Merge requires a significant dataset,

which may pose challenges in data-constrained scenarios. - Similar to LGPMA, it

exhibits considerable GPU resource demands.

GraphTSR:

• Advantages: - GraphTSR is effective in detecting spanning cells, a valuable feature

in tabular recognition. - It leverages graph-based models to capture local patterns

and propagate information across the graph, contributing to improved predictions

and insights.[9]

• Limitations: - Graph-based models like GraphTSR can become computationally

intensive and memory-hungry as the graph size increases, potentially limiting scala-

bility. - The risk of overfitting exists, particularly if the dataset used for training is

small.

In summary, LGPMA stands out in terms of empty and spanning cell detection. Split-

Embed-Merge performs well on complex structures but requires substantial data and

GPU resources. GraphTSR leverages graph models for improved predictions but may face
22
CHAPTER 2. CURRENT METHODS FOR TABLE RECOGNITION

scalability and overfitting challenges. These insights assisted us in selecting LGPMA as

the most suitable method for our task.

2.6 Conclusion

Following a meticulous analysis of the various methodologies discussed in this chapter,

a clear path has emerged. Our rigorous investigation has led us to the resolute decision of

adopting the LGPMA model for our table structure recognition task. The comprehensive

evaluation of existing techniques has affirmed the suitability of this model for our specific

objectives.

23
Chapter 3

Data Proficiency and Environment


Setup: Navigating Understanding,
Preparation, and Configuration

3.1 Introduction

This chapter is dedicated to introducing our image dataset and aiming to provide a con-

cise overview of the data through the generation of exploratory insights and visualizations.

Additionally, we will delve into the process of data annotation, a crucial step to adapt the

data into the necessary format for our LGPMA model.

Furthermore, we will present the configuration of the environment, detailing the setup

that enables our project’s smooth execution.

24
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

3.2 Data Understanding and Preparation

3.2.1 Image Dataset

Data exploration, or Exploratory Data Analysis (EDA), is a critical initial phase in

the process of working with any form of data, including the data processed in computer

vision tasks and involves a systematic and in-depth examination of the dataset. Our

dataset comprises 250 tabular images featuring a diverse array of tables, each possessing

distinct structures. These images are sourced directly from authentic employee documents,

providing a genuine representation of the data encountered in real-world scenarios. However,

during the curation process, the dataset was refined to 230 images. This reduction was

necessitated by the exclusion of certain images that were deemed unsuitable for model

training. These images exhibited notable noise and irregularities such as doted lines of

the tables or unclear words, making them less conducive to effective learning. The careful

curation process ensures that the dataset maintains a higher quality and relevance, thereby

enhancing the reliability and performance of the model during training and subsequent

applications. The figure 3.1 shows an example of our image dataset:

25
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

Figure 3.1: Images from Dataset

26
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

3.2.2 Data annotation

Data annotation, also referred to as data labeling, tagging, or classification, involves the

essential task of assigning pertinent labels (such as tags, annotations, or classes) to individual

data samples. This procedure holds significant influence over a model’s performance. In the

context of our project, image annotation was meticulously undertaken using LabelImage

( image annotation tool set in an anaconda environment ). This meticulous annotation

process serves as the foundation for generating the training dataset, enabling supervised AI

models to acquire knowledge. The manually annotated images establish a baseline dataset

crucial for training the LGPMA model effectively. This preparatory step plays a pivotal

role in facilitating the model’s ability to comprehend and make predictions accurately. The

figure below shows the annotation .

Figure 3.2: annotation process

The annotation process is a fundamental step in preparing our dataset for effective model

training. In this process, each image is labeled using bounding boxes, where each bounding

box represents an individual cell within the table witch deemed possible with the help of

annotation tool.

27
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

Figure 3.3: Annotation tool

An image annotation tool is a software application or platform specifically designed for

adding labels, shapes, text, or other metadata to images. These tools are commonly used

in computer vision, machine learning, and data annotation tasks. Image annotation tools

help in creating labeled datasets for training and testing machine learning models.In our

case we chose labelImg. LabelImg: LabelImg is an open-source graphical image annotation

tool that allows us to draw bounding boxes around objects in images. It’s commonly used

for object detection tasks. LabelImg supports both PASCAL VOC and YOLO formats.

The figure 3.4 represents the output of the annotation tool:

28
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

Figure 3.4: XML file generated by the labelImg tool

This approach allows us to accurately map the tabular structure present in the images.

The resulting annotations are compiled into XML files, which encapsulate essential infor-

mation including the height, width, and coordinates of each bounding box. This structured

XML representation forms the cornerstone of our annotated dataset. Subsequently, these

XML annotations are transformed into the format required by the model . The figure 3.5

shows the structure of Jason file used as input for the model:

29
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

Figure 3.5: Format required for LGPMA model

Description of Data Parameters:

• filename: This parameter corresponds to the name of the sample image.

• height: Specifies the height of the image.

• width: Denotes the width of the image.

• content-ann: This dictionary encapsulates three critical aspects of table information:

– bboxes: A list of coordinates outlining the text area within a cell. The for-

mat employed is [x1, y1, x2, y2], representing the upper-left and lower-right

coordinates.

30
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

– cells: A list indicating the row and column information for each cell. The format

is [Start Row Index, Start Column Index, End Row Index, End Column Index].

– labels: A list of labels for each cell, with values of 0 indicating a header cell and

1 representing a non-header cell.

It is important to pay attention to the following nuances in the above data format:

• For cells with no text area, the bboxes list should be represented as an empty list.

• The row and column indexes in the cells parameter begin from 0.

• It is imperative to maintain the order of the three lists (bboxes, cells, and labels)

precisely, ensuring that they correspond one-to-one without any mismatch.

3.3 Environment Configuration

In response to the considerable resource demands imposed by the model, we made

the strategic decision to transition to a Linux machine provided by a hosting company,

equipped with hardware capabilities that align with our computational requirements. This

new setup boasts 32GB of RAM, coupled with a powerful GP102 GPU, specifically the

GeForce GTX 1080 Ti, featuring 12GB of dedicated graphics memory. This shift ensures

that our computational infrastructure is robust enough to handle the complexities and

demands of our model, enabling us to achieve optimal performance and efficiency.

31
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

Figure 3.6: Environment Info log

3.3.1 Software

In this section, we will discuss the different components of the software environment

and explain the underlying reasons for their use.

• . Python : Python is a popular high-level and polyvalent programming language. It

was invented in 1991 by Guido van Rossum and developed by the Python Software

Foundation.[3] It is widely used in the scientific and research communities, as the

majority of data scientists have ranked it as the most privileged programming language

because it provides a large collection of open-source libraries that help to easily solve

complex business problems, build robust systems, and especially applications such as

32
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

table structure recognition Studio Code: It is a powerful code editor that runs on

any operating system (Windows, Linux, Mac Os). It comes with built-in support for

JavaScript, TypeScript and Node.js and has a rich ecosystem of extensions for other

languages (such as C++, C, Java, Python,etc.) and runtimes.

• Google Colaboratory or Colab, is a free Google tool for developing data science

projects. It is a free Jupyter notebook environment that runs on Google’s cloud

servers, allowing the user to leverage backend hardware like GPUs and TPUs

• Jupyter is an open-source project that provides a web-based interactive computing

environment. It’s particularly popular in the field of data science and scientific

computing. The name "Jupyter" is a combination of three core programming languages

it initially supported: Julia, Python, and R.

The key component of Jupyter is the Jupyter Notebook, which allows users to create

and share documents that contain live code, equations, visualizations, and narrative

text. These notebooks are a versatile tool for data analysis, scientific research,

machine learning, and more. Users can write and execute code in a notebook, see the

results immediately, and document their work in a coherent and interactive manner.

Jupyter supports a wide range of programming languages beyond the original three,

thanks to "kernels," which are language-specific computational engines that execute

the code within the notebooks. This flexibility makes Jupyter an invaluable tool

for researchers, data scientists, and educators working with various programming

languages and data analysis tasks.

• PyTorch is an open-source deep learning framework developed by Facebook’s AI

Research lab (FAIR). It provides a flexible and dynamic computational graph, making

it widely adopted in both research and production for various machine learning tasks.

Main Features:

– Dynamic Computation Graph: PyTorch uses a dynamic computational graph,


33
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

which allows for more flexible and intuitive model design and debugging.

– GPU Support: It has native support for GPUs, making it efficient for training

deep neural networks on GPU hardware.

– PyTorch has a rich ecosystem of libraries and tools, including torchvision for

computer vision tasks, and PyTorch Lightning for streamlining the training

process.

3.3.2 libraries

In this section, we are going to list the packages incorporated in our project

• OpenCV: OpenCV is a graphics library developed by Intel, specialized in image

processing. This library provides more than 2500 computer vision algorithms that

can be used to process images. These algorithms are primarily based on complex

mathematical calculations that primarily concern the processing of matrices, since

an image is considered as a matrix of pixels

• Spacy: spaCy is a Python-based free and open-source natural language processing

(NLP) library with many built-in features such as NER, POS tagging, dependency

parsing, sentence segmentation, text classification, lemmatization, morphological

analysis, entity linking, and more. Because of its cutting-edge speed and rigorously

tested accuracy, it is becoming increasingly popular for NLP processing and analysis.

• Tensorflow: Tensorflow is an open-source library for programming data streams in

various tasks. It is a symbolic mathematical library, which is also used for machine

learning applications such as neural networks. Created by the "Google Brain" team,

it is a toolbox for solving extremely complex mathematical problems easily. [8]

• MMCV (Multimedia Common Vision): MMCV is an open-source deep learning

library primarily focused on computer vision tasks. It is developed and maintained

by the Multimedia Laboratory at the Chinese University of Hong Kong.


34
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

– MMCV offers a wide range of pre-processing, data augmentation, model archi-

tecture, and evaluation tools specifically designed for computer vision tasks..

– Integration with Other Libraries: MMCV is often used in conjunction with

other popular libraries like PyTorch and MMDetection

– Use Cases: MMCV is commonly used in computer vision research and applica-

tions, such as image classification, object detection, instance segmentation, and

more.

• MMDetection: is an open-source deep learning framework specifically designed for

object detection, instance segmentation, and other related computer vision tasks.

It is built on top of the PyTorch deep learning framework and is developed and

maintained by the Multimedia Laboratory at the Chinese University of Hong Kong.[4]

– Modular Design: MMDetection follows a modular design philosophy, making

it highly customizable and adaptable to different computer vision tasks. Users

can easily configure and extend the framework to suit their specific needs.[2]

– Wide Range of Models: The framework provides a collection of state-of-

the-art object detection and instance segmentation models. These models are

pre-implemented and can be easily used for various tasks, including Faster

R-CNN, Mask R-CNN, RetinaNet, and many others.

– Efficient Training and Evaluation: MMDetection includes efficient training

and evaluation pipelines, making it straightforward to train models on custom

datasets and evaluate their performance using standard metrics.

– Rich Set of Features: It offers a rich set of features such as multi-scale

training, data augmentation, anchor generation, and more, which are crucial for

achieving high performance in object detection and related tasks.

– Integration with MMCV: MMDetection is closely integrated with the MMCV

library (Multimedia Common Vision), which provides additional computer vision


35
CHAPTER 3. DATA PROFICIENCY AND ENVIRONMENT SETUP: NAVIGATING
UNDERSTANDING, PREPARATION, AND CONFIGURATION

utilities and tools for data pre-processing, visualization, and evaluation.

– Community and Development: MMDetection has gained popularity in the

computer vision community and is actively developed and maintained by a

community of researchers and engineers. It benefits from contributions from the

open-source community, ensuring its continued improvement and enhancement.

3.4 Conclusion

In conclusion, this chapter encompassed the exploration and transformation of our

dataset, alongside a detailed discussion of environment configuration .

36
Chapter 4

Modeling, Evaluation, and


Deployment

4.1 Introduction

In this chapter, we will address the principles of modeling, evaluation, and deployment

in the field of data science, providing a scientific analysis of these essential processes in our

project.

4.2 Modeling

4.2.1 Parameter Configuration

Model-related configuration parameters are specified in the

/demo/table_recognition/lgpma/config/lgpma_base.py and lgpma_pub.py files as dic-

tionaries. 1.lgpma_base.py file can configure model training parameters, backbone, neck,

optimizer, batchsizeas as shown in figure 4.1,

37
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

Figure 4.1: model training parameters, backbone, neck, optimizer, batchsize

• batch size : is the number of samples processed each time the model is updated

• backbone: is a set of parameters essential to the feature pyramid network which

performs feature extraction on the input data and transforms it into certain repre-

sentation compatible with ROI-Align input.

• optimizer:its purpose is to adjust model weights to maximize a loss function .

• neck :set of parameters representing the color values respectively for input and output

images .In our case , we chose to process grayscale images which would output only

a black channel .

2.In lgpma_pub.py file, we configure the training data path, model storage path, and

log storage path.

38
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

Figure 4.2: training data path, model storage path, and log storage path

3.In lgpma_pub.py file, we can configure epoch number as well as number of gpu and the

weights of the pretrained model.

Figure 4.3: train configuration

The figures 4.2 and 4.3 represents our configuration for the model.

4.2.2 LPMA

for the pyramid mask regression, we assign the pixels in the proposal bounding box

regions with the softlabel in both horizontal and vertical directions, as shown in Figure

3. The middle point of text will have the largest regressed target "1" which is the darkest
39
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

level. Specifically, we assume the proposed aligned bounding box has the shape of H ×W.

The top-left point and bottom right point of the text region are denoted as (x1, y1),(x2,

y2), respectively, where 0<x1<x2 W and 0<y1<y2 H. Therefore, the target of the pyramid

mask is in shape R 2×H×W [0, 1], in which the two channels represent the target map

of the horizontal mask and vertical mask, respectively. For every pixel (h, w), these two

targets can be formed as:


 
 w if w ≤ xmid  h if h ≤ ymid

 

xmid ymid
t(w, h) = v(w, h) =
 W −w
 

if w > xmid  H−h

if h > ymid
W −xmid H−ymid
In this way, every pixel in the proposal region takes part in predicting the boundaries.

4.2.3 GPMA

Although LPMA allows the predicted mask to break through the proposal bounding

boxes, the local region’s receptive fields are limited. To determine the accurate coverage

area of a cell, the global feature might also provide some visual clues. Inspired by , learning

the offsets of each pixel from a global view could help locate more accurate boundaries.

However, bounding boxes in celllevel might be varied in width-height ratios, which leads

to the unbalance problem in regression learning. Therefore, we use the pyramid labels as

the regressing targets for each pixel, named Global Pyramid Mask Alignment (GPMA). .

The ground-truth of empty cells are generated according to the maximum height/width of

the non-empty cells in the same row/column. Only this task learns empty cell division

information since empty cells don’t have visible text texture that might influence the region

proposal networks to some extent. We want the model to capture the most reasonable cell

division pattern during the global boundary segmentation according to the human’s reading

habit, which is reflected by the manually labeled annotations. For the global pyramid mask

regression, since only the text region could provide the information of distinct cells, all

non-empty cells will be assigned with the soft labels . All of the ground-truths of aligned

bounding boxes in GPMA will be shrunk by 5% to prevent boxes from overlapping

40
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

4.3 Evaluation

4.3.1 IOU(Intersection Over Union)

Intersection over Union (IoU) is a metric commonly used in computer vision and image

segmentation tasks to evaluate the accuracy of an object detection or image segmentation

algorithm. It measures the overlap between the predicted and ground truth regions in an

image. IoU is particularly useful for tasks where you need to assess how well a model’s

predictions match the actual objects or regions of interest in an image.

Here’s the formula for calculating IoU:

Intersection Area
IoU =
Union Area

4.3.2 Precision , Recall and F1 score

• Precision:Precision is a measure of how accurate the positive predictions made by a

model are. In the context of object detection or segmentation with IoU, precision

is the ratio of true positives (correctly predicted object instances with IoU above a

certain threshold) to the total number of positive predictions (both true positives

and false positives). Here’s the formula for Precision:

True Positives
Precision =
True Positives + False Positives

• Recall:

Recall, also known as sensitivity or true positive rate, measures the model’s ability

to identify all the relevant positive instances in the dataset. In the context of IoU,

recall is the ratio of true positives to the total number of actual positive instances.

41
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

Recall (Sensitivity or True Positive Rate):

True Positives
Recall =
True Positives + False Negatives

• F1 score :

The F1 score is a metric commonly used in binary classification tasks to measure the

model’s accuracy in terms of both precision and recall. It is the harmonic mean of

precision and recall. Here’s the formula for the F1 score :

2 · precision · recall
F1 =
precision + recall

4.3.3 Evaluation Results

In this subsection we will discuss the result of the evaluation Process :

Precision Recall F1 score


0.70 0.79 0.74.

Table 4.1: Evaluation results

For a table structure recognition task, the performance metrics can be interpreted as

follows:

• F1 Score (0.74):

– An F1 score of 0.74 indicates a good balance between precision and recall,

suggesting effective table structure recognition.

• Recall (0.70):

– A recall of 0.70 signifies that the model correctly identifies 70% of the actual

tables in the dataset.


42
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

• Precision (0.79):

– A precision of 0.79 implies that when the model predicts a table, it is correct

approximately 79% of the time.

In summary, these performance metrics suggest that the model is reasonably effective at

recognizing table structures in the data. It correctly identifies a significant portion of

the tables (recall of 0.70) while maintaining a high level of accuracy in its predictions

(precision of 0.79). However, the specific interpretation may depend on the requirements

and objectives of the table structure recognition task and whether certain trade-offs between

precision and recall are acceptable in the given application.the obtained resulted may be

more accurate if we had a bigger dataset which is considered very low volume for this type

of model.

4.4 Deployment

FastAPI is a modern, fast (high-performance), web framework for building APIs with

Python. It is designed to be easy to use, while also being highly efficient and providing

automatic validation, serialization, and documentation of APIs. FastAPI has gained

popularity for its simplicity and performance, making it an excellent choice for building

RESTful APIs and web applications.[6]

Here are some key features and concepts associated with FastAPI:

• Automatic API Documentation: FastAPI automatically generates interactive API

documentation using the OpenAPI standard. You can access this documentation

through a web browser, making it easy for developers to understand and test your

API.

• Automatic API Documentation: FastAPI automatically generates interactive API

documentation using the OpenAPI standard. You can access this documentation

43
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

through a web browser, making it easy for developers to understand and test your

API.

• Async Support: FastAPI fully supports asynchronous programming with Python’s

async and await syntax. This allows you to write non-blocking, high-performance

code.

• File Uploads: It supports handling file uploads from clients with ease.

FastAPI application is designed to process base64-encoded images and generate JSON

output, the input typically involves sending the image data in base64 format as part of an

HTTP request. Once received, FastAPI can decode this input and process it. In our case

the JSON output would typically comprise three key components. First, it includes HTML

representing the structure of the table, outlining its rows, columns, and cells. Second,

the JSON output can contain coordinates of these cells, providing information about

their precise positioning within the image. Finally, the third component would consist of

the content extracted from each cell, enabling users to access the actual data within the

recognized table. By organizing and presenting these three parts within a structured JSON

response our model is deployed effectively. The figure 4.5 et 4.6 will present the input and

output of the model respectively

44
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

Figure 4.4: Input of Fast-API

Figure 4.5: Output of Fast-API

45
CHAPTER 4. MODELING, EVALUATION, AND DEPLOYMENT

4.5 Conclusion

In conclusion, the evaluation of our modeling efforts in this chapter has yielded highly

positive results, aligning closely with the objectives and criteria we set out to achieve. The

comprehensive analysis and assessment of our models have provided valuable insights into

their performance and effectiveness.

Throughout this chapter, we have systematically examined various aspects of our models,

ranging from their accuracy and precision to their ability to generalize beyond the training

data. We have also considered their computational efficiency, scalability, and robustness in

real-world scenarios.

The results obtained indicate that our model has met the predefined expectations. we

assume it is ready to successfully address the specific tasks and challenges we set out to

tackle.

46
General Conclusion

This report represent end-of-study project carried out in Axe finance in order to obtain

the national diploma of computer science engineering of the National Superior Engineering

of Tunis and aimes to implement table and structure recognition model LGPMA

The initial chapter laid the foundation with a detailed presentation of the host company

and a concise problem definition presenting , establishing the need for an effective solution.

Recognizing the significance of choosing the right model, we evaluated various options and

ultimately selected LGPMA. This decision was driven by its track record of delivering

excellent results based on the icdar2021 competition and its remarkable capability to

handle complex data structures effectively.

Following this, we executed data annotation and environment setup, crucial steps in

preparing our data and creating a conducive modeling environment. This approach

ensured alignment with our objectives and the compatibility of our dataset with the

LGPMA model. then we moved into the modeling phase, implementing the LGPMA

model and evaluating its performance. The model consistently met the predefined

expectations, affirming the effectiveness of this approach . however ,It’s worth noting that

a larger dataset could potentially enhance the model’s performance.

Finally, we successfully deployed our solution using FastAPI, making it accessible and

practical for real-world applications

47
Bibliography

[1] USA ; Vlad I. Morariu; Brian Price; Scott Cohen; Tony Martinez Chris Tensmeyer

Adobe Research, San Jose. Deep splitting and merging for table structure decomposition.

IEEE, page 12, 20-25 September 2019.

[2] PyTorch Contributors. pytorh, 2021. [Accessed may 14, 2023].

[3] Python Software Foundation. python doc, 2021. [Accessed on April ,may ,juin , 2023].

[4] IBM. Tprésentation générale de crisp-dm, Year. [Accessed on Ferauary 2, 2023].

[5] Zhanzhan Cheng Peng Zhang Shiliang Pu Yi Niu Wenqi Ren Wenming Tan Fei Wu

Liang Qiao, Zaisheng Li. Lgpma: Complicated table structure recognition with local

and global pyramid mask alignment. Journal Name, page 17, 2022.

[6] MIT. Fastapi, 2021. [Accessed juin 10, 2023].

[7] OpenMMLab. Mmdetection’s documentation, 2021. [Accessed on April 10, 2023].

[8] Tanserflow team. Tanserflow, 2022. [Accessed may 15, 2023].

[9] Heng-Da Xu Houjin Yu Wanxuan Yin Xian-Ling Mao Zewen Chi, Heyan Huang.

Complicated table structure recognition. 20-25 September 2019, page 9, 13 Aug 2019.

48
Abstract

This report represents an end-of-study project carried out in Axe Finance in order to obtain the national
diploma of computer science engineering of the National Superior Engineering of Tunis and aims to
implement table and structure recognition model LGPMA The results obtained after systemic work
from data annotation, environment set up, modeling, evaluation and deployment indicate that our
model has met the predefined expectations

Keywords: LGPMA, modeling, data annotation, deployment

Résumé
Ce rapport représente un projet de fin d'études réalisé au sein d'Axe Finance en vue de l'obtention du
diplôme national d'ingénieur informatique de l'Ingénieur National Supérieur de Tunis et vise à mettre
en œuvre le modèle de reconnaissance de tables et de structures LGPMA. Les résultats obtenus après
un travail systémique de l'annotation des données, la mise en place de l'environnement, la
modélisation, l'évaluation et le déploiement indiquent que notre modèle a répondu aux attentes
prédéfinies

Mots clés : LGPMA, modeling, data annotation, deployment

‫الملخص‬
‫نم رتويبمكلا مولع ةسدنهل ينطولا مولبدلا ىلع لوصحلا لجأ نم يف هذيفنت مت يذلا ةساردلا ةياهن عورشم ريرقتلا اذه لثمي‬
‫لمعلا دعب اهيلع لوصحلا مت يتلا جئاتنلا لكيهلاو لودجلا ىلع فرعتلا جذومن ذيفنت ىلإ فدهيو سنوتب ايلعلا ةينطولا ةسدنهلا‬
‫اًقبسم ةددحملا تاعقوتلا ىفوتسا دق انجذومن نأ ىلإ رشنلاو مييقتلاو ةجذمنلاو ةئيبلا دادعإو تانايبلا حرش ريشي نم يجهنملا‬

LGPMA, modeling, data annotation, deployment : ‫الكلمات المفاتيح‬

You might also like