Welcome to Scribd!

Ctcorp Linkml Draft

Uploaded by

0% found this document useful (0 votes)

6 views4 pages

1. Websites are crawled using Sequentum software and stored in an Excel file. This file is then uploaded to a database. 2. In the database, the data is normalized by standardizing field names for entities across states. New, updated, and removed information is detected. 3. Link.ML is a search engine built in-house to search the normalized entity data in the database. On-demand crawling can also retrieve updated information for specific entities outside the regular crawling schedule.

Original Description:

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

6 views4 pages

Ctcorp Linkml Draft

Uploaded by

Roel Allosada

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 4

Search inside document

CTCORP Link.

ML
Process Steps:

1. Website Crawling
Websites are crawled using Sequentum. All information for an
entity will be crawled and harvested which will be stored in an
excel file. The excel file is an intermediate file which will be used
for further processing.

2. Database Uploading and Normalization

The output of Sequentum (in excel format) will be uploaded to a

database. The field names of each entity name for every different
state are normalized for easy implementation. For example, an
Entity Name information can be called differently across states. In
the normalized database, Entity name should be assigned with a
one name and the same will apply to other information as well.
During normalization process, the application detects which
information are new, which are updates, and which are for
removal. This is feasible through the detection process or
functionality in Sequentum.

The uploading and normalization process, operates on backend.

3. Entity Searching (Link.ML)

This is a web base search engine that is conceptualize based on

project needs. Highly configurable and easily to change or
enhance as this is being developed inhouse.
4. On Demand Crawling

On Demand Crawling is when an entity needs to be crawled out of

the prescribe crawling schedule by which there is a need to
harvest the updated information. Once an entity name is marked
for ‘On Demand Crawling’. The user can extract the list of entity
names and fed that to Connotate using the existing Connotate
set-up and license. The user can do this on a defined schedule or
anytime of the day for any urgent need.

Practical Entity Framework: Database Access for Enterprise Applications
From Everand
Practical Entity Framework: Database Access for Enterprise Applications
Brian L. Gorman
No ratings yet
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
Rating: 5 out of 5 stars
5/5 (1)
MS Project Report
Document26 pages
MS Project Report
virubhau
No ratings yet
1 Query Processing
Document4 pages
1 Query Processing
peter njuguna
No ratings yet
Search Engine Functionality For LLP: Apache Lucene
Document6 pages
Search Engine Functionality For LLP: Apache Lucene
vikashvardhan
No ratings yet
Different Stages in Data Pipeline
Document7 pages
Different Stages in Data Pipeline
Santosh
No ratings yet
Worksheet 3.8 Entity Framework
Document9 pages
Worksheet 3.8 Entity Framework
eunice romo
No ratings yet
Ijret 110302054 PDF
Document3 pages
Ijret 110302054 PDF
shanysunny
No ratings yet
2.database Arckoiukhitecture. and Transaction Management
Document37 pages
2.database Arckoiukhitecture. and Transaction Management
Ayano Ni
No ratings yet
Azure Notes - 3 Data Integration
Document9 pages
Azure Notes - 3 Data Integration
antwagilet
No ratings yet
IBM Research Report: The Semantic Analysis Workbench (SAW) : Towards A Framework For Knowledge Gathering and Synthesis
Document7 pages
IBM Research Report: The Semantic Analysis Workbench (SAW) : Towards A Framework For Knowledge Gathering and Synthesis
Yiyi Valerie Zhang
No ratings yet
Case Study Q06se
Document6 pages
Case Study Q06se
honggiang13
No ratings yet
Information Retrieval and XML Data: ADBMS Unit-4
Document37 pages
Information Retrieval and XML Data: ADBMS Unit-4
sdesfesf
No ratings yet
Dmbi File
Document31 pages
Dmbi File
harnek singh
No ratings yet
Chap 1
Document22 pages
Chap 1
ggf
No ratings yet
Questions and Answers
Document13 pages
Questions and Answers
ankitaadass451
No ratings yet
Microsoft Azure Data Factory
Document4 pages
Microsoft Azure Data Factory
golden
No ratings yet
Chapter 1 Search Engine 1. Objective
Document63 pages
Chapter 1 Search Engine 1. Objective
Kumar Bhupendra
No ratings yet
Solr Elasticsearch
Document10 pages
Solr Elasticsearch
Rubila Dwi Adawiyah
No ratings yet
Search and Resource Discovery Paradigms
Document12 pages
Search and Resource Discovery Paradigms
rachana sai
No ratings yet
CEMLI Process Description
Document1 page
CEMLI Process Description
francy_raj
No ratings yet
A1 Exp-1
Document11 pages
A1 Exp-1
Adfar Rashid
No ratings yet
UNIT-1 1) KDD: KDD (Knowledge Discovery in Database)
Document17 pages
UNIT-1 1) KDD: KDD (Knowledge Discovery in Database)
Abinash Satapathy
No ratings yet
Server Tech/Safe Internet: Introduction To The Project
Document21 pages
Server Tech/Safe Internet: Introduction To The Project
vaddeseetharamaiah
No ratings yet
Module 4
Document15 pages
Module 4
MALLUPEDDI SAI LOHITH MALLUPEDDI SAI LOHITH
No ratings yet
2.database Architecture
Document27 pages
2.database Architecture
J V
No ratings yet
OAF Introduction: Learn Share Connect
Document17 pages
OAF Introduction: Learn Share Connect
shyam123g
No ratings yet
Worksheet 3.8 Entity Framework
Document8 pages
Worksheet 3.8 Entity Framework
Jorge Moncada
No ratings yet
Problem Definition 2. Objective 3. Data Base Concept 4. Oracle Data Base 5. Environment and Tools Used VC++ Odbc 6.implementation & Result
Document35 pages
Problem Definition 2. Objective 3. Data Base Concept 4. Oracle Data Base 5. Environment and Tools Used VC++ Odbc 6.implementation & Result
Satu Aseem Bansal
No ratings yet
How To Convert Items
Document14 pages
How To Convert Items
Srinivas
No ratings yet
Oracle - DB and AppSecurity
Document15 pages
Oracle - DB and AppSecurity
AkankshaMahajan
No ratings yet
Data Structure
Document4 pages
Data Structure
amaravathi R
No ratings yet
Indexing
Document2 pages
Indexing
Reshmi jr
No ratings yet
All Bi
Document17 pages
All Bi
vinay009pal
No ratings yet
Data Mining and Warehousing
Document10 pages
Data Mining and Warehousing
VAN
No ratings yet
How To Use Isetup
Document7 pages
How To Use Isetup
oneeb350
No ratings yet
Chapter - 6 Part 1
Document21 pages
Chapter - 6 Part 1
CLAsH with Dx
No ratings yet
International Journal of Computational Engineering Research (IJCER)
Document7 pages
International Journal of Computational Engineering Research (IJCER)
International Journal of computational Engineering research (IJCER)
No ratings yet
Worksheet 3.8 Entity Framework
Document5 pages
Worksheet 3.8 Entity Framework
John Aldana
No ratings yet
SQL Server Full Text Search
Document5 pages
SQL Server Full Text Search
Hasleen Gujral
No ratings yet
Operating System: COPE/Technical Test/Interview/Database Questions & Answers
Document25 pages
Operating System: COPE/Technical Test/Interview/Database Questions & Answers
Vasu Devan
No ratings yet
Entity Framework
Document70 pages
Entity Framework
Reduan ahimu
No ratings yet
No. Page No.: 1.2 Database Management System 1.2 Information Retrieval and Database Querying 1.3 Ranking Based Querying
Document27 pages
No. Page No.: 1.2 Database Management System 1.2 Information Retrieval and Database Querying 1.3 Ranking Based Querying
Praman Tyagi
No ratings yet
Hostel Management
Document85 pages
Hostel Management
Jittina cj
No ratings yet
Resume BI DS
Document7 pages
Resume BI DS
HAMDI GDHAMI
No ratings yet
2
Document12 pages
2
Bharat
No ratings yet
Data Management Research Paper - Sai Teja
Document7 pages
Data Management Research Paper - Sai Teja
joseph badia
No ratings yet
CNCF Operator WhitePaper v1-0 20210715
Document33 pages
CNCF Operator WhitePaper v1-0 20210715
Arij
No ratings yet
1.1 Web Mining
Document16 pages
1.1 Web Mining
sonarkar
No ratings yet
Apache Solr Presentation
Document37 pages
Apache Solr Presentation
Naman Mukund
No ratings yet
DBMS
Document3 pages
DBMS
Yash Wagh
No ratings yet
Touch With Industry
Document3 pages
Touch With Industry
Anonymous kw8Yrp0R5r
No ratings yet
Docu
Document44 pages
Docu
ysf1991
0% (1)
Slug A Semantic Web Crawler
Document9 pages
Slug A Semantic Web Crawler
kalpa1986
No ratings yet
Oracle Fusion Database Items
Document5 pages
Oracle Fusion Database Items
francy_raj
No ratings yet
Oracle Fusion Database Items
Document5 pages
Oracle Fusion Database Items
Abhinav Kundaram
No ratings yet
4
Document15 pages
4
Rushyanth Vattikonda
No ratings yet
Devforce Reference
Document759 pages
Devforce Reference
Abu Bakar Wan Ibrahim
No ratings yet
Enhanced Data Models For Advanced Applications
Document15 pages
Enhanced Data Models For Advanced Applications
sonaligadade22
91% (11)
External Data and Search Engines
Document1 page
External Data and Search Engines
Srinivas
No ratings yet
Feedback Notes
Document15 pages
Feedback Notes
Roel Allosada
No ratings yet
Tracking Report
Document135 pages
Tracking Report
Roel Allosada
No ratings yet
Project Status Reporting
Document65 pages
Project Status Reporting
Roel Allosada
No ratings yet
GENESIS Power Point
Document8 pages
GENESIS Power Point
Roel Allosada
No ratings yet
System Testing: ID Test Description Test Steps Test Sample Filename
Document11 pages
System Testing: ID Test Description Test Steps Test Sample Filename
Roel Allosada
No ratings yet
Genesis Library API Documentation
Document3 pages
Genesis Library API Documentation
Roel Allosada
No ratings yet
Migration Plan - v1.0 User Story
Document55 pages
Migration Plan - v1.0 User Story
Roel Allosada
No ratings yet
Master List
Document32 pages
Master List
Roel Allosada
No ratings yet
Files Tracking
Document6 pages
Files Tracking
Roel Allosada
No ratings yet
CTCorp Deliverables
Document4 pages
CTCorp Deliverables
Roel Allosada
No ratings yet