This paper stimulates and describes the data integrationcomponent of INDUS that is,
ystem,environment for data-driven information extraction and integrationfrom heterogeneous, distributed, autonomous information sources.
INDUS employs ontologies and inter-ontology mappings, to enable auser or an application to view a collection of physically distributedautonomous, semantically heterogeneous data sources regardless of location, internal structure and query interfaces as though they were acollection of tables structured according to an ontology supplied bythe user. This allows INDUS to answer user queries againstdistributed, semantically heterogeneous data sources without the needfor a centralized data warehouse or a common global ontology. Thedesign of INDUS is motivated by the requirements of applicationssuch as scientific discovery, in which it is desirable for users to beable to access, flexibly interpret, and analyze data from diversesources from different perspectives in different contexts. INDUSimplements a federated, query-centric approach to data integrationusing user-specified ontologies. More than 13 systems are studiedand it is realized that INDUS is the most preferred system for Information Extraction, Integration, and Knowledge Acquisition fromHeterogeneous, Distributed and Autonomous Information Sources.PROSITE, MEROPS, SWISSPROT, and MEME are examples of data sources used by Computational Biologists.
INDUS (Intelligent Data Understanding System), Query-centric approach, PROSITE, MEROPS, SWISSPROT, MEME,MIPS2GO, EC2GO.
INDUS is a modular, extensible, platform which does notdependent environment for information integration and data-driven knowledge acquisition from heterogeneous, distributed,autonomous information sources. INDUS when comparedwith machine learning algorithms for ontology-guidedknowledge acquisition that can accelerate the pace of discovery in emerging data-rich domains such as biologicalsciences, atmospheric sciences, economics, defense, socialsciences, by means of enabling scientists and decision makersrapidly and flexibly explore and analyze vast amounts of datafrom disparate sources. IBM provides a family of datamanagement products that enable a systematic approach tosolve the information integration challenges that businessesface today.
Data Integration systems  attempt to provideusers with seamless and flexible access to information frommultiple autonomous, distributed and heterogeneous datasources through a unified query interface. Ideally, a dataintegration system should allow users to specify whatinformation is needed without having to provide detailedinstructions on how or from where to obtain the information.Data integration system must provide mechanisms for thefollowing, such as communications and interaction with eachdata source as needed, specification of a query, expressed interms of a user specified vocabulary, across multipleheterogeneous and autonomous data sources, specification of mappings between user ontology and the data-source specificontologies, transformation of a query into a plan for extractingthe needed information by interacting with the relevant datasources, and integration and presentation of the results interms of a vocabulary known to the user. Basically there aretwo broad classes of approaches to data integration: DataWarehousing and Database Federation .Figure1 Data Integration Layer INDUS allows users to,
A Review on Ontology-Driven Query-CentricApproach for INDUS Framework
L. Senthilvadivu, Dept of Software Technology Dr. K. Duraiswamy, Dean(Academic)SSM College of Engineering K.S.R College of TechnologyKomarapalayam, Tamilnadu, India Tiruchengode, Tamilnadu, Indialsvadivu.firstname.lastname@example.org@yahoo.co.in
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 5, August 201046http://sites.google.com/site/ijcsis/ISSN 1947-5500