You are on page 1of 7

UNSTRUCTURED

DATA
ANALYSIS
1. Process Flow
2. Data Transformation Studio
3. DT-PowerCenter
4. Flow Chart
5. Analysis




Contents
Process Flow
Unstructured data to be loaded to HDFS or any other file
system.
Data to be fetched in the same format for processing.
DT Studio comes into picture for converting the PDF/Word Doc
into format easily readable by ETL/Reporting tool.
Useful information to be extracted using DT and loaded to a
XML or Flat File.
Reports to be generated on this useful information for depicting
the overall career graph of a resource.
Further analysis using predictive model for firms utility for a
candidate.
Data Transformation Studio
Informatica B2B Data Transformation provides accessibility to complex file
and message formats based on a comprehensive, enterprise-class solution
to your transformation challenges.


It features the best technology for extracting data from any file, document,
or messageregardless of format, complexity, or sizeand transforming it
into a usable form.
Data Transformation is been setup on INFA server for processing the
output file from DT.
UDT Transformation is used to fetch the files from the folder where output
XML/flatfile is placed.
It will act as an input to the mapping or the report.
Data Transformation-PowerCenter
Flow Chart
HDFS
(Hadoop
Distributed
File System)
Data Transformation Studio
XML/Flat File
(Readable Format)
ETL (INFORMATICA) / Reporting
(QlikView)
Analysis
After Processing we have multiple roads:

1. Complete stats available for a candidate in a structured manner
to be queried as per convenience.

2. Predictive analysis over these statistics for the conclusions over
demands and recruits. We can use languages like R or SAS utility
for analysis.

You might also like