Professional Documents
Culture Documents
srnreddy.nalla@gmail.com
Data Engineer |+91-9676231247
Working with Mitratech as Senior Software Engineer having total 10+ years of IT work
experience in Data Warehousing and Data Analytics.
My main area of experience has been Information Gathering, Requirement Analysis, ETL
Design, Development and deployments & on time Delivery.
Worked extensively on projects dealing with Data integration & Data Engineer for Data
warehouse & Data Lake using Oracle Data Integrator, Informatica Power Center and
Oracle (SQL)/PLSQL, Sqoop, Hive, PySpark, Sisense, Pentaho, Snowflake, AWS S3,
Azure Databricks and Azure Data Factory.
Having strong data integration experience in Data warehousing & Data Lake concepts
such as dimensional modeling, Star Schema and Snowflake Schema, Data Warehouse,
Data marts and ad-hoc analytics.
Experience in SQL and PL/SQL programming.
Experience on Hadoop Ecosystem like HDFS, Hive, Sqoop, Spark and Python.
Knowledge on CI/CD mechanism
Working on Agile development methodology
Knowledge on Machine Learning
Technical Skills:
Area Skills
Operating Systems Windows, Linux
Programming Languages SQL, PL/SQL and Python
Databases & Tools Oracle 11g,12c & MS SQL Server
Version control Tools Tortoise SVN, Source Tree & Bitbucket
Automation Tool Atomic UC4
ETL Tools ODI11G, ODI12C, Informatica Power Center 8.x,9.x and
Pentaho
Data Reporting & Visualization SAP BO & Sisense8.2 and
Tool
Bigdata Cloud Stack HDFS, Sqoop, Hive, Spark, Python, PySpark, Databricks,
Azure, Azure Data Factory (ADF), AWS and Snowflake
Employment History:
Project Experience:
Project Description:
Team Connect is the proven end-to-end platform for Legal Operations to deliver more efficient
legal services to the rest of the organization, combining matter management, e-billing, legal spend
management, document management, reporting and analytics, and processing automation
Roles &Responsibilities:
Created DB scripts which suits for both Oracle & MSSQL database as our product is
supporting on both databases.
Created Transformations & Jobs using Pentaho ETL tool.
Implemented object level restart ability for Initial & Refresh Data to overcome the manual
intervention between loads.
Implemented batch process mechanism to load Initial & Refresh loads.
Converting Pentaho Jos to Databricks PySpark Jobs.
Created widgets in Databricks to dynamic files and dynamic tables.
Implemented ADF pipelines for data-driven workflows in cloud for orchestrating and
automating data movement and data transformation.
Implemented pipelines according to business rules by using different ADF activities like IF
Condition, For Each, Get Metadata, Set Variable, Filter and Switch etc.
Implemented PySpark data bricks notebooks to ingest and transform the data from
source(raw) to target(output) container by applying business logics.
Created hive tables with partitions on target Parquet and Delta file format folders.
Implemented code reusability technique's by creating separate notebooks for variable
initialization and Pyspark function, calling them in each dimension and fact table notebooks
execution.
Created Azure Logic App resource to post the request using web activity in ADF to send
automated emails to business/project resources to acknowledge Pipeline Success/Failures
Prepared data extraction scripts required for ML project.
Created data pipelines and cleansed data using python pandas and Spacy libraries.
Used Spacy module to generate named entities from one of the data frame columns.
Created Docker image to run shell scripts to connect remote database to eliminate the
Linux dev setup
Created SAP BO users from CMC
Created and Validated SAP BO Reports
Organization : Cognizant Technology Solutions
Project : Data Lake Implementation of PG&E(Pacific Gas and Electric)
Duration : June 2018 – Apr 2020
Role : Technical Lead
Environment : Oracle ,MS-SQL Server, DI(Oracle Data Integrator),Snowflake and Python
This project deals with building a data lake for Pacific Gas and Electric (PG&E) -Utility organization
This Project involves migrating on-premise system like Oracle MS-SQL server and CSV data to AWS
data lake(S3) and ingesting into data into Snowflake Designed/Coded data pipeline like Sqoop
application, cleanse and match merge logic to ingest data into Data Lake.
Roles &Responsibilities:
Project Description:
Millennium/Takeda is one of the world’s leading biotechnology companies. The aim of the Data
Management Flat form (DMF) to load the Prescriber/Patient/Drug Information in the EDW
(Enterprise Data Ware House). The EDW is loaded with all the Drugs Information’s manufactured by
the Takeda and as well as the Competitor Drugs. It enables the user to compare its Product’s with
the competitors in the Market. The Reports are created using Tableau on the EDW Layer to analyze
the performance on different dimensions Prescriber/Patient/Payer/Drug.
Roles and Responsibilities:
Converting ETL queries to Hive Queries.
Setting up Sqoop jobs.
Data integration, Aggregation and Representation
Data Visualization
Data validation of ETL jobs.
Project : Oracle Healthcare Foundation (OHF) Release 7.0 and 7.1
Client : MD Anderson, Mount Sinai
Organization : Oracle India Ltd.
Duration : May 2015-Nov 2016
Environment : ODI12C,Informatica 9.6.1 and Oracle SQL/PLSQL.
Project Description:
OHF is next generation version of EHA platform wherein multiple components of EHA are unified
along with major enhancements to individual components. All the components are integrated and
deployed through Oracle Universal Installer. The administration tasks like creating database users,
repositories, integration services, configuring supported files are all auto-mated to bring down
installation time and provide a better experience to users.
Identified alternative method to install the ODI and Informatica components by using data
dump mechanism. It helps in packaging and automates the whole OHF installer.
To perform performance benchmarking exercise. Billions of near real time source data is
auto-generated and the ETL's are run to measure SLA.
Created external tables to read and load data into intermediate table.
To eliminate the manual process, written PLSQL packages, procedures for data cleanup,
schema maintenance like dropping and recreating indexes, gather statistics.
Upgraded ODI11g repositories to ODI12C and modified the ETLs to run without any issues.
Project: TRC (Translational Research Centre) Releases 2.0, 3.0 and 3.1
Client : Mayo Clinic, UPMC.
Organization : Oracle India Ltd.
Duration : Mar 2012-Apr 2015
Environment : ODI11g/ODI12C, Informatica 9.6.1 and Oracle SQL/PLSQL.
Project Description:
Oracle Health Sciences Translational Research Center is a comprehensive platform that Normalizes,
aggregates, controls, and analyzes all the diverse clinical and molecular data needed to support
complete biomarker lifecycle. With Oracle Health Sciences Translational Research Center
researchers have real-time access to internal proprietary data as well as external public data such
as The Cancer Genome Atlas, simplifying analysis and increasing statistical Power. The
comprehensive platform that enables secondary use of electronic health records and omics data to
help accelerate biomarker identification for drug discovery, clinical development and translational
medicine. Enables answering complex clinical and genomic research questions by combining clinical
with cross-platform genomic data. Allows visualization of the data using both in-house business
intelligence tools and via integration with genomic viewers developed by the research community.
Project Description:
Rolta One View is a web-based business intelligence application that empowers personnel to make
on-time decisions at all levels of the organization. Rolta OneView™ provides a platform to improve
performance and overall operational effectiveness by aligning the work-process efforts of personnel
with the goals of the business and provides a center for collaborative decision making leading to
lower risk. Rolta OneView establishes a cross-functional paradigm shift, uniting users around one
view of their business. OneView Data Warehouse Common Object Model provides an informational
backbone for all facets of OneView. This single point of access provides a cross-functional view of
multi-sourced information based on ISA and other industry standards and best practices.