Professional Documents
Culture Documents
Komalfinal 99
Komalfinal 99
I further affirm, to the best of my knowledge, that the structure and content of this report are entirely original
and have not been previously submitted for any purpose whatsoever.
Sincerely
Komal
MCA 4th Semester
M.K.M college of Management and Information Technology for girls, hodal
Approval Letter
Acknowledgement
I would like to this take this opportunity to express my heartfelt gratitude to NCMRWF (Noida) for providing
me with the invaluable opportunity to complete my internship with your esteemed organization.
My time at NCMRWF has been a remarkable and enriching experience. I am deeply thankful to the entire team
for their warm welcome, guidance, and support throughout my internship journey. Your professionalism,
dedication, and commitment to excellence have been truly inspiring.
I would like to extend my special thanks to Dr. Indira rani S. for their mentorship, valuable insights, and
continuous encouragement. Their guidance played a pivotal role in my personal and professional development
during this internship.
The knowledge and skills I gained from working with each of you have been invaluable.
I would like to express my appreciation towards Anadi J. (Dean) for providing me with exposure to diverse
projects and tasks, which enhanced my learning experience and allowed me to contribute meaningfully to the
organizat
Once again, thank you, ncmrwf for this incredible opportunity, and for being an integral part of my academic
and career development.
Sincerely,
Ms.komal
Index
S.no. Contents 1.
Organizational
Profile.................................................................
1.1 Name
1.2 Location
1.3 Background
1.4 Functions and Objectives
1.5 Research Activity
1.6 Technological Advancements
1.7 Collaboration
1.8 Achievement
1.9 Challenges
1.10 Future Directions
2. Internship Details........................................................................
2.1 Internship Domain
2.2 Internship Goal
2.3 Internship Outcome
2.4 Internship Duration
3. Technology................................................................................... .
3.1 Python
3.2 Shell Scripting
3.3 Excel
6. Project Detail......
Organizational Profile
• The center operates under the Ministry of Earth Sciences, Government of India.
Functions and Objectives:
• It plays a crucial role in issuing forecasts for up to 10 days in advance, helping various sectors,
including agriculture, disaster management, and transportation.
• The center conducts research and development in atmospheric and oceanic sciences.
• It focuses on improving the accuracy of weather forecasts, climate modeling, and disaster
management.
Research Activities:
• NCMRWF is involved in a wide range of research activities related to atmospheric and oceanic
sciences.
• It conducts research on numerical weather prediction (NWP) models, data assimilation techniques, and
climate modeling.
• The center also conducts studies on monsoons, cyclones, and extreme weather events.
Technological Advancements:
• NCMRWF uses advanced technologies and high-performance computing systems for weather
modeling and forecasting.
• The center has developed its own NWP model called the "Unified Model for Medium-Range Weather
Prediction" (UMM).
Collaborations:
• NCMRWF collaborates with various national and international meteorological organizations and
research institutions.
• NCMRWF has made significant contributions to improving the accuracy of medium-range weather
forecasts in India.
• It has successfully predicted weather events, including monsoons, cyclones, and extreme weather
conditions.
• The center's research has led to advancements in weather modeling and data assimilation techniques.
Challenges:
• NCMRWF faces challenges related to data assimilation, model improvement, and the need for constant
technological upgrades.
• Predicting extreme weather events, such as cyclones and heavy rainfall, remains a complex task.
Future Directions:
• NCMRWF continues to work on improving its NWP models and expanding its forecasting
capabilities.
• It aims to enhance its climate modeling efforts and contribute to climate change research.
• Collaboration with other organizations and the development of stateof-the-art technologies are part of
its future plans.
Conclusion :
NCMRWF plays a crucial role in enhancing India's weather forecasting capabilities and contributes
significantly to various sectors, including agriculture, disaster management, and climate research.
Internship Details :
Internship Goal :
Internship Outcomes :
• Industrial Training is beneficial for students, it impproves ‘personal attitude’, and ‘work attiude’.
PYTHON
Python is a popular programming language that is easy to learn and use. It has a rich ecosystem of open-source
libraries and tools that allow scientists to build sophisticated applications. Python is particularly popular in data
science because it can perform complex analyses on various data sets.
The uses of Python are varied and quite impactful. Here is a list of fields where Python is commonly used:
Web Development
As a web developer, you have the option to choose from a wide range of web frameworks while using Python
as a server-side programming language. Both Django and Flask are popular among Python programmers.
Django is a full-stack web framework for Python to develop complex large web applications, whereas Flask is
a lightweight and extensible Python web framework to build simple web applications as it is easy to learn and
is more Python-based. It is a good start for beginners.
Instagram use the Django framework, whereas Airbnb, Netflix, Uber, and Samsung use the Flask framework.
Machine Learning:-
As Python is a very accessible language, you have a lot of great libraries on top of it that make your work
easier. A large number of Python libraries that exist help you to focus on more exciting things than reinventing
the wheel. Python is also an excellent wrapper language for working with more efficient C/ C++
implementations of algorithms and CUDA/cuDNN, which is why existing machine learning and deep learning
libraries run efficiently in Python. This is also super important for working in the fields of machine learning
and AI.
Data Analysis :
“Data is Everywhere”, in sheets, in social media platforms, in product reviews and feedback, everywhere. In this
latest information age it’s created at blinding speeds and, when data is analyzed correctly, can be a company’s
most valuable asset. “To grow your business even to grow in your life, sometimes all you need to do is Analysis!”
If your business is not growing, then you have to look back recognize your mistakes, and make a plan again
without repeating those mistakes. And even if your business is growing, then you have to look forward to making
the business grow more.
All you need to do is analyze your business data and business processes. The process of studying the data to find
out the answers to how and why things happened in the past. Usually, the result of data analysis is the final
dataset, i.e. a pattern, or a detailed report that you can further use forDataAnalytics.
While there are many libraries available to perform data analysis in Python, here are a few to get you started:
• NumPy: For scientific computing with Python, NumPy is essential. It supports large, multi-
dimensional arrays and matrices and includes an assortment of high-level mathematical functions to
operate on these arrays.
• SciPy: This works with NumPy arrays and provides efficient routines for numerical integration and
optimization.
• Pandas: This is also built on top of NumPy, and offers data structures and operations for manipulating
numerical tables and time series.
• Matplotlib: A 2D plotting library that can generate data visualizations as histograms, power spectra,
bar charts, and scatterplots with just a few lines of code.
• Subplots: Matplotlib'splt.subplots function is used to create multiple subplots (in this case, two
vertically stacked subplots) within a single figure to display 'Reception' and 'Used' data side by side.
• Legends: Legends are added to the subplots to label the lines in the graph. They are created using
Matplotlib's legend function.
• Custom Labels: Custom labels for the x-axis and y-axes are set using functions like set_xlabel and
set_ylabel.
• Data Calculation: Basic calculations are performed to calculate the percentage of 'Reception' data that
is 'Used'. This is done using arithmetic operations on DataFrame columns.
Desktop Applications:-
Desktop applications are software programs designed to run on personal computers and laptops, typically within
an operating system environment like Windows, macOS, or Linux.
Platform-Specific: These applications are generally developed for specific operating systems and take
advantage of platform-specific features, such as Windows APIs, macOS frameworks, or Linux libraries.
which
systeminvolves copying files, creating shortcuts, and potentially updating
configurations.
Performance: Desktop applications often have direct access to system resources, which can result in
faster performance compared to web-based
applications.
Offline Capabilities: Since they're installed on a device, desktop applications can often function
without an internet connection, providing a seamless user experience.
Shell Scripting
A shell script is a text filethat contains a sequence of commands for a UNIX-based operating system. It is
called a shell script because it combines a sequence of commands, that would otherwise have to be typed into
the keyboard one at a time, into a single script. Theshell is the operating system's command-line interface
(CLI) and interpreter for the set of commands that are used to communicate with the system.
A shell script is usually created for command sequences in which a user has a need to use repeatedly in order to
save time. Like other programs, the shell script can contain parameters, comments and subcommands that the
shell must follow. Users initiate the sequence of commands in the shell script by simply entering the file name
on a command line.
The basic steps involved with shell scripting are writing the script, making the script accessible to the shell and
giving the shell execute permission.
Shell scripts containASCII text and are written using atext editor,word processor or graphical user interface
(GUI). The content of the script is a series of commands in a language that can be interpreted by the shell.
Functions that shell scripts support includeloops, variables, if/then/else statements, arrays and shortcuts. Once
complete, the file is saved typically with a .txt or .sh extension and in a location that the shell can access.
Bash (Bourne Again SHell) : The most widely used shell in Linux and
Sh (Bourne Shell): A simpler shell, often used as a default in many UNIX systems.
Zsh (Z Shell): An advanced shell with additional features, like better scripting capabilities, command-
line completion, and customization.
Csh/Tcsh (C Shell/TENEX C Shell): Shells with a syntax similar to the C programming language,
popular in some UNIX environments.
Shebang : The first line of the script indicating which shell should interpret
the script. For Bash, it's usually #!/bin/bash . are comments and are ignored by the
shell.
Comments : Lines starting with #
Variables: Shell scripts use variables to store values for later use. Variables
Control Structures : Shell scripts support loops (for, while), conditionals (if,
• File Manipulation: Creating, modifying, moving, and deleting files and directories.
• Text Processing: Processing and transforming text data using tools like awk, sed, grep, and cut.
• System Administration: Managing system services, user accounts, and system configurations.
• Batch Processing: Running multiple commands or scripts in sequence to process large amounts of data.
• Follow Consistent Naming Conventions: Use clear and descriptive variable and function names.
• Error Handling: Include checks for errors and handle them appropriately.
• Permissions: Ensure scripts have the correct permissions for execution (chmod +x script.sh).
• Security: Be cautious with user input and avoid common security vulnerabilities, such as command
injection.
EXCEL
Excel is frequently used for data analysis because of its superb data visualization features, which enable the
creation of illuminating graphics. Each Excel chart has a specific meaning, and excel comes with a significant
selection of built-in charts, which may be elegantly used to make the best use of data. Data visualization is the
graphic depiction of data, simplifying the data understanding. Utilizing Data visualization tools like Data
Wrapper, Google Charts, and others, data visualization can be done. A spreadsheet called Excel is also used to
visualize data and organize it.
Data visualization can be done in various data visualization excel charts & graphs.
Data visualization can also be done by data visualization using excel templates. Excel has charts of many kinds,
including column charts, bar charts, pie charts, line charts, area charts, scatter charts, surface charts, and many
more.
Microsoft Excel is a versatile and widely used spreadsheet software that serves a
variety of purposes, from simple data entry to complex data analysis. Here's a theoretical overview of Excel
and its key components:
1. Spreadsheet Structure
Cells: Each cell is an intersection of a row and a column, identified by a combination of letters (for
columns) and numbers (for rows). For example, cell A1 is in the first column and the first row.
Workbooks: A workbook is a complete Excel file that can contain one or more worksheets.
2. Basic Operations
Data Entry: You can enter text, numbers, dates, and formulas into cells.
Formulas: Formulas are mathematical expressions used for calculations within Excel. They typically
begin with an equals sign (=) and can involve basic arithmetic operations, functions, and references to
other cells.
Functions: Excel offers a wide range of functions for various tasks, such as mathematical, statistical,
financial, and logical operations. Examples include
Sorting and Filtering: Excel allows you to sort data based on specific
criteria and filter rows that meet certain conditions.
PivotTables: PivotTables are powerful tools for summarizing and analyzing large data sets. They
enable you to create customizable reports, group data,
Data Validation: This feature allows you to restrict data entry in cells to specific types, ranges, or
lists.
4. Data Visualization
Charts: Excel provides a variety of chart types (e.g., bar, line, pie, scatter) to
visually represent data. Charts can be customized with labels, titles, and other elements.
5. Advanced Features
Macros and VBA: Excel supports automation through macros, which are
sequences of instructions that can be recorded and replayed. VBA (Visual Basic for Applications) is a
programming language that allows for more complex automation and customizations.
Data Connections: Excel can connect to external data sources, such as databases, CSV files, and online
services, allowing for real-time data analysis and integration.
Collaboration: With Excel Online and cloud-based solutions like Microsoft 365, multiple users can
collaborate on the same workbook in real time, making it easier to share and work on data.
Report: Data Analysis and Visualization
Introduction:-
This report shows the analysis and visualization of data obtained from 'Reception' and 'Used' sources within
two years (2021-22). The data was processed using Python and Matplotlib to create informative graphs. This
document outlines the purpose, methodology, and results of the analysis.
1. Purpose
'Reception' and 'Used' sources over the course of two years, from 2021 to 2022. It seeks to provide insights
into trends, patterns, or noteworthy findings from these
datasets.
2. Methodology
The analysis involved the use of Python, a powerful programming language for
data science, and Matplotlib, a popular library for creating visualizations. Key stages in the methodology
might include:
Data Collection : Gathering data from the specified sources (e.g., 'Reception' and 'Used') to
ensure that it is accurate, complete, and relevant.
Data Cleaning: Removing inconsistencies, missing values, and outliers to ensure data quality.
insights.
3. Results
The report would likely contain various visualizations to showcase the results of
variables.
4. Discussion
In this section, the report may discuss the key findings and their implications. This
5. Conclusion
The report would summarize the main takeaways from the data analysis and
visualization. It might also suggest next steps for further research or action based on the findings.
6. Appendices
Any additional information, such as the Python scripts used for analysis, data tables, or raw data, might be
included in the appendices for reference or further
study.
7.Raw data
Sources: The origin of the data. In your case, it's mentioned that the data comes from 'Reception' and
'Used' sources.
Reception Sources: This could refer to incoming data, like goods received,
Used Sources: This could indicate data representing how resources or products are utilized, distributed,
or consumed.
Format: The raw data could be in various formats, like CSV, Excel, SQL
databases, or text files. Understanding the structure of these files is essential for processing.
Timeframe: The data spans a period from 2021 to 2022. You'd need to know if it's continuous (daily,
weekly) or collected at specific intervals.
Continuous Data: Collected at regular intervals, like daily or weekly, providing consistent snapshots
over time.
Specific Intervals: Collected at irregular intervals or specific events, requiring a different approach to
identify trends.
Chapter 1
import pandas as pd import matplotlib.pyplot as plt import
matplotlib.dates as mdates
# Create three separate subplots fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(10, 8), sharex=True)
# Calculate the percentage of reception data that is used df['percentage_used'] = (df['used'] / df['reception']) *
100
#define y_ticks and set ytick label for ax3 y_ticks = [0,
This Python code appears to be aimed at processing and visualizing data from files
1. Setting Labels : It sets the xlabel for the third subplot and defines the ytick positions for better
readability.
2. Importing Libraries: The code imports necessary libraries such as pandas for data
3. Defining Folder Path and File List: It defines the folder path where the files are located and lists the
names of the files to be processed.
4. Reading and Concatenating Data: It reads each file in the file list, converts it into a DataFrame, and
concatenates them into a single DataFrame called
main_dataframe .
6. Reading Data from a Text File: It reads data from another text file named 'ahiclr11_00.txt' into a
DataFrame.
7. Adding Headers to the DataFrame: It assigns column names ('date', 'reception', 'used') to the DataFrame.
9. figsize=(10,
Creating Subplots: 8), It creates a figure with three vertically stacked subplots using
# Loop through the remaining files and concatenate data for i in range(1, len(file_list)): data =
pd.read_table(file_list[i]) df =
')
# Read the combined data from the text file without headers df = pd.read_csv('ahiclr_06.txt', header=None,
sep=' ')
# Create three separate subplots fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(10, 8), sharex=True)
700000, 800000]
# Calculate the percentage of reception data that is used df['percentage_used'] = (df['used'] / df['reception']) *
100
# Loop through the remaining files and concatenate data for i in range(1, len(file_list)): data =
pd.read_table(file_list[i]) df =
')
# Read the combined data from the text file without headers df = pd.read_csv('ahiclr_12.txt', header=None,
sep=' ')
# Create three separate subplots fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(10, 8), sharex=True)
# Calculate the percentage of reception data that is used df['percentage_used'] = (df['used'] / df['reception']) *
100
ax3.legend() ax3.grid(True)
# Loop through the remaining files and concatenate data for i in range(1, len(file_list)): data =
pd.read_table(file_list[i]) df =
')
# Read the combined data from the text file without headers df = pd.read_csv('ahiclr_18.txt', header=None,
sep=' ')
# Create three separate subplots fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(10, 8), sharex=True)
This report documents the analysis and visualization of data obtained from
'Reception' and 'Used' sources. The data was processed using Python and Matplotlib to create informative
graphs. This document outlines the purpose, methodology, and results of the analysis.
1. Importing Libraries:
- `folder_path` specifies the directory where the data files are located.
- `file_list` is a list of three file names that you want to read and combine.
- `main_dataframe` is created as an empty DataFrame. It will be used to store the combined data from
the files.
- Each file's data is stored in a temporary DataFrame called `df`, and then `main_dataframe` is updated
by concatenating these `df` objects horizontally using `pd.concat`.
- After combining all the data, `main_dataframe` is saved to a text file named "ahiclr_xx.txt" without
headers and using space (' ') as a separator.
- The code reads the "ahiclr_XY.txt" file into a new DataFrame named `df`, specifying that there are
no headers and that the separator used is a space ('
').
7. Adding Headers:
- The script adds headers to the DataFrame `df` for clarity. The columns are named 'date,' 'reception,'
and 'used.'
9. Creating Subplots:
- Three separate subplots (`ax1`, `ax2`, and `ax3`) are created using
`plt.subplots`. These subplots will be used to visualize different aspects of the data.
10. Plotting 'reception' and 'used' Data:
- Data from the 'reception' and 'used' columns of the DataFrame are plotted on the first and second
subplots, respectively.
- Axis labels and tick parameters are set for each subplot.
- The x-axis of the third subplot (`ax3`) is formatted to display the date at 6month intervals.
- Legends are added to the first and second subplots to label the data series.
- The code calculates the percentage of 'used' data relative to 'reception'data and stores it in a new
column 'percentage_used' in the DataFrame.
- The 'percentage_used' data is plotted on the third subplot (`ax3`), labeled as 'Percentage Used,' and
the y-axis is formatted as percentages.
"Percentage_Used_XY.png." 16.
Displaying the Plot:
- Finally, the script displays the plot on the screen.
This code essentially reads, combines, and visualizes data from multiple text files and then saves the resulting
plot as an image file.
Tasks and Activities : Detail the tasks carried out during the project,
Quality Assurance: Explain how quality was ensured throughout the project, including testing, reviews,
and audits.
Data Analysis: Present relevant data, statistics, and analysis that support the project's outcomes.
including what worked well and what could be improved in future project
Clarity and Conciseness: Ensure the report is clearly written and concise.
Use plain language where possible, and avoid jargon unless necessary.
Visual Elements: Incorporate visual aids such as charts, graphs, and tables to help convey information
more effectively.
Consistency: Maintain consistent formatting and structure throughout the report, including headings,
font styles, and numbering.
including what worked well and what could be improved in future projects.
Risk Management Plan: Provide a detailed list of potential risks, their likelihood, impact, and
mitigation strategies. Include a risk register if applicable.
Problem Statement: Describe the problem or opportunity that prompted the project. Explain why the
project was initiated and the context surrounding it.
Project Justification: Outline the reasons for undertaking the project, including expected benefits,
return on investment, or strategic alignment with
organization