You are on page 1of 45

Data Processing

• What Is Data Processing?


Data processing is the method of collecting raw data and translating it
into usable information.

The raw data is collected, filtered, sorted, processed, analyzed, stored,


and then presented in a readable format.

Data processing is essential for organizations to create better business


strategies and increase their competitive edge.
Data Processing
• It is usually performed in a step-by-step process by a team of data
scientists and data engineers in an organization

• By converting the data into readable formats like graphs, charts, and
documents, employees throughout the organization can understand
and use the data.
Data Processing
• The data processing cycle consists of a series of steps where raw data
(input) is fed into a system to produce actionable insights (output).
Each step is taken in a specific order, but the entire process is
repeated in a cyclic manner.
Data Processing Cycle
Data Processing Cycle
• Generally, there are six main steps in the data processing cycle:
➢Step 1: Collection
The type of raw data collected has a huge impact on the output
produced.

Hence, raw data should be gathered from defined and accurate sources
so that the subsequent findings are valid and usable.

Raw data can include monetary figures, website cookies, profit/loss


statements of a company, user behavior, etc
Data Processing Cycle
➢Step 2: Preparation
Data preparation or data cleaning is the process of sorting and filtering
the raw data to remove unnecessary and inaccurate data

Raw data is checked for errors, duplication, miscalculations or missing


data, and transformed into a suitable form for further analysis and
processing.

This is done to ensure that only the highest quality data is fed into the
processing unit.
Data Processing Cycle
➢Step 3: Input
The raw data is converted into machine readable form and fed into the
processing unit.

This can be in the form of data entry through a keyboard, scanner or


any other input source.
Data Processing Cycle
➢Step 4: Data Processing
The raw data is subjected to various data processing methods using
machine learning and artificial intelligence algorithms to generate a
desirable output.

This step may vary slightly from process to process depending on the
source of data being processed and the intended use of the output.
Data Processing Cycle
➢Step 5: Output
The data is finally transmitted and displayed to the user in a readable
form like graphs, tables, vector files, audio, video, documents, etc.

This output can be stored and further processed in the next data
processing cycle.
Data Processing Cycle
➢Step 6: Storage
The last step of the data processing cycle is storage, where data and
metadata are stored for further use.
This allows for quick access and retrieval of information whenever
needed, and also allows it to be used as input in the next data
processing cycle directly.
Types of Data Processing
There are different types of data processing based on the source of data and
the steps taken by the processing unit to generate an output

Batch Processing;
Data is collected and processed in batches. Used for large amounts of data.
Eg: payroll system

Real-time Processing;
Data is processed within seconds when the input is given. Used for small
amounts of data.
Eg: withdrawing money from ATM
Types of Data Processing
Online Processing;
Data is automatically fed into the CPU as soon as it becomes available.
Used for continuous processing of data.
Eg: barcode scanning

Multiprocessing;
Data is broken down into frames and processed using two or more
CPUs within a single computer system. Also known as parallel
processing.
Eg: weather forecasting
Types of Data Processing
Time-sharing
Allocates computer resources and data in time slots to several users
simultaneously.
Examples of Data Processing

✓A stock trading software that converts millions of stock data into a


simple graph
✓An e-commerce company uses the search history of customers to
recommend similar products
✓A digital marketing company uses demographic data of people to
strategize location-specific campaigns
✓A self-driving car uses real-time data from sensors to detect if there
are pedestrians and other cars on the road
Access to all Learning Materials

elearning.mzumbe.ac.tz
Data Processing Methods
• Manual Data Processing

This data processing method is handled manually. The entire process of


data collection, filtering, sorting, calculation, and other logical
operations are all done with human intervention and without the use
of any other electronic device or automation software. It is a low-cost
method and requires little to no tools, but produces high errors, high
labor costs, and lots of time and tedium.
Data Processing Methods
• Mechanical Data Processing

Data is processed mechanically through the use of devices and


machines. These can include simple devices such as calculators,
typewriters, printing press, etc. Simple data processing operations can
be achieved with this method. It has much lesser errors than manual
data processing, but the increase of data has made this method more
complex and difficult.
Data Processing Methods
• Electronic Data Processing

Data is processed with modern technologies using data processing


software and programs. A set of instructions is given to the software to
process the data and yield output. This method is the most expensive
but provides the fastest processing speeds with the highest reliability
and accuracy of output.
Moving From Data Processing to Analytics
• The most significant game-changer in today’s business world, it’s big
data.

• Although it involves handling a staggering amount of information, the


rewards are undeniable.

• Analytics, the process of finding, interpreting, and communicating


meaningful patterns in data, is the next logical step after data
processing.
Moving From Data Processing to Analytics
• But no matter which of these processes data scientists are using, the
sheer volume of data and the analysis of its processed forms require
greater storage and access capabilities, which leads us to the next
section!

• The future of data processing can best be summed up in one short


phrase: cloud computing.
Cloud Computing
• While the six steps of data processing remain immutable, cloud
technology has provided spectacular advances in data processing
technology that has given data analysts and scientists the fastest,
most advanced, cost-effective, and most efficient data processing
methods today.

• In Simplest terms, cloud computing means storing and accessing the


data and programs on remote servers that are hosted on the internet
instead of the computer’s hard drive or local server.
Cloud Computing
Cloud computing is also referred to as Internet-based computing.

Hosting a cloud: There are three layers in cloud computing. Companies


use these layers based on the service they provide.
➢Infrastructure
➢Platform
➢Application
Cloud Computing
Now, let’s have a look at hosting
Let’s say you have a company and a website and the website has a lot
of communications that are exchanged between members.
You start with a few members talking with each other and then
gradually the number of members increases.
As the time passes, as the number of members increases, there would
be more traffic on the network and your server will get slow down. This
would cause a problem. A few years ago, the websites are put on the
server somewhere, in this way you have to run around or buy and set
the number of servers.
Now, let’s have a look at hosting
• It costs a lot of money and takes a lot of time. You pay for these
servers when you are using them and as well as when you are not
using them.
• This problem is overcome by cloud hosting. With Cloud Computing,
you have access to computing power when you needed. Now, your
website is put in the cloud server as you put it on a dedicated server.
People start visiting your website and if you suddenly need more
computing power, you would scale up according to the need.
Types of Cloud Computing Services
Most cloud computing services fall into three categories
➢Software as a service (SaaS)
➢Platform as a service (PaaS)
➢Infrastructure as a service (IaaS)
Software as a Service(SaaS)
Software-as-a-Service (SaaS) is a way of delivering services and
applications over the Internet. Instead of installing and maintaining
software, we simply access it via the Internet, freeing ourselves from
the complex software and hardware management.

It removes the need to install and run applications on our own


computers or in the data centers eliminating the expenses of hardware
as well as software maintenance.
Software as a Service(SaaS)
• SaaS provides a complete software solution that you purchase on a
pay-as-you-go basis from a cloud service provider. Most SaaS
applications can be run directly from a web browser without any
downloads or installations required.

• The SaaS applications are sometimes called Web-based software, on-


demand software, or hosted software.
Advantages of SaaS
• Cost-Effective: Pay only for what you use.

• Reduced time: Users can run most SaaS apps directly from their web
browser without needing to download and install any software. This
reduces the time spent in installation and configuration and can
reduce the issues

• Accessibility: We can Access app data from anywhere.


Platform as a Service
PaaS is a category of cloud computing that provides a platform and
environment to allow developers to build applications and services
over the internet. PaaS services are hosted in the cloud and accessed
by users simply via their web browser.

A PaaS provider hosts the hardware and software on its own


infrastructure. As a result, PaaS frees users from having to install in-
house hardware and software to develop or run a new application.
Thus, the development and deployment of the application take place
independent of the hardware.
Advantages of PaaS:
Simple and convenient for users: It provides much of the infrastructure
and other IT services, which users can access anywhere

Cost-Effective: It charges for the services provided on a per-use basis


thus eliminating the expenses one may have for on-premises hardware
and software.

Efficiency: It allows for higher-level programming with reduced


complexity thus, the overall development of the application can be
more effective.
Infrastructure as a Service (IaaS)
• Infrastructure as a service (IaaS) is a service model that delivers
computer infrastructure on an outsourced basis to support various
operations. Typically IaaS is a service where infrastructure is provided
as outsourcing to enterprises such as networking equipment, devices,
database, and web servers.
Infrastructure as a Service (IaaS)
• It simply provides the underlying operating systems, security,
networking, and servers for developing such applications, and
services, and deploying development tools, databases, etc.

• It simply provides the underlying operating systems, security,


networking, and servers for developing such applications, and
services, and deploying development tools, databases, etc.
Advantages of IaaS:
• Cost-Effective: Eliminates capital expense and reduces ongoing cost
and IaaS customers pay on a per-user basis, typically by the hour,
week, or month.
• Website hosting: Running websites using IaaS can be less expensive
than traditional web hosting.
• Security: The IaaS Cloud Provider may provide better security than
your existing software.
Advantages of IaaS:
• Maintenance: There is no need to manage the underlying data center
or the introduction of new releases of the development or underlying
software. This is all handled by the IaaS Cloud Provider.
• The various companies providing Infrastructure as a service are
Amazon web services, Bluestack, IBM, Openstack, Rackspace, and
Vmware.
Communication Technologies
• Communication technology refers to all the tools used to send,
receive, and process information. In today’s fast climate, efficiency
and convenience are the keys to successful communication
technology.

• Communication technology is the transmission of communications


between people through a machine, which is technology. This
information process can assist humans in making decisions, solving
issues, and controlling machines.
Communication Technologies
Types of Communication technologies
➢Telegraph
➢Telephone
➢Radio
➢Satellite
➢Internet
Computerisation and Digitalisation
• Computerization refers to the process of automating manual tasks
and replacing them with computer systems or software. It involves
the use of computers to streamline and optimize various operations,
such as data entry, calculations, and record-keeping. Essentially,
computerization aims to improve efficiency and accuracy by
eliminating the need for human intervention in repetitive or time-
consuming tasks.
Computerisation and Digitalisation
• Computerization refers to the process of automating manual tasks or
processes by using computers and computer software. It involves the
conversion of analog information into digital format, enabling
machines to perform tasks that were previously done manually.
Computerization focuses on the mechanization and streamlining of
specific tasks or operations.
Computerisation and Digitalisation
• With computerization, businesses and organizations can replace
human labor with automated systems, reducing the potential for
human error and increasing efficiency. This approach typically
involves the use of specialized software or systems that are designed
to carry out specific functions, such as data processing, inventory
management, or financial calculations.
Computerisation and Digitalisation
• For example, in a manufacturing setting, computerization may involve
the implementation of computer-controlled machines or robots to
perform repetitive tasks, thereby increasing productivity and
precision.
Computerisation and Digitalisation
• Digitalization, on the other hand, goes beyond mere automation. It
encompasses the transformation of analog information into digital
formats, enabling the storage, processing, and transmission of data in
a digital environment. Digitalization involves the integration of digital
technologies and platforms to revolutionize business processes,
communication, and information management. It encompasses a
wide range of activities, including digitizing documents, adopting
cloud computing, implementing data analytics, and leveraging
artificial intelligence.
Computerisation and Digitalisation
• Digitalization, on the other hand, encompasses a broader scope than
computerization. It refers to the transformation of analog
information, processes, and systems into digital form, enabling the
integration and utilization of digital technologies across various
aspects of an organization.
Computerisation and Digitalisation
• Through digitalization, businesses can harness the power of
technologies such as cloud computing, artificial intelligence, big data
analytics, and the Internet of Things (IoT) to transform their
operations, business models, and customer interactions. This
transformation enables organizations to gain valuable insights, make
data-driven decisions, and adapt to the rapidly evolving digital
landscape.
Computerisation and Digitalisation
• In summary, while computerization focuses on automating specific
tasks or processes using computers and software, digitalization
encompasses a broader transformation towards digital technologies
and strategies that revolutionize entire systems and operations.

You might also like