Professional Documents
Culture Documents
Learning Outcomes
Define the Data Operation Management.
Understand the purpose and principles component.
Know the tools for Data Operation Management.
Time Frame
1 hour
Introduction
The thought for the DataOps concept draws heavily from the source of DevOps,
according to which infrastructure and development teams should work together so
that projects can be managed efficiently. DataOps focuses on multiple subjects within
its field of action, for example, data acquisition and transformation, cleaning, storage,
backup scalability, governance, security, predictive analysis, etc
Analysis
1. What is Data Operation Management?
2. What is the purpose in creating Data Operation?
3. How does the data operation key components help the data management?
Abstraction
Besides, the benefits of DataOps extend across the enterprise. For example:
1. Supports the entire software development life cycle and increases DevTest
speed by the fast and consistent supply of environments for the development
and test teams.
2. Improves the quality assurance and through the provision of “production-like
data” that enables the testing to effectively exercise the test cases before
clients encounter errors.
3. Helps organizations to move safely to the cloud by simplifying and speeding
up the process of data migration to the cloud or other destinations.
4. Supports both data science and machine learning. Any organization’s data
science and artificial intelligence endeavors are as good as the information
available. So, DataOps ensures a reliable flow of the data for digestion and
learning as well.
5. Helps with compliance and establishes standardized data security policies and
controls for the smooth flow of data even without risking your clients.
Put all steps to Version Control – There are lots of stages of processing that turn
raw data into useful information for stakeholders. To be valuable, data must progress
through these steps, linked together in some way, with the ultimate goal of producing
a Data-Analytics output.
Branch & Merge – Branching and merging are the main productivity boost for Data
Analytics Team to make any kind of changes to the same source code files. Each team
member control work environment space. Test programs, make changes and take
risks.
Use Multiple Environments – Every Data Analytics team have tools in the laptop for
development. Version Control tools allow working at a private copy of code while
coordinating with other team members. It cannot be productive if don’t have the data
required.
Reuse and Containerize – In DataOps, the analytics team moves so faster like
lightning speed by using highly optimized tools and processes. One of the
Productivity tools is to Reuse and Containerize. Reuse Code means reusing Data
Analytics components. Reuse code saves time also. Container means to run the code
of the application. It a platform like Docker.
Utilizing Vision Control for Data Scientist Projects – DataOps use this concept on
Data Science. They use this concept when hundreds of Data Scientists work together
or separately on many different projects. When Data Scientist work on their local
machines then data saved locally which slowdowns the productivity. To reduce this,
make a common repository which solves this problem.
o Versioning
o Self-service
o Democratize data
o Platform Approach
o Go be open source
o Team makeup and Organisation.
o Unified Platform for all data- historical and Real-Time production.
o Multi-tenancy and Resource Utilisation.
o Access Model and Single Security for governance and self-service access.
There are four key software components of a DataOps Platform: data pipeline
orchestration, testing and production quality, deployment automation, and data science
model deployment / sandbox management. Below is our running list of the vendors in
each group.
DBT (Data Build Tool) — is a command-line tool that enables data analysts
and engineers to transform data in their warehouse more effectively.
Learning Outcomes
Define Data Security Management
Identify security threats and how to manage them.
To know the best practices in data protection
Understand the use of security tools.
Time Frame
1 hour
Introduction
Data security has become even more complicated with today’s hybrid
environments. Coordinated security management is essential to a range of critical
tasks, including ensuring that each user has exactly the right access to data and
applications, and that no sensitive data is overexposed.
Analysis
1. In your own understanding define Data Security Management.
2. How to protect your data from data threats?
3. What is your way in securing data?
Abstraction
What is Data Security Management?
Data security management involves a variety of techniques, processes and
practices for keeping business data safe and inaccessible by unauthorized parties. Data
security management systems focus on protecting sensitive data, like personal
information or business-critical intellectual property. For example, data security
management can involve creating information security policies, identifying security
risks, and spotting and assessing security threats to IT systems. Another critical
practice is sharing knowledge about data security best practices with employees
across the organization — for example, exercising caution when opening email
attachments.
Data security threats and how to manage them
There are many different threats to data security, and they are constantly evolving, so
no list is authoritative. But here is the most common threats you need to keep an eye
on and teach your users about:
DDoS attack — Distributed denial of service attacks attempt to make your servers
unusable. To mitigate the risk, consider investing in an intrusion detection system
(IDS) or intrusion prevention system (IPS) that inspects network traffic and logs
potentially malicious activity.
Phishing scams — This common social engineering technique attempts to trick users
into opening malicious attachments in phishing emails. Solutions include establishing
a cybersecurity-centric culture and using a tool to automatically block spam and
phishing messages so users never see them.
Hackers — This is an umbrella term for the actors behind the attacks listed above.
Third parties — Partners and contractors who lack sufficient network security can
leave interconnected systems open to attacks, or they can directly misuse the
permissions they’ve been granted in your IT environment.
Malicious insiders — Some employees steal data or damage systems deliberately, for
example, to use the information to set up a competing business, sell it on the black
market or take revenge on the employer for a real or perceived problem.
Mistakes — Users and admins can also make innocent but costly mistakes, such as
copying files to their personal devices, accidently attaching a file with sensitive
data to an email, or sending confidential information to the wrong recipient.
Data protection best practices
To build a layered defense strategy, it’s critical to understand your cybersecurity risks
and how you intend to reduce them. It’s also important to have a way to measure the
business impact of your efforts, so you can ensure you are making appropriate
security investments.
The following operational and technical best practices can help you mitigate data
security risks:
Classify data based on its value and sensitivity. Get a comprehensive inventory of all
the data you have, both on premises and in the cloud, and classify it. Like most data
security methods, data classification is best when it’s automated. Instead of relying on
busy employees and error-prone manual processes, look for a solution that will accurately
and reliably classify sensitive data like credit card numbers or medical records.
Run vulnerability assessments. Proactively look for security gaps and take steps to
reduce your exposure to attacks.
References
The DataOps Enterprise Software Industry, 2020. (2019, February 28). Retrieved from DataKitchen:
https://medium.com/data-ops/the-dataops-enterprise-software-industry-2019-a862904857ef
What is Data Operation (DataOps) ? Principles | Benefits | Adoption | Tools. (2018, November 17).
Retrieved from XENONSTACK: https://www.xenonstack.com/insights/data-operations/
Brooks, R. (2020, February 13). Data Security Management: Where to Start. Retrieved from
NETWRIX: https://blog.netwrix.com/2020/02/13/data-security-management-where-to-start/