Professional Documents
Culture Documents
• Data collection should be “opt-in” rather than “opt-out,” or in other words, data
collection should not be turned on by default and no data should be collected without
specific user approval.
• If data will be sent to third parties for processing/storage, the user should also be
informed of this upfront.
• A smart data collection policy goes a long way in establishing goodwill and customer
satisfaction.
Data Governance
Encryption
• Encrypting sensitive information such as credit card information has
become a standard industry practice.
• Access to data must be controlled, and only approved users should be granted
access.
Data Governance
Anonymizing the Data
• If the data needs to be sent to a third parties or even other less secure internal
groups, all potentially identifying information, such as names, addresses,
telephone numbers, and IP addresses, should be scrubbed from the data.
• No data should be shared with third parties without sufficient consent being
obtained from the users who are featured in the data being shared.
Creating a Data Governance Board
• In order to develop the initial data governance policies, a data
governance board can be constituted.
• The board should be formed with people who can drive these big
decisions. The necessity may arise for the board to push through
difficult decisions that are at odds with the aims of the organization in
order to protect the rights of the people whose data is at risk.
Initiating Data Governance
• It is always easier to start with an existing set of data governance
rules and then adapt the rules to fit your organization.
• Your data governance board will aid in making key decisions for which
policies may not yet have been established, setting precedents, and
then instating newer policies as the organization evolves and grows to
handle more data.
• Such a process will help to ensure that the costs of governing the data
do not exceed the benefits derived from it.
HIPAA
• The Health Insurance Portability and Accountability Act (HIPAA) dictates the
procedures to be followed and the safeguards to be adopted regarding medical
data. If you are dealing with medical data, it is critical to be compliant with these
laws and regulations from the get-go.
• In 2013, the Health Information Technology for Economic and Clinical Health Act
(HITECH Act) was also implemented. The HITECH Act makes it mandatory to
report breaches that affect 500 or more people to the U.S. Department of Health
and Human Services, the media, and the persons affected. Only authorized
entities are allowed to access patients' medical data.
• With this in mind, an organization should be careful that the data being sourced is
not violating any provisions of HIPAA or HITECH and that it is ethically and legally
sourced.
GDPR
• In 2018, the European Union introduced a new set of privacy policies called the
General Data Privacy Regulation (GDPR). These privacy policies put the user as
the data owner, irrespective of whether data is stored. Under GDPR, data
collection must be explicit, and any implicit consent—such as “fine print” stating
that signing up for an account implies that your data can automatically be
collected—is in contravention of GDPR.
• GDPR also mandates that requests for deletion of user data should be as simple
as the form for consent. GDPR mandates that users be made aware of their rights
under the policy, as well as how their data is processed, what data is being
collected, and how long will it be retained, among other things. GDPR is a step in
the right direction for user privacy, aimed to protect the users from data
harvesters and unethical data collection.
GDPR
• The responsibility and accountability have been put squarely on the shoulders of
the data collectors under GDPR. The data controller is responsible for the
nondisclosure of data to unauthorized third parties. The data controller is
required to report any breaches of privacy to the supervisory authority; however,
notifying users is not a mandatory requirement if the data was disclosed in an
encrypted format.
• Although these regulations are only legally applicable to users in the European
Union, adopting GDPR policies for users across the globe will put your
organization at the forefront for compliance and data governance practices.
Data Responsible AI-oriented organization
• Data governance might seem like a daunting task, but with the help of
a solid plan, it can be managed just like everything else.
Pitfall 1: Insufficient Data Licensing
• Using unlicensed data for your use case is the quickest way to derail a
system just as it is about to launch.
• In the worst case, you find the issue when the data owners bring legal
action against your organization.
• Data stores should be carefully designed from the start of the project.
Data leakage can lead to major trust issues among your customers
and can prove to be very costly.
Pitfall 3: Insufficient Data Security
• Customer data should be stored only in an encrypted format. This will ensure that even if
the entire database is leaked, the data will be meaningless to the hackers.
• This will ensure that even if the entire database is leaked, the data will be meaningless to
the hackers.
• It should be confirmed that the encryption method that is selected has sufficient key
strength and is used as an industry standard, like RSA (Rivest–Shamir–Adleman) or
Advanced Encryption Standard (AES).
• The keys should not be stored in the same location as the data store. Otherwise, you
could have the most advanced encryption in the world and it would still be useless.
Pitfall 3: Insufficient Data Security
• These specialists think like hackers and use the same tools that are used by
hackers to try to break into your system and give you precise recommendations
to improve your security.
• Although getting security right on the first attempt might not be possible, it is
nonetheless necessary to take the first steps and consider security from the
beginning of the design phase.
Pitfall 4: Ignoring User Privacy
• Dark designs are design choices that trick the user into giving away their privacy.
These designs work in such a way that a user might have given consent for their
data to be analyzed/stored without the user understanding what they have
consented to.
• A quick way to judge whether your design choices are ethical is to check whether
answering “no” to data collection and analysis imposes a penalty on the user
beyond the results of analysis.
Pitfall 4: Ignoring User Privacy
• If third-party vendors are used for data analysis, it becomes imperative to ensure
that anonymization of the data has taken place. This is to lessen the likelihood
that the third party will misuse the data.
• Cambridge Analytica abused its terms of service as Facebook merely relied on the
good nature and assumed integrity of Cambridge Analytica's practices. It could
have been avoided if Facebook took proper security measure ahead.
Pitfall 5: Backups
• This last step is frequently missed and leads to problems when the
system actually breaks. Untested backups fail to recover lost data or
produce errors and require a lot of time to restore, thus costing the
organization time and money to fix the problems.
Pitfall 5: Backups
• With cloud storage becoming so commonplace, it is essential to remember that the
cloud is “just another person's computer” and it can go down, too.
• Although cloud solutions are typically more stable than a homegrown solution because
they are able to rely on the economies of scale and the intelligence of industry experts,
they can still have issues.
• Relying only on cloud backups may make your life easier in the short term, but it is a bad
long-term strategy. Cloud providers could turn off their systems. They could have
downtime when you need to do that critical data recovery procedure.
• With encrypted backups, you will have peace of mind and your
customers will sleep soundly, knowing their data is safe.
Questions?