Skip to content

How to protect personal information in an application. Part 2.

Posted on:May 29, 2022 at 02:21 PM

In Part 1, I talked about the general principles of protecting personal data, in particular about the “onion” pattern, and briefly touched on the physical and the administrative types of protection.

Technical protection type: general

Now it is time to go over the most interesting — to the technical protection of your apps.

This subject can be divided into several parts:

  1. General methods of protecting PII.
  2. Special methods. Let’s start with the first group.

Password Manager

Before we begin, ask yourself these questions:

  1. Do I store my passwords in a secure place, and not in a text file or in a notebook / on a paper next to the computer?
  2. Are all my passwords complicated enough (> 12 characters, with at least 1–2 digits, 1–2 special characters, and at least 2–3 capital letters)?
  3. Do I use the same passwords for different accounts?
  4. Do I protect information that is being shared with me? (accesses, keys, etc.)? If you have answered “No” to any of these questions, then it is imperative to make a change. Luckily, there is special software to solve the above-mentioned issues:

Software for the secure storage of passwords and sensitive information

  1. 1Password. Monthly subscription, but a convenient password manager available on all platforms. Synchronization is conducted through the cloud (you can use both their cloud and third-party solutions, such as Dropbox, iCloud, etc.). A user account is not required for this service.
  2. LastPass. The basic functionality is free and available on all platforms. For additional features (sharing, user management, etc.) you have to pay. It is a few, anywhere from $ 2 to $ 5. The data is stored in an encrypted form in the cloud. This service requires a user account.
  3. Keepass. Free Open-Source password manager. With a lot of both pros and cons related to almost any open-source project.
  4. Dashlane. Shareware password manager. Available on all platforms, but with sync fee (of $ 3.33). Additional functions are also, but only with a payed subscription.
  5. Zoho Vault. Part of the Zoho package that requires access to your user account.

Sharing secret information

How do you share your personal information, passwords or accesses to friends and clients? Through Slack? Through Skype? By e-Mail? None of these methods are reliable enough, as technology advances the near future holds a risk of granting a back-door entrance to a database through which attackers will gain access to important information.

Therefore, instead of conveying the information itself, it makes sense to transfer a link where this information can be obtained. To open the link, you will need to enter a password, after some time, the link will become expired and will no longer be available. The password for opening a link can also be an abstract question. Remember the last part of Harry Potter, when the characters interrogated their friends to find out if they were under the Polyjuice potion by asking personal questions?

Similarly, you can work with such software: for example, “password from the link is the name of your first project in the company”, etc.

Examples of similar software:

  1. Caesarapp has a nice interface, the ability to set how many times you can encrypt/decrypt the message, the expiration date, the ability to attach a file. Encryption/decryption takes place on the browser side, which excludes any kind of vulnerability. What is even more pleasant is the software development for the 4xxi command on the Symfony (backend) and Vue.JS (frontend) stacks.
  2. PasteVault — the simplest vault for the exchange of textual information and the ability of setting an expiration date. Unfortunately, it has not been updated for the last 5 years, but it still works well. Installing such software on a server would take up to 5 minutes (docker for Caesarapp) up to 60 minutes (Pastevault, Lemp stack).

Storing access to source code repositories

Please, never store passwords/accesses in repositories on GitHub and bitbucket. Seriously. The probability that an attacker will gain access to a private repository is not at 0%. Moreover, there are special programs that analyze the code, for example, AWS keys. If a hacker finds the key — voila, the creation of EC2-instances took place. Don’t believe me — read the history of the developer, who almost got hit for $ 6000.

Sharing code and screenshots

Another issue that I usually address with all new developers is — never, you hear, never pass the code and/or screenshot of anything through public services.

You will be surprised how many passwords from servers, WordPress configs with real data (and access to servers) can be found, for example, on an open Pastebin in just 10 seconds of searching.

An example of a real screenshot with information from Pastebin: from the top left — access to the real SendGrid account, on the right — to the server with Symfony, in the bottom left — to Wordpress. The search took less than a minute.

How do I get the code?

  1. Through Slack and snippets. And that’s better.
  2. Through private snippets on the bitbuckets.
  3. Secret code — through the vault.

How to share screenshots?

Just do not upload them to the public cloud and do not use third-party. The same Monosnap, for example, can “fill in” the screenshots in Dropbox or in S3, where they can be easily destroyed after a couple of weeks and where the opportunities for public access can be limited. Set up the “fill” on S3 — which takes about an hour and 10 minutes to learn the instruction for new employees.

Technical protection type: software

Obfuscation

Very often to find bugs you need real data from this production. However, this is not always possible: for example, if it is a FinTech / MedTech project, then the developer is “ordered” to the real data. However, it is not always sufficient to reproduce bugs.

To do this, you can apply obfuscation of data — a method in which real data is replaced by fake data, with the preservation of the structure. Often the relationships between the data are also kept.

That is if you had a transaction in the name of Peti Ivanov worth $ 1,000, then after obfuscation, it would be John Doe for $ 234.

How to do it? Of course, there is already a large set of libraries, of which I would highlight two:

  1. Ruby gem “data-anonymization”. Has a very simple and understandable syntax, which makes it easy to describe and obfuscate production. Examples are in the repository itself. To obfuscate a simple database for 30–50 tables, it would only take 1–2 hours. Next — the matter of technology: to set up a dump and obfuscation in automated mode. As a result, developers work with almost real data, there is no leak, everyone is happy.
  2. PHP library neutralizer. It is similar in terms of functionality to the previous but less flexible. Useful for those who need a serious customization and it is written in PHP.

What are the pros and cons of obfuscation?

Pros:

  1. The data is not test, but quite real.
  2. It is easy to reproduce production bugs without real access to the production.
  3. Developers do not interact with PII, which means that we significantly reduce the risk of data loss.
  4. There is no territoriality: developers can be anywhere, and data is in the right country (for example, if you are a MedTech server, cross-border transactions with medical data are usually prohibited, for example, in the US and Canada at least, it is necessary to “plant” developers there, avoid cross-border transfers).

Cons:

  1. Additional time for setting up and supporting obfuscation. Not a lot of time, but still.
  2. Aggregated and denormalized data is difficult to obfuscate.
  3. Sometimes a debug needs real-time data, which is unlikely to happen with the above approach to obfuscation.

Automated checks

There are a number of services that allow you to quickly and easily configure your programs to automatically check for obvious things like XSS. I would like to bring your attention to two of them:

OWASP (Open Web Application Security Project). The non-commercial project, which aims to improve the security of applications. Has a set of ready cases that check for almost all occasions in life. I will not say that the set is very big, but a few tens/hundreds of mistakes and minor edits will obviously help you to do. As a result, it provides a nice looking report. An additional advantage is put on your server. That you control the report data.

OWASP Report

Burp — paid service for automated server checks and obvious errors on it. It is inexpensive, the level of protection (on the principle of “bulbs”) increases.

I am sure that there are other services that solve similar problems. You should pay attention to them since the resource investments in such stories are minimal, but you can be sure that your software does not contain any obvious mistakes.

Storage server for secret data

Any modern application contains a number of secret data: access to databases, AWS keys, tokens to social networks, etc.

Often, all this data is stored in configuration files. I already explained earlier why such data does not belong in repositories, where so many people have access, so configuration files are created and managed only on the server side. However, when accessing the source code of the server, the attacker will not be able to become their owner.

You can transfer data to ENV-variables (by the way, this method will soon be used by default in Symfony). However, recently there was a scandal when some npm packages passed environment variables outside. Thus, this method is not super-reliable either.

An alternative is to use a dedicated server for storing sensitive data. One example of such a server is Hashicorp Vault.

Instead of storing data in a configuration and/or environment variables, we request a special vault to ask us for a parameter.

Working with Hashicorp Vault (

What are the advantages of this method?

  1. Audit log. You can always view. You can always see who, when and what secrets you requested. One of the most important security requirements is information, and the vault allows you to get it out of the box. By logs, you can restore the problem and you can understand what information was lost and which is not.
  2. Ability to quickly close the vault. If you realize that there is a data leak, then the vault can be closed. Moreover, the keys for closing the vault can be distributed to different people. These keys can not reopen the vault, but they can quickly close it and stop leaking information.
  3. Dynamic access to AWS. Vault is by default well integrated with Amazon, so you can configure it so that when a data query is requested, a special user is created that has rights only to perform the specified operation, and then this user is immediately deactivated. The principle of “minimum allocation of rights” in action.
  4. Simple management of the environment. If you have several production environments and/or DEV / UAT, you can, without giving developers access to the secrets of the application, allow the application to function normally.

Summarizing: for large projects that have an integration with AWS and/or a set of different environments, vault adds value. However, even without this, this is another level of protection that will not allow an attacker to quickly obtain the necessary data. While the attacker is trying to get them, you can take countermeasures.

Encryption of data

The best way to protect sensitive information is not to store this information. This works well for work, for example, with payment services, when you delegate security of storing credit card information, for example, PayPal.

But there are situations when you can not store any information. For example, if you need to make server-server synchronization with a third-party service that does not have a good API and requires authorization using user’s login-password. In this case, you will have to store the necessary information in your database.

Storing such data in an open manner is unacceptable. Therefore, they must be encrypted. There can also be several levels here:

  1. Encryption of the instance. For example, AWS for the RDS service offers encryption of the entire database instance out of the box. From the leak, of course, it does not protect, but if someone physically “pulls” the server, then he can not do anything with it.
  2. Encryption PII. Here we are already directly encrypting the data. You can do this at the database level (for example, using pgcrypto for PostgreSQL), at the ORM level (for example, DoctrineEncryptionBundle for Symfony) or create it yourself, based on the built-in encryption libraries. The choice is yours.

Summary

Of course, within two articles it is difficult to cover such a big topic as security. Especially when it comes to details. But, whatever one may say, the most important thing in security is to start thinking about it.

Think about how much that you do is secure. How can I enhance the security of the application? What are the “narrow” places? Which tools improve the security of the app? What shall we do when the security is broken?

Security is a way of thinking, when, as in chess, we have to think a few steps forward.

The examples mentioned in the article are extremely simple and seem obvious. However, I assure you, 99.9% of us do not observe them. We do not think that we can be hacked, and therefore, through us, our apps can be hacked too.

I hope that by starting to apply these methods, you will at least be able to raise the security level of both your projects and your company to a higher one. The introduction of these simple tips into your practice will be a serious step towards an information-safe world.

And yes, remember: constant vigilance!

P.S. Presentation:

Presentation from SymfonyLive London 2017.