Encrypt/Decrypt some fields.

Parameters:

Parameters:
Encryption and Decryption are based on a symmetric key (i.e. there is one unique key that allow encryption and decryption). The key is save inside a “Key File”. You need a “Key File” to encrypt or decrypt your data.
To select a “Key File” (using the “Browse” button) or to create a new “Key File” (using the “Create New Key File” button), you need to switch to “Expert-user-mode”. To switch to expert-user-mode: Click the button in the main toolbar of the application! Once you are in “expert-user-mode”, the “Browse” button and the “Create New Key” button are enabled.
By moving randomly your mouse inside this window, you generate random numbers that are used to create a 100% random key. This encryption key is saved inside a “Key File” or inside a string (this last option allows to use a “Global Parameter” to specify the key when decrypting the data).
NOTE:
Never lose your “Key File”. If you lose your key file, you’ll never be able to decrypt your data later.
NOTE:
Never send your “Key File” to third parties.
NOTE:
The encryption algorithm that is used is DES (for the short keys) and 3DES (for the long keys). It’s a well-studied encryption algorithm that does not seem to have any weakness.
The encryption algorithm used inside ETL is symmetric. This guarantees that there will never be any “collisions”. For example: Let’s assume that you are encrypting many MSISDN (i.e. many phone numbers): because there are no collisions, the number of distinct MSISDN before and after encryption is the same. There will never be 2 different un-encrypted MSISDN that are “mapped” to the same encrypted MSISDN (i.e. there are no collisions, never).
Since there are no collisions, you can safely use the Encrypt action to anonymize your datasets. In particular, when anonymizing datasets containing MSISDN numbers, you’ll lose, after encryption, some precious information about the MSISDN. The lost information is:
These pieces of information are very important when analyzing communication-graphs using SNA (Social Network Analysis) algorithms. You can use:
NOTE :
Anonymizing a dataset using a non-symetric encoding (such as MD5) can lead to some “collisions”. Non-symetric encodings (such as MD5) are thus bad and dangerous alternatives when anonymizing some dataset.Let’s take an example. Let’s assume that you are anonymizing 2 million different MSISDN using a 5-characters-MD5-code. A 5-characters-MD5-code can only have, at maximum, 1 million different values (=165). This means that you will have a catastrophic number of collisions that will make your anonymized dataset completely useless (Actually, even if you use, on the same population, a 6-character-MD5-code, there are 99% chance that you’ll also have so many collisions that your anonymized dataset is also useless).
