Download PDF 945KB
By Roman Kazantsev, Denis Katerinskiy, Thaddeus Letnes
Introduction
The goal of this paper is to demonstrate how Intel® Tamper Protection Toolkit protects critical code and valuable data against reverse-engineering and tampering attacks. In this article we will focus on a single component of Intel Tamper Protection Toolkit called iprot which is used to obfuscate binaries.
Many developers are now recognizing that their applications have critical code and valuable data that need protection. For example, Scrypt Encryption Utility is a simple password-based encryption utility using Scrypt [4] key derivation function. Scrypt is a modern and secure key derivation function that is actively used in security conscious software such as full disk encryption in Android* 4.4. We will show some of the reverse-engineering and tampering attacks that exist against the utility and how these attacks reduce utility security.
Code Obfuscation
Let us consider the sample source code of a function called “sensitive” illustrated in Example 1 and compile it to a dynamic library.
#define MODIFIER (0xF00D) int __declspec(dllexport) sensitive(const int value) { int result = 0; int i; for (i = 0; i < value; i++) { result += MODIFIER; } return result; }
Example 1:The source code of "sensitive" function
Once the function is converted into machine code we can reverse engineer the code using a disassembler such as IDA Pro. Figure 1 illustrates the control flow of the disassembled code with logic and data used for computation. A hacker can easily find out the modifier value and change it.
Figure 1:Flow graph of disassembled code
Different code obfuscation techniques can help to hide software implementation details from hackers' sight, complicating reverse-engineering and preventing tampering. Code obfuscation is the process of transforming machine code into code that is difficult for a human to disassemble and understand, but does not alter the functional behavior of the program. It is used to prevent theft of software secrets and tampering with its implementation, and guard intellectual property.
Intel® Tamper Protection Toolkit
Intel Tamper Protection Toolkit is a product focused on runtime code integrity verification and preventing reverse-engineering for Microsoft Windows* and Android binaries on Intel® architecture. Binaries protected with the Intel Tamper Protection Toolkit do not require special loaders and can be run on any Intel® CPU.
Intel Tamper Protection Toolkit Beta can be downloaded at https://software.intel.com/en-us/tamper-protection.
Intel Tamper Protection Toolkit offers an obfuscation tool called “iprot” that obfuscates the most critical parts of code and helps mitigate possible attacks to steal and change secrets like in the code illustrated in Example 1. The tool creates self-modifying and self-encrypted code which means obfuscated code is modified and encrypted/decrypted by itself during runtime.
The obfuscation tool takes as input a binary file that is a dynamic link library (.dll) and a list of exported functions. The output is a dynamic link library containing only the obfuscated versions of the specified exported functions (entry points). Starting from the address of each entry point, all reachable code is parsed and recorded. Branches, jumps, and calls are followed whenever possible, and their destinations are parsed in turn. A few restrictions apply to code in order for it to be processed by iprot. The code must not contain relocations, indirect jumps, and PIC global references (Android specific).
To obfuscate the dynamic library from the Code Obfuscation section using iprot, enter on the command line:
iprot sensitive.dll sensitive -o sensitive_obf.dll
Using sophisticated tools, such as IDA Pro, we tried to reverse engineer the obfuscated dynamic library. The resulting disassembled code is shown in Example 2.
sensitive PROC NEAR jmp ?_001 ?_001: push ebp push eax call ?_002 ?_002 LABEL NEAR pop eax lea eax, [eax+0FECH] mov dword ptr [eax], 608469404 mov dword ptr [eax+4H], 2308 mov dword ptr [eax+8H], -443981824 mov dword ptr [eax+0CH], 1633409 mov dword ptr [eax+10H], -477560832 mov dword ptr [eax+14H], 15484359 mov dword ptr [eax+18H], -1929379840 mov dword ptr [eax+1CH], -1048448<….>
Example 2:Disassembled obfuscated code
You can see obvious differences between disassembly views of the original and obfuscated dynamic library. IDA Pro is unable to show a flow graph of obfuscated code, and the modifier value has disappeared in obfuscated code. Also, obfuscated code is resistant to static and dynamic tampering.
Password-Based Key Derivation Function
A password-based key derivation function (PBKDF) is a function used to convert a user-supplied password into a binary key that can be used in cryptographic algorithms. PBKDFs are very important for software security domain because using user-supplied passwords in cryptographic algorithms is insecure due to insufficient entropy. These functions are widely used in security applications. For example, cryptographic keys produced from passwords by PBKDFs are used in PGP systems to encrypt/decrypt user data on disk. Also, operating systems use these functions for user password verification.
The general mathematical form of a PBKDF is
y = F(P, S, d, t1, …,tn)
where y is a derived key, P is a password, S is salt, d is a length of the derived key, and t1, …, tn are parameters defining the amount of hardware resources such as CPU time and RAM required by the function for computation. The salt S allows creation of different keys from a single password. The parameters t1, …, tn play a role in defining the amount of hardware resources consumed to compute the function so they can be tuned as computational complexity of the function is increased to add additional protection to the function against brute-force.
A scheme of one PBKDF usage is shown in Figure 2.
Figure 2:Scheme of PBKDF usage
There are two scenarios in which an attacker might discover the user password:
- The attacker recovers the password using a derived key that is leaked to him for some reason.
- The attacker recovers the password using encrypted or authenticated data.
Intel Tamper Protection Toolkit can help to prevent scenario 1 by hiding the code in which the derived key is generated and used.
Intel Tamper Protection Toolkit cannot prevent scenario 2-typeattacks, but can help ensure that an attacker cannot force use insecure parameters for key generation.
Examples of Password-Based Key Derivation Functions Used in Practice
- Password-based Key Derivation Function (PBKDF2) [1] y = F(P,S,c), is a function , where c is the iteration count used to adjust the amount of CPU time required to compute F for any P,S. As shown in [3] PBKDF2 can be implemented for systems with very little RAM, which makes brute-force attacks very effective. Despite that, some software continue to use PBKDF2.
- Bcrypt [5] function is more resistant to the attacks above since it uses a larger fixed size RAM.
The most modern and secure key derivation function is Scrypt developed by Colin Percival [4]. Scrypt has the following mathematical form:
y = F(P,S,d,N,r,p),
where y is a derived key, d is the length of the derived key, P is the user-supplied password, S is salt, p, r and N are parameters to set up the processor count, CPU time, and RAM size required for key derivation computation. Values of parameters N, r, p, d can be considered as publicly known and usually these values are stored alongside the key or encrypted/authenticated data.
By varying the parameters N, r, p the user can tune the key derivation function to use any CPU time and RAM size. For instance, if the parameters are tuned to use about 100ms and about 20MB, a brute force attack against Scrypt would not be as effective as against PBKDF2 because the latter requires little RAM, and parallel computations for different passwords can be easily executed simultaneously.
The Scrypt Encryption Utility
The Scrypt Encryption Utility performs AES encryption in CTR mode of input files with a key derived from a user-supplied password using the Scrypt function.
The utility's targeted OS is Linux*. You can download the program at the following link http://www.tarsnap.com/scrypt/scrypt-1.1.6.tgz, then install and run it on Linux. To complete the installation you should install the OpenSSL* development package, which is available in standard repositories of all modern Linux distributions, decompress the downloaded archive, and execute the 3 famous commands: ./configure
, make
and make install
.
Here are the details to install the software on Ubuntu*:
- Install OpenSSL development package by running
apt-get install libssl-dev
- Uncompress a file in the tarball format using
tar xfvz game.tgz
- Create a makefile using
./configure
- Build the utility by launching
make
- Install the program by executing
make install
The utility has both required and optional parameters.
Required parameters:
- Password used by Scrypt to generate a key
- Operation mode: encrypt or decrypt data
- Input file name
Optional parameters:
- -t CPU time in seconds required for key derivation
- -m a fraction of RAM to be used for key derivation
- -M a number of bytes of RAM to be used for key derivation
- Output file name
For example, calling scrypt enc infile –t 0.1 –M 20971520
will encrypt infile and require at most 100ms CPU time and 20MB RAM for key derivation. These parameters complicate parallelization of a brute force attack.
Figure 3 shows the flow of information in the Scrypt Encryption Utility when a user specifies an input file to encrypt, a passphrase, and parameters defining the amount of required hardware resources.
The encryption flow will happen in this order:
- Parse command line and transform to N, r, p: The program converts the CPU time and RAM size parameters into the format needed by the Scrypt function.
- Scrypt key derivation: The Scrypt derives a 64-byte key based on the user-supplied password and parameters N, r, p computed in the previous step. The low 32 bytes of the key dk1 are used to calculate the keyed hash for N, r, p, salt, and encrypted data to verify the correctness of the passphrase and integrity of encrypted message during decryption. The high 32 bytes of the key dk2 are used for encryption of the input message with a 32-byte block AES encryption in CTR mode.
- Keyed hash computing for Scrypt parameters: In this step the keyed hash is computed for parameters N, r, p, salt used to derive the key.
- OpenSSL 32-byte block AES encryption in CTR mode: The input message is encrypted with dk2 using 32-byte block AES cipher in CTR mode.
- Keyed hash computing for encrypted data: The keyed hash is computed for encrypted data using dk1 to provide an integrity check. The output file consists of encrypted data, parameters N, r, p, salt used for encryption, and keyed hashes that provide integrity of the encrypted data and the parameters.
Figure 3:Encryption scheme of the Scrypt Encryption Utility
Possible Threats
Analyzing the flow for encryption we see that the values of N, r, p, salt, and the derived key produced in intermediate steps are sensitive and must be protected from being changed at run-time. For example, in debug-mode an attacker can set up different values for N, r, p to diminish resistance of key derivation against brute force attack.
Figure 4 illustrates the utility's decryption flow when a user specifies an input file with cipher text, N, r, p, salt, keyed hashes, and password.
The decryption flow will happen in this order:
- Scrypt parameters setup: Input file for decryption mode contains the encrypted data, the keyed hashes hmac1, hmac2 and the parameters N, r, p, salt used by encryption. In this step these parameters are parsed from the input file and passed on to key derivation.
- Scrypt key derivation: Scrypt derives a key for the input passphrase and N, r, p, salt from the previous step. The low 32 bytes and the high 32 bytes of this key are designated on the figure as dk1 and dk2 respectively.
- Scrypt parameters and password integrity verification: Integrity of N, r, p, salt, and the correctness of password are verified using a keyed hash. To determine if the passphrase is correct the utility computes a keyed hash for parameters N, r, p, salt using dk1 and compares the computed hash with hmac1. If they match, the password is correct.
- Integrity verification of encrypted data: To determine if encrypted data is corrupted the program computes keyed hash for the data using dk1 and compares the computed hash with hmac2. If they match, the data is not corrupt and can be decrypted in the next step.
- OpenSSL 32-byte block AES decryption in CTR mode: Finally, the data is decrypted with AES 32-byte block cipher in CTR mode using dk2. The output is a file with decrypted data.
Figure 4:Decryption scheme of the Scrypt Encryption Utility
Acknowledgments
We thank Raghudeep Kannavara for giving us the idea about to apply Intel Tamper Protection Toolkit to Scrypt encryption utility and Andrey Somsikov for many helpful discussions.
References
- B. Kaliski. PKCS #5: Password-Based Cryptography Specification Version 2.0.
- C. Percival. The Scrypt Encryption Utility. http://www.tarsnap.com/scrypt/scrypt-1.1.6.tgz
- C. Percival. Stronger key derivation via sequential memory-hard functions, BSDCan'09, May 2009
- C. Percival, S. Josefsson (2012-09-17). The Scrypt Password-Based Key Derivation Function. IETF.
- N. Provos, D. Mazières, J. Talan Sutton 2012 (1999). A Future-Adaptable Password Scheme. Proceedings of 1999 USENIX Annual Technical Conference: 81–92.
About the Authors
Roman Kazantsev works in the Software & Services Group at Intel Corporation. Roman has 7+ year of professional experience in software engineering. His professional interests are focused on cryptography, software security, and computer science. Currently he occupies a position of Software Engineer where his ongoing mission is to deliver cryptographic solutions and expertise for content protection across all Intel platforms. He received his Bachelor and Masters in Computer Science with honors at Nizhny Novgorod State University, Russia.
Denis Katerinskiy works in the Software & Service Group at Intel Corporation. He has 2 years of experience in software development. His main interests are programming, performance optimization, algorithm development, mathematics, and cryptography. In his current role as a Software Developer Engineer Denis develops software simulators for Intel architecture. Denis Katerinskiy currently pursues Bachelor in Computer Science at Tomsk State University.
Thaddeus Letnes works in the Software & Services Group at Intel Corporation. He has 15+ year of professional experience in software development. His main interests are low level systems, languages, and engineering practices. In his current role as a Software Engineer developing software development tools Thaddeus works closely with software developers, architects, and project managers to produce high quality development tools. Thaddeus holds a Bachelor’s degree in Computer Science from Knox College.