How to obfuscate string constants?_问答_开发者

We have an application which contains sensitive information and I'm trying my best to secure it. The sensitive information includes:

The main algorithm
The keys for an encryption/decryption algorithm

I've been looking at Obfuscating the code but it doesn't seem to help much as I can still decompile it. However, my biggest concern i开发者_JAVA百科s that the keys used for encryption of serial numbers etc are clearly visible when you decompile the code, even when it's Obfuscated.

Can anyone suggest how I can secure these strings?

I realise one of the methods might be to remove any decryption from the application itself, while this may be possible in part, there are some features which have to use encryption/decryption - mainly to save a config file and to pass an 'authorisation' token to a DLL to perform a calculation.

There are ways to do what you want, but it isn't cheap and it isn't easy.

Is it worth it?

When looking at whether to protect software, we first have to answer a number of questions:

How likely is this to happen?
What is the value to someone else of your algorithm and data?
What is the cost to them of buying a license to use your software?
What is the cost to them of replicating your algorithm and data?
What is the cost to them of reverse engineering your algorithm and data?
What is the cost to you of protecting your algorithm and data?

If these produce a significant economic imperative to protect your algorithm/data then you should look into doing it. For instance if the value of the service and cost to customers are both high, but the cost of reverse engineering your code is much lower than the cost of developing it themselves, then people may attempt it.

So, this leads on to your question

How do you secure your algorithm and data?

Discouragement

Obfuscation

The option you suggest, obfuscating the code, messes with the economics above - it tries to significantly increase the cost to them (5 above) without increasing the cost to you (6) very much. The research by the Center for Encrypted Functionalities has done some interesting research on this. The problem is that as with DVD encryption it is doomed to failure if there is enough of a differential between 3, 4 and 5 then eventually someone will do it.

Detection

Another option might be a form of Steganography, which allows you to identify who decrypted your data and started distributing it. For instance, if you have 100 different float values as part of your data, and a 1bit error in the LSB of each of those values wouldn't cause a problem with your application, encode a unique (to each customer) identifier into those bits. The problem is, if someone has access to multiple copies of your application data, it would be obvious that it differs, making it easier to identify the hidden message.

Protection

SaaS - Software as a Service

A more secure option might be to provide the critical part of your software as a service, rather than include it in your application.

Conceptually, your application would collect up all of the data required to run your algorithm, package it up as a request to a server (controlled by you) in the cloud, your service would then calculate your results and pass it back to the client, which would display it.

This keeps all of your proprietary, confidential data and algorithms within a domain that you control completely, and removes any possibility of a client extracting either.

The obvious downside is that clients are tied into your service provision, are at the mercy of your servers and their internet connection. Unfortunately many people object to SaaS for exactly these reasons. On the plus side, they are always up to date with bug fixes, and your compute cluster is likely to be higher performance than the PC they are running the user interface on.

This would be a huge step to take though, and could have a huge cost 6 above, but is one of the few ways to keep your algorithm and data completely secure.

Software Protection Dongles

Although traditional Software Protection Dongles would protect from software piracy, they wouldn't protect against algorithms and data in your code being extracted.

Newer Code Porting dongles (such as SenseLock^†) appear to be able to do what you want though. With these devices, you take code out of your application and port it to the secure dongle processor. As with SaaS, your application would bundle up the data, pass it to the dongle (probably a USB device attached to your computer) and read back the results.

Unlike SaaS, data bandwidth would be unlikely to be an issue, but performance of your application may be limited by the performance of your SDP.

^{† This was the first example I could find with a google search.}

Trusted platform

Another option, which may become viable in the future is to use a Trusted Platform Module and Trusted Execution Technology to secure critical areas of the code. Whenever a customer installs your software, they would provide you with a fingerprint of their hardware and you would provide them with a unlock key for that specific system.

This key would would then allow the code to be decrypted and executed within the trusted environment, where the encrypted code and data would be inaccessible outside of the trusted platform. If anything at all about the trusted environment changed, it would invalidate the key and that functionality would be lost.

For the customer this has the advantage that their data stays local, and they don't need to buy a new dongle to improve performance, but it has the potential to create an ongoing support requirement and the likelihood that your customers would become frustrated with the hoops they had to jump through to use software they have bought and paid for - losing you good will.

Conclusion

What you want to do is not simple or cheap. It could require a big investment in software, infrastructure or both. You need to know that it is worth the investment before you start along this road.

All efforts will be futile if someone is motivated enough to break it. No one has managed to figure this out yet, even the biggest software companies.

I'm trying my best to secure it

I'm not saying this as a scathing criticism, just you need to be aware of what your trying to achieve is currently assumed to be impossible.

Obfuscation is security through obscurity, it does have some benefit as it will deter the most incompetent of hacker attempts, but largely it is wasted effort that could perhaps be better spent in other areas of development.

In answer to your original question, you are going to run into problems with intelligent compilers, they might automatically piece together the string into the compiled application removing some of your obfuscation efforts as a compilation optimisations. It would be hard to maintain as well, so I would reconsider your risk analysis model and perhaps resign yourself to the fact it can be cracked and if it has any value probably will be.

I recently read a very simple solution to OP.

Simple declare your constants as readonly string, not const string. That simple. Apparently const variables get written to a stack area in the binary but written as plain text whereas readonly strings get added to the constructor and written as a byte array instead of text.

I.e. If you search for it, you won't find it.

That was the question, right?

Using a custom algorithm (security through obscurity?), combined with storing the key inside the application, is simply not secure.

If you are storing some kind of a password, then you can use a one-way hashing function to ensure that decrypted data is unavailable anywhere in your code.

If you need to use a symmetric encryption algorithm, use a well known and tested one, like AES-256. But the key obviously cannot be stored inside your code.

[Edit]

Since you mentioned encryption of serial numbers, I believe you a one-way hashing function (like SHA-256) would really suit your needs better.

The idea is to hash your serial numbers during build time into their hashed representations, which cannot be reversed (SHA-256 is considered to be a pretty safe algorithm, compared to, say, MD5). During run time, you only need to apply the same hash function to the user input, and compare hashed values only. This way none of the actual serial numbers are available to the attacker.

@Tom Gullen have given a proper answer.

I merely got some suggestions on how you can make it harder for the users to access your keys and algorithm.

As for the algorithm: Do not compile your algorithm at compile time, but at runtime. To be able to do this you need to specify an interface which contains the methods for the algorithm. The interface is used to run it. Then add the source code for the algorithm as an encrypted string (embedded resource). Decrypt it at runtime and use CodeDom to compile it into a .NET class.

Keys: The usual way is to store spread parts of your key in different places in the application. Store each part as byte[] instead of string to make it a bit harder to find them.

If all your users have an internet connection: Fetch the algorithm source code and the keys using SSL instead.

Note that everything will be pieced together at runtime, anyone with a bit of more knowledge can inspect/debug your application to find everything.

i dont think you can easily obfuscate string constants, so if possible, dont use them :) you can use assembly resources instead, those you can encrypt however you want.

Depends what you're trying to do but can you use asymmetric encryption? That way you only need to store public keys with no need to obfuscate them.