Storing data in python_问答_开发者_运维开发者技术经验分享

I need help storing data within a python program. I don't want the u开发者_Python百科ser to be able to touch the data at all. I've looked into pickle but many posts say it is "insecure".

You can't stop the user from touching data. If they're running a program on their system, they can do whatever they want with the bits after you write them to disk. You can obfuscate the data in various ways, possibly even encrypting it, but they can still, eventually get to it if they're determined. If you want proof, look at the absolute failure of every copy protection/DRM system ever invented. There are solutions that are 'good enough' but, without knowing what problem you're actually trying to solve there's no good way to start providing realistic options.

...and Pickle's great if you can trust your data. If you can be reasonably certain that your program wrote a file to disk and that malicious programs aren't actively targeting your application, it's safe. I'd never trust a pickle sent across the network, however - a pickle can potentially execute arbitrary commands during deserialization.

You can try to stop the program from running if it's been tampered with, for example by comparing its (md5) hash to a known good value. Check out the Chrome OS project for an example of a system which does roughly this.

You can try to stop the user from understanding your program and data written to the disk, for example by encrypting it and hiding the decryption subroutine, or by obfuscating the source code.

But you can't stop a determined user from destroying your program and its data or from interrupting it. Once your program is in the air, I think its memory segment is protected from access by other processes. This doesn't stop a user from decompiling and trying to make sense of your program before running it, though.

Security which aims to protect software from its owner is bound to rely on clever hacks. Remeber that your clever hacks are subject to circumvention by other clever hacks. Python was designed as an open language, so you might have better luck with other languages if you intend to design "sneaky" programs.

If the user will be running your program on his machine you just can't hide the data.

The user running the program can access everything the program can, is just a matter of knowing where to look.

If you are dealing with end-users then just encrypt the data and decrypt it at the last point before using it. Just keep in mind that, at some point, the data has to be decrypted on memory and then the user could see it.

Maybe you should look at CouchDB? You basically store data in JSON as document. It works perfectly.

http://code.google.com/p/couchdb-python/

Do you mean within as in within memory, and not on disk? A really simple solution to storing data in memory and making it easily accessible throughout the program would be to make a python file containing a dictionary in your program's root directory.

for example, py/storage.py would contain:

data = {}

Then you could write to it like so:

import storage
storage.data['foo'] = 'bar'

and then read the same value in another file like this:

import storage
foo = storage.data['foo']

The data in the storage module will be accessible to all other modules within the program, but will be erased when the program exits. The user will not be able to touch it without modifying the program. If you need something more database related, sqlite allows you to create databases in memory that only stay in memory as long as the program is running

Have a quick look at this, and this. Those libraries may be of some help. I feel compelled to note that an experienced computer software engineer will probably find a way to circumnavigate whatever implemented method of data concealing by simply reading your source file, deconstruction of your compiled code, or even possibly a feasible brute-force decryption (I could be wrong).

If you need to protect the data from tampering (but you are not worried about someone seeing the data), then deploying your project with a custom loader is the way to go such as signet (http://jamercee.github.io/signet/). Put your data blob in your script, and it's values will be incorporated in it's sha1 calculations. If anyone were to modify the data blob, the loader will detect it as tampering, and refuse to run it.