I'm already checking for content-type, size, and extension (Django (audio) File Validation), but I need a library to read the file and confirm that it is in fact what I hope it is (mp3 and mp4 mostly).
I've been here: http://wiki.python.org/moin/Audio/ but no luck. Been at this one for a while, am a bit lost in the woods. Relying on SO big time for this whole end of things...
Thanks in advance.
EDIT: I'm already (in Django) using UploadedFile.content_type() :
"The content-type header uploaded with the file (e.g. text/plain or application/pdf). Like any data supplied by the user, you shouldn't trust that the uploaded file is actually this type. You'll still need to validate that the 开发者_如何转开发file contains the content that the content-type header claims -- "trust but verify."
So, I'm already reading the header. But how can I validate the actual content of the file?
If just checking the header isn't good enough, I'd recommend using mutagen
to load the file. It should throw an exception if it's not correct.
FYI, I do not think your approach is very scalable. Is it really necessary to read every byte of the file? What is your reason for not trusting the file header?
You can call a unix sub-shell within python like this:
>>> filename = 'Giant Steps.mp3'
>>> import os
>>> type = os.system('file %s' % filename)
Giant Steps.mp3: ISO Media, MPEG v4 system, iTunes AAC-LC
** See man pages for more details on the 'file' command if you want to go this route.
See this post for other options
Use sndhdr
It does a little more than content-type. Reads the file and gets it's headers..of course this is still not foolproof..using ffmpeg is probably then the only option.
精彩评论