Wednesday, September 14, 2011

File entropy calculator

A long time ago ... I got a request from a colleague: how could we, given a bunch of files, sort out the ones that could be encrypted?

I remembered that encrypted files tend to exhibit an entropy that's higher than the usual file, so I wrote a quick python script - I was learning the language - and used it on our large dataset. This was really useful and we were able to quickly find all the encrypted files.

A few false positives were caught: mostly compressed files. Feel free to drop me a line if you find this useful or if you find any bug.

Git repository



2 comments:

  1. Hey thanks for this. I was thinking about creating a similar app but looks like you beat me to it :)

    ReplyDelete
  2. You are most welcome. If there is any additional feature that would be useful, let me know :-)

    ReplyDelete