How the Hell did I do that?: File type count

Monday, February 20, 2012

File type count

Say you have a big directory of files spread across tons of subfolders, and you want to find the unique file types present, along with the count for each one. Try this:

find . | sed -e 's/\.[^.]*\.$.*$/\1/g' | grep -v '^\.' | sort | uniq -c > ~/files3.txt

find lists all the files across all subdirectories.
sed grabs only the extension after the . in each filename.
grep removes extension-less folders.
sort sorts all extensions alphabetically (somehow uniq doesn't work without it).
uniq compiles the duplicate counts.

Alternatively,

find . | sed -e 's/\.[^.]*\.$.*$/\1/g;/\//d' | sort | uniq -c > ~/files3.txt

sed performs grep's deletion of extension-less folders by detecting remaining slashes in the filename (assuming folder pathnames don't contain dots).

3 comments:

UnknownJuly 24, 2012 at 6:09 AM
I prefer to use a MS-Windows GUI tool: Directory Report
http://www.file-utilities.com
After scanning your files select menu:
Largest / Display Largest Type
ReplyDelete
Replies
AntonioJuly 26, 2012 at 11:15 AM
A little improvement that keeps only everything after the last slash, and sorts the counts:

find . | sed -e 's/.*\/$.*$$/\1/g' | sed -e 's/.*\.$.*$$/\1/g' | sort | uniq -c | sort -n
ReplyDelete
Replies

Add comment