[TriLUG] Search for Duplicate Filename, Ignore Extension

Robert Dale robdale at gmail.com
Wed Nov 2 13:29:46 EDT 2011


Generate database of files:

find . -type f | while read file; do echo `basename
"${file%.*}"`\#${file}; done > files.txt

Find dupes:

cat files.txt | cut -f 1 -d '#' | sort | uniq -c | grep -v "^.*1" |
awk '{print $2}' | while read dupe; do grep -i ${dupe}\# files.txt;
done

Caveats:
database delimiter is '#' so if you have those in your filename, well,
shame on you.

On Wed, Nov 2, 2011 at 1:22 PM, Jason Evans <jason.s.evans at gmail.com> wrote:
> I thought fslint also has a filename search function or at least it had.  I
> used it to find multiple copies of the same ebook that calibre sometimes
> creates.
>
> Best Regards,
> Jason
>
>
>
> On Wed, Nov 2, 2011 at 1:11 PM, Peter Neilson <neilson at windstream.net>wrote:
>
>> I'd use locate, assuming I have updatedb running. Try
>>
>> info locate
>>
>> and see what you get.
>>
>>
>> On Wed, 02 Nov 2011 12:44:47 -0400, Brian Blater <brb.lists at gmail.com>
>> wrote:
>>
>>  Say I have a drive filled with files, be it images or music or
>>> whatever, and I need to find duplicates of the files, but I what to
>>> only find the dupes by filename and ignore the extension. For example
>>> say I have an image called picture_of_me.jpg but I'm pretty sure that
>>> was created from an original file with the same name but another
>>> extension. I just want to find all files with the same filename
>>> irregardless of it's extension.
>>>
>>> I've looked at fslint and fdupes but they just appear to look at files
>>> based on hashes and since a jpg is different than a tiff etc., nothing
>>> is found. I would imagine this could be done with a simple script, but
>>> I'm no scripting genius and any script I created would probably end up
>>> being 100 lines when it only needed to be 1.
>>>
>>> Any ideas from our gurus out there?
>>>
>>> Thanks,
>>> Brian
>>>
>> --
>> This message was sent to: Jason Evans <jason.s.evans at gmail.com>
>> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
>> address.
>> TriLUG mailing list : http://www.trilug.org/mailman/**listinfo/trilug<http://www.trilug.org/mailman/listinfo/trilug>
>> Unsubscribe or edit options on the web  : http://www.trilug.org/mailman/**
>> options/trilug/jason.s.evans%**40gmail.com<http://www.trilug.org/mailman/options/trilug/jason.s.evans%40gmail.com>
>> TriLUG FAQ          : http://www.trilug.org/wiki/**
>> Frequently_Asked_Questions<http://www.trilug.org/wiki/Frequently_Asked_Questions>
>>
> --
> This message was sent to: Robert Dale <robdale at gmail.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web  : http://www.trilug.org/mailman/options/trilug/robdale%40gmail.com
> TriLUG FAQ          : http://www.trilug.org/wiki/Frequently_Asked_Questions
>



-- 
Robert Dale



More information about the TriLUG mailing list