Home Dashboard Directory Help
Search

Get-Unique; please add a -duplicate parameter by b.koehler


Status: 

Active


5
0
Sign in
to vote
Type: Suggestion
ID: 830066
Opened: 3/7/2014 6:20:20 PM
Access Restriction: Public
0
Workaround(s)
view

Description

Please add a -duplicate parameter to Get-Unique.

Getting a list of unique items is easy, however getting a list of duplicate items is surprisingly difficult. Across a few hundred thousand lines of text Get-Content, Sort-Object, and Get-Unique perform quite well. The issue comes into play when I just need duplicates, as Group-Object and Compare-Object are extremely slow.

Please add the following parameter:
-duplicate     only output duplicate objects

Ideally I'd be able to do the following:
$file = Get-Content .\file.txt | sort
$duplicates = $file | get-unique -duplicate

Instead I'm doing the following:
$file = Get-Content .\file.txt | sort
$duplicates = $file | uniq.exe -d

#Code examples
#1319998 file.txt, 10000 lines of 64 character strings

#uniq.exe example 1
(Measure-Command{$example1 = $file | uniq.exe -d}).TotalSeconds
#0.7767339
#0.7557846
#0.7624836

#hashtable example 2
(Measure-Command{
$hashtable = @{}
$file | ForEach-Object {$hashtable["$_"] += 1}
$example2 = $hashtable.keys | Where-Object {$hashtable["$_"] -gt 1} | sort
}).TotalSeconds
#0.8028991
#0.788281
#0.7761327

#Compare-Object example 3
(Measure-Command{
$unique = $file | Select-Object -Unique
$example3 = (Compare-Object $file $unique).InputObject | Get-Unique
}).TotalSeconds
#19.4573219
#19.2388358
#19.2343912

#Group-Object example 4
(Measure-Command{$example4 = ($file | Group-Object | Where-Object {$_.count -gt 1}).name}).TotalSeconds
#19.3467626
#19.2612973
#19.2966829

#Verify our results
Compare-Object $example1 $example2
Compare-Object $example1 $example3
Compare-Object $example1 $example4
Details
Sign in to post a comment.
Posted by b.koehler on 3/10/2014 at 3:49 PM
#Compare-Object example 5
(Measure-Command{
$unique = $file | get-unique
$example5 = (Compare-Object $file $unique).InputObject | get-unique
}).TotalSeconds
#1.2992229
#1.3230796
#1.3470929

Compare-Object $example1 $example5
Sign in to post a workaround.