Cookie Notice

As far as I know, and as far as I remember, nothing in this page does anything with Cookies.

2017/05/08

count is not uniq...

I recall reading something online about 20 years ago (gasp!) where the authors were looking for a core set of knowledge that would constitute "knowing Unix", and found that there just wasn't. Knowing Unix was like the Humpty Dance, in that no two people do it the same.

And, presumably, you have Unix down when you appear to be in pain.


I have been a Unix/Linux user since the 1990s and I only found out about uniq -c because of the last post. I had been using sort | uniq forever, and have recently stopped in favor of sort -u, which I also learned about recently.

I find that uniq is kinda useless without a sort in front of it; if your input is "foo foo foo bar foo" (with requisite newlines, of course), uniq without sort will give you "foo bar foo" instead of "foo bar" or "bar foo", either of which is closer to what I want.

So, I could see adding alias count=" sort | uniq " to my bash setup, but adding a count program to my ~/bin seems much better to me, much closer to the Right Thing.

marc chantreux suggested an implementation of count that is perhaps better and certainly shorter than the one I posted. There was regex magic that I simply didn't need, because I wanted the count to stand by itself (but I might revisit to remove the awk step, because as a user, I'm still a bit awkward with it.)

B)

my %seen ;

map { $seen{$_}++ } do {
    @ARGV ? @ARGV : map { chomp ; $_ } <>;
    } ;

while ( my ( $k, $v ) = each %seen ) {
    say join "\t", $v, $k ;
    }
I like marc's use of the ternary operator to handle STDIN vs @ARGV, but I'm somewhat inconsistently against map over for. I know people who thing that map not going into an array is a problem, so I don't go back to it often.

I do, however, do for my $k ( keys %seen ) { ... } enough that I'm sort of mad at myself for not encountering each before.

ETA: It's been brought to my attention that using map {} as a replacement for for () {} is not good.