/var/log/rant: 2017

2017/11/09

Blogging Elsewhere

I discovered Jekyll.

Jekyll allows me to write blogs in Markdown and add them with git.

This is much closer to the workflow I want, because it's a very developer-centric blogging style.

So, right now, I am mostly blogging at https://jacoby.github.io/.

2017/08/02

Considering Code Documentation

I was procrastinating about a work project, thinking about magnets, thinking about crushers, thinking about thermite, when I thought about my status tracker.

It goes by the name status.pl because status is already a DBus utility. I made some other decisions relating to command-line interface, moving away from a cool idea I had a few years ago that hasn't aged well. I changed it to record.pl, which isn't similarly overloaded on my system, at least, and wrote a few flags for it, mostly for overkill. I mean, at core, it's because Len Budney convinced me my scripts needed manners, especially it not doing anything when just called.

So, I wrote this. All the details are tucked in Status.pm, which I am not sharing and which needs an overhaul too. All in all, I am reasonably happy with this; If there was a useless use of sub signatures award, I think this'd be a contender, but both that and postderef are used in get_records.pl, so it's okay. (I put no warnings because I'm running 5.26 on my desktop, but it is likely that I'll copy this to cluster machines running our 5.20 or the system 5.10 Perls, which would need the warnings.) I could and probably should go without IO::Interactive, but still.

So, my question is, beyond the user documentation that gets passed through Pod::Usage, what documentation does this program need? I'm always at a loss for examples of what is needed. In general, programmers discount documentation; most editor themes I see make comments gray and muted, which has always angered me, but looking at this (and yes, it's a slight case) I'm at a loss to decide what would be important to add.

So, what comments would you add? What do you find wanting?

(If desired, to add to this conversation, I can add the display section and the module.)

2017/07/26

Using VS Code and Ligatures

Quick history: I started out as a vi man, having the comical "how do I save and exit?" issues with emacs that I see lots of people complain about for vim. After college, my first job's standard editor was UltraEdit. In the lab, I experimented with KomodoEdit, have been a Sublime Text 2 user, and right now, on both Windows and Linux, I'm using Visual Studio Code.

As well as vim. There are occasional things VS Code can't do, or at least can't do with the packages I know about. Line sorting, for one. I like to have my modules sorted; it helps me to see if the one I want is called.

Similarly, over time, I have developed a fondness for specific fonts in my editor. I cannot go through a long history of preference, but I can say that variations of Droid Sans Mono with modifications for a dotted or slashed zero do more easily distinguish them from capital O are normally what I run in terminals and editors.

But, I recently saw Scott Hanselman blog on Monospace Programming Fonts with Ligatures.

What's that?

Consider the code from yesterday's post on Sentiment Analysis. There are a couple places where there are skinny arrows (->), fat arrows (=>) and greater-than-and-equal signs (>=). Look specifically at line 21 for skinny arrows and greater-and-equal. Pretty, isn't it? I will also point out that line 21 also shows the dotted 0 and how easy it is to distinguish it from letters.

Also look at the logical AND (&&) on like 36.

These are just in-the-editor changes; the code from yesterday's post is copied from this editor.

The font I'm using is called Fira Code, and it's on GitHub. There are others, like Monoid and Hasklig that can do this, but I like Fira Code. Within VS Code, you also have to add "editor.fontLigatures": true to your settings.

I have it as my font for the HexChat IRC client, which supports ligatures, and Git for Windows Bash, Windows Subsystem for Linux and PowerShell terms on Windows, which don't, but the font still looks good. I noticed an issue where it swallowed the equal sign in Gnome Terminal, so there I'm back to Droid Sans Mono Slashed, but that might have been a temporary issue.

Hrmm. Yesterday, I wrote on an Azure API. Today, I'm praising a Visual Studio-related editor. Tomorrow, I might get into the Windows Subsystem for Linux.

2017/07/25

Sentimentalizing Twitter: First Pass

It started here.

I need a Twitter that sends all my grumpy thoughts into oblivion rather than to actual people.
— Phil Sands (@PurdueCSPhil) July 10, 2017

I couldn't leave it there.

You could do sentiment analysis as an alternate client or regularly after-the-fact to cut down on that. We have the technology.
— Code By Java (@JacobyDave) July 10, 2017

I can write a tool that goes through all old tweets and deletes ones that don't pass criteria, but I prefer to get out ahead of the issue and leave the record as it is.

And, as a later pass, one can pull a corpus, use a module like Algorithm::NaiveBayes and make your own classifier, rather than using Microsoft Research's Text Analytics API or another service. I was somewhere along that process when the hosting died, so I'm not bringing it here.

I was kinda under the thrall of Building Maintainable Software, or at least the first few chapters of it, so I did a few things differently in order to get their functions less than 15 lines, but the send_tweet function didn't get the passes it'd need, and I could probably give check_status some love, or perhaps even roll it into something like WebService::Microsoft::TextAnalytics. In the mean time, this should allow you to always tweet on the bright side of life.

#!/usr/bin/env perl

use feature qw{ postderef say signatures } ;
use strict ;
use warnings ;
use utf8 ;

no warnings qw{ experimental::postderef experimental::signatures } ;

use Carp ;
use Getopt::Long ;
use IO::Interactive qw{ interactive } ;
use JSON ;
use LWP::UserAgent ;
use Net::Twitter ;
use YAML qw{ LoadFile } ;

my $options = options() ;
my $config  = config() ;

if ( check_status( $options->{ status }, $config ) >= 0.5 ) {
    send_tweet( $options, $config ) ;
    }
else { say qq{Blocked due to negative vibes, man.} }
exit ;

sub send_tweet ( $options, $config ) {
    my $twit = Net::Twitter->new(
        traits          => [ qw/API::RESTv1_1/ ],
        consumer_key    => $config->{ twitter }{ consumer_key },
        consumer_secret => $config->{ twitter }{ consumer_secret },
        ssl             => 1,
        ) ;
    my $tokens = $config->{ twitter }{ tokens }{ $options->{ username } } ;
    if (   $tokens->{ access_token }
        && $tokens->{ access_token_secret } ) {
        $twit->access_token( $tokens->{ access_token } ) ;
        $twit->access_token_secret( $tokens->{ access_token_secret } ) ;
        }
    if ( $twit->authorized ) {
        if ( $twit->update( $options->{ status } ) ) {
            say { interactive } $options->{ status } ;
            }
        else {
            say { interactive } 'FAILED TO TWEET' ;
            }
        }
    else {
        croak( "Not Authorized" ) ;
        }
    }

sub check_status ( $status, $config ) {
    my $j   = JSON->new->canonical->pretty ;
    my $key = $config->{ microsoft }{ text_analytics }{ key } ;
    my $id  = 'tweet_' . time ;
    my $object ;
    push @{ $object->{ documents } },
        {
        language => 'EN',
        text     => $status,
        id       => $id,
        } ;
    my $json    = $j->encode( $object ) ;
    my $api     = 'https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment' ;
    my $agent   = LWP::UserAgent->new ;
    my $request = HTTP::Request->new( POST => $api ) ;
    $request->header( 'Ocp-Apim-Subscription-Key' => $key ) ;
    $request->content( $json ) ;
    my $response = $agent->request( $request ) ;

    if ( $response->is_success ) {
        my $out = decode_json $response->content ;
        my $doc = $out->{ documents }[ 0 ] ;
        return $doc->{ score } ;
        }
    else {
        croak( $response->status_line ) ;
        }
    return 1 ;
    }

sub config () {
    my $config ;
    $config->{ twitter }   = LoadFile( join '/', $ENV{ HOME }, '.twitter.cnf' ) ;
    $config->{ microsoft } = LoadFile( join '/', $ENV{ HOME }, '.microsoft.yml' ) ;
    return $config ;
    }

sub options () {
    my $options ;
    GetOptions(
        'help'       => \$options->{ help },
        'username=s' => \$options->{ username },
        'status=s'   => \$options->{ status },
        ) ;
    show_usage( $options ) ;
    return $options ;
    }

sub show_usage ($options) {
    if (   $options->{ help }
        || !$options->{ username }
        || !$options->{ status } ) {
        say { interactive } <<'HELP';
Only Positive Tweets -- Does text analysis of content before tweeting

    -u  user        Twitter screen name (required)
    -s  status      Status to be tweeted (required)
    -h  help        This screen
HELP
        exit ;
        }
    }
__DATA__

.microsoft.yml looks like this
---
text_analytics:
    key: GENERATED_BY_MICROSOFT

.twitter.cnf looks like this

---
consumer_key: GO_TO_DEV.TWITTER.COM
consumer_secret: GO_TO_DEV.TWITTER.COM
tokens:
    your_username:
        access_token: TIED_TO_YOU_AS_USER_NOT_DEV
        access_token_secret:TIED_TO_YOU_AS_USER_NOT_DEV

I cover access_tokens in https://varlogrant.blogspot.com/2016/07/nettwitter-cookbook-how-to-tweet.html

2017/07/22

The Next Thing

As discussed last time, I had been using my GitHub Pages space as a list to my Repos. I had been considering moving my blogging from here to ... something else, and this looked like an interesting concept.

I've always developed the web with a smart server side, and I've known from the start that this makes you very vulnerable, so I do like the idea of writing markdown, committing it to the repo and having the system take care of it from there. So, that's a win.

But, as far as I can tell, I've followed the "this is how you make an Atom feed" magic and get no magic from it, and that, more than webhooks triggering on push, is how you start putting together the social media hooks that make blogging more than writing things down in a notebook. Which is a lose.

So, I'm not 100% happy with GitHub Pages and Jekyll, but the good thing is that I can write and commit from anywhere. If I used Statocles or another static site generator, I'd have to have that system running on whatever computer I blog from, or transfer it to there.

I would guess that, if I had the whole deal running on any of my machines, some of the small things would work better, but so far, getting a setup that displays pages exactly like github.io on localhost has been less that working, And, I would've like to have this as a project page, jacoby.github.io/blog, so my personal page could be more of a landing page, but alas.

And, ultimately, I do want to not have myblog.<service>.com, but rather myblog.<me>.com. Every time I think about it, I think about the tools I'd build on it, rather than the billboard for me, but

2017/07/13

Github to Perl to Github? Putting your projects on a web page

My projects are on GitHub: https://github.com/jacoby/

I have a page on GitHub: https://jacoby.github.io/

Early in my playing with Bootstrap, I made this as a way to begin to play with it. It is about as simple a GitHub API to LWP to Template Toolkit to Bootstrap tool as I could have written. I'm now thinking about how to make this prettier, but for now, it's what I use to generate, when I remember to redo my GitHub page. Learn and enjoy.

2017/07/07

Temperature based on Current Location for the Mobile Computer

I'm always curious about how people customize their prompt. I put name and machine in, with color-coding based on which machine, because while spend most of my time on one or two hosts, I have reason to go to several others.

I work in a sub-basement, and for most of my work days, I couldn't tell you if it was summer and warm, winter and frozen, or spring and waterlogged outside, so one of the things I learned to check out is current weather information. I used to put the temperature on the front panel of our HP printer, but we've moved to Xerox.

Currently, I use DarkSky, formerly forecast.io. I know my meteorologist friends would recommend more primary sources, but I've always found it easy to work with.

I had this code talking to Redis, but I decided that it was an excuse to use Redis and this data was better suited for storing in YAML, so I rewrote it.

store_temp

#!/usr/bin/env perl

# stores current temperature with YAML so that get_temp.pl can be used
# in the bash prompt to display current temperature

use feature qw{ say state } ;
use strict ;
use warnings ;

use Carp ;
use Data::Dumper ;
use DateTime ;

use IO::Interactive qw{ interactive } ;
use JSON ;
use LWP::UserAgent ;
use YAML::XS qw{ DumpFile LoadFile } ;

my $config = config() ;
my $url
    = 'https://api.darksky.net/forecast/'
    . $config->{apikey} . '/'
    . ( join ',', map { $config->{$_} } qw{ latitude longitude } ) ;
my $agent = LWP::UserAgent->new( ssl_opts => { verify_hostname => 0 } ) ;
my $response = $agent->get($url) ;

if ( $response->is_success ) {
    my $now = DateTime->now()->set_time_zone('America/New_York')->datetime() ;
    my $content  = $response->content ;
    my $forecast = decode_json $content ;
    my $current  = $forecast->{currently} ;
    my $temp_f   = int $current->{temperature} ;
    store( $now, $temp_f ) ;
    }
else {
    say $response->status_line ;
    }

exit ;

# ======================================================================

sub store {
    my ( $time, $temp ) = @_ ;
    say {interactive} qq{Current Time: $time} ;
    say {interactive} qq{Current Temperature: $temp} ;
    my $data_file = $ENV{HOME} . '/.temp.yaml' ;
    my $obj       = {
        curr_time => $time,
        curr_temp => $temp,
        } ;
    DumpFile( $data_file, $obj ) ;
    }

# ======================================================================
# Reads configuration data from YAML file. Dies if no valid config file
# if no other value is given, it will choose current
#
# Shows I need to put this into a module
sub config {
    my $config_file = $ENV{HOME} . '/.forecast.yaml' ;
    my $output      = {} ;
    if ( defined $config_file && -f $config_file ) {
        my $output = LoadFile($config_file) ;
        $output->{current} = 1 ;
        return $output ;
        }
    croak('No Config File') ;
    }

And this is the code that reads the YAML and prints it, nice and short and ready to be called in PS1.

get_temp

#!/usr/bin/env perl

# retrieves the current temperature from YAML to be used in the bash prompt

use feature qw{ say state unicode_eval unicode_strings } ;
use strict ;
use warnings ;
use utf8 ;
binmode STDOUT, ':utf8' ;

use Carp ;
use Data::Dumper ;
use YAML::XS qw{ LoadFile } ;

my $data_file = $ENV{HOME} . '/.temp.yaml' ;
my $output    = {} ;
if ( defined $data_file && -f $data_file ) {
    my $output = LoadFile($data_file) ;
    print $output->{curr_temp} . '°F' || '' ;
    exit ;
    }
croak('No Temperature File') ;

I thought I put a date-diff in there. I wanted to be able say 'Old Data' if the update time was too long ago. I should change that.

I should really put the config files in __DATA__ for show, because it will show that the location is hard-coded. For a desktop or server, that makes sense; it can only go as far as the power plug stretches. But, for other reasons, I adapted my bash prompt on my Linux laptop, and I recently took it to another state, so I'm thinking more and more that I need to add a step, to look up where I am before I check the temperature.

store_geo_temp

#!/usr/bin/env perl

# Determines current location based on IP address using Google
# Geolocation, finds current temperature via the DarkSky API
# and stores it into a YAML file, so that get_temp.pl can be
# in the bash prompt to display current local temperature.

use feature qw{ say state } ;
use strict ;
use warnings ;
use utf8 ;

use Carp ;
use Data::Dumper ;
use DateTime ;
use IO::Interactive qw{ interactive } ;
use JSON::XS ;
use YAML::XS qw{ DumpFile LoadFile } ;

use lib $ENV{HOME} . '/lib' ;
use GoogleGeo ;

my $json     = JSON::XS->new->pretty->canonical ;
my $config   = config() ;
my $location = geolocate( $config->{geolocate} ) ;
croak 'No Location Data' unless $location->{lat} ;

my $forecast = get_forecast( $config, $location ) ;
croak 'No Location Data' unless $forecast->{currently} ;

say {interactive} $json->encode($location) ;
say {interactive} $json->encode($forecast) ;

my $now     = DateTime->now()->set_time_zone('America/New_York')->datetime() ;
my $current = $forecast->{currently} ;
my $temp_f  = int $current->{temperature} ;
store( $now, $temp_f ) ;

exit ;

# ======================================================================
# Reads configuration data from YAML files. Dies if no valid config files
sub config {
    my $geofile = $ENV{HOME} . '/.googlegeo.yaml' ;
    croak 'no Geolocation config' unless -f $geofile ;
    my $keys = LoadFile($geofile) ;

    my $forecastfile = $ENV{HOME} . '/.forecast.yaml' ;
    croak 'no forecast config' unless -f $forecastfile ;
    my $fkeys = LoadFile($forecastfile) ;
    $keys->{forecast} = $fkeys->{apikey} ;
    croak 'No forecast key' unless $keys->{forecast} ;
    croak 'No forecast key' unless $keys->{geolocate} ;
    return $keys ;
    }

# ======================================================================
# Takes the config for the API keys and the location, giving us lat and lng
# returns the forecast object or an empty hash if failing
sub get_forecast {
    my ( $config, $location ) = @_ ;
    my $url
        = 'https://api.darksky.net/forecast/'
        . $config->{forecast} . '/'
        . ( join ',', map { $location->{$_} } qw{ lat lng } ) ;
    my $agent = LWP::UserAgent->new( ssl_opts => { verify_hostname => 0 } ) ;
    my $response = $agent->get($url) ;

    if ( $response->is_success ) {
        my $content  = $response->content ;
        my $forecast = decode_json $content ;
        return $forecast ;
        }
    return {} ;
    }

# ======================================================================

sub store {
    my ( $time, $temp ) = @_ ;
    say {interactive} qq{Current Time: $time} ;
    say {interactive} qq{Current Temperature: $temp} ;
    my $data_file = $ENV{HOME} . '/.temp.yaml' ;
    my $obj       = {
        curr_time => $time,
        curr_temp => $temp,
        } ;
    DumpFile( $data_file, $obj ) ;
    }

A few things I want to point out here. First off, you could write this with Getopt::Long and explicit quiet and verbose flags, but Perl and IO::Interactive allow me to make this context-specific and implicit. If I run it myself, interactively, I am trying to diagnose issues, and that's when say {interactive} works. If I run it in crontab, then it runs silently, and I don't get an inbox filled with false negatives from crontab. This corresponds to my personal preferences; If I was to release this to CPAN, I would likely make these things controlled by flags, and perhaps allow latitude, longitude and perhaps API keys to be put in that way.

But, of course, you should not get in the habit, because then your keys show up in the process table. It's okay if you're the only user, but not best practice.

This is the part that's interesting. I need to make it better/strong/faster/cooler before I put it on CPAN, maybe something like Google::Geolocation or the like. Will have to read some existing Google-pointing modules on MetaCPAN before committing. Geo::Google looks promising, but it doesn't do much with "Where am I now?" work, which is exactly what I need here.

Google's Geolocation API works better when you can point to access points and cell towers, but that's diving deeper than I need; the weather will be more-or-less the same across the widest accuracy variation I could expect.

GoogleGeo

package GoogleGeo ;

# interfaces with Google Geolcation API 

# https://developers.google.com/maps/documentation/geolocation/intro

use feature qw{say} ;
use strict ;
use warnings ;

use Carp ;
use Data::Dumper ;
use Exporter qw(import) ;
use Getopt::Long ;
use JSON::XS ;
use LWP::Protocol::https ;
use LWP::UserAgent ;

our @EXPORT = qw{
    geocode
    geolocate
    } ;

my $json  = JSON::XS->new->pretty ;
my $agent = LWP::UserAgent->new ;

sub geocode {
    my ($Google_API_key,$obj) = @_ ;
    croak unless defined $Google_API_key ;
    my $url = 'https://maps.googleapis.com/maps/api/geocode/json?key='
        . $Google_API_key ;
    my $latlng = join ',', $obj->{lat}, $obj->{lng} ;
    $url .= '&latlng=' . $latlng ;
    my $object = { latlng => $latlng } ;
    my $r = $agent->post($url) ;
    if ( $r->is_success ) {
        my $j = $r->content ;
        my $o = $json->decode($j) ;
        return $o ;
        }
    return {} ;
    }

sub geolocate {
    my ($Google_API_key) = @_ ;
    my $url = 'https://www.googleapis.com/geolocation/v1/geolocate?key='
        . $Google_API_key ;
    my $object = {} ;
    my $r = $agent->post( $url, $object ) ;
    if ( $r->is_success ) {
        my $j = $r->content ;
        my $o = $json->decode($j) ;
        return {
            lat => $o->{location}{lat},
            lng => $o->{location}{lng},
            acc => $o->{accuracy},
            } ;
        }
    return {} ;
    }

'here' ;

If this has been helpful or interesting to you, please tell me so in the comments.

2017/07/06

Working Through Limitations: The Perl Conference 2017 Day 3

A Delorean with a Perl-powered center column and the cutest little Flux Capacitor on the dashboard. Oh, the wonders you can see at a developer conference.

"A man's got to know his limitations."

That's a line from Dirty Harry Callahan in Magnum Force, but it really described my planning for the Perl Conference. Once the calendar was up, I went in, first and foremost thinking "What are skills I need to learn?"

One crucial skill is version control. It's difficult to add to my main workflow, as I develop in production. (I live in fear.) But I'm increasingly adding it to my side projects. It is especially part of the process for maintaining the site for Purdue Perl Mongers, as well as aspects of HackLafayette, but beyond certain basics, I just didn't know much about how to use Git and version control to improve my projects. I learned how CPAN Testers tests your code on many platforms after you upload to CPAN, and how Travis-CL and Appveyor test against Linux, macOS and Windows after pushing to GitHub, but how track changes, align issues with branches, etc., are all new to me. So, I started Wednesday with Genehack and Logs Are Magic: Why Git Workflows and Commit Structure Should Matter To You. (Slides) I fully expect to crib ideas and aliases from this talk for some time to come.

There was a talk at a local software group featuring a project leader from Amazon on the technology involved with Alexa, which involved a lot of how this works, going from speech-to-text to tokenization and identifying of crucial keywords -- "Alexa, have Domino's order me a Pizza" ultimately boiling down to "Domino's" and "Pizza" -- and proceeding from there. It gave a sense of how Amazon is taking several hard problems and turning them into consumer tools.

What came very late in the talk is how to interface my systems with Amazon's "Talking Donkey", and I had a few conversations where we talked about starting the day with "Alexa, what's fresh hell is this?" and getting back a list of systems that broke overnight, but I lacked a strong idea of what is needed to make my systems interact with the Alexa system.

And an Echo.

But, thankfully, Jason Terry's Amazon, Alexa and Perl talk covered this, albeit more in the "Turn my smart lights on" sense than in the "Tell me what my work systems are doing" sense. Still, very much something I had been interested in.

But, as I implied, Amazon does a lot of heavy lifting with Alexa, getting it down to GETs and and POSTs against your API. If you're running this from home, where you have a consistent wired connection, this works. But, if you're running this, for example, in your car, you need it to be small, easy, and self-contained. Chris Prather decided to MAKE New Friends with a Raspberry Pi 3. This was a case of Conference-Driven Development, and he didn't have it ready to demonstrate at this time.

I've been trying to move my lab to the New Hotness over time, and because I will have to tie things back to existing systems, I have avoided learning too much in order to make it work. Joel Berger presented Vue.js, Mojolicious, and PostgreSQL chat in less than 50 lines. I've heard good things about Vue.js, which I heard from the project head was the parts he needed from Angular without the stuff he didn't use. (RFC Podcast) I use jQuery and Template and not much more, so the "only what you need" aspect sounds good to me. I fully expect to bug Joel on Twitter and IRC about this and other things I need to do with Mojolicious over the coming months. (Slides)

But, of course, not every talk needs to speak directly to my short-term needs. Stevan Little has been trying to add an object system to the Perl 5 for quite some time, and in Hold My Beer and Watch This, he talks about Moxie, his latest attempt.

OK, the Moxie talk was in the same room as the Mojolicious talk and this next one. I could have gone to one of the others, but I decided against. Oh well. I would put "do more object-oriented development" as a thing to learn, so I was glad to hear it.

The last full talk was The Variable Crimes We Commit Against JavaScript by Julka Grodel. I would say that I code Javascript half as much as I code Perl, but twice as much as I code in other languages. I knew certain parts, like how let gives lexical scoping like Perl's my. I had heard about let from my JS Twitter list, as well as from Matt Trout's talk, but there were certainly things I didn't know.

And, actually, things I still don't. I had technical difficulties with my laptop, and if I could've worked it out that day, I would've tried to set up a lightning talk. Alas, not only did it not work -- in the end, I swapped out the WiFi, and I think if I switched back, it'd be sane again -- I missed a lot of her talk about arrow functions, which piqued the interest of Perl's functional programming guru, Mark Jason Dominus. (In a not-surprising addition to my limitations, I still haven't gotten to the end of Higher Order Perl.) Anyway, I believe that, after I finish this post, I will go back and watch this again.

After this, there were Lightning Talks, interspersed with ads for various Perl Mongers groups, and urges for you to start a Perl Mongers group. I am likely going to do another post, with these talks from all three days, but to end this one, I'll show off one where M. Allan Noah talks about the SANE Project and how he reverse engineers scanner output to allow him to support scanners without vendor input.

Lack of documentation is a limitation, but knowing the limitation does not mean you have to stop, and lack of documentation will not stop him. We should all draw inspiration from that.

Now that I'm two weeks past the end, most of the talks are up, and I would like to commend the organizers. Conference organization is hard, and this venue made it harder. The US Patent and Trademark Office is a great place, but there are aspects that they as a venue wanted secured and I would've preferred to be more open, but it was a beautiful venue and I be glad to return.

But, the thing I'd like to commend them most on is the high quality and fast turnaround on talk videos. The audio is good, the video is clear and well-lit, and the slides take precedence over the speaker in framing. It's everything I want in a conference video.

2017/06/29

Three Little Words: The Perl Conference 2017 Day 2

I lost the bubble.

I wrote a post at the end of the first day of the Perl Conference, intending to do the same for the second and third, but I went to visit family and play with their puppies on Tuesday night, and puppies are better than blogging.

Puppies!

The beginning of the third day, my laptop hit a weird issue where it couldn't enable the wireless NIC. I fixed it at home, once I had access to precision screwdrivers again, but this meant that I couldn't blog the last day, either.

But waiting affords me the opportunity to add videos of the talks I attended. So, yay.

I started Day 2 with Sam Batschelet's Dancing In The Cloud, which was not what I was expecting, being more a deep dive in another domain. I was hoping for an introduction to using the Dancer framework on cloud providers. It was interesting, and I now know what kubernetes does and that etcd exists, but this is far enough away from where I live that I don't know if it'll give me much more than buzzword recognition when I next attend a technical conference.

Again, I would like to reiterate that it was deep for me. Good talk, good speaker, but I was the wrong audience. Read the talk descriptions, not just the titles!

This was followed by Chad Granum on Test2, which is not yet on YouTube. I came to TPC wanting to up my testing game, and I've been looking through Andy Lester's modules and brian d foy's Stack Overflow posts trying to get into the mindset. This talk is a long-form walk through the diffs between the old and new test frameworks, and by showing how to test using Test2, it showed a lot about how to test in general. Once the talk is up, I am sure I will go through it again and again, trying to get my head around it.

But just because we don't have this talk on YouTube, that doesn't mean we don't have Chad talking about testing.

My next talk was Adventures in Failure: Error handling culture across languages by Andrew Grangaard, which is another point of growth for me. All too often, I just let errors fall, so knowing how they're commonly done is a good thing, well presented.

Damian Conway was the keynote speaker, and his talk was called Three Little Words. Those are, of course, I ❤ Perl.

Also, class has method. Conway proceeded to implement Perl6-style OOP with Dios, or Declarative Inside-Out Syntax, which needed Keyword::Declare to enable the creation of new keywords in Perl 5. To get this going, he needed to more quickly parse Perl, so along comes PPR, the Pattern-based Perl Recognizer. It was in this talk that the previous Lightning Talk from Chad, including his module Test2::Plugin::Source, comes from.

The Lightning Talks were good, but the one that comes up as a good one to put here is Len Budney's Manners for Perl Scripts. This makes me want to look into CLI::Startup.

I hope to get to Day 3 and maybe another post just on Lightning Talks soon. An hour's dive into topics won't get you too deep, but an hours' lightning talks will will open up a number of possibilities for you to follow.

2017/06/27

Reverse Engineering Google Maps at Highway Speed

I was on I-70 in Maryland on Sunday, going to Alexandria, Virginia, along with a lot of others. I was using Google Maps for navigation. When I could look down, the route was looking red, indicating congestion and delay. Eventually, Google said, "We have an alternate route that will save you a half hour. Want to take it?"

Of course I said yes. So I took the next exit, went down some rural highways, through a small town and almost down someone's driveway, it seemed, and back onto I-70. I called it a win.

The Friday after, on my way home, it rained all the way through Maryland, West Virginia, Pennsylvania and Ohio, and well into Indiana. It occasionally rained hard enough that I started seeing other drivers turning on their hazard lights for others to see them, and I followed suit.

A truck had crashed in western Ohio, closing westbound lanes, and Google told me it would add an hour delay, but when it routed me through five miles of county roads to the next on-ramp, it was closed. I knew it -- I saw the chain across the road -- but Google didn't, so I alone, without a line of other vehicles, bird-dogged a route to the next on-ramp with Google urging me to make a u-turn every few hundred feet.

This made me think about what's actually going on inside the navigation feature of Google Maps. It starts in graph theory, establishing the US road system as a directed graph with weighted edges, showing Interstates as preferable to highways to county roads and side streets, and using Dijykstra's shortest path algorithm to find the best route.

In graph theory, everything is either a node or an edge. In this case, a node could be the intersection of 6th and Main, or the fork where I-70 splits off to I-270. The edges would be the road between 5th and 6th on Main St., or the long path between exits on the Interstate. In other uses of graph theory, the nodes are most important -- Facebook Users being nodes and their friendships being edges in Facebook's Social Graph -- but here, the information about the edges is the important part.

This is similar to how we used to do it, looking at a paper map and preferring Interstate for long-haul routes, judging this road or that road as preferable due to scenery or a hundred other criteria, dropping to surface streets only to get through the final few miles. But, Google knows this road gets gridlocked at rush hour and that one is under construction, which you can't tell from your ten-year-old road atlas, and has a huge body of historical data that allows it to present alternate routes and the estimated time difference between.

The increasing amount of data helps it properly weigh edges and make connections. Early in my time with Maps, it suggested I go from Lafayette to Ft Wayne through Indianapolis, which would actually add an hour to the trip, because it had weighted the Interstate so highly. Now, it properly suggests two east-west routes and avoids the southern detour entirely. Similarly, an intersection near work is right-turn only, and it took some time for Maps to not suggest a left turn.

Maps re-calculates the route often, which is why, after a time off its path, it still says "fastest way is behind you; turn around and go back". When you're just stopping for gas, it's a little annoying, but when you see that the highway department has blocked the suggested route, you wish it would just shut up and get with the program. I'd have to take more trips where I'm not the driver to really tell, but I would guess it re-runs shortest-path several times a minute. My guess is there are waypoints, places along the route between here and there, so Maps tries to find the shortest path between them, rather than rethinking every turn in your 600-mile journey, which makes this faster and more predictable.

Maps has a lot of data for each section of road, showing how many drivers use it and their average speed, as well as the posted speed limit. If the drivers are going slower than average and slower than the speed limit, that indicates there is a problem. In the Ohio case, the Department of Transportation must have reported that the reason was an accident, but to paraphrase Scream, causes are incidental. What's important is knowing where things get back to normal, and if cars aren't there, then Maps has no way of knowing where that is and what on-ramp is open. This is guesswork of the highest order, but in a week where my every movement was guided by Maps' algorithms, this was the only point of failure.

When I took that detour in Maryland, I was not alone; I counted at least eight vehicles taking that route along with me. Google offered me the choice, but this can't have been an option for every driver on a backed-up highway. I think there must have been a process of A-B testing, where some were rerouted and some were kept on the main road, and it used this information to decide where to send drivers later.

I don't often take these long drives, so it may be a year or two before I'm so fully in the hands of Google Maps and on such a dynamic journey, but I expect the experience to be even better.

2017/06/19

Feeling "Tipsy": The Perl Conference 2017 Day 1

A few lessons learned before I dive into the talks I attended and what I gained from them:

I had figured that aspects of my Every Day Carry would not be appreciated in a government building, so I left my multitool off my belt, but, even after I had put everything I could think of into the tray, I had still made the machines go beep, causing the guards to pull out the wand. Tomorrow, pack lighter.
I take notes on a paper notebook, but I used my phone and tablet to connect to others via IRC and Twitter, as well as keeping track of my calendar, and I ran through my batteries before the end of the talks. In part, it was switching between the local WiFi and the cellular network, doing well with neither, but I'm not convinced it'd wouldn't be drained regardless. At lunch, I need to find a place to charge. I often come with a power strip for just this purpose, too.
I didn't break out the laptop once. If I don't use it more, I should just leave it and have less to carry.

Hopefully, I will remember this and come prepared for the next one.

Now, to the talks:

I started the day with MetaCPAN, the Grand Tour. I had been convinced earlier that MetaCPAN is better than CPAN for looking into modules, but there's more, including modules I am going to have start using to search CPAN.

Graham Knop's Continuous Integration for CPAN followed. I had been aware of Travis-CI, and had become aware of AppVeyor recently, but the tooling available in Perl to work with these was less familiar to me. I was unaware that you can specify Linux or OSX in Travis, This was something I was thinking and asking questions of other developers about. I have issues on FreeBSD, which I'm told is something that Gitlab-CI can help me with, but somehow, I doubt I can connect Github to Gitlab, but I could be wrong.

Steven Lembark had much more with Dockerizing CPAN Testers: Running an Isolated Test Site than I could fit into my head, and I think I'll have to go back to the tape once it's available, but I think it's a useful addition to the world.

After lunch, I went to Joel Berger's Variables, Scoping and Namespaces, which he set as a talk for beginners. He went so far as to suggest more established developers go elsewhere. Since I never thought I learned all of Perl when I was learning, it was very much a lot of things I did already know, but a little "So that's why we do that", some more "Oooh. I forgot about that", and one weird trick that explains how to mock functions for tests.

(That, fundamentally, is my big item to work on as a developer. Testing.)

After this, I attended Matt S. Trout's ES6: Almost an acceptable Perl 5? and it gave me a sense of treating Javascript like Perl, but since I don't code Perl like Matt does, I probably won't code ES6 like Matt does. My notes peter out about halfway through, but they do include several gems, such as lodash, that might improve the code I do write.

Following this is Lightning Talks, which has a bunch of interesting points, going from "Write for OpenSource.com" to "Try learning Dart" to "Creating Test JSON from JSON Schemas" to "Using Instapaper, IFTTT and Buffer to tweet what you read" to the Alien modules, which I almost understand now. Or maybe not. Certainly, though, I'd be installing and trying Dave Rolsky's RTx-toGitHub right now if I wasn't so tired.

Finally, Sawyer X talked about Perl 5.26 and the changes that came and the changes that are coming. The thing that comes to mind first is that things that have been deprecated since 5.0 are finally being pulled. I understand the critics who think that removing . from @INC is insignificant, but I am still for it. I also like that Perl recognizes unhandled merges and dies now.

Tomorrow, I will be learning about Dancer, Test2 and more with @INC, and visiting with family after

2017/06/02

Tracking your Old Tabs in Chrome over Time

Starting "Has this ever happened to you?" is a very informercial way to start, but it's where my brain has left me: Working on your computer and suddenly, something happens. Blue screen. Kernel panic. Browser crash. Whatever. Restart things, open Chrome and click "Restore Tabs" and it ... does nothing.

Or does something, but not enough.

A user's list of open browser tabs and windows is a map of that user's interests. For me, there's usually 6-20 open windows with 3-12 tabs covering a number of topics that, while interesting to you, are not directly applicable right now.

And Chrome is of no help these days. As Neil Young said "It's all the same song", Google believes, between phones, tablets, laptops, desktops — heck, the talking-donkey called Google Home, for all I know — that it's all the same browser. Pages you saw recently, no matter what device, are at the top of chrome://history, and the tabs that you kept around for when you get back to that topic are way dropped out, and ironically, the tabs you looked at, rejected and closed are right at the top.

Last night, I killed lots of tabs to enhance browser stability. Today, looking for an image, I screwed up my machine, and so I rebooted. It's Linux and that's bad Linux admin work, but it wasn't a hill I wanted to die on, and when I got back on, the six windows and maybe 20 tabs total were gone. There were certain givens (Tweetdeck, Chrome) and certain trashables (work tabs for tasks completed the day before) but the "I was gonna get back to that" windows are gone, and the pages they held are deep deep deep in the communal history behind today's web comics and headline links.

I cannot get those pages and the plans I built on them back. But, I can start to get my house in order to keep it from happening again. And that starts with storing them.

This page shows the default location for Chrome user data. Lifehacker has a how-to on Restoring tabs, detailing what files hold what you need and how to force Chrome into Recovery Mode. the key things are in Current Session, Current Tabs, Last Session and Last Tabs. These files are SNSS format, which could be better (I've found a Github repo with an SNSS parser in Python, but haven't started working with it), but they respond to strings, so you have something to work on without the parser.

#!/usr/bin/env perl

use feature qw{ say state } ;
use strict ;
use warnings ;
use utf8 ;

use DateTime ;
use File::Copy ;
use File::Path qw{ make_path } ;
use IO::Interactive qw{ interactive } ;

# program to back up chrome tabs for easy restore should things go bad
# add to your crontab with something like
#       @hourly ~/bin/chrome_tab_backup.pl

# T
#

my $now = DateTime->now()
                  ->set_time_zone('America/Indiana/Indianapolis')
                  ->set( second => 0 ) ;

my $date = $now->ymd('/') ;
my $time = $now->hms('-') ;

my $source = join '/', $ENV{HOME}, qw{ .config chromium Default } ;
my $target = join '/', $ENV{HOME}, '.chrome_backup', $date, $time ;
say $source ;
say $target ;

if ( !-d $target ) {
    my @dirs = make_path($target) ;
    }

chdir $source ;
for my $f ( grep {m{(Tabs|Session)$}n} glob '*' ) {
    say $f ;
    copy( $f, $target ) or die qq{Copy Failed: $! } ;
    }

There's a bit of generalization in this. I would prefer to make it discern what TZ you're in without hard-coding it, and I could just use UTC and everyone would know to look for the most recent one.

The Lifehacker page points to a Windows-specific (or maybe Windows and Mac, I dunno) Local State file as where you set exited_cleanly to false in order to force recovery, while in Linux it's Preferences.

Where this leaves us is a bunch of files in the directory. I could figure out a way to parse the SNSS. I could create an HTML page showing that time's open tabs, which obviates the need to force Chrome to open the correct tabs. I could begin to start grouping them by window and time, saying "You've been wanting to get back to MQTT for some time; fish or cut bait".

Plus, this example is very Linux-centric, meant to service Chromium and run in crontab. Making it run on Win7 or Win10 and Strawberry Perl and be run in Task Manager is important, as really, my Windows machine is for browsing and testing.

If all I get is "Hey, give me my tabs back!", I'll be happy.

2017/05/22

Testing Perl Modules on Windows: A Cry For Help

I've been working on modules again, after a recent push, and I found a big project whose .travis.yml file only went to Perl 5.20. I thought I'd get some dev brownie points by adding the next two stable versions, and found that my build was dying in the Pull Request submission process.

Specifically, it was dying on the checks. It passes Travis-CI, which runs the tests on Unix-like systems, but it was failing with Appveyor.

"What is Appveyor?", I asked myself.

Perhaps this isn't a direct quote.

Appveyor is a continuous integration service that tests against Windows systems. Scott Hanselman wrote a glowing review of it three years ago.

But there's no .appveyor.yml file in the project, so it runs and fails.

I've mentioned this on the project IRC, and no, I'm not going to name names, because there is movement toward testing on Windows, and even if it doesn't work, I admire the goal.

I wrote this three years ago, in response to a Python conference video:

2) Sure, real programmers use Unix/Linux to run their code, real programmers, but beginner programmers don't come in knowing how to set up an environment and come in with the (Windows) computer they have, and the documentation sucks and they feel lost and they don't like it and they don't feel the power, and they're gone. Even if you dislike Microsoft, good tools for Windows and good documentation are important for new programmers, important for building community, important for inclusiveness.

I run Windows, but I program on and for Linux, and only have one project based on Windows. But I have installed ActiveState and Strawberry Perls, and think that if you write general-purpose modules, you should be sure to test against Windows as well as Linux.

But, Travis-CI has documentation covering Perl projects. Appveyor says you can use Perl 5.20. eserte wrote a post on Appveyor for blogs.perl.org last year, but I'd love to see better documentation from either them, the Perl community or both. Following is the YAML from eserte, with a switch to "check only master branch", but as with Travis, which uses perlbrew and allows testing as far back as 5.8.8, I think having it test against older versions of Perl, both ActivePerl and Strawberry, would be the thing.

branches:
  only:
    - master

skip_tags: true

cache:
  - C:\strawberry



install:

  - if not exist "C:\strawberry" cinst strawberryperl

  - set PATH=C:\strawberry\perl\bin;C:\strawberry\perl\site\bin;C:\strawberry\c\bin;%PATH%

  - cd C:\projects\%APPVEYOR_PROJECT_NAME%

  - cpanm --installdeps .



build_script:

  - perl Makefile.PL

  - dmake test

If you have a more fully-featured .appveyor.yml file you'd like the Perl community to use, especially distinguishing MakeMaker modules from Dist::Zilla modules, I'd love to see it.

2017/05/17

Contact Me! (If you REALLY need to)

A recent comment from Schlomi Fish said:

Hi! I cannot seem to find any contact information on this page. How should I contact you?

And then linked to a FAQ entry explaining his position on the state of email, comparing the futility of hiding addresses and the benefits of being open.

I have to say, I hadn't thought about this in ... years? In general, I'm active on Twitter (@jacobydave), which is good if you are, but not helpful if you aren't. I try to keep track of the comments, but that doesn't fit every message a person would want to send me.

So, a friendly "Hey, you should put your email on your blog" comment makes sense to me.

But, adding more traffic to the mailbox that friends and relatives have access to doesn't. I'm happy to put up an email address, but I'm less than happy to make it my main email address. It goes to context; my coworkers generally don't get that one either.

I had a long, barely touched by me but used enough by others project that was R syntax highlighting in Komodo Edit, which is dead because it's now native, but the ActiveState packaging used an email address to set the id, so, I created rlangsyntax@gmail.com.

So, to the right, in a section called "More Of Me", there is a requested mailto: pointing to rlangsyntax@gmail.com. I will check it. Use it in good health.

2017/05/08

Coffee and Code and Calendars and R

I have what you might call a conflicted relationship with caffeine.

Long story short: I found it necessary to cut down on caffeine, limiting myself to two cups a day, preferrably before noon, and, as sort of a measure of accountability, and as a way to gain more skill with SQL and R, I wrote tools that store when I drink coffee and, at the end of the work day, tweeting the image.

I forget exactly where I found the calendar heatmap code, but it was several years ago, and was one of the first instances of ggplot2 that I found and put into service. I chose a brown-to-black color scheme because, well, obviously it needed to look like coffee.

This image, with "Dave has had n cups of coffee today" is autotweeted every day at 6pm Eastern. Recently, it has drawn interest.

@hoelzro It's very cargo-cult on my end. I should blog or gist it.

Thank @hadleywickham , I just put data into it.
— 'Dev' in Creole (@JacobyDave) May 6, 2017

So here it is, both blogged and gisted.

I started doing that thing with YAML in Perl, to keep database keys out of programs. I'm reasonably okay with my SQL skills, I think, but I am clear that my R code is largely cargo-cult. It'd be good to replace 2,4,6 with the days of the week, and I am reasonably sure that it's reversed from the way you'd normally expect weekdays to go. Some day, I'll have the knowledge required to make those changes.

If you have questions and comments about this, I'd be glad to take them on, but I'm very much the learner when it comes to R.

count is not uniq...

I recall reading something online about 20 years ago (gasp!) where the authors were looking for a core set of knowledge that would constitute "knowing Unix", and found that there just wasn't. Knowing Unix was like the Humpty Dance, in that no two people do it the same.

And, presumably, you have Unix down when you appear to be in pain.

I have been a Unix/Linux user since the 1990s and I only found out about uniq -c because of the last post. I had been using sort | uniq forever, and have recently stopped in favor of sort -u, which I also learned about recently.

I find that uniq is kinda useless without a sort in front of it; if your input is "foo foo foo bar foo" (with requisite newlines, of course), uniq without sort will give you "foo bar foo" instead of "foo bar" or "bar foo", either of which is closer to what I want.

So, I could see adding alias count=" sort | uniq " to my bash setup, but adding a count program to my ~/bin seems much better to me, much closer to the Right Thing.

marc chantreux suggested an implementation of count that is perhaps better and certainly shorter than the one I posted. There was regex magic that I simply didn't need, because I wanted the count to stand by itself (but I might revisit to remove the awk step, because as a user, I'm still a bit awkward with it.)

B)

my %seen ;

map { $seen{$_}++ } do {
    @ARGV ? @ARGV : map { chomp ; $_ } <>;
    } ;

while ( my ( $k, $v ) = each %seen ) {
    say join "\t", $v, $k ;
    }

I like marc's use of the ternary operator to handle STDIN vs @ARGV, but I'm somewhat inconsistently against map over for. I know people who thing that map not going into an array is a problem, so I don't go back to it often.

I do, however, do for my $k ( keys %seen ) { ... } enough that I'm sort of mad at myself for not encountering each before.

ETA: It's been brought to my attention that using map {} as a replacement for for () {} is not good.

2017/05/05

One! One New Utility! Bwa-ha-hahaha!

Classic unix utilities give you a number of great tools, and you can use sed and awk and bash when those aren't enough.

But sometimes ...

I use ~ as a scratch space all too often, which leaves me with a huge amount of files that I stopped playing with a while ago. I can get to the point of knowing what types, sure, as I show here.

$ ls *.* | awk -F. '{ print $NF }' 
jpg
jpg
jpg
jpg
txt
txt
txt
pl
txt
txt
pl
txt
pl
pl
txt
html
pl
pl
gz
mp3
pl
pl
pl
pl
txt
pl
pl
sh
sh
txt
pl
pl
diff
txt
txt
txt
pl
txt
pl
txt
txt
txt
txt
py
...

But this only gets you so far. I can sort and know that there's a LOT of Perl files, perhaps too many, but nothing was immediate about telling me how many.

But hey, I am a programmer, so I wrote a solution.

And here it is in a shell, combined with sort in order to give me the numbers, which includes a lot of throwaway Perl programs.

$ ls *.* | awk -F. '{ print $NF }' | count | sort -nr

95 pl
59 txt
10 sh
10 jpg
8 py
6 html
6 csv
5 js
2 gz
2 diff
1 zip
1 ttf
1 tt
1 svg
1 sql
1 Rd
1 R
1 pub
1 png
1 pdf
1 mp4
1 mp3
1 log
1 json
1 foo
1 conf
1 cnf

I suppose I need to do some cleanup in $HOME today...

2017/03/14

Coding for Pi Day

Today is Pi Day, which is a good day to talk about Pi.

Normally, I'd probably use Pi, sine and cosine to draw things, but instead, I flashed on a couple ways to estimate Pi.

Also, showing you can use Unicode characters in Perl.

#!/usr/bin/env perl

use feature qw{ say } ;
use strict ;
use warnings ;
use utf8 ;

my $π = 3.14159 ;

my $est2  = estimate_2() ;
my $diff2 = sprintf '%.5f',abs $π - $est2 ;
say qq{Estimate 2: $est2 - off by $diff2} ;

my $est1  = estimate_1() ;
my $diff1 = sprintf '%.5f',abs $π - $est1 ;
say qq{Estimate 1: $est1 - off by $diff1} ;

exit ;

# concept here is that the area of a circle = π * rsquared
# if r == 1, area = π. If we just take the part of the circle
# where x and y are positive, that'll be π/4. So, take a random
# point between 0,0 and 1,1 see if the distance between it and 
# 0,0 is < 1. If so, we increment, and the count / the number
# so far is an estimate of π.

# because randomness, this will change each time you run it

sub estimate_1 {
    srand ;
    my $inside = 0.0 ;
    my $pi ;
    for my $i ( 1 .. 1_000_000 ) {
        my $x = rand ;
        my $y = rand ;
        $inside++ if $x * $x + $y * $y < 1.0 ;
        $pi = sprintf '%.5f', 4 * $inside / $i ;
        }
    return $pi ;
    }

# concept here is that π can be estimated by 4 ( 1 - 1/3 + 1/5 - 1/7 ...)
# so we get closer the further we go
sub estimate_2 {
    my $pi = 0;
    my $c  = 0;
    for my $i ( 0 .. 1_000_000 ) {
        my $j = 2 * $i + 1 ;
        if ( $i % 2 == 1 ) { $c -= 1 / $j ; }
        else               { $c += 1 / $j ; }
        $pi = sprintf '%.5f', 4 * $c ;
        }
    return $pi ;
    }

2017/02/28

Having Problems Munging Data in R

#!/group/bioinfo/apps/apps/R-3.1.2/bin/Rscript

# a blog post in code-and-comment form

# Between having some problems with our VMs and wanting 
# to learn Log::Log4perl. I wrote a program that took 
# the load average -- at first at the hour, via 
# crontab -- and stored the value. And, if the load 
# average was > 20, it would send me an alert

# It used to be a problem. It is no longer. Now I 
# just want to learn how to munge data in R

# read in file
logfile = read.table('~/.uptime.log')

# The logfile looks like this:
#
#   2017/01/01 00:02:01 genomics-test : 0.36 0.09 0.03
#   2017/01/01 00:02:02 genomics : 0.04 0.03 0.04
#   2017/01/01 00:02:02 genomics-db : 0.12 0.05 0.01
#   2017/01/01 00:02:04 genomics-apps : 1.87 1.24 0.79
#   2017/01/01 01:02:02 genomics-db : 0.24 0.14 0.05
#   2017/01/01 01:02:02 genomics-test : 0.53 0.14 0.04
#   2017/01/01 01:02:03 genomics : 0.13 0.09 0.08
#   2017/01/01 01:02:04 genomics-apps : 1.66 1.82 1.58
#   2017/01/01 02:02:01 genomics-test : 0.15 0.03 0.01
#   ...

# set column names
colnames(logfile)=c('date','time','host','colon','load','x','y')

# now:
#
#   date       time     host         colon load x y
#   2017/01/01 00:02:01 genomics-test : 0.36 0.09 0.03
#   2017/01/01 00:02:02 genomics : 0.04 0.03 0.04

logfile$datetime <- paste( as.character(logfile$date) , as.character(logfile$time) )
# datetime == 'YYYY/MM/DD HH:MM:SS'
logfile$datetime <- sub('......$','',logfile$datetime)
# datetime == 'YYYY/MM/DD HH'
logfile$datetime <- sub('/','',logfile$datetime)
# datetime == 'YYYYMM/DD HH'
logfile$datetime <- sub('/','',logfile$datetime)
# datetime == 'YYYYMMDD HH'
logfile$datetime <- sub(' ','',logfile$datetime)
# datetime == 'YYYYMMDDHH'

# for every datetime in logfile. I love clean data

# removes several columns we no longer need

logfile$time    <- NULL
logfile$date    <- NULL
logfile$colon   <- NULL
logfile$x       <- NULL
logfile$y       <- NULL

# logfile now looks like this:
#
#   datetime  host             load
#   2017010100 genomics-test    0.36 
#   2017010100 genomics         0.04 
#   2017010100 genomics-db      0.12 
#   2017010100 genomics-apps    1.87 
#   2017010101 genomics-db      0.24 
#   2017010101 genomics-test    0.53 
#   2017010101 genomics         0.13 
#   2017010101 genomics-apps    1.66 
#   2017010102 genomics-test    0.15 
#   ...

# and we can get the X and Y for a big huge replacement table
hosts <- unique(logfile$host[order(logfile$host)])
dates <- unique(logfile$datetime)

# because what we want is something closer to this
#
#   datetime        genomics    genomics-apps   genomics-db     genomics-test
#   2017010100      0.04        1.87            0.12            0.36
#   2017010101      0.13        1.66            0.15            0.53
#   ...

# let's try to put it into a dataframe

uptime.data <- data.frame()
uptime.data$datetime <- vector() ;
for ( h in hosts ) {
    uptime.data[h] <- vector()
    } 

# and here, we have a data frame that looks like 
#
#   datetime        genomics    genomics-apps   genomics-db     genomics-test
#
# as I understand it, you can only append to a data frame by merging.
# I need to create a data frame that looks like
#
#   datetime        genomics    genomics-apps   genomics-db     genomics-test
#   2017010100      0.04        1.87            0.12            0.36
#
# and then merge that. Then do the same with 
#
#   datetime        genomics    genomics-apps   genomics-db     genomics-test
#   2017010101      0.13        1.66            0.15            0.53
#
# and so on.
#
# I don't know how to do that. 
#
# I *think* the way is make a one-vector data frame:
#
#   datetime        
#   2017010101      
#
# and add the vectors one at a time.

for ( d in dates ) {

    # we don't and the whole log here. we just want 
    # this hour's data
    # 
    #   datetime  host             load
    #   2017010100 genomics-test    0.36 
    #   2017010100 genomics         0.04 
    #   2017010100 genomics-db      0.12 
    #   2017010100 genomics-apps    1.87 
    log <- subset(logfile, datetime==d)

    print(d)

    for ( h in hosts ) {
        # and we can narrow it down further
        # 
        #   datetime  host             load
        #   2017010100 genomics         0.04 
        hostv <- subset(log,host==h)
        load = hostv$load 
        # problem is, due to fun LDAP issues, sometimes 
        # the logging doesn't happen
        if ( 0 == length(load) ) { load <- -1 }
        print(paste(h, load ))
    }

    # and here's where I'm hung. I can get all the pieces 
    # I want, even -1 for missing values, but I can't seem  
    # to put it together into a one-row data frame
    # to append to uptime.data. 

    #   [1] "2017010100"
    #   [1] "genomics 0.04"
    #   [1] "genomics-apps 1.87"
    #   [1] "genomics-db 0.12"
    #   [1] "genomics-test 0.36"
    #   [1] "2017010101"
    #   [1] "genomics 0.13"
    #   [1] "genomics-apps 1.66"
    #   [1] "genomics-db 0.24"
    #   [1] "genomics-test 0.53"
    #   [1] "2017010102"
    #   [1] "genomics 0.36"
    #   [1] "genomics-apps 0.71"
    #   [1] "genomics-db 0.08"
    #   [1] "genomics-test 0.15"

}

2017/01/20

Ding! Ding! The Process is Dead!

Starts with a thing I saw on David Walsh's Blog:

I've been working with beefy virtual machines, docker containers, and build processes lately. Believe it or not, working on projects aimed at making Mozilla developers more productive can mean executing code that can take anywhere from a minute to an hour, which in itself can hit how productive I can be. For the longer tasks, I often get away from my desk, make a cup of coffee, and check in to see how the rest of the Walsh clan is doing.

When I walk away, however, it would be nice to know when the task is done, so I can jet back to my desk and get back to work. My awesome Mozilla colleague Byron "glob" Jones recently showed me his script for task completion notification and I forced him to put it up on GitHub so you all can get it too; it's called ding!

OK, that sounds cool. So I go to Github and I see one line that gives me pause.

Requires ding.mp3 and error.mp3 in same directory as script. OSX only.

I can handle the image thing, but I don't own or run an OSX computer. (I have one somewhere, but it's ancient and has no functioning battery. I don't use it.)

"So," I think, "how could I do this on my Linux box? What's the shortest path toward functionality on this concept?"

Well, recently, I have been playing with Text-to-Speech. Actually, I have been a long-time user of TTS, using festival then espeak to tell me the current time and temperature on the hour and half-hour. I switched to Amazon's Polly in December, deciding that the service sounded much better than the on-my-computer choices. (Hear for yourself.) So, I knew how to handle the audio aspects.

The other part required me to get much more familiar with Perl's system function than I had been previously.

I'm not yet 100% happy with this code, but I'm reasonably okay with it so far. Certainly the concept has been proven. (I use the audio files from globau's ding.) With enough interest, I will switch it from being a GitHub gist to being a repo.

Cookie Notice