Archive

Posts Tagged ‘perl’

Dictionary, word pairs code challenge

March 7th, 2010 dwright No comments

A few months ago, I came across a job posting with a code challenge to solve and submit with your resume.

I'm not in the job market but I generally like these sort of challenges and since the job is no longer listed, I will post what I came up with.

The Rules:
Given a dictionary, output all word pairs where both words
share all their letters except the last two, which are
distinct and reversed.

Notes:
- use a reasonable dictionary of your choice
- potential word pairs are: "ear, era" ; "revies, revise" ; "burglaries, burglarise"
- "shear, era" is not a valid pair because "she" != "e"
- "bell, bell" is not a valid pair because the last two letters are not distinct
- this will be benchmarked

Here is the final result I came up with, I don't have anything to compare it with, but I believe it's correct. (filter.pl)

# filter.pl
#to run:
#cat /usr/share/dict/words | perl filter.pl

use strict;
use warnings;
my $save_possible = {};

while (<>) {
   chomp $_;

   # lower case
   $_ = lc $_;

   # skip words smaller then 3 chars
   next unless length $_ >= 3;

   my $word_minus_2 = substr($_, 0, -2);
   my $last_2_chars_from_word = substr($_, length($_)-2, length($_));

   $save_possible->{$_} = $_;

   my $tmp = $_;
   # the last two chars of word, reversed
   my $last_2_chars_from_word_rev = chop($tmp);
   $last_2_chars_from_word_rev .= chop($tmp);

   if ($save_possible->{$word_minus_2.$last_2_chars_from_word_rev} #exists
       and ($_ ne $word_minus_2.$last_2_chars_from_word_rev)){#skip same word
      my $w2_rev=$save_possible->{$word_minus_2.$last_2_chars_from_word_rev};
      print "$_ -> $w2_rev\n";
     $save_possible->{$_} = undef; # skip dups
   }
}
real	0m0.992s
user	0m0.897s
sys	0m0.047s

Results verified with:

$ wc -l  /usr/share/dict/words
234936
$ time cat /usr/share/dict/words | perl filter.pl | sed -e 's/ ->.*//g' > OUT
$ wc -l OUT
565 OUT
$ grep -ciwf OUT  /usr/share/dict/words
565
=~ .2% of the words in this dict fit our criteria




Ok, now that I've posted that I will illustrate how the wrong assumptions can lead you down the wrong path, here was my original approach:

My assumption was that the cases would be consecutive.
i.e. (would exist lexicographically, in succession)
aab
aba

abel
able

ala
aal
...

So, my (as I thought, 'slick') idea was to use the Perl's native sort function, providing my own sorting routine, that should be very fast and I don't have to do much, great!

Here is my first draft in Perl: (yes, it's wrong - and ugly!)

#!/usr/bin/perl -w

#on my mac os x system one run's this as:
#$ cat /usr/share/dict/words | perl filter.pl

use strict;
use warnings;

my %possible = ();

sub _same {
   chomp $a;
   chomp $b;
   return -1 unless length $a > 1 && length $b > 2;

   my $last_two_a   = substr($a, length($a)-2, length($a));
   my $last_two_b_r = substr($b, length($b)-1, length($b));
   $last_two_b_r .= substr($b, length($b)-2, 1);

   if (((substr($a, 0, length($a)-2)
        cmp
         substr($b, 0, length($b)-2)) == 0)
        &&
       (($last_two_a cmp $last_two_b_r) == 0)){

         $possible{lc $a} = lc $b;
         return 0;
   }

   # we dont care about the returned sort, it's a byproduct
   return 1;
}

my @len_sort = sort _same <>;

foreach my $w (keys %possible){
  print "$w => $possible{$w}\n";
}
real    0m0.715s
user    0m0.660s
sys     0m0.053s

Great, results! and great performance, Well, that settles it, it must be working (haha).

One night a few weeks later, this popped into my head again (especially, since I had a feeling my approach was not accurate), so I revisited it.

Well, of course my 'ass'-umption was incredibly WRONG!

This approach does find some cases, I never bothered to really verify it, the quick eyeball said, 'geez, I would think there would be more, but ok,...'. (Ok, this is what I get for working past midnight all week)

After other attempts of trying to utilize Perl's native sort function. (I'll save you the from the example code, since at this point it's messy and complex, which alone is a red flag, since this should be simple!)

Basically by this point, it became clear that the sort method is not providing any benefit, is the wrong approach and is helping me to obfuscating the code in this specific situation.

That is when I came up with the approach I posted at the beginning. (I avoided doing this the first time because I thought, "I just need to compare consecutive words and how would I save the previous word, to compare with the next via an input stream" (i.e. without using global storage, saving all of them; and since I assumed they were consecutive, I would just provide a method to sort, which provides much for free. lazy! in a bad way). Had I realised, of course, that they are not consecutive I would have abandoned the sort method right off the bat.

Categories: perl Tags:

CentOS 5.4 (Final) Weak references are not implemented in the version of perl

November 12th, 2009 dwright No comments

I just ran a sudo yum update on my development CentOS 5.3 box. (upgrade to 5.4)

All went well, except two issues.

1. I had to rollback perl-Apache-DBI.noarch 1.07-1.el5.rf
(luckily, I knew about this one, and had an old package available, and there are no deps issues - whew)

The issue manifests itself as:
[Tue Nov 10 10:43:02 2009] [error] Global $r object is not available. Set:\n\tPerlOptions +GlobalRequest\nin httpd.conf at /usr/lib/perl5/vendor_perl/5.8.8/Apache/DBI.pm line 144.\nCompilation failed in require at (eval 2) line 1.\n
[Tue Nov 10 10:43:02 2009] [error] Can't load Perl file: perl_load.pl for server , exiting...

To rollback:
$ sudo yum remove perl-Apache-DBI
$ sudo rpm -ivh perl-Apache-DBI-1.06-1.el5.rf.noarch.rpm

2. synopsis: "Weak references are not implemented in the version of perl "

The issue manifests itself as:
Weak references are not implemented in the version of perl at /usr/lib/perl5/vendor_perl/5.8.8/SOAP/Lite.pm line 2447\nBEGIN failed--compilation aborted at /usr/lib/perl5/vendor_perl/5.8.8/SOAP/Lite.pm line 2447.\nCompilation failed in require,...
...
...
[Wed Nov 11 16:50:01 2009] [error] Can't load Perl file: perl_load.pl for server, exiting...

which seemed me to (incorrectly) initially look in SOAP::Lite. (it's a Scalar::Util warning) Luckily I had another box I had just updated and knew that the updated did in fact work.

here is a command line test, to check with:
% perl -e"use Scalar::Util qw( weaken );"
%

should return nothing, on my box it returned:
% perl -e"use Scalar::Util qw( weaken );"
Weak references are not implemented in the version of perl at -e line 1
BEGIN failed--compilation aborted at -e line 1.

I couldn't find Scalar::Util installed!?
$ yum list|grep perl |grep -i scalar
perl-Scalar-Defer.noarch 0.20-1.el5.rf rpmforge
perl-Scalar-List-Utils.x86_64 1.21-1.el5.rf rpmforge
perl-Scalar-Properties.noarch 0.13-1.el5.rf rpmforge
perl-Set-Scalar.noarch 1.24-1.el5.rf rpmforge
perl-Tie-Scalar-Timeout.noarch 1.33-1.el5.rf rpmforge

After some searching, it appears that it's part or perl core now.

Eventually, I stumbled onto:
% locate Util.pm
/usr/lib/perl5/5.8.8/CGI/Util.pm
/usr/lib/perl5/5.8.8/Hash/Util.pm
/usr/lib/perl5/5.8.8/List/Util.pm
/usr/lib/perl5/5.8.8/Scalar/Util.pm
/usr/lib/perl5/vendor_perl/5.8.8/CGI/Simple/Util.pm
/usr/lib/perl5/vendor_perl/5.8.8/Crypt/DSA/Util.pm
/usr/lib/perl5/vendor_perl/5.8.8/HTTP/Headers/Util.pm
/usr/lib/perl5/vendor_perl/5.8.8/Mail/Util.pm
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/Util.pm
/usr/lib/perl5/vendor_perl/5.8.8/Net/SFTP/Util.pm
/usr/lib/perl5/vendor_perl/5.8.8/Net/SSH/Perl/Util.pm
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/List/Util.pm
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/Scalar/Util.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/APR/Util.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/Apache/TestUtil.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/Apache2/ConnectionUtil.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/Apache2/RequestUtil.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/Apache2/ServerUtil.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/Apache2/Util.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/ModPerl/MapUtil.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/ModPerl/Util.pm
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi/Params/Util.pm

The only difference in the results of this and the box that works,
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/List/Util.pm
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/Scalar/Util.pm
are not in the list, of the working box.

(I have no idea how those got there, or where they are from, except for what the obviously look like, rhel5/centos packages,...)

perl INC is loading those libs first too.

$ perl -e 'print join "\n", @INC'
/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.8
/usr/lib/perl5/site_perl
/usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.8
/usr/lib/perl5/vendor_perl
/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi
/usr/lib/perl5/5.8.8

so to 'resolve' the issue, I did this.

% sudo mv /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/List/Util.pm /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/List/Util.pm-1.19-broken-with-5.8.8-27
% sudo mv /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/Scalar/Util.pm /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/Scalar/Util.pm-1.19-broken-with-5.8.8-27

This worked for me, although is probably not the 'correct way'. I'm still not sure exactly what happened to get into this state, or I'd file a bug somewhere.

ref: (1)http://www.gossamer-threads.com/lists/modperl/modperl/98113

(1)http://rt.cpan.org/Public/Bug/Display.html?id=36346

(2) http://www.perlmonks.org/?node_id=424737 A different approach to the situation. (in general, we try to do everything through yum/rpm packages)
(2)http://bugs.centos.org/view.php?id=3022 an old bug, which made me realize I had to figure it out :)
(2) http://www.bluequartz.us/phpBB2/viewtopic.php?t=106577&sid=17481305416e016b01a061acb4ac3f45

Load quote content into ACME::QuoteDB

October 1st, 2009 dwright No comments

This is a example for how to load quotes content into the perl module, ACME::QuoteDB

(NOTE: use 0.1.2 or newer: http://search.cpan.org/~dvwright/ACME-QuoteDB-0.1.2/lib/ACME/QuoteDB.pm)
(There is a debian (squeeze/sid) package here: http://www.dwright.us/misc/libacme-quotedb-perl-0.1.2.tar.gz)

For this example we will be loading some quotes via crawling a website, Alan J. Perlis (no pun intended) quotes, available at http://www.cs.yale.edu/quotes.html

file: load_quotes.pl

#!perl

# subclass ACME::QuoteDB::LoadDB and override dbload to do our html parsing
package LoadQuoteDBFromHTML;
use base 'ACME::QuoteDB::LoadDB';
use strict;
use warnings;
use Carp qw/croak/;
use Data::Dumper qw/Dumper/;
use HTML::TokeParser;

sub dbload {
  my ($self, $file) = @_;

  my $p = HTML::TokeParser->new($file) || croak $!;

  while (my $token = $p->get_tag("p")) {
      my $data = $p->get_trimmed_text("p");
      # see $self->set_record in ACME::QuoteDB::LoadDB for attribute fields
      # to populate
      if ($data){
         # axe our beginning line nums: '110. "quote content",,..';
         $data =~ s/\A\d+\.\s+//xms;

         next if $data =~ m/EPIGRAMS IN PROGRAMMING/;
         next if $data =~ m/\AFrom ACM's SIGPLAN publication/;

         $self->set_record(quote => $data);
         $self->set_record(name => q{Alan J. Perlis});

         #$self->debug_record;
         $self->write_record;
      }
  }
}

package main;
use strict;
use warnings;

use ACME::QuoteDB::LoadDB;
use ACME::QuoteDB;
use Carp qw/croak/;
use LWP::Simple;

#my $source = q{Alan J. Perlis - "Epigrams in Programming"};
my $source = q{From ACM's SIGPLAN publication, (September, 1982), };
$source .= q{Article "Epigrams in Programming", };
$source .= q{ by Alan J. Perlis of Yale University};
my $file = '/tmp/q.html';
if (! -e $file){
  if (!is_success(getstore('http://www.cs.yale.edu/quotes.html', $file))){
    croak 'could not retrieve html page';
  }
}

my $load_db = LoadQuoteDBFromHTML->new({
                            file => $file,
                            file_format => 'html',
                            # overwrite all data, this is our first addition
                            create_db   => 1, # first run, create the db
                            # provide a category for all
                            category => [qw(Epigrams Programming)],
                            attr_source => $source,
                            # provide a general quality rating for all quotes
                            rating => 8,
                            # dont actually do anything,
                            #dry_run => 1,
                            #verbose => 1,
                        });

$load_db->data_to_db;

if (!$load_db->success){croak 'Something went wrong'}

my $sq = ACME::QuoteDB->new;
print $sq->get_quote();
#print join "\n", @{$sq->get_quotes({AttrName => 'Perlis', Limit => 3})};
print "\n\n";

if (120 ==  scalar @{$sq->get_quotes({AttrName => 'Perlis'})}){
    print "load complete\n";
}
else {
    print "unexpected amount of quotes loaded\n";
}

Which produces:
[dwright@debian perl]$ perl load_quotes.pl
If a listener nods his head when you're explaining your program, wake him up.
-- Alan J. Perlis

load complete
[dwright@debian perl]$ perl -MACME::QuoteDB -le 'print quote()'
There will always be things we wish to say in our programs that in all known languages can only be said poorly.
-- Alan J. Perlis

And of course my favorite Perlis quote:

#!perl

use strict;
use warnings;

use ACME::QuoteDB;

my $sq = ACME::QuoteDB->new;

print @{$sq->get_quotes_contain({ Contain =>  'parameters', })};
print "\n";

which produces:
If you have a procedure with ten parameters, you probably missed some.
-- Alan J. Perlis

Categories: perl Tags: , , , ,

debian .deb package for perl module criticism

September 14th, 2009 dwright No comments

I'm a big fan of The Perl Best Practices line of thinking stared by Damian Conway.

I am also a fan of the Perl::Critic school of modules available on the CPAN.

So in the tradition of contributing to open source, here is a debian/ubuntu package I just built for criticism

The one non-standard task I did was to build it without the Perl::Critic dependancy, as the docs state. See the bug I filed for details.

It's unsigned signed, and built on debian squeeze/sid, it contains some dependancies that were not covered by the perl modules versions in lenny.

there are a few lintian warnings. I should clean them up, and submit it to debian perl group proper. no lintian warnings

9/29/09: I filed an ITP bug http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=548991

Ruby on rails seemless integration with legacy perl application

August 27th, 2009 dwright No comments

Comment:

This kind of set up should also work with an php, cgi (or any language, running on apache) site.

Overview:

I am (slowly) migrating a (large) legacy perl/cgi application to ruby on rails, the base requirement is that the (legacy) site continues to behave 'as expected' to the end user.

So, I need to run these two applications simultaneously from the same server and have it look like one application to clients. (i.e. transparent integration)

Existing links/sessions need to continue to work seamlessly.  I decided to go the route of keeping the legacy app functioning as before, so for instance http://example.com/admin/admin.pl will still work as expected. All new development (which is in rails) will be located at http://example.com/admin2/

The first thing I needed to do was handle all session/cookies info with rails (I will do a separate post on that another time, as it is involved, although since we use sessions via a database it wasn't that bad)

Setup:
We are using apache2, enterprise ruby and phusion passenger.

Conf:
NOTE: we will use example.com for our http host for demonstration purposes


Ruby on Rails vhost conf file:

# we are now handling *ALL* http requests with mod_rails
<VirtualHost *:80>

# set server name/admin
ServerAdmin you@example.com
ServerName example.com

# be sure to point to 'public'
DocumentRoot "/usr/local/httpd/
admin2/public"
ScriptAlias /admin2/ "/usr/local/httpd/admin2/public/"

RewriteEngine on
RewriteLog "/var/log/httpd/rewrite.log"
RewriteLogLevel 5

RewriteRule ^/$ /admin2/session [R,L]
RewriteRule ^/index.html$ /
admin2/session [R,L]

# pass all images to legacy perl/cgi
RewriteRule ^(.*)(\.)(gif|jpg|png)$ http://example.com:8080$1$2$3 [proxy]

# pass/reroute legacy css/js requests
RewriteRule ^/css/(.*) /stylesheets/$1 [R,L]
RewriteRule ^/js/(.*) /javascripts/$1 [R,L]

### send any /admin/ page request to perl/cgi on port 8080
# /admin/ requests have not yet been migrated to rails
#http://httpd.apache.org/docs/2.0/mod/mod_proxy.html#proxyrequests
ProxyRequests Off
ProxyPreserveHost On
#15 minutes
ProxyTimeout 900

<Proxy *>
Order deny,allow
Allow from all
<Proxy>

# all requests for /admin go to legacy app
ProxyPass /admin/ http://example.com:8080/admin/
#('hides' the port)
ProxyPassReverse /admin/ http://example.com:8080/admin/

SetEnv force-proxy-request-1.0 1
SetEnv proxy-nokeepalive 1
</VirtualHost>


If you have some ajax requests that take 'awhile', you may need to add the:
ProxyTimeout 900
SetEnv force-proxy-request-1.0 1
SetEnv proxy-nokeepalive 1

On some (long running) monthly report type queries we were getting:
Bad Gateway
The proxy server received an invalid response from an upstream server.

and this in apache logs:
[Thu Jun 11 10:58:08 2009] [error] (70014) End of file found: proxy: error reading status line from remote server


legacy perl/cgi vhost file:
# this would be the vhost file for the http://example.com:8080 host above
# we are now running perl/cgi on port 8080, mod_rails is handling
# all http requests
Listen 8080

<VirtualHost *:8080>
DocumentRoot /usr/local/httpd/htdocs

ServerName example.com
AddHandler cgi-script .pl

ScriptAlias /admin/ "/usr/local/httpd/htdocs/"
Alias /images/ "/usr/local/httpd/htdocs/images/"

RewriteEngine on

### handle with rails (/admin2/)
### 'any session page requests'
RewriteRule ^/$ http://%{HTTP_HOST}/admin2/session [L,R]
RewriteRule ^/index.html$ http://%{HTTP_HOST}/
admin2/session [L,R]
RewriteRule ^/admin/session.pl$ http://%{HTTP_HOST}/
admin2/session [L,R]

#### apps we have migrated to be handled with rails need a rule here
# passwd.pl is first
RewriteRule ^/admin/passwd.pl$ http://%{HTTP_HOST}/
admin2/passwd [L,R]
</VirtualHost>


To rollback (or for testing. to disable the rails site) we can just comment out the rails vhost file and update the legacy vhost file to:
#Listen 8080
#<VirtualHost *:80>
<VirtualHost *:80>

ref:
Integrating Ruby on Rails with existing PHP or Perl Based Sites
http://work.rowanhick.com/2008/04/10/rails-php-sharing-the-same-session/

perl command line fu (awk-ish)

August 6th, 2009 dwright No comments

Here's a neat trick I didn't know about to have perl behave in a way that is very 'awk-ish', nice addition  to boost your command line perl-fu:

$> cat file.txt
Name Age Number
xyz 22 123456
abc 21 234567

$> perl -F' ' -lane 'print $F[1]' file.txt
Age
22
21

$> perl -F' ' -lane 'print $F[0]' file.txt
Name
xyz
abc

$> perl -F' ' -lane 'print $F[2]' file.txt
Number
123456
234567

(The awk equivalent is awk '{print $2'} file.txt)

Categories: perl Tags: , ,

perl global search replace many files

July 29th, 2009 dwright No comments

Issue:
Your ssh'd into a remote *NIX box and you have many files (potentially, in many directories) that need a 'minor' text change. (i.e. a simple search/replace)

For example, you want to change an variable name, 'MyVariable' to 'OurVariable'

here is a quick and dirty (careful, very dirty, there is no 'undo' on this, test first! - you have been warned)
$ find . -type f -exec perl -pi -e 's/ReplaceThisString/WithThisString/g' {} \;

this example finds any file under the current dir (and recursivly, through every dir under this path) and does an 'in place edit' of the file replacing 'ReplaceThisString', 'WithThisString'

For instance, recently I had to add binmode(\*STDOUT, ":utf8"); to a bunch of old perl/cgi scripts. I did so like such:
(note necessary escaping; and added comment)
perl -pi -le 's/charset\(\"utf-8\"\);/charset\(\"utf-8\"\);\nbinmode\(\\*STDOUT, ":utf8"\);\# re: 2105 \(db utf8\)/g' *.cgi

The above reads, find: charset("utf-8"); and replace it with: charset("utf-8");(i.e. itself (we need a pattern to find and this appears at the top of the scripts) and)
[Newline]binmode(\*STDOUT, ":utf8");

this combines the find comand (you can use all the functionalty of find) with the power of perl.

enjoy

Categories: perl Tags: , ,

utf8 perl, ruby command line

July 8th, 2009 dwright No comments

String: Iñtërnâtiônàlizætiøn
- 20 characters
- 7 multi byte chars
- ñëâôàæø all have 2 bytes each
- string length is 27 bytes

% perl  -le 'print length "Iñtërnâtiônàlizætiøn"'
27

% ruby -e 'puts "Iñtërnâtiônàlizætiøn".length'
27

% ruby -le  'puts [73,195,177,116,195,171,114,110,195,162,116,105,195,180,110,195,160,108,105,122,195,166,116,105,195,184,110].pack("c*")'
Iñtërnâtiônàlizætiøn

% ruby -le  'puts "I" << [195,177].pack("cc") << "t" << "\303\253"   << "rn" << [195,162].pack("cc") << "ti" << [195,180].pack("cc") << "n" << [195,160].pack("cc") << "liz" << [195,166].pack("cc") << "ti" << [195, 184].pack("cc") << "n" '
Iñtërnâtiônàlizætiøn

% ruby -le  'puts "I\303\261t\303\253rn\303\242ti\303\264n\303\240liz\303\246ti\303\270n"'
Iñtërnâtiônàlizætiøn

% perl  -le 'print length "ñëâôàæø"'
14

% perl  -le 'print  "\xc3\xa6"'
æ
% ruby  -le 'print  "\xc3\xa6"'
æ
% perl  -le 'print  unpack 'U', "\xc3\xa6"'
195
% perl  -le 'print  unpack 'U', "\xa6"'
166
% perl  -le 'print unpack 'UU', "æ"'
195166
% perl  -le 'print unpack 'UUU', "æ"'
195166

% perl  -C2 -le 'print pack "U", 0xc3a6'

% perl  -le 'print pack("UU", 0xc3, 0xa6)'
æ
% perl  -le 'print pack("UU", 195, 166)'
æ

% perl  -le 'print unpack("UU", "ñ")'
195177
% perl  -le 'print pack("UU", 195,177)'
ñ
%  ruby  -le 'print [195,177].pack("cc")'
ñ

% perl -le 'print unpack("U*", "ë")'
195171

% perl -C2  -le 'print "\x{8482}\x{8480}\x{8481}\x{8483}\x{8484}"'
蒂蒀蒁蒃蒄
% ruby -e 'puts [0x8482,0x8480,0x8481,0x8483,0x8484].pack("U*")'
蒂蒀蒁蒃蒄
% perl -C2 -le 'print pack("U*", 0x8482,0x8480,0x8481,0x8483,0x8484)'
蒂蒀蒁蒃蒄

dwright@dwright-OSXL:[1139]:watir% ruby -v
ruby 1.8.7 (2009-04-08 patchlevel 160) [i686-darwin9]
dwright@dwright-OSXL:[1140]:watir% perl -v
This is perl, v5.8.9 built for darwin-2level

Categories: perl, ruby Tags: , , ,

Recreate sql’s ORDER BY (multi column) with perl’s sort

February 22nd, 2009 dwright 1 comment

What:
Recreate sql's ORDER BY (multi column) with perl's sort

Issue:
you have 2 sql statements that utilize a multi column order by statement (they return the same type and number of columns of information, for some reason they cannot be combined into one sql statement) but you need to display them as one (properly ordered) table client side

Data:
2 array refs ordered by columns 'date', 'name', 'id' , in that order.

'name', 'date' ("%mm/%dd/%YYYY" ), 'id'

How:
How do you combine them to one properly ordered list?

Example:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper qw/Dumper/;

# these lists come from a sql statement are are in the order desired but need
# to be combined togther

# one array ref with many records (as array refs)
my $data1 = [
        [qw{Eels   01/11/2009  10955252 field4 field5}],
        [qw{Mapple 01/12/2009  10957336 field4 field5}],
        [qw{Ack    01/12/2009  20005374 field4 field5}],
        [qw{Ack    02/18/2009  20005371 field4 field5}],
];
# access (for the uninitiated)
#warn @{$a->[3]};            # Mapple01/12/200910957336field4field5
#warn join "\t", @{$a->[3]}; # Mapple  01/12/2009  10957336    field4  field5
#warn $a->[3][0];            # Mapple 

# same data format as above, but store in array instead of array ref
# i.e. one array with many records (as array refs)
my @data2 = (
        [qw{Ack    02/19/2008  20005373 field4 field5}],
        [qw{Eels   01/12/2009  10955252 field4 field5}],
        [qw{Mapple 01/12/2009  10985507 field4 field5}],
        [qw{Ack    01/12/2009  20005374 field4 field5}],
        [qw{Zoys   02/09/2009  20004772 field4 field5}],
        [qw{Ack    02/18/2009  20005372 field4 field5}],
);
# access (for the uninitiated)
#warn @{$b[2]};             # Mapple01/12/200910985507field4field5
#warn join "\t", @{$b[2]};  # Mapple  01/12/2009  10985507    field4  field5
#warn $b[2]->[0];           # Mapple
# also
#warn @{[$b[2]->[0]]};      # Mapple 

# combine both lists to one
my @c = (@{$data1},@{[@data2]});
#warn Dumper \@c;

# to confirm
#foreach my $i ( @c ) {
#   warn "$i->[0], $i->[1], $i->[2]\n";
#}
# step 1, now  they are combined: (but not in the correct order)
#Eels, 01/11/2009, 10955252
#Mapple, 01/12/2009, 10957336
#Ack, 01/12/2009, 20005374
#Ack, 02/18/2009, 20005371
#Ack, 02/19/2008, 20005373
#Eels, 01/12/2009, 10955252
#Mapple, 01/12/2009, 10985507
#Ack, 01/12/2009, 20005374
#Zoys, 02/09/2009, 20004772
#Ack, 02/18/2009, 20005372

# try to sort by date
#foreach my $i ( sort { $a->[1] cmp $b->[1] } @c ) {
#   warn "$i->[0], $i->[1], $i->[2]\n";
#}
# hmmm, not what we expect, the names' should be grouped
#Eels, 01/11/2009, 10955252
#Mapple, 01/12/2009, 10957336
#Ack, 01/12/2009, 20005374
#Eels, 01/12/2009, 10955252
#Mapple, 01/12/2009, 10985507
#Ack, 01/12/2009, 20005374
#Zoys, 02/09/2009, 20004772
#Ack, 02/18/2009, 20005371
#Ack, 02/18/2009, 20005372
#Ack, 02/19/2008, 20005373

# ok, let's add some other columns to the sort
# this should sort by the date, name then id field respectivly
#foreach my $i (
#               sort {
#                       $a->[1] cmp $b->[1]
#                                ||
#                       $a->[0] cmp $b->[0]
#                                ||
#                       $a->[2] <=> $b->[2]
#                    } @c
#              ) {
#   warn "$i->[0], $i->[1], $i->[2]\n";
#}
# ok, looks better but still not correct, hmm, why is 2008 first?
#Eels, 01/11/2009, 10955252
#Ack, 01/12/2009, 20005374
#Ack, 01/12/2009, 20005374
#Eels, 01/12/2009, 10955252
#Mapple, 01/12/2009, 10957336
#Mapple, 01/12/2009, 10985507
#Zoys, 02/09/2009, 20004772
#Ack, 02/18/2009, 20005371
#Ack, 02/18/2009, 20005372
#Ack, 02/19/2008, 20005373

# the solution I use is hackish and probably not recommended but seems to work
# in my cases - it appears sort is sorting by the first 2 digits only,
# I am looking for year, month, date, so let's tell sort that,..
# let's split on the '/' and join to a string by year, month, day
#join('', (split '/', $_ 3)[2,0,1]) 

foreach my $i (
               sort {
                      join('', (split '/', $a->[1], 3)[2,0,1])
                           cmp
                      join('', (split '/', $b->[1], 3)[2,0,1])
                                ||
                       $a->[0] cmp $b->[0]
                                ||
                       $a->[2] <=> $b->[2]
                    } @c
              ) {
   warn "$i->[0], $i->[1], $i->[2]\n";
}
# this is what we were looking for
#Ack, 02/19/2008, 20005373
#Eels, 01/11/2009, 10955252
#Ack, 01/12/2009, 20005374
#Ack, 01/12/2009, 20005374
#Eels, 01/12/2009, 10955252
#Mapple, 01/12/2009, 10957336
#Mapple, 01/12/2009, 10985507
#Zoys, 02/09/2009, 20004772
#Ack, 02/18/2009, 20005371
#Ack, 02/18/2009, 20005372

# references:
# http://www.sysarch.com/Perl/sort_paper.html A Fresh Look at Efficient Perl Sorting
# perldoc perldsc -> perl data structures cookbook
# perldoc -f sort -> sort documentation

Categories: perl Tags: , , ,