Rss

Playing with a Perl proxy. Part 1: Kitties everywhere

I’ve spent recently some time playing with the Perl module HTTP::Proxy which allows to create a proxy in a few lines of code. One interesting thing is that makes possible on live modifications of the content.

As the documentation of HTTP::Proxy::BodyFilter::simple states, we can do something like:

my $filter = HTTP::Proxy::BodyFilter::simple->new(
    sub { ${ $_[1] } =~ s/foo/bar/g; }
);

and that will replace all ‘foo’ strings by ‘bar’.

So I proposed myself to create a proxy that replace all images by kitties, because kitties are the best of the Internet. With the magic of HTTP::Proxy and placekitten we can write something like:

#!/usr/bin/perl
use strict;
use warnings;
use HTTP::Proxy;
use HTTP::Proxy::BodyFilter::simple;

# Create proxy
my $proxy  = HTTP::Proxy->new(in => { port => 8080 });
my $filter = HTTP::Proxy::BodyFilter::simple->new(\&tamper_image);
$proxy->push_filter(mime => 'text/html', response => $filter);
$proxy->start;

# Modify images
sub tamper_image {
  my ( $self, $dataref, $message, $protocol, $buffer ) = @_;

  my @matches = ($$dataref =~ m#(<img.*?src[\s|\t]*=["|'].*?["|'].*/?>)#g);
  foreach my $match (@matches) {
    $match =~ m#width[\s|\t]*=["'](.*?)["']#;
    my $width  = $1;
    $match =~ m#height[\s|\t]*=["'](.*?)["']#;
    my $height = $1;

    if ($width && $height) {
      $$dataref =~ s#$match#<img alt="" src=" http://placekitten.com/$width/$height" />#;
    }
  }
}

This code creates a proxy on the port 8080 and assings a filter to all html files. This filter will search for all img tags and will replace the source of the image.
If we run this script and configure our web browser to use a proxy on localhost:8080 we will start to see cats instead of some pictures.
Replacement of images by cats

Isn’t that great?

Well, it has some problems. To begin with I used a regex to parse the html instead of some module dedicated to it, so this code is messing around with the html and breaking some things. Nevertheless of that it will only alter the images with width and height properties defined. If the dimensions or the image are defined in css, javascript or other madness there is no simple way to know the appropriate size to get the right cat.

Because of this flaws I’ve tried another approach: tamper the image data itself and keep untouch the html code.

#!/usr/bin/perl
use strict;
use warnings;
use HTTP::Proxy;
use HTTP::Proxy::BodyFilter::simple;
use Imager;
use LWP::Simple qw($ua get);
# Some cats want an appropriate user agent
$ua->agent('Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0');

my($type, $port) = @ARGV;
$type ||= "cats";
$port ||= 8080;

# There are more things that cats on the Internet
my %PLACE_HOLDERS = (
  cats   => 'http://placekitten.com/WIDTH/HEIGHT',
  dogs   => 'http://placedog.com/WIDTH/HEIGHT',
  apes   => 'http://placeape.com/WIDTH/HEIGHT',
  random => 'http://pipsum.com/WIDTHxHEIGHT',
  puppy  => 'http://placepuppy.it/WIDTH/HEIGHT',
  sheen  => 'http://placesheen.com/WIDTH/HEIGHT',
);
$PLACE_HOLDERS{$type} || die "I don't know how to replace that: $type";

# Create proxy
my $proxy  = HTTP::Proxy->new(in => { port => $port });
my $filter = HTTP::Proxy::BodyFilter::simple->new(\&tamper_image);
$proxy->push_filter(mime => 'image/*', response => $filter);
$proxy->start;

# Modify images
sub tamper_image {
  my ( $self, $dataref, $message, $protocol, $buffer ) = @_;

  eval {
    # Get original image data
    my $img = Imager->new(data => $$dataref);
    my ($w, $h) = ($img->getwidth(), $img->getheight());

    # Construct url
    my $url = $PLACE_HOLDERS{$type};
    $url =~ s#WIDTH#$w#;
    $url =~ s#HEIGHT#$h#;

    # Get image
    $$dataref = get($url);
  };
  if ($@) {
    $$dataref = '';
  }
}

This works similar way but it applies a filter to all images and just change the image data for a different one. Besides, there are more things than cats. You can use something like proxy.pl apes to get:
Replaced images by apes

Extra ball

I’ve uploaded the code to GitHub and created a video tutorial:

Like This Post? Share It

Leave a Reply

Your email address will not be published. Required fields are marked *

 __________
< Comment! >
 ----------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||