One of my current goals is to make an entire port of Term::VT102 in XS (Term::VT102::XS). It is progressing very slowly. However, in an act of impatience, I took a current copy of Term::VT102 and converted one of its functions to XS to see how much faster it was. I replaced
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sub attr_unpack { | |
shift if ref($_[0]); # called in object context, ditch the object | |
my $data = shift; | |
my ($num, $fg, $bg, $bo, $fa, $st, $ul, $bl, $rv); | |
$num = unpack ('S', $data); | |
($fg, $bg, $bo, $fa, $st, $ul, $bl, $rv) = ( | |
$num & 7, | |
($num >> 4) & 7, | |
($num >> 8) & 1, | |
($num >> 9) & 1, | |
($num >> 10) & 1, | |
($num >> 11) & 1, | |
($num >> 12) & 1, | |
($num >> 13) & 1 | |
); | |
return ($fg, $bg, $bo, $fa, $st, $ul, $bl, $rv); | |
} |
...with...
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SV* | |
attr_unpack(self, sv_buf) | |
SV *self | |
SV *sv_buf | |
PPCODE: | |
char *buf = SvPV_nolen(sv_buf); | |
EXTEND(SP, 8); | |
mPUSHs( newSViv( buf[0] & 7 ) ); | |
mPUSHs( newSViv( (buf[0] >> 4) & 7) ); | |
mPUSHs( newSViv( buf[1] & 1) ); | |
mPUSHs( newSViv( (buf[1] >> 1) & 1) ); | |
mPUSHs( newSViv( (buf[1] >> 2) & 1) ); | |
mPUSHs( newSViv( (buf[1] >> 3) & 1) ); | |
mPUSHs( newSViv( (buf[1] >> 4) & 1) ); | |
mPUSHs( newSViv( (buf[1] >> 5) & 1) ); |
and ran a test script through Devel::NYTProf.
The pure-perl version ran in this much time (9.44s is the time in line):
The XS version ran in this much time:
A speedup that reduced the amount of time in that sub to nearly one ninth the time! Believe it or not, this function was one of the more expensive calls in my project code. I know I cheated by requiring $self for the XS function (rendering it completely incompatible but I don't know how to overload it quite yet), but I hope that doesn't make too much of a difference, speed-wise. If it speeds up a single function to 12% of the time (inclusive), imagine how much faster it will be with the entire thing ported to XS.