What's That Noise?! [Ian Kallen's Weblog]

All | LAMP | Music | Java | Ruby | The Agilist | Musings | Commute | Ball
Main | Next month (May 2007) »

20070425 Wednesday April 25, 2007

Intel Migration Pain With Perl

There's a bunch of code that I haven't had to work on in months. Some of it predates my migration from PPC Powerbook to the Intel based MacBook Pro. Now that I'm dusting this stuff off, I'm running to binary incompatibilities that are messin' with my head. My recompiled my Apache 1.3/mod_perl installation just fine but doing a CVS up on the code I need to work on and updating the installation, there's a new CPAN dependency. No problem, use the CPAN shell. Oh, Class::Std::Utils depends on version.pm and it's ... the wrong architecture. Re-install version.pm. Next, XMLRPC::Lite is unhappy 'cause it depends on XML::Parser::Expat and it's ... the wrong architecture.

Aaaaugh!

The typical error looks like

mach-o, but wrong architecture at /System/Library/Perl/5.8.6/darwin-thread-multi-2level/DynaLoader.pm
I just said "screw it" and typed "cpan -r" ... which looks to be the moral equivalent of "make world" from back in my FreeBSD days. Everything that has an XS interface just needs to be recompiled.

Compiling... compiling... compiling. I guess that'll give me time to write a blog post about it. OK, that's done, seems to have fixed things: back to work.

                 

( Apr 25 2007, 05:19:37 PM PDT ) Permalink


20070423 Monday April 23, 2007

Simple is as simple ... dohs!

I was working on an Evil Plan (tm) to serialize python feedparser results with simplejson.

 parsedFeed = feedparser.parse(feedUrl)
 print simplejson.dumps(parsedFeed) 
Unfortunately, I'm hitting this:
TypeError: (2007, 4, 23, 16, 2, 7, 0, 113, 0) is not JSON serializable 
I'm suspecting there's a dictionary in there that has a tuple as key and that's not allowed in JSON-land. So much for simple! Looks like I'll be writing a custom serializer fror this. I was just trying to write a proof-of-concept demo; what I've proven is that just 'cause "simple" is in the name, doesn't mean I'll be able to do everything I want with it very simply.

I've had a long day. A good night's sleep and fresh eyes on it tomorrow will probably get this done but if yer reading this tonight and you happen to have something crafty up your sleeve for extending simplejson for things like this, let me know!

     

( Apr 23 2007, 10:50:21 PM PDT ) Permalink


20070422 Sunday April 22, 2007

Linux Virtual Memory versus Apache

I ran into a very peculiar case of an Apache 2.0.x installation with the worker MPM completely failing to spawn it's configured thread pool. The hardware and kernel versions weren't significantly different from other systems running Apache with the same configuration. Here are the worker MPM params in use:

ServerLimit         40
StartServers        20
MaxClients        2000
MinSpareThreads     50
MaxSpareThreads   2000
ThreadsPerChild     50
MaxRequestsPerChild  0
But on this installation, same version of Apache and RedHat Enterprise Linux 4 like rest, every time httpd started it would cap the number threads spawned and leave these remarks in the error log:
[Fri Apr 20 22:54:24 2007] [alert] (12)Cannot allocate memory: apr_thread_create: unable to create worker thread 

It turns out that a virtual memory parameter had been adjusted, vm.overcommit_memory had been set to 2 instead of 0. Here's the explanation of the parameters I found:

overcommit_memory is a value which sets the general kernel policy toward granting memory allocations. If the value is 0, then the kernel checks to determine if there is enough memory free to grant a memory request to a malloc call from an application. If there is enough memory, then the request is granted. Otherwise, it is denied and an error code is returned to the application. If the setting in this file is 1, the kernel allows all memory allocations, regardless of the current memory allocation state. If the value is set to 2, then the kernel grants allocations above the amount of physical RAM and swap in the system as defined by the overcommit_ratio value. Enabling this feature can be somewhat helpful in environments which allocate large amounts of memory expecting worst case scenarios but do not use it all.
From Understanding Virtual Memory
The vm.overcommit_ratio value is set to 50 on all of our systems but rather than fiddling with that, setting vm.overcommit_memory to 0 had the intended effect; Apache started right up and readily stood-up to load testing.

So, if you're seeing these kind of evil messages in your Apache error log, use sysctl and check out the vm parameters. I haven't dug further into why the worker MPM was conflicting with this memory allocation config; next time I run into Aaron, I'm sure he'll have an explanation in his back pocket.

                 

( Apr 22 2007, 08:19:57 PM PDT ) Permalink


20070416 Monday April 16, 2007

Character Encoding Foibles in Python

I was recently stymied by an encoding error (the exception thrown was kicked off by UnicodeError) on a web page that was detected as utf-8, the W3 Validator said it was utf-8 but in all my efforts to get a parsing classes derived from python's SGMLParser, it consistently bombed out. I tried chardet:

>>> import chardet
>>> import urllib
>>> urlread = lambda url: urllib.urlopen(url).read()
>>> chardet.detect(urlread(theurl))
{'confidence': 0.98999999999999999, 'encoding': 'utf-8'}
...and yet the parser insisted that it had hit the "'ascii' codec can't decode byte XXXX in position YYYY: ordinal not in range(128)" error. WTF?!

On a hunch, I decided to try forcing it to be treated as utf-16 and then coercing it back to utf-8, like this

parser.feed(pagedata.encode("utf-16", "replace").encode("utf-8"))
That worked!

I hate it when I follow an intuited hunch, it pans out and but I don't have any explanation as to why. I just don't know the details of python's character encoding behaviors to debug this further, most of my work is in those Curly Bracket languages :)
If any python experts are having any "OMG don't do that, here's why..." reactions, please let me know!

           

( Apr 16 2007, 11:28:31 AM PDT ) Permalink