What's That Noise?! [Ian Kallen's Weblog]

« Previous page | Main | Next page »

20050412 Tuesday April 12, 2005

Hello Berkeley DB This morning, I wanted to get familiar with the Berkeley DB "Java Edition" API (that's a mouthful, can't I just call it "sleepycat"?). I was in a carpool and I don't think the dude driving realized I was hacking-in-traffic. While I've happily used used DB_File in Perl for years-n-years, I haven't had time/opportunity to mess with the Java stuff from Sleepycat. However, at work we're cooking up a durable message buffer with an embedded servlet container, fun! As with poking into any new API, I like to start with Hello World.

Here's my Sleepycat Hello World

import com.sleepycat.je.Cursor;
import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.LockMode;
import com.sleepycat.je.OperationStatus;
import java.io.File;

public class HelloBdb {

    public static void main(String[] args) throws Exception {
        String key = args[0];
        String value = args[1];

        File dir = new File("db");
        dir.mkdirs();

        Environment env = new Environment(dir, new EnvironmentConfig());
        Database database = env.openDatabase(null, "foobar", new DatabaseConfig());
        database.put(null, 
            new DatabaseEntry(key.getBytes()), new DatabaseEntry(value.getBytes()));

        DatabaseEntry foundKey = new DatabaseEntry();
        DatabaseEntry foundData = new DatabaseEntry();

        Cursor cursor = database.openCursor(null, null);
        while (cursor.getNext(foundKey, foundData, LockMode.DEFAULT) == 
                OperationStatus.SUCCESS) {
            String keyString = new String(foundKey.getData());
            String dataString = new String(foundData.getData());
            System.out.println("Key | Data : " + keyString + " | " + 
                dataString + "");
        }
        cursor.close();
        database.close();
        env.close();
    }
}

Of course, the real fun will be running this in a multi-threaded environment and the concurrency issues therein. With Hello World done, it's time to move on to see what else needs to be added to the cookbook.

( Apr 12 2005, 08:57:09 PM PDT ) Permalink


20050410 Sunday April 10, 2005

del.icio.us investment details Joshua Schachter just announced the specifics of the del.icio.us investment. It's an intriguing cast of characters.

Here's the roster:

On the mound: Union Square Ventures
specializes in infomediation
Catcher: AMZN
Amazon is certainly interested in how you tag the products your interested in
1st base: Marc Andreesen
what's he up to now?
2nd base: BV Capital
Bertelsmann Ventures are playahs
short stop: Esther Dyson
another leading light
3rd base: Seth Goldstein
another
Right field: Tim O'Reilly
book mogul
Center field: Bob Young
founder of RedHat
Left field: Josh Koppelman
The half.com guy
DH: Howard Morgan
Not sure, IdeaLab?
I've been a del.icio.us fan for a long time, this seemed inevitable. Looks like a lot of good folks, congrat's to del.icio.us!

( Apr 10 2005, 06:38:37 PM PDT ) Permalink


20050405 Tuesday April 05, 2005

Talk To The Blog Cause The Hand Isn't Typing If you'd rather dictate audio than type a word composition, check out Audioblogger. It works with a blogger account by posting audio that you call in to a number that Audioblogger provides.

With all of the phones these days supporting crude digital video features, it seems like just a matter of time before vblogging comes of age. Er, hold the phone, Google is archiving video clips now. In the meantime, I'm imagining James T. Kirk rambling philosophically to his blog about a recently visited corner of a previously uncharted galaxy and posting video mashups of interstellar foibles.

( Apr 05 2005, 07:55:31 AM PDT ) Permalink


20050404 Monday April 04, 2005

How Not to Learn Japanese I've been collaborating with a team in Tokyo a lot over the previous few months. They've been very gracious about speaking English with me but I've been unable to reciprocate. Anyway, I've had my eye on the w3c conference in Chiba. I should probably learn to speak a little Japanese.
This is what I'm not going to be doing to learn Japanese:
Kanji Quiz Toilet Paper
This is an extremely extraordinary item, toilet paper with the power to teach! For those who prefer to "sit in the library" this great paper can provide... hours (?) of fun and prepare them for that pop quiz in Japanese class the next morning. Three different methods of learning are provided "multiple choice", "reading", and "philosophical fill-in-the-blank". Written in a soft blue color on white. A superb item for anyone interested in studying Japanese, it's really cool on many different levels.
- eBay blurb where this was offered for $4.80
Maybe I'll just take the phrase book or something.

( Apr 04 2005, 01:43:29 AM PDT ) Permalink


20050331 Thursday March 31, 2005

Core Competencies Has Google jumped the shark and been eclipsed by Yahoo!?

Clearly Yahoo!360 demonstrates Yahoo!'s competency: building a relationship with its users. All of the web mail, 'My Yahoo!' and 'Yahoo! Groups' user features are basically good personalized outward looking experiences with (more or less) single sign-on. Contrast this with Google's user features: the disconnected islands of GMail, Orkut and Blogger each have separate identities and user populations.

360 adds to Y!'s feature set social networking, blogging and integrates with (some) other Y! features seamlessly. I anticipate seeing Flickr and other innovative features making their way into this soon enough. So it's transforming from a personalized outward looking platform to one for personalized networking. Google's core competency seems to be stuck at "sooner or later, we're indexing everything" -- not that that's a bad thing. Unlocking the world's accumulated knowledge is cool.

So while I don't know if I agree that Yahoo! has overtaken Google, they're both clearly magnifying what they're good at lately. And I'm sure the Googlers aren't sitting still, they've got some smarty pants people there. In the meantime, Yahoo! clearly knows mores about providing a platform for people.

( Mar 31 2005, 12:36:39 PM PST ) Permalink


20050330 Wednesday March 30, 2005

Rob Zombie Becomes More Human How fitting that the week I found myself prowling around the grinding heavy metal equivalent of a nascar-for-nerds rally I've learned of Rob Zombie's blog. What's next? Lars Ulrich releasing music under a creative commons license?

I am the ripper
Man a locomotion
Mind love american
Style
More human than human
...stranger, sicker things have happened

( Mar 30 2005, 10:51:43 PM PST ) Permalink


20050329 Tuesday March 29, 2005

EVDB The Events and Venues Database just went beta.

I'm looking forward to subscribing to a to follow specific types of events and seeing what EVDB does to identify important events.

( Mar 29 2005, 09:04:27 PM PST ) Permalink


20050328 Monday March 28, 2005

RoboGames 2005

There was almost a Homebrew Computer Club feel (not that I was there but I can imagine the ambient excitement and ferment) to the mix of people who turned for RoboGames 2005. This is the event formerly known as and it was heaps 'o fun!

In a future time
Children will work together
To build a giant cyborg

Robot Parade
Robot Parade
Wave the flags that the robots made
Robot Parade
Robot Parade
Robots obey what the children say

There's electric cars
There's electric trains
Here comes a robot with electric brains

Robot Parade
Robot Parade
Wave the flags that the robots made
Robot Parade
Robot Parade
Robots obey what the children say
 
Boy versus Robot
The boy has a better CPU but the robot has a blue samurai sword that lights up!
I took a few pictures. This is the kind of endeavor I could imagine sucking up all of my time. For now, I'm content to take an aloof armchair interest in robotics.

( Mar 28 2005, 01:12:05 AM PST ) Permalink


20050327 Sunday March 27, 2005

Tagbacks Just wanted to post a quick clarification from some of the discussion last month about .

The comment about Technorati's robots.txt is correct. There are no rules there for the tag pages there but in the tag pages themselves have instructed the googlebot with nofollow, this is in the markup:

<meta name="robots" content="index,nofollow" />
This won't stop tag spam that's just trying to get accidental clickthroughs (i.e. we know we have some more work to do) but it will deny google juice rewards (which is what they really want).

( Mar 27 2005, 06:17:23 PM PST ) Permalink


20050316 Wednesday March 16, 2005

The Web Spammed By Jesus There is no shortage of funny, sad and scary things fizzling out of the magma of blogs.

The recent interest Technorati has taken in web spam has perhaps inspired the divine. Note the recent appearance of http://jesus-the-lord.blogspot.com/ but linking to it without nofollow is most certainly a sin.


( Mar 16 2005, 01:59:38 PM PST ) Permalink


20050215 Tuesday February 15, 2005

A Java i18n Checklist I've worked on Java projects with the human language aspects abstracted out but now that I've had to get down and dirty with real localization problems, I'm starting to take mental notes in the "if I had known that earlier it would've saved me a lot of grief" folder.

It seemed pretty straight forward going into the project that I'm working on:

  1. Text is stored as UTF-8 character data
  2. Use ResourceBundle property files to manage display strings
  3. Maintain the set of property keys in the properties file
  4. Let the browser's Accept-Language request headers drive what property file to prefer
See, it's easy! Well, for simple proofs of concept with Western characters, it's just about that easy. When dealing with multibyte strings for asian languages, there's a whole lot more to consider.
  1. Make sure the servlet container is handling UTF-8 appropriately
    For instance, if Tomcat is serving on the HTTP tier edit server.xml and make sure the URIEncoding attribute (absent by default) is set for the connector.
        <Connector port="8080"
                   maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
                   enableLookups="false" redirectPort="8443" acceptCount="100"
                   debug="0" connectionTimeout="20000"
                   disableUploadTimeout="true"
                    URIEncoding="UTF-8"
        />
    
    The same holds true for letting Apache do the HTTP dirty work and connecting with mod_jk
        <Connector port="8009"
                   enableLookups="false" redirectPort="8443" debug="0"
                   URIEncoding="UTF-8"
                   protocol="AJP/1.3" 
        />
    
    And, by the way, if you have static content served by an Apache server, you probably want this as well
    AddDefaultCharset utf-8
    
  2. Wire up the native2ascii ant task into the build system early in the project.
    Manually dealing with the ASCII escaping is a nuisance. If the conversion can't be transparent, at least automate it.
  3. Make sure the database connection drivers are being gentle with their data handling.
    In the case of MySQL, changing the JDBC URLs from this
    jdbc:mysql://localhost/fubar
    
    to this
    jdbc:mysql://localhost/fubar?useUnicode=true&characterEncoding=UTF-8
    
    made a world of difference.
  4. Check the HTTP response headers to assure that the Content-type header value is appropriate
    If the charset isn't set to UTF-8 when it really is, you could be confusing the client. This can set in a servlet, in a JSP and IIRC the struts-config.xml allows you to set it declaritively. You want to set the Content-type before writing to the response object's PrintWriter. Apparently if you have multibyte characters in your JSP page components, you need to set the pageEncoding i.e. in the JSP file itself, something like this:
    <%@ page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8 %>
    
    Though my whole motivation for using Java on this project was to have page components have only markup and display code; all of the lanugage is abstracted. Anyway, I'm preferring Velocity over JSP these days.
  5. Be prepared to convert request parameter values.
    In my experience, doing this
    request.setCharacterEncoding("UTF-8");
    
    before getting the parameter values is not reliable (could be Tomcat bugs though). However, this appears to be a fairly standard idiom
    String formValue = new String(request.getParameter("formParam").getBytes("ISO8859_1") /* bytes */, "UTF-8");
    
There are other i18n traps to beware of; seems like every place data is passed from one subsystem to another there's an opportunity for the encoding to get mangled.

( Feb 15 2005, 10:17:29 PM PST ) Permalink


20050214 Monday February 14, 2005

Time Slips Away Another day of end-to-end frenzy and the day is still not done. Short meetings turn long. Interruptions and blocking issues. Who has time for anything? Who has time to blog?

I just took my trusty old swiss army watch out of my pocket, I shoved it in there on the way out the door this morning and have had nary a chance to adorn my wrist with it all day. And now it's too late. Where goes the time?

( Feb 14 2005, 09:24:56 PM PST ) Permalink


20050127 Thursday January 27, 2005

Speed Isn't Everything and It's Crap Without Automated Testing The software project management triangle has these three things at the corner:
  1. scope
  2. speed
  3. cost

The third item refers to allocating more resources (typically, people) to projects. The idea is that if you're resource starved (short handed) you need to reduce scope and/or the schedule (sacrifice speed) to compensate. If you increase scope, either the schedule slips or costs increase (or both). Of course, throwing more resources at the problem is often counter-productive.

There's another triangle associated with software development:

  1. rigor
  2. automated testing
  3. quality
Most software projects involving technology innovation can ill-afford to lean to close to the rigor corner; problems that haven't been identifiably solved before are intrinsically riskier. Rigorous development practices such as old-school waterfall processes are usually a bad fit where the emphasis has to be on speed of execution and technical creativity. Culturally, automated testing can't just be a function for quality assurance staff; unit testing, integration testing and functional testing have to built into the DNA of a software project's build system or else the engineering costs for writing and running tests grows too high to sustain. So the third option is winging it in an ad-hoc fashion and live with the associated reduction in quality. Obviously for things like flight control and medical monitoring where lives are at stake, you want rigor and automated testing otherwise bad things happen. Space crafts blowing up and/or becoming astro-debris, missile guidance systems failing or heart transplant patients dying due to software defects are tragic. But most software projects don't have those kinds of consequences attached. Thus the incentive is high for making test driven development part of the software engineering culture wherever speed of execution and breadth of innovation are among the primary motivations.

So if low quality is crap, perhaps there's a mathematic expression here

  SPEED - AUTOMATED TESTING = CRAP
...and perhaps it's even transitive: high quality + automated testing = speed.

Speaking of which, it must be time for more coffee.

( Jan 27 2005, 10:11:18 AM PST ) Permalink


20050123 Sunday January 23, 2005

Technorati + Firefox Search tools in browsers are usually pre-populated with all of the usual suspects: Google, Yahoo, Amazon and so on. That's all well and good but what about when you want to know about what's being said on the real-time web?

There's now a Firefox plugin that adds a "Technorati Engine" to the pulldown list. Sweet!

( Jan 23 2005, 09:45:08 AM PST ) Permalink


20050122 Saturday January 22, 2005

Start Up Pains I've posted here about Technorati's misadventures in recent months: power outages, disk upgrades, data center moves and so on. In the life of a startup, these kinds of foibles seem to be just a part of growing up.

So I've empathized and enjoyed recent readings of other's mishaps. Not in celebration perhaps in feeling the bonds of shared trauma.

FlickrBlog's Growing Pains
  • unbalanced load/capacity distributions
  • bottlenecked queues
  • non-specific system instabilities
Word to your mutha, I know that.
LiveJournal's Power-loss post-mortem
  • hardware configuration/system problems that exhibited themselves as hosts that didn't came back up correctly
  • MySQL databases getting hosed
  • Databases tuned for speed over safety
Been there, done that.
So is it the destiny of all web service start-ups to have fabulous disasters? Probably. A lot of times, these kinds of things are predicated on having Innovative architectures, on doing some things that aren't widely known to have been done that particular way. Can you really guard against finding yourself with arrows in your back when you're out on the frontier? I don't think so. Count on the topsy turvy. It's not always fun when you're in the thick of it but if you adapt, you'll be better for it.

Of course, you could just laugh about it. Or post to your blog about it. Or both. So far, from what I've stumbled across, this is the funniest of the bunch.

10:11 pm: So far so good. Things are checking out, but we're wearing tinfoil hats. A few annoying LJ users, but nothing that's not fixable. We're going to be buying a bunch of weed on Monday so that, if this happens again, we'll just be too baked to care.
This weekend's disastrophe for me is relatively mild: sore throat and congestion. So I'm drinking tea. And laughing about it. And posting to my blog.

( Jan 22 2005, 08:42:19 PM PST ) Permalink