... and a happy new year!

There's a lot to do here, so just some quick notes about the past two days:
  • I really walked into the wrong direction on the day I wrote about in my last post. But it isn't dangerous and I like going there for the pizza.
  • There is no point in describing Manhattan to you, you must experience it yourself. The words "awesome" and "fuck yeah" must suffice.
  • If you like shopping, the Fifth Avenue in Manhattan will give you an aneurysm out of pure, unfiltered joy. We dedicated a whole day to buying some clothes and in the end I got more clothes than I probably ever did, and I'm still below the free amount of what I am allowed to bring back home without paying for customs.
  • New Yorkers are very proud of their city, rightly so. They rub this into your face by constantly being friendly, helpful and entirely non-condescending. (This also made the shopping experience so much better)
  • No ball dropping for us: we didn't spend New Year's Eve on Times Square because we would have had to secure our spot in the crowd of 1 million people for hours, which would have been uncomfortable in the rain. We got really cool fireworks in Central Park instead.
  • Americans do not brew beer.*

* This is not up for debate.

New Year, New York

So, originally we were to arrive in New York City at December 27th, but because of the aggressively tightened security measures after the recent incident, we arrived at Toronto Pearson International Airport only to get told that our flight to NYC had been canceled. Because of this we had to spend 2 nights in two different Toronto Airport hotels at the expense of Air Canada. We haven't seen a lot of Toronto (except for the mall), though.

But now, this morning, we finally arrived in New York and it's already great. We took a taxi from the Newark airport to our hotel in Queens and got a good impression of that big, big city out of it. Its buildings, streets, and constructions; it's basically everything I expected it to be and of course a whole lot more.

My friends decided to go to Manhattan already, pretty soon after we arrived at our hotel, but after having to get up at 2:15am in Toronto last night (to get our Newark-bound flight at 6:40am) and without any real sleep anyway I decided to doze off a little bit more in my third hotel bed this week instead.

So later I woke up, got a shower, dressed myself nicely for a quick first glance in the western Queens part of New York and at about 2:00pm I was out on the streets.

I'm not exactly sure if that's something I'm really supposed to do, being all alone on those streets near here, because it certainly looks different than anything else I've ever seen. It's raw, weathered and full of trash on the sidewalks and there are some people out there who are clearly on something. But it's a bright sunny day (still very cold) and I figured that I wouldn't get immediately mugged as long as I minded my own business and watched my back a little. This isn't 80s New York anymore after all and no matter what it looks like to my Munich-pampered senses, nobody said that this was a particularly dangerous corner.

So I got to keep my liver and went into a little pizza parlor somewhere on 21st where some mexican people sat around and a mostly spanish speaking guy sold a slice of tasty looking pepperoni pizza to me, which I ate at a table there. My first New York pizza slice, and now I can attest to what people say: they really are the best.

After eating it I had to check out a nearby subway station. It looked really rotten but so fantastically ... well... authentic, like the rest of what I saw, and when my look at that combined ATM/ticket vending machine got too confused (because I wasn't sure which metro pass to choose) I got immediate help from a guy who was passing by, who, possibly through some mysterious New Yorker divination, just pressed the buttons for a week-long unlimited pass and left.

errno.h

They squeezed Text file busy into ETXTBSY and no such process into ESRCH (very obvious). They even left out an S in EACCES and when your Argument list is too long, that condition is mnemonically referred to as E2BIG.

But File name too long is, of course, called ENAMETOOLONG.

KDE 4.1...

... seems at least usable. 'think I'm gonna keep it.

Google Treasure Hunt finished

Better late than never. After finishing the first three questions quite shortly after they came out, I neglected to even check when the next (and apparently final) question of Google Treasure Hunt 2008 would be available.

Today, however, was a slow day, so I just decided to finish that last one, too and now I'm done.

Here's a quick summary of the questions and a basic outline of how I solved each of them.

1. The Robot

A robot is in the left upper corner of a rectangular grid. It has to reach the goal on the right lower corner of that grid. With every move, the robot can go either down or right to reach the next square of the grid. No other directions are allowed. The question is: how many different paths can the robot take to reach the goal?

How I solved it: The number of further possible paths for each grid follows a pretty simple recursion: for each grid that is not adjoint to the goal, the number of possible paths is 2 (since it can go either down or right) plus the number of possible paths when standing on the right neighbour (which would be the next square if you chose to go right), plus the number of possible paths when standing on the lower neighbour. The base cases are for the squares directly above or left from the goal, there is just one possible path there.

Writing down that recursive function is pretty trivial. But it's not enough: the width and height of Google's rectangular grid lies somewhere between 30 and 60, which drives the naive approach infeasible.

The solution is to apply a straightforward approach of dynamic programming: start by computing the number of paths for every square in the rightmost column (which is 1, because you can't go right when being in the last column), then proceed to compute the left neighbouring column, starting with the lowermost square. During each computation, you can immediately use the memorized path numbers of the square's lower and right neighbours without any recursion. This is enough to get the correct answer in a few seconds.

A better approach: I didn't notice it, but the problem can easily be reduced to the binomial coefficient or choose function: since the robot can't ever go back and it always has to reach the opposite corner, every path has the same amount of down moves (D) and right moves (R) and it's sufficient to calculate the number of possible combinations of width-1 R-moves and height-1 D-moves.

Some people apparently chose to do the whole thing in Excel or OpenOffice: every grid square is represented by a sheet cell and each cell contains a formula that calculates the number of paths based on the neighbouring cells, just like I explained above. That's actually a pretty funny way to do some limited functional programming with automatic memoization!


2. The Zip File

You have to download a ZIP archive, which contains several text files in several subdirectories. For each given path patterns (simple ones, the paths must contain a substring and the filenames have to end with a specific extension), find the matching files and take the sum of the numbers in a specific line number. Multiply all sums together, enter the result. Example:

Sum of line 1 for all files with path or name containing BCD and ending in .rtf
Sum of line 5 for all files with path or name containing EFG and ending in .js
Hint: If the requested line does not exist, do not increment the sum.

Multiply all the above sums together and enter the product below.



How I solved it: This one was really easy and I solved it with a quick shell one-liner, using mostly find and sed. There may have been some awk in there, too, but there's really a million ways to do, and they're all pretty easy.

3. The Network


This one was even easier than the last one. You get a number of hosts (named A,B,C...) and a routing table which contains four entries for every host (including a default gateway). You are given a starting host name and a destination IP and your goal is to give the sequence of hosts that a packet would travel.

I don't know what they're aiming at. You need to know next to nothing to solve it, and you don't even need to use your computer in any way. In fact, people who never cared about informatics or mathematics at all could easily solve it after a 5 minutes introduction of how a routing table works.

How I solved it: I did it manually, by following the routes and writing down which hops the packet went through. The only part of the computer that I used was the screen. By looking at it.

It would probably be a good idea to fire people who try to write a program to solve this problem.

4. The Primes

This got more interesting again. You have to find the smallest number which can be expressed as the sum of several given amounts of consecutive prime numbers and which is also a prime number itself. Here's an example:

Find the smallest number that can be expressed as
the sum of 17 consecutive prime numbers,
the sum of 43 consecutive prime numbers,
the sum of 57 consecutive prime numbers,
the sum of 569 consecutive prime numbers,
and is itself a prime number.
How I solved it: Because the constraints are about consecutive prime numbers, all you need is contiguous windows over a (theoretically infinite) list of prime numbers. Except for the last constraint that states that the matching number should be a prime number, which is equivalent to the number being member of the list.

With a lazy language, where infinite lists are no problem, you could write a more or less complex function which generates that list of prime numbers, but don't: there are plenty of precomputed lists of prime numbers on the internet. Just read one in and use it, it's much less hassle.

The first thing to do is to find a prime number candidate. I did this by sliding the largest window (whose size is the amount of summed numbers in the last constraint) until a prime number is found. The check for a prime number itself is a simple search over the prime number list.

Once that candidate is found, it's time to check the other constraints, in descending order of their sum size. To do that, you slide a new, smaller window over your prime list, its size again being the number of summands in the current constraint, until its sum matches your candidate. However, you don't have to start the windows at the beginning of the list: quick reasoning tells you that any sum of numbers within or ahead of a bigger window whose number of summands is less than the previous window will be smaller than the sum of that window and thus can't be your candidate. It is sufficient to start at i-n+1, where i is the first index position after the bigger window and n is the size of the current window. Since you are checking your constraints in descending order of the window size, you can always use the upper bound of the previous constraint's window, subtract the current size and add 1.

All constraints have to match, so once the sum of any window gets larger than your candidate, you have to start over and look for a new candidate.

The sliding itself can be optimized a bit: instead of just increasing the indexes and recalculate the full sum, it's enough to simply subtract the first number in the window and add the next number outside of the windows upper bound. The algorithm will be fast enough in any way.

As you can see, I didn't make any assumptions about prime numbers at all. In fact, the whole algorithm works with any arbitrary list of numbers, as long as it's ordered.


This was fun (except that the two not so challenging questions in the middle were over pretty fast), let's do it again next year. Next time, it might be a good idea to start solving as soon as the questions come out, maybe I'll have the chance to get one of the treasures :)

Firefox: instantly delete saved form and username/password data with the Delete key

How many times have you entered your username and password into a websites login form and then clicked Remember to allow Firefox to automatically fill out your account information the next time, only to discover that you mistyped something?

It seems like the only way to resolve this situation is to access the list of saved passwords in Preferences and manually search and delete the wrong entry. Which is quite a tedious process, and often you skip this step and hence have to choose your account from a drop-down list of entered account data, at least one of them being garbage.

Another situation: Sometimes you want to delete a single form data entries which have been saved in Firefox, because they are mistyped or otherwise unnecessary but still always show up in a frequently used form, or because it contains private information.

This time, it seems like the only way to do so is to clear all your saved form data.

Not so! A colleague showed me a much better way today:

While the auto-completion drop down list is presented to you, simply press your keyboard's Delete key to instantly remove the entry your mouse is hovering above. No matter whether it's saved form data or a saved username/password combination.

Works for Internet Explorer, too, I've been told.

spreading a single request over multiple access logs impossible with lighttpd?

It seems that lighttpd can't spread one HTTP request over multiple, possibly differently formatted access logs. At least I couldn't find anything in mod_accesslog's documentation or with some quick Google searches.

This is an example of how to specify the format and destination for access logging:

accesslog.filename = "/var/log/lighttpd/access.log"
accesslog.format = "%h %V %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\""

You may notice that there are just the two variables accesslog.filename and accesslog.format, only referencing "the" access log instead of several individual ones, and that the assigned values are strings, not lists (I tried specifying lists, lighttpd doesn't accept them).

Now, spreading different, individual HTTP requests over multiple log files is still possible. For example, you might want to log request for different virtual hosts into different log files, an intention which is quite common. To do this, simply use lighttpd's conditional configuration syntax, something which I like very much about lighttpd, and reassign different values to accesslog.filename (and/or accesslog.format):

$HTTP["host"] == "example.com" {
accesslog.filename = "/var/log/lighttpd/example.com-access.log"
}


But it seems that you can't log the same HTTP requests to multiple destinations. Perhaps you want to have one big access log that contains every request for every virtual host and several specific access logs that each contain requests to an individual virtual host. Or you want to have two access logs which contain exactly the same requests, but with different formats.

I did the latter one under apache, in order to log more information for debugging purposes, without disrupting the scripts that gather statistics from the common access logs. That's how I discovered lighttpd's apparent lack of ability to do so.

One workaround would be to pipe into a script instead of writing directly into a file, which lighttpd supports out of the box:

accesslog.filename = "|/path/to/some/log-script"


But that's adding an additional layer of complexity and you can't easily use lighttpd's conditional configuration to decide on the destination anymore. Additionally, in the above debugging scenario where different log files contain the same requests but with different formats, you'd have to specify the set union of all format options in your lighttpd configuration and parse and reformat inside of your script.

Is there anything I have overlooked or are there any plans for future lighttpd versions?

Internet Explorer 7 caches incomplete downloads sometimes

Internet Explorer 7 shows a strange issue with downloads that haven't finished for various reasons. After one incomplete download attempt, users are unable to properly restart the download in order to fetch the complete file. Subsequent tries seem to succeed immediately, without the truncated file actually being completed or replaced.

Thus, once a download was aborted, the user will have difficulties to download the file at all.

This problem occurs when some isolated issue caused a premature end of the HTTP connection, even though the server would normally be able to serve the file completely. I am not yet sure about the exact conditions, but it happened often enough for users to complain. The access logs are pretty clear about it when it happens: one 200 response followed by several 304 responses for the same IP.

Internet Explorer 7's misbehavior occurs because it sometimes doesn't remove incomplete downloads from its cache (which it usually seems to do). In all cases, IE7 should be able to make a distinction between complete and incomplete downloads, because the server did send a correct Content-Length header, but the amount of transferred and saved data was less than the announced value.

So, what IE7 should do on subsequent retries is either
  1. try to do a partial GET to acquire the rest of the file (this would be the preferred behaviour) or
  2. redownload the file with a normal, unconditional GET.
But what IE7 actually does is this: Because the file whose download attempt is being made is, allegedly, already in the browser's cache, it just sends a conditional GET to the web server. It does this by including an If-Modified-Since header in its request, containing the last modification time of the file.

However, since the file on the server really didn't change (it's just bigger, but file size is not checked), the web server correctly responds with a 304 Not Modified status code and no data. IE7 then proceeds to copy the file out of its cache, even though it's a truncated version from a previously aborted connection.

This happens for all subsequent requests and thus effectively blocks the user from downloading the corresponding file at all. Worse even: the user, which does not see any error messages, is instead just told his retry finished without any apparent issue. This leads him to think that the problem is a truncated file on the server, hence not a correctly retrievable file at all.

To resolve this problem, users have to manually clear their cache. However, it's hard to communicate this to possibly unknown users who (quite reasonably) assume that it's a persistent server problem.

Another workaround would be to tell the server that it shouldn't follow conditional GETs with If-Modified-Since header if MSIE7 is detected. But even if it's even possible at all to configure your web server to do this, you should only do this for the larger "download" content of your website (i.e. podcasts and other large static files), as you would be effectively disabling a part of the browser's cache.

So, if you are suffering some dropped connections for whatever reason, IE7 might make matters worse by turning temporary glitches into semipermanent problems. Let's hope this bug gets fixed in future browser versions. In the meantime, tell users which complain about truncated files to clear their cache.

converting videos for iPod video using ffmpeg

This should work with all platforms where ffmpeg is available, including Linux, Windows, Mac OS X and several BSD flavors.

After experimenting a bit, I finally got the right command lines working for ffmpeg which convert every video (at least every video that ffmpeg supports as input) to iPod video friendly format. Everything seems to work, including forward/backward seeking.

First of all, you need to compile xvid and support into ffmpeg. The corresponding arguments to ffmpeg's configure script are --enable-xvid and --enable-faac. You may need to compile and install faac and xvid first. You also need to specify --enable-gpl, since ffmpeg's configure script won't accept those options otherwise.

If you like to be able to encode videos into h264 format instead of MPEG4/xvid, you might also install x264 libraries and specify --enable-x264 to ffmpeg's configure.

Now that you have everything up and running, you may try to convert a video. For a 4:3 video, use:


ffmpeg -i foo.avi -f mov -b 1800 -maxrate 2500 \
-vcodec xvid -qmin 3 -qmax 5 -s 320x240 \
-acodec aac -ab 128 \
foo.ipod.mov


For some reason, I can't get my iPod to honor the aspect ratio settings and to squish a 16:9 video into letterbox format (or zoom into fullscreen, for that matter), although iTunes and Quicktime seem to have no problem doing so. To work this around, just tell ffmpeg to add the black bars on top and bottom itself and substract the height of the black bars from the actual video size:


ffmpeg -i foo.avi -f mov -b 1800 -maxrate 2500 \
-vcodec xvid -qmin 3 -qmax 5 \
-s 320x180 -padtop 30 -padbottom 30 \
-acodec aac -ab 128 \
foo.ipod.mov


It may be worth playing with bitrate and quality settings or also trying to use 2-pass-encoding, remember however that at least according to Apple's specification, the iPod with video only supports up to 2500kbps for MPEG4 and up to 768kbps for h264. Another thing that you might want to play with, especially if you plan on outputting your videos on a TV screen (using the optional A/V cable) instead of just on the iPod are the size and padding options.

If you know of any means to make the encoding process faster or to induce better quality, please drop me a comment. And now, enjoy your movies and clips on the go!

Update: If forward/backward skipping doesn't work, update your version of gtkpod.

This is it!

It is RAW.