Your browser is very old. You might enjoy surfing the web more if you used something newer like:

Google Chrome

Even Firefox would be OK.

If you're being forced at gunpoint to use Internet Explorer, you should at least upgrade it. Version 8 is tolerable and 9 will be OK when it comes out.

Posts from January 2010

Localizing PHP with gettext

Today, we’re going to talk about the basics of localizing (creating versions in multiple languages) a PHP site using the Gettext extension.

In theory, the steps are pretty simple.

  1. Wrap each block of text that’s going to be translated in gettext(“This is my string”) (You can use _() as a shortcut for gettext())
  2. Create translation tables for each additional language. There’s a one-to-one correspondence in the translation tables so “This is my string” will be mapped to a different string in each language.
  3. Setup which language your site is using at any given time.
  4. PHP handles the rest!

In practice, it’s a little more confusing than that.

Step 1: Create your translation tables

There are two parts to the translation files:

  1. The PO (.po) file. This is a plain text file which contains some project & charset info, then a list of PHP files, the strings in those files, and the translations of those strings.
  2. The MO (.mo) file. This is a binary file that is essentially a “compiled” version of the PO file. This is the file that actually gets read by PHP to figure out which string to display.

It’s possible to create the PO and MO files using the xgettext and msgfmt command-line tools.

However there’s a very nice GUI app called POEdit which runs on Windows, Linux, and OSX. It’s free and makes working with these files much easier so that’s the method I’m going to cover.

The first thing you want to do is create an area for your locale files to be stored. I recommend something like the following where folders are named according to their standard locale names. In fact, on Ubuntu, this didn’t work unless the name of folder containing the PO and MO files exactly matched the locale name listed in /etc/locale.aliases (minus the charset info). On Ubuntu, you may have to install the language packs for the languages you’re going to be using.

I had to manually install the language packs for German, Spanish, and Chinese for this example:

sudo apt-get install language-pack-de
sudo apt-get install language-pack-es
sudo apt-get install language-pack-zh

Then let’s create the following directory layout. I’ve been putting the locale folder in the same path as my PHP project but it technically doesn’t matter.

locale
  |
   de_DE
     |
      LC_MESSAGES
  |
   es_US
     |
     LC_MESSAGES
  |
   zh_CN
     |
     LC_MESSAGES

Now fire up POEdit and create a new catalog.
New catalog menu

Enter some relevant information about your project. The most important things here are the language and country this translation is going to be for.
Project Settings

Set the Base path to the directory where your PHP files live, then add “.” to the list of paths to scan for files.
Project Path Settings

Then save the file as messages.po (you’ll see why the name is important down the road) to the LC_MESSAGES directory you created for this particular language.
Save .po file

Now click the “Update Catalog” button to tell POEdit to scan your PHP files for strings wrapped in gettext(), _(), or any other functions you may have added to the Keywords section of the project settings.
Scan PHP Files

This will display a list of strings that will be added to the PO file.
Update Summary

Once the strings have been loaded, it’s simply a matter of entering the translations and saving to generate the MO file.
Define translation tables

Step 2: Tell PHP how to load the translation information

There are several things happening here.

First we check to see if the locale is specified as the query string. If not, we default to English. Then we specify the path to the locale directory and set up the translation “domain” ($domain = ‘messages’; tells PHP to look for MO files named messages.po).

//Make sure we specify a charset of utf-8 or lots of foreign characters (Chinese in particular) won't show up properly
header('Content-type: text/html; charset=utf-8');

$locale = ($_GET['locale']) ? $_GET['locale'] : 'en_US';

$localePath = DOCROOT . '/locale';
$domain = 'messages';

//Set the language to whichever locale we're using
putenv("LC_ALL=$locale.utf8");
setlocale(LC_ALL, "$locale.utf8");

//Specify the location and charset of the translation tables
bindtextdomain($domain, $localePath) ;
bind_textdomain_codeset($domain, 'utf8');

//Select the translation domain
textdomain($domain);

echo _("Page Title");

As long as “Page Title” exists in the translation table, the translated string should be output instead of “Page Title”.

For simple cases where English is the default, I think it’s reasonable to just put the English text in the _(“”) call (like _(“Welcome to my page!”)). In the product case where I’m going to be using this, there’s going to be a huge amount of content in big chunks so I opted to treat each string as a named indentifier (like _(“Page TItle”)) and then create a translation table for english.

Caveats

Here’s a short list of things that threw me a bit, some of which were mentioned previously:

  • I’ve only tried this with PHP 5.2+ and Apache. However the code worked perfectly on Ubuntu+Apache as well as Windows+Apache
  • You need to restart Apache anytime your PO and MO files change. This because gettext caches the translation tables and won’t reload them without a webserver restart.
  • Ubuntu needed to have language-packs installed for each language I implemented.
  • Ubuntu also needed to have the locale names match the names in /etc/locale.aliases

Running a CherryPy app with Apache and mod_python

I’ve been working on getting a new server setup to run some CherryPy apps. I’d only ever run CherryPy scripts using the built-in webserver so I figured I’d do quick write-up of how I got it working with Apache and mod_python.

A Simple Python Template
The basic usage example is a good place to start. Just make sure that you don’t accidentally make def setup_server() part of the Root class. In the end, I went with the DeployTemplate example because I like the fact it can run inside Apache or using the built-in webserver without any code changes.

The Root class is used in the same way as all the other CherryPy examples. The functions below ( start(), serverless(), and server() ) allow the script to operate either inside Apache or using the built-in webserver.

import os
import cherrypy

class Root(object):

    def index(self):
        return "Hello World!"
    index.exposed = True

root = Root()

def start():
    cherrypy.config.update({
        'log.error_file': os.path.join(os.path.dirname(__file__), 'site.log'),
        'environment': 'production',
        })
    cherrypy.tree.mount(root)
    cherrypy.engine.start()

def serverless():
    cherrypy.server.unsubscribe()
    start()

def server():
    cherrpy.config.update({'log.screen': True})
    start()

if __name__ == "__main__":
    serve()

Configuring Apache

This is a normal Name-based VirtualHost entry. The important stuff is down in the Directory directive.

The first step is to add the path to the directory containing your application to the PythonPath.

The next step is to setup CherryPy as the PythonHandler

The last required step tells CherryPy which function to call to start your app. In this case, my app is named myapp.py and it lives in the /data/websites/myapp.net/htdocs directory. We’re then telling CherryPy to call the serverless() function that’s defined inside myapp.py

I’ve also left PythonDebug on which helps for debugging by printing more complete error information to the screen and logs. You’d want to turn that off in production.

<VirtualHost *:80>
        ServerName myapp.net
        DocumentRoot "/data/websites/myapp.net/htdocs"
        Customlog "/data/websites/myapp.net/logs/access.log" combined
        ErrorLog "/data/websites/myapp.net/logs/errors.log"

        <Directory /data/websites/myapp.net/htdocs>
                AllowOverride All
                PythonPath "sys.path+['/data/websites/myapp.net/htdocs']"
                SetHandler python-program
                PythonHandler cherrypy._cpmodpy::handler
                PythonOption cherrypy.setup myapp::serverless
                PythonDebug On
        </Directory>
</VirtualHost>

Basics of the Alternative PHP Cache (APC)

We’re going to talk, briefly, today about the Alternative PHP Cache (APC) which can be used to cache data at the application level. Keep in mind that this is very different than storing data in the SESSION object (which disappears when the session is over) and is only accessible by that particular SESSION. Data stored in the APC lasts indefinitely if you haven’t specified a TTL (time-to-live) and is accessible across all SESSIONS.

Using it is pretty simple. Plop data into the cache by calling apc_store(), get the data back out by calling apc_fetch(), and remove it from the cache by calling apc_delete().

You’ll also notice the function apc_add() which does the same thing as apc_store() except that apc_add() will not overwrite an existing value in the cache.

In general, you can put just about any kind of object into the cache. The only exception is an array of objects, in which case you need to wrap the array with the ArayObject wrapper.

So what’s it good for?

Technically, you can use it to pre-compile a PHP file into a bytcode to speed up execution (although I haven’t used that functionality personally).

I mainly use it for caching the results from large database queries so I don’t have to keep hitting the database over and over.

For example, here’s the profiling information for a random test database query:

And here’s the profiling info for pulling that same information from the cache:

As you see, a decent amount of speed-up for not very much extra work.