Perlbal

Perlbal is our Perl-based reverse proxy load balancer and web server.

What is Perlbal?

It processes hundreds of millions of requests a day just for LiveJournal and TypePad and dozens of other "Web 2.0" applications.

Overview

Perlbal is a single-threaded event-based server supporting HTTP load balancing, web serving, and a mix of the two (see below).

One of the defining things about Perlbal is that almost everything can be configured or reconfigured on the fly without needing to restart the software. A basic configuration file containing a management port enables you to easily perform operations on a running instance of Perlbal.

Here is a basic list of the personalities and features that Perlbal implements at this time.

Role: Web Server
  • Listen on a port, share from a directory
  • Directory indexing
  • Byte range support (clients can resume downloads)
  • Can have directory index requests fall back to index file list
    • I.e., requests for /foo/ go to /foo/index.html instead
    • Multiple index files supported, tries one at a time until it finds one
  • Persistent client connections (configurable)
  • Almost all disk operations are done asynchronously as to not stall the event loop (Almost. When you enable PUT support, the
    close()
    operation is blocking. However, it's generally pretty fast (we've had no problems). Also, directory indexing is a synchronous operation.)
  • Configurable support for storing files (PUT, DELETE support)

Role: Reverse Proxy
  • Maintains pool of connected backend connections to reduce turnover
  • Gets list of nodes either from asynchronously monitored node file, or from in-server pool objects which you can add/remove nodes from using the management interface.
  • Intelligent load balancing based on what backend connections are free for a new request. No unreliable "weighting" numbers required.
  • Can verify (using a quick OPTIONS request) that a backend connection is talking to a webserver and not just the kernel's listen queue before sending client requests at it. Lower latency for the client.
  • Has a high priority queue for sending requests through to backends quickly
    • Uses cookies to determine if a request should go to fast queue (configurable)
    • Highpri (high priority) plugin supports making requests high priority by URI or Host
    • Can specify a relief level to let low priority requests through to prevent starvation
  • Can allow X-Forwarded-For (and similar) headers from client based on client IP
  • Configurable header management before sending request to backend
  • Internal redirection to file or URL(s)
    • Big one for us; a backend can instruct Perlbal to fetch the user's data from a completely separate server and port and URL, 100% transparent to the user
    • Can actually give Perlbal a list of URLs to try. Perlbal will find one that's alive. Again, the end user sees no redirects happening.
    • Can also redirect to a file, which Perlbal will serve non-blocking. See webserver mode above.
  • Persistent client connections (configurable)
  • Persistent backend connections (shared by multiple clients; no “backend waste”)

Performance

  • Event-based using epoll or kqueue to avoid the scalability problems of not-so-modern systems
  • HTTP Header processing (optionally) done in C with Perlbal::XS::HTTPHeaders for maximum performance
  • 100% asynchronous in all the recommended use cases
  • Lightweight
  • Great performance "out-of-the-box" (for both small and large sites)

Statistics and Monitoring

The management interface provides extremely detailed and powerful statistics in addition to runtime configuration. For example:

  • CPU usage (user, system)
  • Total requests served across all services
  • Requests serviced by individual backends
  • Perlbal uptime
  • All connected sockets (and tons of info about each)
  • Outstanding connections to backends
  • Backends that have recently failed verification
  • Pending backend connections by service
  • Total of all socket states by socket type
  • Size (in seconds and number of connections) of all queues
  • State of reproxy engine (queued requests, outstanding requests, backends)
  • Loaded plugins per service

(All statistics are in machine readable form, so it's easy to parse and write scripts that check on the status of Perlbal!)

Plugins (Extensibility)

Perlbal supports the concept of having per-service (and global) plugins that can override many parts of request handling and behavior. We have written custom plugins that send new headers to the backends, promote requests to the fast queue, maintain more detailed statistics, do image header manipulation, and more...

For more information on how plugins work, see the plugins directory in the distribution. They're fairly self-explanatory.

Much more documentation needs to happen...

Get Perlbal

Downloads at: http://search.cpan.org/dist/Perlbal/. You'll also need Danga::Socket and Sys::Syscall.

The source code is in the Six Apart SVN Repository.

Feel free to ask us questions on the mailing list or read more about Perlbal on Wikipedia.

Application Type: Infrastructure
Discuss: Mailing List
Source Code: Trac and SVN
Latest Release: March 8, 2008

Download 1.70