Six Apart's Update Stream provides a simple interface to clients who wish to subscribe to a persistent stream of entries made in the Six Apart universe, and is an excellent alternative to writing a spider to crawl LiveJournal, TypePad, and weblogs.
Who should subscribe to the stream?
The stream was initially launched with large service providers and search engines in mind, as a more manageable and scalable alternative to pinging the multitude of companies and entities wishing to be notified when new content is available on LiveJournal, TypePad and even Movable Type weblogs.
Connecting to the Stream
To connect to the stream a simple HTTP GET request is issued to the following endpoint:
Once a connection is established, the Atom Server will then begin transmitting to the client any content that is injected into the stream. Additionally, the Atom Stream Server transmits timestamps every second both to keep the connection alive (in case it goes idle), and to provide you a marker so you know how far you've gotten so you can reconnect at a certain point in time if you restart your listener.
Keeping up with the stream
The Atom Stream Server will temporarily cease to stream updates to clients that fall behind in reading data from the stream. In the event that a client cannot keep up with the volume of injections being made to the stream, the server will send an indication to the client of the number of injections they missed.
The Update Stream makes use of the Index Extension for Atom in order to indicate within a feed whether content should be indexed or not by those consuming it.
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="time" type="xs:unsignedLong" />
<xs:attribute name="youMissed" type="xs:positiveInteger" use="required" />
<xs:element name="feed" type="xs:any" />
<xs:any namespace="##other" />
/time ::= unix epoch timestamp:
This element is transmitted at a fixed interval in order to keep listening sockets alive. It also serves as a "save point" that you can resume from later, should you need to reconnect. You should keep track of the last timestamp you heard and when you reconnect later, connect with the URL parameter ?since=TIMESTAMP. You'll have to do any duplicate elimination on your own, and the server makes no promises about how far back it remembers, so you should reconnect as soon as possible, definitely within a minute.
/feed ::= a single atom feed:
This element is transmitted whenever data is injected into the stream, and SHOULD contain a single Atom feed that contains the entirty of the feed.
/sorryTooSlow@youMissed ::= integer representing the number of feeds missed:
This element is transmitted when a client falls behind in reading from the stream and indicates the number of injections they were unable to be read from the stream.
GET /atom-stream.xml HTTP/1.0
HTTP/1.0 200 OK
<sorryTooSlow youMissed="5" />