Tuesday, July 3, 2012

Nexus Q from Google I/O stolen by airport security

One of the giveaways at this year's Google I/O conference was the newly announced Nexus Q streaming media player. I was particularly looking forward to tinkering with this device, because it seems like it has a lot of potential to be unlocked (it runs Android ICS, and during the keynote they referred to the USB port as being for "general hackability").

Because I had so much else to carry (my camera equipment, laptop, Nexus 7 and Galaxy Nexus were in my carry-on backpack), I checked my suitcase, which had the Nexus Q and the Chromebox in it, as well as the Sphero I bought for my four-year-old son.

The travel experience was pretty bad. I had arrived in good time for my 3pm flight home via Chicago, but at the gate it was delayed to 4pm, because they didn't have any flight crew available, meaning that I was going to miss my connection. United told me that I'd be stuck in Chicago until the next morning, and reluctantly admitted that they'd have to arrange a hotel room for the night. (But not themselves — they said I'd have to find someone to arrange that when I got there.)

Sitting on the plane, we all watched in bemusement as standby passengers were repeatedly shuffled on and off, and out of the window I saw pets being dropped off (three times, as the mobile conveyor belt had disappeared), and then taken away again on a baggage cart. It all seemed like a bit of a circus. Eventually it was announced that the flight would be further delayed until 5pm, and then as 5pm passed they told us that there were dents in the plane, the depth of which they needed to measure, and that they'd have a maintenance decision by 6pm. I got off the plane to get a drink and stretch my legs, and was back in my seat just in time for them to tell us that they were pulling the plane out of service. On the way out they gave me a slip of paper with a number to call, which I did, and was offered a 10pm flight via Philadelphia on US Airways, getting me home at 9am the next morning. I took it, as it was my best option, but as I was at work the next day I ended up having to be up for 30 hours straight.

When I arrived at my home airport, I picked up my bag from the United flight it had come in on, but when I got it home and unpacked, I discovered with horror that my Nexus Q was missing, having been replaced by a Notice of Inspection from Covenant Aviation Security. CAS only seem to accept complaints by mail, fax or leaving a voicemail, so I'm starting with the people I can talk to on the phone first.

Also missing were: the Sphero charger, which means my son only got about ten minutes of playing with that before I had to tell him we'd need to buy a new charger; the chargers for my shaver and beard trimmer, which can't be replaced on their own, so I'll have to buy whole new ones; my iPad desk stand; and a bottle of heartburn pills. Thankfully, the Sphero itself and the Chromebox were not taken.

I almost managed to put in a claim with United, but at the last moment the agent told me that he couldn't submit it, because they won't take responsiblity for my bag due to me flying with US -- despite my bag flying with United! Are US going to tell me they're not responsible because the they didn't transport the bag?

This is a sad and frustrating ending to what had been a really great week.

Wednesday, May 4, 2011

Rock stars and session musicians

Something I've noticed since moving from the UK to the US is that potential employers are more likely to ask you for a link to your GitHub (or similar) account. In fact, some employers claim to value it more than a résumé. And who can blame them? A résumé says, "here are some things I would like you to believe I have done." A brimming GitHub account, or a well-maintained project site, says, "I get things done, and demonstrably so. I also like to share, and I'm probably more than a nine-to-five programmer."
But what does it say about a person if they don't? Not necessarily the opposite. In my case, programming is a passion. I've been doing it for fun since I was about five years old. I've been lucky that I could take something I enjoy and turn it into a career, but the day job and the programming I do at home are very different things. My day job is about deadlines, requirements, standardized platforms and change control. They're about the mechanics of delivering products as much as they are about the creativity of writing software. So it's nice to come home and spend some of my increasingly rare free time (I have a wife and a three-year-old) just experimenting and learning.
There's nothing really wrong with that, but there's always room for growth, and I see benefits to myself in 'putting myself out there'. I've recently embarked on a couple of longer-term personal projects. One of them is yielding a Werkzeug-based web app framework as an artifact, and I do intend to release that as open source eventually, even though for the moment it's easier for me to keep it in sync by developing it in the app's private repository.
All of this led me to conceive of the following analogy. Don't think about it too much, though, or it will fall apart.
Some programmers are like rock stars. They create a lot of content that they release with their own name attached to it, and it's a name people in the community know well. Their notability comes with exposure to direct criticism, and popular opinion of them can bias the reception of their work.
Other programmers are more like session musicians. You've probably never heard of them, but they've contributed professionally to many projects. You might even have unknowingly experienced their work as part of a larger product.

Monday, February 7, 2011

Understanding IP address exhaustion

On February 3rd, 2011, IANA announced that they had allocated the last of the IPv4 address spaces to the five regional internet registries. Since then, blogs and news sites have been reporting this and speculating about its implications. Unfortunately, there seems to be a lot of confusion and misunderstanding about how IP address allocation works and the terms that are used.

The management of this IP address space is delegated across a number of different organisations. At the top level is IANA (the Internet Assigned Numbers Authority), which is part of ICANN (the Internet Corporation for Assigned Names and Numbers). IANA's role it is to oversee global IP address allocation, and in the early days of the Internet, IANA would directly provide IP address to the organisations that would use them. Between 1993 and 2005, five Regional Internet Registries (RIRs) became responsible for allocations within continental-scale regions. These regions are:

* African Network Information Centre (AfriNIC) for Africa
* American Registry for Internet Numbers (ARIN) for the United States, Canada, and several parts of the Caribbean region
* Asia-Pacific Network Information Centre (APNIC) for Asia, Australia, New Zealand, and neighboring countries
* Latin America and Caribbean Network Information Centre (LACNIC) for Latin America and parts of the Caribbean region
* RIPE NCC for Europe, the Middle East, and Central Asia

In the APNIC and LACNIC regions, allocation is further delegated to National Internet Registries (NIRs) who will further delegate to Local Internet Registries (LIRs), who are ISPs or other large organisations who need control over their own routing. In the other regions, the RIRs directly delegate to LIRs. (My experience was with operating a LIR in the RIPE region.) At each level, the allocations are smaller. IANA allocates /8 blocks to RIRs (about 16 million addresses), whereas LIRs receive a default initial /19 allocation (8192 addressess).

Even after all these levels of allocation, the IP addresses are still not considered to be in use. Although an LIR can announce all of its *allocated* ranges, it is expected to formally *assign* parts of those ranges to its customers. The upshot of this is that despite Thursday's announcement, end-users will still be receiving assignments... but only for a few more months.

What happens then? Ultimately, we're going to have to move to IPv6, which, as well as having a 128-bit address space, provides a number of other benefits, such as auto-configuration and improved address mobility. Although IPv6 seems new, the first deployments were in 1999, and it has had extensive testing. For a typical end-user, the transition shouldn't be difficult, as all the major operating systems have good IPv6 support. More advanced users can start using IPv6 now, if they want. Even if your provider doesn't support it, you can use a free tunnel provider such as Hurricane Electric or SixXS. There are also transition mechanisms such as Toredo (which tunnels IPv6 in UDP in IPv4, so can be used through NAT gateways) and 6to4 (which tunnels IPv6 in IPv4, so requires a public IPv4 address). The problem lies with any ISPs who don't have a clear IPv6 deployment strategy. While your desktop computer got its IPv6 support through an OS upgrade, the high-speed routers that ISP networks run on need hardware support. Some ISPs are already offering IPv6 to their customers, and others are beginning trials, but some haven't announced any timeline. The one thing that is clear is that the coming months will see a mix of organisations that sail through the transition and those that find themselves in an 11th-hour panic.

Friday, November 27, 2009

Injecting attributes into Python modules

One of the things I don't like about frameworks like Django and Pyramid is the amount of boilerplate imports you end up having at the beginning of modules, especially those modules that are more like configuration files. It spoils the DSL-like nature of them, and I was interested in finding a way to be able to import a module with certain attributes already defined. The __import__ function doesn't allow you to do this, because the locals argument is ignored, so I looked for another way. Before I describe the method I came up with, here's how you might use it:
import elixir
from inject import module_inject
module_inject('myapp.models', elixir)
import myapp.models
Easy!
PEP 302 describes the import hooks that have been available since Python 2.3, and defines an import protocol. By adding an object with find_module and load_module methods to sys.meta_path, you can get hooked into the import process. find_module is called with the module name to see if an object knows how to load it. load_module is then called to do the actual loading. The class below implements both of those methods.
class InjectionLoader(object):
    def __init__(self, name, dicts):
        self.name = name
        self.dicts = dicts
    
    def find_module(self, fullname, path=None):
        if fullname == self.name:
            return self
    
    def load_module(self, fullname):
        # Get the leaf module name and the directory it should be found in
        if '.' in fullname:
            package, leaf = fullname.rsplit('.', 1)
            path = sys.modules[package].__path__   
        else:
            leaf = fullname
            path = None

        # Open the module file
        file, filename, description = imp.find_module(leaf, path)

        # Get the existing module or create a new one (for reload to work)
        module = sys.modules.setdefault(fullname, imp.new_module(fullname))
        module.__file__ = filename
        module.__loader__ = self  
        
        code = compile(file.read(), filename, 'exec')
        
        # Populate the module namespace with the injected attributes
        for d in self.dicts:
            module.__dict__.update(d)
            
        # Finally execute the module with its injected attributes
        eval(code, module.__dict__)
        return module
It's instantiated with the module name it's injecting to, and the dicts it is injecting. To make it easier to use, I wrote a helper function, module_inject. It takes a module name, and one or more dicts or modules. Dicts are injected as-is. Modules have their __dict__s injected, but only those attributes listed in the module's __all__ attribute, or if that isn't present then only those that don't begin with a double underscore, are used. This is like doing a from module import * at the beginning of the imported module. Here is its implementation:
def module_inject(name, *args):
    """Set a hook so that when module 'name' is imported, it is executed with
    the attributes in 'args' already in module scope. The arguments can be
    dictionaries or modules (see 'normalize_dict')."""
    args = map(normalize_dict, args)
    sys.meta_path.append(InjectionLoader(name, args))

def normalize_dict(d):
    """If the argument is a module, return the module's dictionary filtered
    by the module's __all__ attribute, otherwise return the argument as-is.
    If the module doesn't have an __all__ attribute, use all the attributes
    that don't begin with a double underscore."""
    if isinstance(d, types.ModuleType):
        keys = getattr(
            d,
            '__all__',
            filter(lambda k: not k.startswith('__'), d.__dict__.keys())
        )
        d = dict([(key, d.__dict__[key]) for key in keys])
    return d
It's something to be used with caution, though. In general, the Python mantra of *explicit is better than implicit* is a good guideline to follow.
Update: somebody asked me about the use of file as a local variable. I'm actually torn on the issue. Yes, it does shadow the built-in file function, but on the other hand it's concise, and it's the same name used in the Python documentation.

Saturday, October 3, 2009

Initializing attributes from __init__ arguments

Every once in a while, I get fed up of having to do lots of self.foo = foo in Python __init__ methods, and wonder if it couldn't be done automatically. I came up with the following function to do just that, but I doubt I'll ever use it myself, because it goes against the *explicit is better than implicit* philosophy of Python.

#!/usr/bin/env python
import inspect

def init_from_args():
    frame = inspect.stack()[1][0]
    code = frame.f_code
    var_names = code.co_varnames # __init__'s parameters and locals
    init_locals = frame.f_locals # __init__'s dict of locals
    num_args = code.co_argcount # Number of arguments
    arg_names = var_names[1:num_args] # Positional argument names

    # If there's a **kwargs parameter, get the name of it
    kw_name = None
    if code.co_flags | 12:
        kw_name = var_names[num_args + 1]
    elif code.co_flags | 8:
        kw_name = var_names[num_args]

    # Copy the positional arguments
    for name in arg_names:
        setattr(init_locals[var_names[0]], name, init_locals[name])

    # If there was a **kwargs parameter, copy the keywork arguments.
    if kw_name:
        for name, value in init_locals[kw_name].items():
            setattr(init_locals[var_names[0]], name, value)

class Foo:
    def __init__(self, a, b, *args, **kwargs):
        init_from_args()
        bar = 123
        baz = "hello"
        quux = "foo"

if __name__ == "__main__":
    foo = Foo(1, 2, 3, something="something else")
    print foo.__dict__

Tuesday, January 20, 2009

Towards talker standards

A few years ago I wrote an article about the desire to bring talkers out of their strictly console-based world. I'm reproducing it here so it has a permanent home.

Despite the rapidly rising popularity of instant messaging on the Internet, talkers have maintained a loyal following due to the unrivaled sense of presence and community they offer. However, their implementations have remained largely unchanged since their inception, and they have failed to take advantage of the past decade of developments in Internet technologies. This article presents a case for the collaborative development of standard talker protocols.

The state of talker development

Browsing the source code of any current popular talker implementation will reveal signs of a long heritage of modification upon modification. The most popular talkers have long departed from the stock implementations they began with, each adding a rich diversity of new features. More recent talkers such as Amnuts and PG+ are derived from talker code written in 1992. In software development terms, this is a long time. To put it into historical perspective, when Talkserv and Elsewhere were first released, Microsoft had just released Windows 3.1. These talkers were conceptually based on MUDs implemented as far back as 1978.

Talkers are based upon simple client/server architecture. While it is often claimed that they are TELNET servers, in fact most do not adhere to the TELNET protocol, and TELNET clients are just used in their capacity as terminal emulators. Both talker clients and talker servers have limited terminal functionality, and because of this they are limited to line-based input processing. This results in a non-intuitive and often off-putting user interface. For example, most talkers offer the facility to send messages similar to e-mails between users, but there is very little message-editing functionality. While it would be possible to provide a curses-based interface for such operations, the added complexity of doing so has prevent its adoption.

Futhermore, the look and feel of the user interface is defined in the code of a talker server. When establishing a new talker, a sysop first chooses a talker base code to start from. This is usually based on personal preference for style, with the biggest decision being EWToo-style versus NUTS-style. This decision will usually have a large impact which users will use the talker. The sysop must then customise the talker to give it unique characteristics. Some of these customisations simply involve changing text files supplied in the talker distribution package, but most involve changing or adding to the talker's source code. This means the sysop must be a programmer, or at least have available a programmer who is willing to donate his time. This is accepted practice, but it's easy to see how absurd it is by imagining having to recompile your web server in order to update your web site! The customisations come in two forms: modifications or additions to the behaviour of the talker, and modifications to the appearance of the talker. The latter, while relatively easy, is tedious and error prone because each talker's output is intermingled with its control logic, meaning that many disparate functions must be modified in order to create a new unified visual appearance.

Because of the ad-hoc nature of the additions and the lack of separation between logic and presentation, changes are rarely returned to the stock implementation they derived from, and features are often reimplemented afresh in other talkers. This leads to problems later; when the original code base is updated with important fixes the author of the derivative must then decide whether to attempt to isolate and integrate those fixes, or to abandon his code and begin again with the new code base. As a large proportion of the modifications are customisations that provide the derived talker with its uniqueness, this can be a difficult decision to take. It also means that additional code is not reviewed, increasing the risk of introducing security problems.

The problems

The existing problems identified above can be summarised as follows:

  • Non-intuitive UI. Because the talker emulates a text terminal, the server—not the clients—defines the UI, and this UI is restrictive.
  • Output interleaved with logic. A developer must change many disparate functions to obtain a new unified visual appearance, even though those functions may contain logic which is unchanged. This is tedious, leads to errors, and prevents code merging.
  • EWToo versus NUTS. Choosing one style of talker over another restricts the userbase . Others, such as Nilex-style talkers, have even more limited appeal.
  • A sysop must be, or must have, a programmer. Very little of the talker can be changed without changing its source code and recompiling it.
  • Ad-hoc design. Talker code has evolved over many years, under different developers, with no common design goals.
  • Fork and forget. When a new talker is developed the base code is forked and rarely merged. Fixes in the base code are difficult to isolate and integrate.
  • No code review. Single developers develop most talker code. This provides little opportunity for them to receive feedback about code quality.
  • Features are reimplemented. Because no widely used talker supports the notion of plug-in components , it is difficult for developers to release packaged features.

User agents

When MUDs and talkers were first developed Internet access was uncommon, and mostly limited to academic users with Unix accounts. These users were quite used to text-based interfaces driven by abbreviated command names, but most of today's talker users are more familiar with GUIs, multimedia, the World Wide Web and instant messaging software. It's therefore unsurprising that there are a number of MUD and talker clients, such as Pueblo and Z-MUD, that offer enhancements over basic terminal emulation. Talkers can use Pueblo's protocol, while MUDs have even more extensions such as MUD Sound Protocol, MUD eXtension Protocol and MUD Client Protocol. These protocols differ in their design, but they all have a common goal: to allow the client to provide a richer user experience while maintaining compatibility with non-enhanced clients.

To help illustrate the kind of experience an enhanced user agent might provide, the following scenarios suggest some likely interactions.


  • Alice is idly chatting on a talker. She sees a message telling her that Bob is requesting a game of Connect 4 with her. She clicks on the message and a small window containing a playing board opens up. She hears the familiar sound as Bob places his first piece, then clicks on another column to make her own move. She then returns to chatting while Bob ponders his next move.
  • Charlie logs into a talker for the first time in a couple of weeks. He clicks the who's online? button on the toolbar and looks at the list of users. He doesn't recognise Dan, so he clicks on the Dan's profile icon. He sees that Dan is a new user who joined today, so he clicks on his name to open a private chat with him, and welcomes him to the talker.
  • Emma is chatting in the main room, but is getting annoyed with Fred. She clicks on his name, and chooses ignore from the menu. She no longer sees anything Fred says.
  • As Gini logs into her usual talker, a message pops up telling her there are new news items. She chooses to read them now, so a message reading window opens with two messages in it. She reads the messages and deletes them. While she has the message reading window open, she looks back at a couple of old talker mail messages and decides to reply to one of them, before closing the window and returning to chatting.
  • Hayley is chatting in the main room, but also having a private conversation with Ian. It's busy in the main room, and she keeps missing messages from Ian, so she opens up a conversation with Ian window. Her messages from Ian now appear in there instead of in the main window, so she can keep track of both conversations.


These are the sort of interactions that users are familiar with from using other GUI applications. UI design is complex and, to some extent, subjective, so no restrictions on how such a client should behave are given here. Instead, it is anticipated several clients would be developed independently, catering for people with differing tastes.

A client/server protocol

If, instead of the current terminal emulation approach, talkers and their clients communicated using a domain-specific protocol, a number of possibilities would open up. Most importantly, it would allow for a radically different kind of user agent that would be able to present information in a much clearer way.

It would also allow other software to communicate with the talker. Bots are a common example of software agents that need to do this. Most bots currently parse the human-readable output from a talker and respond with the same commands a user would use. This is not a foolproof strategy, and depends on the talker's output for a given event not changing as the talker is developed.

Another recent trend is that of embedding other services into the talker. Examples include the HTTP and SMTP servers in Anthony Biacco's Ncohafmuta talker code. These are only partial implementations of the respective protocols, and are likely to introduce bugs that, if exploited, may crash the entire talker process. A talker protocol would allow for software agents acting as gateways been web servers and mail servers respectively.

Finally, a standardised protocol would allow for talker-to-talker links between talkers using different implementations, so long as each talker adhered to a common standard.

The diagram here shows the structure of the client, gateway and server tiers. Because of the vastly increased flexibility such a protocol would bring, it would be the cornerstone of new talker developments, and careful design would be vital. Any protocol to be considered would need to satisfy certain requirements:

  • The session layer must support the transfer of arbitrary data, including binary types. This does not preclude the use of textual data such as XML documents at the presentation layer.
  • It must provide an inline mechanism for negotiating encrypted connections so that such connections would not require a separate port.
  • It must support authentication methods including, but not limited to, plain passwords and asymmetric keys.
  • It must provide a facility for bi-directional delivery of asynchronous events, and for bi-directional request/response pairs. (It would be acceptable for the latter to be implemented in terms of the former.)

In addition, the application layer would need to not only support the features found in a large subset of current talkers, but also be extensible enough to support future features.

Cryptography

A small number of talker developers have expressed a desire to enable end-to-end encryption between their talker and its clients. This relatively straightforward application of cryptography could be implemented without too much difficulty on the server side, using a free Transport Layer Security implementation, and similarly on the client side if clients such as those described above were used.

Authentication

However, once cryptography has been introduced, it opens up a number of interesting possibilities. The first of these is asymmetric key authentication. Asymmetric key algorithms use a pair of keys: one public, and one private. The two are mathematically related, but to derive one from the other is considered computationally infeasible. Such algorithms are now widespread, and used extensively in protocols such as PGP, SSL/TLS and SSH. This authentication scheme has significant security advantages, because the server need only ever know a user's public key. This public key can be used on every talker the user connects to, and as long as the corresponding private key is never revealed no security is compromised. Typically, private keys themselves are encrypted using a passphrase. It is this passphrase that a user would type when connecting to a talker, and the passphrase never leaves the user's computer. In fact, a client could be designed so that once the user enters their passphrase to connect to one talker, they don't need to enter it again until they restart the client, no matter how many talkers they connect to. This is functionality similar to that provided by SSH agents such as ssh-agent and Pageant.

Trust networks

A problem that talker sysops face regularly is that of user identity. If a sysop wishes to ban a malevolent user, there are no real ways to ensure he stays gone. The user may reconnect at any time using a different name and from a different address, or from an address the sysop knows is used by many users (such as that of a shell server). Because of the high value the Internet places on users' right to anonymity there is unlikely to be a complete solution to this problem, but it can be approached from a different angle. Instead of trying to track users as they change their identity, we can persuade them to use only a single identity.

Suppose that Alice is a user who wants to use a particular talker called Foo Hills. She uses several other talkers, and she knows Bob, who is already a user of Foo Hills. Bob has used the talker for several months, and the talker's sysop has indicated his trust in Bob by using the talker's private key to sign Bob's public key. Bob knows Alice is also trustworthy, so he similarly indicates this by using his own private key to sign her public key. Now Alice becomes a user of the talker and a chain a trust exists from the sysop, to Bob, to Alice.

Now supposed that Carl wanted to join the talker. He has a public key that has been signed by the private key of a small, little-known talker called The Bar. However, no trust relationship exists between Foo Hills and The Bar, so Carl is considered an untrusted user. Depending on the policy chosen by the Foo Hills sysop, he may be denied access, or be allowed to connect as an untrusted user. This concept of trusted and untrusted users could form the basis of what many talkers refer to as citizenship.

Portable objects

Trust networks require that the talker have its own private key, which can be used to sign users' public keys. One interesting possibility that arises from this is the ability to export signed data from the talker, such that anything else with access to that talker's public key can assert two things about that data: that it was indeed exported from that talker, and that it hasn't been modified since it was exported from that talker. Objects (items that users carry, wear, use etc) could be exported in this fashion, and then used in another talker, providing that the importing talker understood the nature of the objects, and had a trust relationship with the exporting talker. The same is true of the 'currency' used on talkers, which suggests that ideas regarding simple economics could be explored. Curiously, there have been several instances of items from MUDs being auctioned off on e-Bay (for real money). This does suggest that such a feature might have some appeal.

Directories

The information stored about a user on a talker can be divided into three categories: transient state, local profile, and user information.

Transient state is implementation-specific data regarding the user's session. This data is discarded when the user disconnects. Local profile includes the user's description, how much currency they have, and which room they connect in. This information is saved when the user disconnects.

Most users use more than one talker. Some users use many talkers, often using the same identity on each. There are many pieces of information associated with users that they must enter manually into each talker. These include name, e-mail address, sex, age or date of birth, IM handles and homepage URL. It would make sense for this information to be kept in one location. An LDAP directory would be one possible solution, even though it leaves a number of details to be considered, such as who would keep the directory online, and what would happen in the event of a failure.

Internationalisation

Talkers don't currently attempt to deal with internationalisation (i18n) issues. This is understandable; it's a complex issue. For example, should the sequence of bytes EF BB BF E4 BD A0 E5 A5 BD look like "ï»¿ä½ å¥½" or "你好"? It's clear to us which one is correct, but not to either the talker or the clients, because the answer depends on the character set being used. Talkers make little attempt to interpret input characters other than the few they directly act upon, instead passing them straight to the clients. If the two conversing parties are using non-ASCII characters (i.e. those with code points above 127) but are using the same character set, then this isn't a problem. However, if they're using incompatible character sets then the non-ASCII characters will be displayed wrongly. The TELNET protocol allows the discovery of a client's character set using a sub-option, but this isn't currently used. Even if it were, the talker would have to perform complex conversions between character sets. A new talker architecture could overcome this problem by storing and transferring all text in Unicode , a character coding system that assigns a single unique number to every character used by modern languages today (and then some). When the talker and all its clients know that they're exchanging Unicode text, agreeing on which characters are being exchanged is no longer a problem. Other issues, such as directionality and normalisation, must still be addressed by client software, but the Unicode Consortium gives clear guidelines on this.

Addressing

Names

Most long-term talkers are hosted on servers that also host several other talkers. The server's DNS name might be server.example.com, but, as each talker requires a unique rendezvous point, a TCP port number must also be specified. Many talkers can be partially referenced by their own DNS name, such as mytalker.com, but as this still resolves to the same IP address it must still be disambiguated with a port number. We use names for addresses because they're easier to remember than numbers, but while we don't have to remember an IP address, we still have to remember a port number.

A similar problem existed in web hosting. Before HTTP 1.1, the solution was to give the network interface of a server multiple IP addresses, and allow one web server to bind to each of these addresses. The extremely rapid expansion of the Web and the limited supply of IP addresses meant a better solution was needed, so the current HTTP protocol requires that the DNS name used to access the web site be specified in each request to the web server. This was a great improvement, but because it required a change to the protocol it was impossible to retrospectively apply the same principle to other services.

DNS SRV resource records offer an even more flexible solution. These records are similar to A records in that they provide an IP address for a particular DNS label, but they also provide a port number, and weighting and priority indicators. The practical result of this is that specifying mytalker.com would be sufficient to direct a next-generation talker client to the desired server. The weight field is intended for use in load balancing situations, and is unlikely to be used by talkers. The priority field, which has the same purpose as the priority field in MX records, could be used to allow clients to automatically failover to a backup server.

URIs

A possible extension to addressing talkers solely by name is to address entities (or resources) within the talker. URIs provide a natural facility for doing this. For example, a talker might have a room called entrance, which could be identified by the URI talker://mytalker.com/rooms/entrance. (Note that this is an example only, and talker is not being suggested as a URI scheme.) An 'advanced' option in a talker client might allow such a URI as an indication of which room the user wanted to be in after connecting. URIs might also identify users, groups of users, objects, message boards, messages and administrative controls.

Towards a solution

If a co-ordinated effort is made to develop the next generation of talkers, the problems mentioned can be overcome, and the new features introduced. However, doing so is a delicate task; it is essential to ensure that even if talkers move forwards technologically they still retain the distinct character that separates them from the instant messengers, MUDs and—perhaps most importantly—IRC that they're competing with.

I think the primary goal is to effect a paradigm shift where the talkers stops becoming a program that people interact with directly, and becomes a service that people use. The Web is a good model of this, having user agents (web browsers such as IE and Mozilla), servers (such as Apache and IIS) and resources (primarily HTML pages, but also graphical and interactive content). The server delivers the resources to the user via the user agent. In the case of talkers the user agent would be the client software and the resource would be something that defines the unique characteristics of a talker.

Providing a way in which a talker is defined separately from the code which delivers it yields a number strong advantages:

  • the talker is no longer cluttered with boilerplate code for networking, logging, authentication, loading and saving resources, error handling, etc
  • the server can be replaced independently of the talker definition
  • while it would be important to create a reference implementation of the server, independent implementations could be created by others, giving a choice to those creating a talker definition
The last point also applies to user agents where the freedom of a user to choose an implementation that suits him or her is even more important.

This is only possible with standardisation. Just as HTML describes web pages, a method of defining talkers would have to be devised—a mixture of static and scripted content. There would be many issues to address here. For example, would a single scripting language be chosen to aid interoperability? Popular contenders would no doubt be Python, Ruby, Lua and ECMAScript (also known as JavaScript), but each has its champions and critics.

The protocol used for communication between clients and servers would also need to be defined. The previous section placed some requirements on this, but still leaves much open for discussion. While I think a purely XML-based protocol such as XMPP (developed for use by Jabber) is inappropriate, there are other possible starting points such as BEEP.

Once these were defined with reference implementations, all that would be left is the hurdle of persuading people to adopt the new technology. Hopefully, those talkers run by the people involved in creating the new specifications would be compelling demonstrations of the way forward.

Tuesday, November 4, 2008

Fuzzy date matching in PostgreSQL

As part of one of my side-projects, I wanted a way for users of a web site to search for events by date, but with some flexibility. I also wanted users who are creating events to be able to express uncertainty about the dates on which they happened.
For example, if I record an event that happened sometime in September 1998, I can say it happened on September 15th, 1998 +/- 15 days. That should then be included in the results of a query for events that happened on October 1st, 1998 +/ 1 month, or June 1st, 1998 +/- 6 months, or even September 10th, 1998 +/- 1 day. An exact date would be represented as +/- 0 days.
One way to visualise these is as line segments on a literal date line, stretching from the past into the future, with a mark per day. PostgreSQL's spatial types can be used to represent this in an indexable way. First of all, we'll create a composite type to represent fuzzy dates:
CREATE TYPE fuzzydate AS (midpoint date, fuzziness interval);
The midpoint and fuzziness fields should be self-explanatory. Next, we create a function that will take a fuzzydate and convert it into a box. (PostgreSQL doesn't have an 'overlaps' operator for line segments, so I chose to use zero-height boxes.) The function calculates the start and end dates of the date range, and uses extract to convert them into epoch times (the number of seconds since the beginning of 1970). These are then divided by 86400 to get the number of days since 1970. These values form the x coordinates of the returned box.
CREATE FUNCTION GeometricDate(fd fuzzydate) RETURNS box AS $$
  DECLARE
    start_day integer;
    end_day integer;
  BEGIN
    start_day := extract(epoch from fd.midpoint - fd.fuzziness)::integer / 86400;
    end_day := extract(epoch from fd.midpoint + fd.fuzziness)::integer / 86400;
    RETURN box(point(start_day, 0), point(end_day, 0));
  END;
$$ LANGUAGE 'plpgsql' IMMUTABLE;
Now we can now perform a query like this:
SELECT GeometricDate(('2008-11-04', '2 days'));
to get a result like (14189,0),(14185,0), so let's create a table that we can perform our date range queries on:
CREATE TABLE events (
  id serial PRIMARY KEY,
  name varchar(100),
  fdate fuzzydate
);
Note that the fdate column is only storing the original midpoint and fuzziness. So where is the geometric representation? Well, PostgreSQL allows us to index not just a column, but the results of applying a function to a column:
CREATE INDEX events_fdate_index ON events USING gist (GeometricDate(fdate));
Now we can use the geometric date in a query like this one, which says "give me all the events that were within 10 days before or after November 3rd, 2008":
SELECT * FROM events WHERE GeometricDate(fdate) && GeometricDate('2008-11-03', '10 days');
The really important thing to note here is that GeometricDate is not being called for every row in the table. The result of GeometricDate is taken directly from the index we created, and it's the IMMUTABLE flag on the function that tells PostgreSQL that it's okay to do this.