There is an alternative to circuit-switched voice calls that has become very popular recently, especially in the US. It is called push-to-talk (PTT).
Push-to-talk over Cellular (PoC) is a half-duplex, two-way communication service. It basically works like a walkie-talkie, where only one participant at a time is allowed to speak. PTT is not meant to be a replacement for normal circuit-switched voice calls, but was designed for a quick exchange of information between users.
It is very easy to use a PTT application. The user selects a contact from the contacts list (which typically also displays the presence status of the contact) and holds down the dedicated PTT button while speaking. The voice is immediately transferred to the other call participant(s) in real time.
Until now, only proprietary protocols were used to implement PTT functionality. However, in order to guarantee interoperability, the open mobile alliance (OMA) is in the process of specifying an open standard for PTT. OMA members include wireless vendors like Nokia or Motorola, information technology companies, mobile network operators as well as application and content providers. At the time of writing, the latest version of this standard is candidate version 1.0 (6th of October, 2005) [
1]. A number of OMA-compliant servers and clients are currently under development.
What are the main building blocks of a PTT system? Apart from a PoC client, there is obviously a need for a PoC server as well. The PoC server's responsibility is to allow clients to register and to manage PTT sessions. As soon as a PoC session is established, the client can transmit audio data.
The user's contact lists are stored on the shared XML document management (XDM) server. These lists are stored in an application-independent format and therefore can be shared with other applications. PTT-specific data structures like PoC groups are stored on the PoC XDM server. In order to access any XDM server, an aggregation proxy implements an HTTP proxy that forwards any incoming requests from the client to the appropriate XDM server.
As mentioned above, presence is a feature which is supported by most PTT clients. It allows the user to view the current presence states of the contacts in the user's contacts list. Possible presence states are for example “online”, “busy” or “do not disturb”. A presence server stores the presence information and notifies the client about presence changes. The user can also set his or her own presence. Presence functionality is optional for OMA PoC.
Another optional building block of a PTT system is the device provisioning and management (DM) server. It is used to initialise and update all configuration parameters necessary for the PoC client.
In addition to “normal” one-to-one PTT calls, the OMA specification describes calls which involve a group of subscribers. Pre-arranged groups contain a number of contacts and can be called by selecting the group in the PTT application and pressing the PTT button. Ad-hoc groups only exist during the lifetime of the call. The user has to select the group participants before calling this group. When a pre-arranged or ad-hoc group is initiated, all members of the group are invited to the call. This is not the case with chat groups. A chat group is a group where users can join and leave whenever they like. In order to share groups, group advertisement is defined.
So how does PTT really work? Under the hood of a PoC client, a number of important protocols exchange messages with a server.
The first is the session initiation protocol (SIP). It is a general purpose protocol which can be used to establish any kind of session. It is used to register with the server and send invite messages which start a session. SIP uses the session description protocol (SDP) to agree on media parameters for a PoC session.
For the transmission of audio data between clients, the real-time transport protocol (RTP) as well as the RTP control protocol (RTCP) are used. The latter is used to ensure the quality of the audio channel.
Before the user can speak, the client has to request the floor from the server. This is necessary to guarantee that only one participant at a time is allowed to transmit audio data. When the server receives such a request it can allow or reject it. Additionally, the server can revoke the right to speak any time. The protocol that is used for floor control is called talk burst control protocol (TBCP) and is uses RTCP messages for the exchange of information.
For the implementation of a fully OMA-compliant PoC solution, several hundred pages of specifications have to be read, understood, and correctly implemented. It is no surprise that most implementations wouldn't be able to interoperate smoothly without any testing. In order to ensure interoperability between PTT solutions, the OMA regularly hosts so-called TestFestivals or TestFests. During this event, OMA members can test cross-vendor combinations and make sure they interoperate correctly.
To summarise, PTT is a solution which has its place alongside circuit-switched and voice-over-IP calls. Its main advantage is instant access, simplicity, as well as the efficient use of network resources. But the functionality described above is only the beginning of a whole series of so-called push-to-X services. Like PTT, the point of these services is instant access. Examples are push-to-view, which allows users to share images during a PTT call or push-to-find, where location information is transferred.