TCPCam Protocol Description
---[ Revision history ] YYYY-MM-DD ---------- 2006-06-28: Initial version of the document ---[ General description ] The TCPCam protocol is a point to point video+audio conference protocol designed to be simple to implement and deploy. It works transmitting audio, video and control frames over a single TCP connection. At least one of the two hosts involved in the conference must have a TCP port open to the outside in order for the connection be possible between the hosts. The protocol uses the JPEG image compression algorithm in order to compress and the SPEEX encoder to compress audio. ---[ Transport layer ] In a TCPCam session two hosts are involved at the same time. The first has the TCP port number 7766 open in LISTEN mode accepting connections (Server mode) used by the second host in order to connect (Client mode). A TCPCam implementation should work in both Server and Client mode. The current implementation sets the TCP socket send buffer of both ends to 8192 in order to avoid that too much delay is introduced if the TCP link between the two ends is not fast enough. ---[ Frames format ] The TCP connection is used in order to transport frames of different types containing audio, video and control data. This is the format of the frames. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Frame Type | Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | / DATA / / / | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Both the Frame Type and Total Length fields are unsigned 16 bit integers encoded in network byte order (big endian). ---[ Frames types ] WELCOME FRAME (type: 0x00) It is sent by the Server side once the Client connects in order to tell the Client that the connection was accepted and can continue. BUSY FRAME (type: 0x01) It is sent by the Server side once the Client connects in order to tell the Client that the Server is already involved in a conference and is not able to handle a second connection. AUDIO FRAME (type: 0x02) Contains an audio frame in narrow or wide band encoded using the Speex encoder. This frame can contain only a single audio frame. It is not allowed to put multiple speex frames in a single TCPCam audio frame. IMGDATA FRAME (type: 0x03) Contains part of a JPEG image. TCPCam send images compressing them in JPEG format (the same format used to store actual JPEG files on disk, including the full header), then splitting the entire image in frames not bigger than 512 bytes (including the header). IMGEND FRAME (type: 0x04) This frame contains no data, it is only used to tell the other end that the last IMGDATA frame sent was the last part of the last image transmitted. When this frame is received, the receiver knows that in the image input buffer (composed by one or more IMGDATA frames data) contains a full image that is ready to be decoded and shown to the user. ---[ Control Flow ] An implementation should always check if the kernel is ready to send more data via the TCP socket, and use the following rule in order to send frames: - If the socket is ready to send more data and there are audio frames in queue, send audio data. Image data is not sent even if in queue. - If the socket is ready to send more data and there are NOT audio frames in queue, send image data. - If the socket is NOT ready to send more data and the output audio frames queue is longer than a 1/2 seconds of audio, discard the queue. This simple rules make sure that audio priority is higher than video priority on the TCP channel, since it is much better to have slow video than hard to understand audio. Image data should be always as recent as possible. If an image is already present in the send buffer, but no part of this image was already sent, and a new image is available, it is a good rule to discard the old image and populate the image output buffer with the new one. ---[ Notes ]  JPEG: http://www.jpeg.org  SPEEX: http://www.speex.org ---[ Author ] This document was written by Salvatore Sanfilippo (antirez at gmail dot com) and is released under the GPL license.