SIP Architecture

SIP is the IETF-specified protocol for initiating two-way communication sessions—it is important to emphasize that this protocol is not specific to VoIP and can be used in any session-driven application. Despite now being the largest RFC in IETF history, it is regarded by many to be simpler than H.323. Consider that SIP is text-based, thereby avoiding the ASN.1 parsing issues that exist with the H.323 protocol suite. It is also a pure application-level protocol, decoupled from the protocol layer it is transported across. It can be carried by TCP, UDP, or even Stream Control Transmission Protocol (SCTP). UDP may be used to decrease overhead and increase speed and efficiency, whereas TCP may be preferred if Transport Layer Security (TLS) encryption is incorporated for security reasons. SCTP is a recent protocol specifically developed to transport signaling information. It offers increased resistance to DoS attacks through a more robust four-way handshake method, the ability to multihome, and optional bundling of multiple user messages into a single SCTP packet. It also supports additional security services (TLS over SCTP and SCTP over IPsec).

The architecture of a SIP network (see Figure 7-3) is different from the H.323 structure. It includes a proxy and/or a redirect server, a location server, and a registrar. Its endpoints are usually called User Agents (UAs). Unlike H.323 (with the notable exception of directed routed calls without a gatekeeper), SIP uses only one port. Its default value is 5060.

SIP UA (traditional handset) SIP UA (softphone)

1. UA registration

2. Storage of location information

3. Signaling traffic

4. Signaling traffic to/from the redirect server

5. Query to the location server and response

6. Media stream and signaling traffic

SIP UA (traditional handset) SIP UA (softphone)

1. UA registration

2. Storage of location information

3. Signaling traffic

4. Signaling traffic to/from the redirect server

5. Query to the location server and response

6. Media stream and signaling traffic

Figure 7-3 SIP architecture

As is the case with the H.323 standard, users are not bound to a specific host using the SIP model, either. They initially report their location to the registrar, which may be integrated into a proxy or redirect server. This information is then stored in the location server, which provides address resolution functionality. Messages from endpoints or other services must be routed through either a proxy or redirect server. The proxy server intercepts these messages, inspects them to obtain the destination username, contacts the location server to resolve this username into a valid address, and finally forwards the message along to the appropriate endpoint or service. Redirect servers perform the same resolution functionality, but they leave the actual transmission to the endpoints. In other words, redirect servers obtain the address of the destination from the location server and return this information to the original sender, which then is in charge of sending its message directly to the resolved address, in a way similar to what happens with H.323 direct routed calls with a gatekeeper.

To better explain the data flow during the call setup process, consider a typical scenario where a proxy server is used to mediate between endpoints. The process is similar with a redirect server, but has the extra step of returning the resolved address to the source endpoint.

The SIP protocol itself is modeled on the three-way handshake implemented in TCP. During a regular call setup, communication details are negotiated between the endpoints using the Session Description Protocol (SDP), which contains fields for the CoDec used, caller's name, etc. If a user wishes to place a call, an INVITE request is sent to the proxy server containing SDP information for the session, which is then forwarded to the called party's client by the caller's proxy (possibly via the called party's proxy server). Eventually, assuming the party receiving the call wants to take it, an OK message will be sent back containing the call preferences in SDP format. Then the original caller will respond with an ACK. After the ACK is received, the conversation can begin along the RTP/RTCP ports previously agreed upon until the call session is torn down through a BYE request issued by one of the involved endpoints.

Despite all the traffic being transported through one port in text format and without any of the complicated channel/port switching associated with H.323, SIP still presents several challenges for firewalls and NAT. These challenges are discussed in detail at the end of this chapter.

Was this article helpful?

0 0
The Ultimate Computer Repair Guide

The Ultimate Computer Repair Guide

Read how to maintain and repair any desktop and laptop computer. This Ebook has articles with photos and videos that show detailed step by step pc repair and maintenance procedures. There are many links to online videos that explain how you can build, maintain, speed up, clean, and repair your computer yourself. Put the money that you were going to pay the PC Tech in your own pocket.

Get My Free Ebook


Post a comment