|
First release of CPL for YXA
This document documents my experience of using Erlang/OTP to implement
CPL (Call Processing Language - RFC 3880) for YXA.
CPL is a script language which allows for the declaration of complex
call forwarding behaviors, in IP telephony systems. While the current
implementation is done for YXA, it should be easy to adapt it to work
with other SIP (RFC 3261 - SIP: Session Initiation Protocol) based
servers as well.
Those familiar with CPL, may want to skip the next chapter.
A quick CPL introduction
SIP servers send incoming (or outgoing) SIP requests to a CPL
interpreter if a CPL script is associated with the receiver of a
incoming message (or sender of a outgoing message). After processing
the script, with the SIP request as input data, the CPL interpreter
determines where the call should be sent and tells this to the SIP
server.
At most one script, which can contain code for both incoming and
outgoing handling, can be associated with a single user.
The CPL language has a number of properties:
- All CPL commands can be represented as graph nodes, in a finite state machine.
- Conditional commands are nodes that have 1+ edges while non-conditional commands
are nodes with a single destination edge, leading to the next command.
- Various CPL constraints ensure that loops can never occur. This is done to ensure
that script execution can be done in finite time.
- The language is also designed so that a server can easily confirm the validity of
a script when the server receives it, rather than discovering problems while a call
is being processed.
- The syntax is XML based.
A typical script may look like the one below:
<?xml version="1.0" encoding="UTF-8"?>
<cpl xmlns="urn:ietf:params:xml:ns:cpl"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:ietf:params:xml:ns:cpl cpl.xsd ">
<incoming>
<time-switch>
<time dtstart="20000703T090000" duration="PT8H"
freq="weekly" byday="MO,TU,WE,TH,FR">
<lookup source="registration">
<success>
<proxy/>
</success>
</lookup>
</time>
<otherwise>
<location url="sip:voicemail@example.com">
<proxy/>
</location>
</otherwise>
</time-switch>
</incoming>
</cpl>
- <incoming> Action </incoming> - process "Action" if SIP request is a
incoming one i.e. the user is receiving a call.
- The "time-switch" command tag has two possible conditions "time" and "otherwise",
which are used to determine if either "lookup" or "location" should be executed.
- "time" is a fairly complex iCalendar (RFC 2445) based condition that is used to
specify that monday - friday between 9:00 - 17:00 are work hours.
- "lookup" is used to find all the SIP phones associated with the user and "proxy"
is then used to ring on all these phones.
- "location" is similar to "lookup" but is used to supply a single hardcoded SIP url,
in this script it's used to send the caller to a voicemail service when calling during
non-work hours.
Note: a location table (accumulator) is associated with the script
while it's being executed. This table is used to hold the currently
accumulated location entries (SIP urls), as the CPL interpreter
processes each command - different CPL commands can add or remove
locations in the table, thereby affecting the outcome of later
executing commands.
Note: several SIP phones (user agents) can be associated with the same
user (retrieved with the lookup command). A lookup command is usually
followed by a proxy command which rings on the phones - ringing can be
done both parallel or sequential depending on proxy (xml tag)
attributes specified in the script.
CPL command overview
|
incoming |
actions to do on incoming SIP request |
outgoing |
actions to do on outgoing SIP request |
subaction |
define a reusable subtree |
sub |
call subaction |
address-switch |
branch based on address information in SIP request |
language-switch |
branch based on "accept-language" in SIP request |
priority-switch |
branch based on priority of SIP request |
string-switch |
branch based on textual match on SIP request fields |
time-switch |
match time condition/s against the current time (when CPL script is run) |
location |
add SIP url to location table |
lookup |
get SIP urls of script owner |
remove-location |
remove SIP url/s from location table |
log |
log information to non-volatile storage |
mail |
send mail to specific mailto url |
proxy |
ring and remove locations in location table, may branch if all calls fail |
redirect |
direct calls to destinations in location table |
reject |
reject call |
The implementation
The following RFCs have been implemented:
- RFC 3880 - Call Processing Language (CPL): A Language for User
Control of Internet Telephony Services - a complete implementation has
been done.
- RFC 2445 - Internet Calendaring and Scheduling Core Object
Specification (iCalendar) - a partial implementation, all features
required by the CPL "time-switch" command are implemented, among which
the RRULE is the most noteworthy.
- RFC 3066 - Tags for the Identification of Languages - fully
implemented, used by the "language-switch" command.
The goals of the current implementation:
- correctness - a properly working implementation.
- completeness - a complete implementation of CPL as specified in RFC3880.
Correctness has primarily been ensured by using auto test cases using
autotest.erl, see autotest.erl in the source code. autotest.erl is a
very simple test server, that can run test modules independently of
each other. Extensive real life usage and integration testing (with CPL
script generating tools) has yet to be done, as none oft he GUI tools
that user are expected to use, to generate CPL scripts, have yet been
developed.
Dialyzer
as well as xref (see xref_test.erl in the source) have also
been used to get additional hints about possible bugs.
The implementation is divided into three main parts (see cpl/README in
source for more details):
- The parser (~ 1/4 th of the code) - converts the XML code to a
suitable internal graph representation, does extensive script
verification as well as some "time-switch" related preprocessing.
- scripts are only registered (stored in a mnesia database) if they
pass all verification tests done by the parser.
- The interpreter (~ 1/4 th of the code) - retrieves a registered
script (actually the parser generated graph) and executes the script as
specified in RFC 3880.
- "time-switch" handling (~ 2/4 th of the code) - code that deals with
determining, if the current time is part of a specified reoccurrence,
like does it occur every monday, all workdays between 8:00 - 16:40, the
third thursday after 2004-04-24, ... etc. A custom "time-switch"
resolution algorithm, functionally identical to the one supplied in RFC
3880 (appendix A, page 51) is used.
The implementation of the parser and interpreter is fairly straight
forward while the "time-switch" handling code is far more complex and
tedious - implementing iCalendar involves dealing with a lot of tricky
special cases; disappearing and duplicating hours during DST (Daylight
Saving Time), variable length time units like months with leap day,
getting the first 5 occurrences of the date when month day = 31 (which
obviously doesn't occur every month), ... etc.
Conclusions
Implementation time: 4.5 man-months.
Code size (with comments, test cases and documentation): ~18000 lines,
not including code discarded during various rewrites.
Goals: reached, the implementation is complete and throughly tested
with a large set of auto test cases. The current implementation is
known to be inefficient in a number of places (e.g. calculating the
difference between two dates in days, the code is currently O(N) but
could be made O(1)) and speed optimizations should be able to improve
performance noticeably.
Dialyzer and xref: I have found only a small number of bugs with these
tools - bugs which I have also encountered while developing/debugging
my auto test cases. But the Dialyzer has been able catch one obscure,
hard to test error, which I would otherwise have missed, which would
have resulted in odd unpredictable errors.
Fredrik Thulin - the main maintainer/developer of YXA seems to have
more success with getting useful bug reports from Dialyzer, I can only
speculate that this may be due to design differences between CPL and
YXA (SIP) code, my practice to only use type checking guards when
really needed (reducing the amount of type info) or perhaps because my
code is less type error prone. [Fredrik: I think it is because Håkan
is a careful code writer, while I'm not. Håkan has taught me a thing or
two about actually testing code you write before trying to use it ;) ]
autotest.erl: initially a quick hack (about an hours work), to allow me
to do regression testing when working on YXA code (not CPL related). I
have considered using the OTP test server to set up more complex tests
than the basic module tests done by autotest.erl, but have not taken
the time to reacquaint myself with the test server to be able to
determine to which extent the OTP test server may be useful, for this
purpose.
Parser and interpreter design:
Some Java implementations I have looked at combine parsing and
execution in a piece meal fashion - i.e. traverser XML tree, parse
command, execute command, repeat. This solution has several advantages
and disadvantages:
+ |
No need to store a graph representation. |
+ |
Scripts can be added without any parsing or verification cost - a
external parser-verifier is still needed in CPL editor tools, to
minimize script failure in runtime.
|
+ |
Only a small part of the script may need to be parsed. |
- |
Doing XML parsing each time a script is run, adds extra CPU cost and
may affect response time
|
- |
Certain time-switch conditions can only be evaluated efficiently if
they are preprocessed, this applies for example to the "count"
attribute (specifies a fixed number of occurrences).
|
- |
Puts all script processing load, on the same SIP request processing server. |
The chosen design instead separates parsing (validation and
registration) and SIP request processing, which makes the system
scalable - registration and validation can be run on a different server
(or servers) than SIP request processing.
Pros and cons of CPL (and problems during implementation):
+ |
Simple language. |
+ |
RFC 3880 is easy to understand and has lots of example scripts that
are useful as the basis for various test cases - a proper parser should
for example, be able to parse all example scripts.
|
- |
"language-switch" specification is broken (see cpl/README for a more
in depth explanation).
|
- |
The "time-switch" command is based on iCalendar (RFC 2445) which is a
very feature full and complex standard (148 pages, twice the size of
RFC 3880), which describes the syntax and interpretation of calendar
information and time based reoccurrence rules. The use of iCalendar
allows for easier integration with other calendaring tools, which may
be used to create CPL scripts, but it also means that the CPL
implementation gets considerably more complex.
|
- |
XML is used as both the script language and for the script grammar
specification - "It [CPL] is based on the Extensible Markup Language
(XML), so parsing it is easy and many parsers for it are publicly
available." - RFC 3880 page 2. I find this motivation rather lacking -
creating a parser using a parser generator like yecc (or yacc) is
trivial, the real work in either cases (XML /BNF grammer) is script
validation and transforming the input into something suitable to the
interpreter.
|
The custom time-switch resolver was done for a number of reasons:
- The (non-normative) reference algorithm and implementation (written
in Java) is incomplete (no "count" attribute support) and requires that
a number of preconditions are fulfilled to work properly - I initially
perceived these precondition checks as more complex than they really
were, which gave me the false notion that implementing the time-switch
resolver algorithm part was a relatively minor issue.
- I initially had trouble understanding the working of the reference
algorithm, it makes a number of implicit assumptions which are not
obvious from reading either the time-switch specification (RFC 3880) or
iCalendar (RFC 2445).
- I considered using the java reference implementation in a "black box
manner" but chose not to; as it was incomplete, required various other
preconditions to also be implemented, would be tedious to interface
with Erlang and appeared hard to extend - why else was the reference
implementation incomplete? The RFC (3880) says the following about the
reference algorithm:
"This algorithm is believed to be correct, but this section is
non-normative. Section 4.4, and RFC 2445 [8], are the definitive
definitions of recurrences." - RFC 3880 appendix A page 51. This
implies that the reference algorithm lacks a formal proof of
correctness or real life usage testing, and may at worst even faulty.
I therefore chose to implement my own time-switch resolver based purely
on the time-switch specification (RFC 3880) and iCalendar (RFC 2445).
The time-switch resolver was designed in stages - there are a total of
7 resolution paths (stages) depending on the complexity of the
time-switch condition. As I gradually implemented them, I started to
understand the reference algorithm and it's choice of implicit
assumption - which I chose to follow thereby ending up with a
functionally equivalent solution with similar constant time execution
properties.
It's hard to say if the way I chose to implement the time-switch
resolution algorithm was good or bad, doing it in Erlang using my
custom algorithm or the reference algorithm would result in
approximately the same amount of "reinventing the wheel" work.
Using the java reference implementation instead, would probably have
reduce the amount of work needed to be done, to half of the above
solution - but would require that half the code (preconditions and
count handling) needed to be done in java, which most likely would take
at least twice as much time.
Pros and cons of Erlang/OTP:
+ |
A good exception handling model (the new "try ... catch ... end" was
used) useful in for example the parser.
|
+ |
I appreciate the existence of a Erlang XML library, although I found
the documentation slightly lacking, probably because I had no previous
experience dealing with XML.
|
+ |
High level language, less verbose than java - no messing with type
and class declarations, more powerful semantics, good for prototyping -
easy to test and run non-finished code.
|
- |
Sufficient but somewhat primitive calendar support (see calendar
module in OTP) - there is for example no time-zone support, which would
have been nice.
|
- |
No unicode support, this currently adds some limitations in the
"address-switch" and "string-switch" implementation. This lack of
support _really_ does need to be addressed, unicode strings could
easily be represented the same way as regular strings (as a list of
integers) and processed by the same string operations. But due to
backwards compatibility issues with the current text encoding used
(latin-1), it may be wiser and simpler to just supply a unicode string
library; with string operations, UTF encoding and decoding functions as
well as string normalization functions needed when comparing unicode
strings.
|
About Me
Name: |
Håkan Stenholm |
Email: |
hakan.stenholm@mbox304.swipnet.se |
Address: |
Stockholm - Sweden |
- Got a Master of science degree in Computer Science, 1998, at Uppsala
university (Sweden).
- well versed in C, Java and Erlang.
- most noteworthy software development jobs:
- Four years (09/1998 - 11/2002) working at Ericsson developing Erlang
code for the AXD301.
- The the last seven months (09/2004 - 04/2005) working on YXA and CPL
at Stockholm university (Sektionen för IT och Media / Division of IT
and media services).
$Id: cpl_implementation.html,v 1.3 2005/04/04 12:43:03 ft Exp $
|
|