This document gives some basic indication on how the asterisk code is structured. The first part covers the structure and style of individual files. The second part (TO BE COMPLETED) covers the overall code structure and the build architecture.
Please read it to the end to understand in detail how the asterisk code is organized, and to know how to extend asterisk or contribute new code.
We are looking forward to your contributions to Asterisk - the Open Source PBX! As Asterisk is a large and in some parts very time-sensitive application, the code base needs to conform to a common set of coding rules so that many developers can enhance and maintain the code. Code also needs to be reviewed and tested so that it works and follows the general architecture and guide-lines, and is well documented.
Asterisk is published under a dual-licensing scheme by Digium. To be accepted into the codebase, all non-trivial changes must be licensed to Digium. For more information, see the electronic license agreement on https://issues.asterisk.org/.
Patches should be in the form of a unified (-u) diff, made from a checkout from subversion.
/usr/src/asterisk$ svn diff > mypatch
If you would like to only include changes to certain files in the patch, you can list them in the "svn diff" command:
/usr/src/asterisk$ svn diff somefile.c someotherfile.c > mypatch
- Indent code using tabs, not spaces.
- All code, filenames, function names and comments must be in ENGLISH.
- Don't annotate your changes with comments like "/* JMG 4/20/04 */"; Comments should explain what the code does, not when something was changed or who changed it. If you have done a larger contribution, make sure that you are added to the CREDITS file.
- Don't make unnecessary whitespace changes throughout the code. If you make changes, submit them to the tracker as separate patches that only include whitespace and formatting changes.
- Don't use C++ type (//) comments.
- Try to match the existing formatting of the file you are working on.
- Use spaces instead of tabs when aligning in-line comments or #defines (this makes your comments aligned even if the code is viewed with another tabsize)
Every C source file should start with a proper copyright and a brief description of the content of the file. Following that, you should immediately put the following lines:
"asterisk.h" resolves OS and compiler dependencies for the basic set of unix functions (data types, system calls, basic I/O libraries) and the basic Asterisk APIs. ASTERISK_FILE_VERSION() stores in the executable information about the file.
Next, you should #include extra headers according to the functionality that your file uses or implements. For each group of functions that you use there is a common header, which covers OS header dependencies and defines the 'external' API of those functions (the equivalent of 'public' members of a class). As an example:
**if you are implementing a module, this should be included in one of the files that are linked with the module.
- access to extra file I/O functions (stat, fstat, playing with directories etc)
- basic network I/O - all of the socket library, select/poll, and asterisk-specific (usually either thread-safe or reentrant or both) functions to play with socket addresses.
- parsing of application arguments
- struct ast_channel and functions to manipulate it
For more information look at the headers in include/asterisk/. These files are usually self-sufficient, i.e. they recursively #include all the extra headers they need.
The equivalent of 'private' members of a class are either directly in the C source file, or in files named asterisk/mod_*.h to make it clear that they are not for inclusion by generic code.
Keep the number of header files small by not including them unnecessarily. Don't cut&paste list of header files from other sources, but only include those you really need. Apart from obvious cases (e.g. module.h which is almost always necessary) write a short comment next to each #include to explain why you need it.
- Do not declare variables mid-block (e.g. like recent GNU compilers support) since it is harder to read and not portable to GCC 2.95 and others.
- Functions and variables that are not intended to be used outside the module must be declared static. If you are compiling on a Linux platform that has the 'dwarves' package available, you can use the 'pglobal' tool from that package to check for unintended global variables or functions being exposed in your object files. Usage is very simple:
$ pglobal -vf <path to .o file>
- When reading integer numeric input with scanf (or variants), do NOT use '%i' unless you specifically want to allow non-base-10 input; '%d' is always a better choice, since it will not silently turn numbers with leading zeros into base-8.
- Strings that are coming from input should not be used as the format argument to any printf-style function.
On many platforms, structure fields (in structures that are not marked 'packed') will be laid out by the compiler with gaps (padding) between them, in order to satisfy alignment requirements. As a simple example:
On nearly every 64-bit platform, this will result in 4 bytes of dead space between 'bar' and 'xyz', because pointers on 64-bit platforms must be aligned on 8-byte boundaries. Once you have your code written and tested, it may be worthwhile to review your structure definitions to look for problems of this nature. If you are on a Linux platform with the 'dwarves' package available, the 'pahole' tool from that package can be used to both check for padding issues of this type and also propose reorganized structure definitions to eliminate it. Usage is quite simple; for a structure named 'foo', the command would look something like this:
$ pahole --reorganize --show_reorg_steps -C foo <path to module>
The 'pahole' tool has many other modes available, including some that will list all the structures declared in the module and the amount of padding in each one that could possibly be recovered.
Make sure you are aware of the string and data handling functions that exist within Asterisk to enhance portability and in some cases to produce more secure and thread-safe code. Check utils.c/utils.h for these.
If you need to create a detached thread, use the ast_pthread_create_detached() normally or ast_pthread_create_detached_background() for a thread with a smaller stack size. This reduces the replication of the code to handle the pthread_attr_t structure.
Roughly, Asterisk code formatting guidelines are generally equivalent to the following:
# indent -i4 -ts4 -br -brs -cdw -lp -ce -nbfda -npcs -nprs -npsl -nbbo -saf -sai -saw -cs -l90 foo.c
this means in verbose:
Function calls and arguments should be spaced in a consistent way across the codebase.
Don't treat keywords (if, while, do, return) as if they were functions; leave space between the keyword and the expression used (if any). For 'return', don't even put parentheses around the expression, since they are not required.
There is no shortage of whitespace characters Use them when they make the code easier to read. For example:
is harder to read than
Following are examples of how code should be formatted.
Always use braces around the statements following an if/for/while construct, even if not strictly necessary, as it reduces future possible problems.
Don't build code like this:
Instead, try to minimize the number of lines of code that need to be indented, by only indenting the shortest case of the 'if' statement, like so:
When this technique is used properly, it makes functions much easier to read and follow, especially those with more than one or two 'setup' operations that must succeed for the rest of the function to be able to execute.
Proper use of this technique may occasionally result in the need for a label/goto combination so that error/failure conditions can exit the
function while still performing proper cleanup. This is not a bad thing! Use of goto in this situation is encouraged, since it removes the need
for excess code indenting without requiring duplication of cleanup code.
Do not explicitly cast 'void *' into any other type, nor should you cast any other type into 'void *'. Implicit casts to/from 'void *' are explicitly allowed by the C specification. This means the results of malloc(), calloc(), alloca(), and similar functions do not ever need to be cast to a specific type, and when you are passing a pointer to (for example) a callback function that accepts a 'void *' you do not need to cast into that type.
All public functions (those not marked 'static'), must be named "ast_<something>" and have a descriptive name.
As an example, suppose you wanted to take a local function "find_feature", defined as static in a file, and used only in that file, and make it public, and use it in other files. You will have to remove the "static" declaration and define a prototype in an appropriate header file (usually in include/asterisk). A more specific name should be given, such as "ast_find_call_feature".
Functions with a variable amount of arguments need a 'sentinel' when called. Newer GNU C compilers are fine if you use NULL for this. Older versions (pre 4) don't like this. You should use the constant SENTINEL. This one is defined in include/asterisk/compiler.h
Name global variables (or local variables when you have a lot of them or are in a long function) something that will make sense to aliens who
find your code in 100 years. All variable names should be in lower case, except when following external APIs or specifications that normally
use upper- or mixed-case variable names; in that situation, it is preferable to follow the external API/specification for ease of understanding.
Make some indication in the name of global variables which represent options that they are in fact intended to be global.
static char global_something
Don't use 'typedef' just to shorten the amount of typing; there is no substantial benefit in this:
; typedef struct foo foo_t;}}
In fact, don't use 'variable type' suffixes at all; it's much preferable to just type 'struct foo' rather than 'foo_s'.
Use enums rather than long lists of #define-d numeric constants when possible; this allows structure members, local variables and function arguments to be declared as using the enum's type. For example:
Note: The compiler will not force you to pass an entry from the enum as an argument to this function; this recommendation serves only to make
the code clearer and somewhat self-documenting. In addition, when using switch/case blocks that switch on enum values, the compiler will warn you if you forget to handle one or more of the enum values, which can be handy.
Don't use strncpy for copying whole strings; it does not guarantee that the output buffer will be null-terminated. Use ast_copy_string instead, which is also slightly more efficient (and allows passing the actual buffer size, which makes the code clearer).
Don't use ast_copy_string (or any length-limited copy function) for copying fixed (known at compile time) strings into buffers, if the buffer is something that has been allocated in the function doing the copying. In that case, you know at the time you are writing the code whether the buffer is large enough for the fixed string or not, and if it's not, your code won't work anyway! Use strcpy() for this operation, or directly set the first two characters of the buffer if you are just trying to store a one character string in the buffer. If you are trying to 'empty' the buffer, just store a single NULL character ('\0') in the first byte of the buffer; nothing else is needed, and any other method is wasteful.
In addition, if the previous operations in the function have already determined that the buffer in use is adequately sized to hold the string
you wish to put into it (even if you did not allocate the buffer yourself), use a direct strcpy(), as it can be inlined and optimized to simple
processor operations, unlike ast_copy_string().
When converting from strings to integers or floats, use the sscanf function in preference to the atoi and atof family of functions, as sscanf detects errors. Always check the return value of sscanf to verify that your numeric variables successfully scanned before using them. Also, to avoid a potential libc bug, always specify a maximum width for each conversion specifier, including integers and floats. A good length for both integers and floats is 30, as this is more than generous, even if you're using doubles or long integers.
For the sake of uclibc, do not use index, bcopy or bzero; use strchr(), memset(), and memmove() instead. uclibc can be configured to supply these functions, but we can save these users time and consternation if we abstain from using these functions.
When making applications, always ast_strdupa(data) to a local pointer if you intend to parse the incoming data string.
Use the argument parsing macros to declare arguments and parse them, i.e.:
Make sure you are not duplicating any functionality already found in an API call somewhere. If you are duplicating functionality found in
another static function, consider the value of creating a new API call which can be shared.
Use const on pointer arguments which your function will not be modifying, as this allows the compiler to make certain optimizations. In general, use 'const' on any argument that you have no direct intention of modifying, as it can catch logic/typing errors in your code when you use the argument variable in a way that you did not intend.
As a common example of this point, make an effort to use the lockable linked-list macros found in include/asterisk/linkedlists.h. They are
efficient, easy to use and provide every operation that should be necessary for managing a singly-linked list (if something is missing,
let us know!). Just because you see other open-coded list implementations in the source tree is no reason to continue making new copies of
that code... There are also a number of common string manipulation and timeval manipulation functions in asterisk/strings.h and asterisk/time.h;
use them when possible.
When allocating/zeroing memory for a structure, use code like this:
Avoid the combination of ast_malloc() and memset(). Instead, always use ast_calloc(). This will allocate and zero the memory in a single operation. In the case that uninitialized memory is acceptable, there should be a comment in the code that states why this is the case.
Using sizeof(*tmp) instead of sizeof(struct foo) eliminates duplication of the 'struct foo' identifier, which makes the code easier to read and also ensures that if it is copy-and-pasted it won't require as much editing.
The ast_* family of functions for memory allocation are functionally the same. They just add an Asterisk log error message in the case that the allocation fails for some reason. This eliminates the need to generate custom messages throughout the code to log that this has occurred.
The functions strdup and strndup can not accept a NULL argument. This results in having code like this:
However, the ast_strdup and ast_strdupa functions will happily accept a NULL
argument without generating an error. The same code can be written as:
Furthermore, it is unnecessary to have code that malloc/calloc's for the length of a string (+1 for the terminating '\0') and then using strncpy to copy the copy the string into the resulting buffer. This is the exact same thing as using ast_strdup.
New CLI commands should be named using the module's name, followed by a verb and then any parameters that the command needs. For example:
There are two methods of adding functionality to the Asterisk dialplan: applications and functions. Applications (found generally in
the apps/ directory) should be collections of code that interact with a channel and/or user in some significant way. Functions (which can be
provided by any type of module) are used when the provided functionality is simple... getting/retrieving a value, for example. Functions should also be used when the operation is in no way related to a channel (a computation or string operation, for example).
Applications are registered and invoked using the ast_register_application function; see the apps/app_skel.c file for an example.
Functions are registered using 'struct ast_custom_function' structures and the ast_custom_function_register function.
When writing Asterisk API documentation the following format should be followed. Do not use the javadoc style.
Notice the use of the \param, \brief, and \return constructs. These should be used to describe the corresponding pieces of the function being documented. Also notice the blank line after the last \param directive. All doxygen comments must be in one /*! */ block. If the function or struct does not need an extended description it can be left out.
Please make sure to review the doxygen manual and make liberal use of the \a, \code, \c, \b, \note, \li and \e modifiers as appropriate.
When documenting a 'static' function or an internal structure in a module, use the \internal modifier to ensure that the resulting documentation explicitly says 'for internal use only'.
When adding new API you should also attach a \since note because this will indicate to developers that this API did not exist before this version. It also has the benefit of making the resulting HTML documentation to group the changes for a single version.
Structures should be documented as follows.
Note that /! */ blocks document the construct immediately following them unless they are written, /!< */, in which case they document the construct preceding them.
It is very much preferred that documentation is not done inline, as done in the previous example for member2. The first reason for this is that it tends to encourage extremely brief, and often pointless, documentation since people try to keep the comment from making the line extremely long. However, if you insist on using inline comments, please indent the documentation with spaces! That way, all of the comments are properly aligned, regardless of what tab size is being used for viewing the code.
Before submitting a patch, read the actual patch file to be sure that all the changes you expect to be there are, and that there are no
surprising changes you did not expect. During your development, that part of Asterisk may have changed, so make sure you compare with the latest SVN.
If you are asked to make changes to your patch, there is a good chance the changes will introduce bugs, check it even more at this stage.
Also remember that the bug marshal or co-developer that adds comments is only human, they may be in error
If you create new AMI events, please read manager.txt. Do not re-use existing headers for new purposes, but please re-use existing headers
for the same type of data.
Manager events that signal a status are required to have one event name, with a status header that shows the status. The old style, with one event named "ThisEventOn" and another named "ThisEventOff", is no longer approved.
Check manager.txt for more information on manager and existing headers. Please update this file if you add new headers.
Asterisk is a heavily multithreaded application. It makes extensive use of locking to ensure safe access to shared resources between different threads.
When more that one lock is involved in a given code path, there is the potential for deadlocks. A deadlock occurs when a thread is stuck waiting for a resource that it will never acquire. Here is a classic example of a deadlock:
In this case, there is a deadlock between threads 1 and 2. This deadlock would have been avoided if both threads had agreed that one must acquire Lock A before Lock B.
In general, the fundamental rule for dealing with multiple locks is: an order must be established to acquire locks, and then all threads must respect that order when acquiring locks.
Because any ordering for acquiring locks is ok, one could establish the rule arbitrarily, e.g. ordering by address, or by some other criterion.
The main issue, though, is defining an order that
- is easy to check at runtime;
- reflects the order in which the code executes.
As an example, if a data structure B is only accessible through a data structure A, and both require locking, then the natural order is locking first A and then B. As another example, if we have some unrelated data structures to be locked in pairs, then a possible order can be based on the address of the data structures themselves.
The #1 cause of deadlocks in Asterisk is by not properly following the locking rules that exist at the boundary between Channel Drivers and
the Asterisk core. The Asterisk core allocates an ast_channel, and Channel Drivers allocate "technology specific private data" (PVT) that is
associated with an ast_channel. Typically, both the ast_channel and PVT have their own lock. There are many code paths that require both objects to be locked.
The locking order in this situation is the following:
Channel Drivers implement the ast_channel_tech interface to provide a channel implementation for Asterisk. Most of the channel_tech
interface callbacks are called with the associated ast_channel locked. When accessing technology specific data, the PVT can be locked directly because the locking order is respected.
There are some code paths which make it extremely difficult to respect the locking order.
Consider for example the following situation:
- A message comes in over the "network"
- The Channel Driver (CD) monitor thread receives the message
- The CD associates the message with a PVT and locks the PVT
- While processing the message, the CD must do something that requires locking the ast_channel associated to the PVT
This is the point that must be handled carefully.
The following psuedo-code
is not correct for two reasons:
- first and foremost, unlocking the PVT means that other threads can acquire the lock and believe it is safe to modify the associated data. When reacquiring the lock, the original thread might find unexpected changes in the protected data structures. This essentially means that the original thread must behave as if the lock on the pvt was not held, in which case it could have released it itself altogether;
- Asterisk uses the so called "recursive" locks, which allow a thread to issue a lock() call multiple times on the same lock. Recursive locks count the number of calls, and they require an equivalent number of unlock() to be actually released.
For this reason, just calling unlock() once does not guarantee that the lock is actually released – it all depends on how many times lock() was called before.
An alternative, but still incorrect, construct is widely used in the asterisk code to try and improve the situation:
Here the trylock() is non blocking, so we do not deadlock if the ast_channel is already locked by someone else: in this case, we try to unlock the PVT (which happens only if the PVT lock counter is 1), yield the CPU to give other threads a chance to run, and then acquire the lock again.
This code is not correct for two reasons:
- same as in the previous example, it releases the lock when the thread probably did not expect it;
- if the PVT lock counter is greater than 1 we will not really release the lock on the PVT. We might be lucky and have the other contender actually release the lock itself, and so we will "win" the race, but if both contenders have their lock counts > 1 then they will loop forever (basically replacing deadlock with livelock).
Another variant of this code is the following:
which has the same issues as the while(trylock...) code, but just deadlocks instead of looping forever in case of lock counts > 1.
The deadlock/livelock could be in principle spared if one had an unlock_all() function that calls unlock as many times as needed to actually release the lock, and reports the count. Then we could do:
The issue with unexpected unlocks remains, though.
The next situation to consider is what to do when you need a lock on multiple ast_channels (or multiple unrelated data structures).
If we are sure that we do not hold any of these locks, then the following construct is sufficient:
That type of code would follow an established locking order of always locking the channel that has a lower address first. Also keep in mind
that to use this construct for channel locking, one would have to go through the entire codebase to ensure that when two channels are locked,
this locking order is used.
However, if we enter the above section of code with some lock held (which would be incorrect using non-recursive locks, but is completely legal using recursive mutexes) then the locking order is not guaranteed anymore because it depends on which locks we already hold. So we have to go through the same tricks used for the channel+PVT case.
As you can see from the above discussion, getting locking right is all but easy. So please follow these recommendations when using locks:
Please try to use locks only when strictly necessary, and only for the minimum amount of time required to run critical sections of code. A common use of locks in the current code is to protect a data structure from being released while you use it. With the use of reference-counted objects (astobj2) this should not be necessary anymore.
If possible, do not run any blocking code while holding a lock,
because you will also block other threads trying to access the same
lock. In many cases, you can hold a reference to the object to avoid
that it is deleted while you sleep, perhaps set a flag in the object
itself to report other threads that you have some pending work to
complete, then release and acquire the lock around the blocking path,
checking the status of the object after you acquire the lock to make
sure that you can still perform the operation you wanted to.
Recursive locks are very convenient when coding, as you don't have to
worry, when entering a section of code, whether or not you already
hold the lock – you can just protect the section with a lock/unlock
pair and let the lock counter track things for you.
But as you have seen, exploiting the features of recursive locks
make it a lot harder to implement proper deadlock avoidance strategies.
So please try to analyse your code and determine statically whether you
already hold a lock when entering a section of code.
If you need to call some function foo() with and without a lock held,
you could define two function as below:
and call them according to the needs.
Please document the locking order rules are documented for every lock introduced into Asterisk. This is done almost nowhere in the existing code. However, it will be expected to be there for newly introduced code. Over time, this information should be added for all of the existing lock usage.
The asterisk build architecture relies on autoconf to detect the system configuration, and on a locally developed tool (menuselect) to select build options and modules list, and on gmake to do the build.
The first step, usually to be done soon after a checkout, is running "./configure", which will store its findings in two files:
- contains C macros, normally #define HAVE_FOO or HAVE_FOO_H , for all functions and headers that have been detected at build time. These are meant to be used by C or C++ source files.
- contains variables that can be used by Makefiles. In addition to the usual CC, LD, ... variables pointing to the various build tools, and prefix, includedir ... which are useful for generic compiler flags, there are variables for each package detected. These are normally of the form FOO_INCLUDE=... FOO_LIB=... FOO_DIR=... indicating, for each package, the useful libraries and header files.
The next step is to run "make menuselect", to extract the dependencies existing between files and modules, and to store build options.
menuselect produces two files, both to be read by the Makefile:
- Contains for each subdirectory a list of modules that must be excluded from the build, plus some additional informatiom.
- Contains, for each module, a list of packages it depends on. For each of these packages, we can collect the relevant INCLUDE and LIB files from makeopts. This file is based on information in the .c source code files for each module.
The top level Makefile is in charge of setting up the build environment, creating header files with build options, and recursively invoking the
subdir Makefiles to produce modules and the main executable.
The sources are split in multiple directories, more or less divided by module type (apps/ channels/ funcs/ res/ ...) or by function, for the main binary (main/ pbx/).
TO BE COMPLETED