History
In the old times media formats were represented using a bit field. This was fast but had a few limitations. We were limited on how many there were and they also did not include any attribute information. It was strictly a "this is ulaw". This was changed and ast_format was created, which is a structure that contains additional information. Additionally ast_format_cap was created to act as a container and another mechanism was added to allow logic to be registered which performed format negotiation. Everywhere throughout the codebase the code was changed to use this strategy but unfortunately this came at a cost.
Performance analysis and profiling has shown that we spend an inordinate amount of time comparing, copying, and generally manipulating formats and their related structures. Basic prototyping has shown that there is a reasonably large performance improvement to be made in this area - this project is to overhaul the media format architecture and its usage in Asterisk to improve performance.
Use Cases
The following, for the most part, assumes that the channels use RTP for media and SIP for signalling. Most use cases, however, will translate to any VoIP channel driver. DAHDI, as always, is its own thing.
General Rules
- For an inbound channel with a set of format capabilities, Asterisk should respond to that set of formats with the intersection of the offered capabilities and what is configured for the endpoint for that channel. The format preference order should be based on the configuration of the endpoint.
- If the system should accept a different set of codecs, a dialplan function and/or channel variable can be used to set which formats (and their preference order) are accepted on the channel at run-time. This would have to occur before the inbound channel is answered (via the MASTER_CHANNEL function and the U/M options in the dialplan).
- If the system would like to restrict in the device to a single format, a dialplan function and/or channel variable and/or configuration option can be set so that Asterisk will only ever respond with the preferred codec.
- For an outbound channel, Asterisk should offer the formats that have been configured for that endpoint, with the format preference order based on the configuration of that channel's endpoint.
- If the system would like to restrict the outbound channel such that it only has a single format, a dialplan function/channel variable/configuration option can be used such that Asterisk only offers a single format.
- Prior to entering a bridge, a dialplan function can be used to set whether or not that channel will attempt to make itself compatible with whatever is in the bridge with it. If a channel enters a bridge that has another channel in it with a format it supports, it will attempt to switch the channel to the bridged channel's format to facilitate native bridging. Note that this has no bearing in multi-party bridges, where everyone is transcoded.
- At any point in time, a dialplan function can be used to set the allowed set of formats on the channel, with whatever ordering. These formats should be a subset of the allowed formats configured on that channel's endpoint. This will cause the channel to re-negotiate to the set of formats specified by the function.
The difference with this approach is that Asterisk will no longer always attempt to avoid transcoding. Instead, it will default to the rules configured in the .conf
files, overriding as it can via the dialplan. Transcoding may be more likely in poorly configured systems, but it will also allow for much greater flexibility in the behaviour of Asterisk.
Single Channel
Nominal Offer/Answer (Single Media Stream)
Offer Negotiation - Nominal
- Alice's phone offers some set of codecs in an INVITE request (example: ulaw,g729,ilbc), where all codecs are supported by Alice's endpoint
- Asterisk responds with an answer containing the codecs in the order specified by the offer
Offer Negotiation - Subset (Alice)
- Alice's phone offers a set of codecs in an INVITE request, where a subset of the codecs is supported by Alice's endpoint and some subset is not
- Asterisk modifies the origin line in the SDP, and responds with the set of codecs that are allowed based on the intersection of the offered codecs and the configured codecs for the endpoint
Offer Negotiation - Subset (Asterisk)
- Alice's phone offers a set of codecs in an INVITE request, where the codecs offered is a subset of the codecs supported by Alice's endpoint
- Asterisk responds with an answer containing the codecs in the order specified by the offer
Offer Negotiation - Preferred Codec Only (Alice's preference)
- Alice's phone offers a set of codecs in an INVITE request, where all codecs are supported by Alice's endpoint
- Asterisk modifies the origin line in the SDP, and responds with only the preferred codec in the offer
Offer Negotiation - Preferred Codec Only (Asterisk's preference)
- Alice's phone offers a set of codecs in an INVITE request, where all codecs are supported by Alice's endpoint
- Asterisk modifies the origin line in the SDP, and responds with only the preferred codec configured via the dialplan/configuration file
Offer Negotiation - Preferred Codec List
- Alice's phone offers a set of codecs in an INVITE request, where all codecs are supported by Alice's endpoint
- Asterisk modifies the origin line in the SDP, and responds with a subset of the codecs in the offer re-ordered per the preference order defined via the dialplan/configuration file
Offer Negotiation - packetization
- Alice's phone offers a set of codecs in an INVITE request, where the preferred codec has a ptime attribute indicating a different packetization
- Asterisk responds with the codecs in the offer, and sends RTP to the endpoint with the appropriate packetization
Nominal Offer/Answer (Multiple Media Streams)
All use cases covered in Nominal Offer/Answer (Single Media Stream) apply here as well, save that there should be multiple streams of different types. Asterisk should treat the preferred codec offer in the same fashion for each stream independently; that is, if the preferred codec list is ulaw,g722,h261,h264, then the preferred audio codec is ulaw and the preferred video codec is h261.
Restricted Offer/Answer (Single Stream)
No codecs
- Alice's phone offers no codecs in an INVITE request with an SDP.
- Asterisk responds with an equivalent answer.
Restricted flow
- Alice's phone offers a set of codecs in an INVITE request, where the media is set to sendonly (phone => Asterisk)
- Asterisk responds with the codecs in the offer, where the media is set to recvonly.
Restricted Offer/Answer (Multiple Streams)
Each scenario in Restricted Offer/Answer applies, only with both an audio stream as well as a video stream. Either or both stream can be indicated to be sendonly, or can be sent with no codecs.
Invalid Format Offer/Answer (Single Stream)
Offer Invalid Codec (one)
- Alice's phone offers a set of codecs in an INVITE request, where at least one codec is not supported by Alice's endpoint
- Asterisk responds with the subset of the codecs that were offered that it does support, using the preference order of the offer
Offer Invalid Codec (all)
- Alice's phone offers a set of codecs in an INVITE request, where none of the codecs are supported by Alice's endpoint
- Asterisk responds with a 488.
Offer Invalid Attribute
- Alice's phone offers a set of codecs, where additional attributes are provided that are invalid:
- An invalid rtpmap attribute for an unknown media format
- An invalid attribute (or unknown attribute) for a known media format
- An improperly formatted media description line
- Asterisk responds with a 488.
Invalid Format Offer/Answer (Multiple Stream)
All of the scenarios in Invalid Format Offer/Answer apply, only with a single audio and a single video stream. Streams can be declined, or the entire offer can be declined with a 488 as appropriate.
Multiple Channels
Nominal Offer/Answer
- Alice sends an INVITE request with an offer containing a set of codecs. The offer is a complete match with the set of codecs configured for Alice's endpoint.
- Asterisk dials Bob. Bob's endpoint is configured with the same set of codecs, in the same order.
- Bob's response to the INVITE request contains the same set of codecs as the offer. Asterisk responds to Alice with her set of configured codecs.
- Alice and Bob are bridged using the same formats, with the same priority order.
Preferred codec only
Preferred codec only in outbound answer
- Alice sends an INVITE request with an offer containing a set of codecs. The offer is a complete match with the set of codecs configured for Alice's endpoint.
- Asterisk dials Bob with his endpoint's codecs.
- Bob's response contains only a single codec. Asterisk uses that format for Bob's channel.
- Alice's reply contains her codecs in the order specified by her endpoint.
Preferred codec only in inbound answer
- Alice sends an INVITE request with an offer containing a set of codecs configured for Alice's endpoint.
- The dialplan restricts Asterisk to responding only with Alice's preferred codec.
- Asterisk dials Bob with his endpoint's codecs.
- Bob responds with an acceptable set of codecs.
- Asterisk sends an answer to Alice's offer with only her endpoint's preferred codec.
Transcoding
Acceptable translation path
- Alice sends an INVITE request with an offer containing a set of codecs configured for Alice's endpoint.
- Asterisk dials Bob with his endpoint's codecs, where the codecs for Bob's endpoint are not the same as Alice's but have an acceptable translation path.
- Bob answers with his endpoint's codecs.
- Asterisk sends an answer to Alice's offer with the codecs for her endpoint.
- Alice and Bob enter a bridge together and their media is transcoded based on the current formats sent by their endpoints.
Failed translation path (no path exists)
- Alice sends an INVITE request with an offer containing a set of codecs configured for Alice's endpoint.
- Asterisk determines that there is no translation path between the codecs configured for Alice and the codecs configured for Bob
- Alice's offer is rejected; Bob is not dialled.
Failed translation paths
- Alice sends an INVITE request with an offer containing a set of codecs configured for Alice's endpoint
- Asterisk dials Bob with his endpoint's codecs, where the codecs for Bob's endpoint are not the same as Alice's. For each codec that does not have a translation path to Alice's codecs, those codecs are not offered.
- Bob responds with one of this acceptable codecs. Asterisk responds to Alice with her codecs.
- Alice and Bob enter a bridge together and their media is transcoded based on the current formats sent by their endpoints.
Re-Invite to Native Bridge
Nominal
- Alice sends an INVITE request with a different ordered set of codecs than Bob.
Alice's channel is set to re-INVITE back to native bridging if possible.
- Asterisk dials Bob with his set of codecs in his endpoint's priority order.
- Bob responds back with a set of codecs. The set of codecs should have at least one format in common.
- Asterisk Answers Alice with her preferred codecs.
- Alice and Bob enter a bridge together. Asterisk sends a re-INVITE to Alice and to Bob with the formats that are in common.
- Alice and Bob respond to the re-INVITE with a 200 OK
- Asterisk switches to a native bridge
Failed re-INVITE
- Alice sends an INVITE request with a different ordered set of codecs than Bob.
Alice's channel is set to re-INVITE back to native bridging if possible.
- Asterisk dials Bob with his set of codecs in his endpoint's priority order.
- Bob responds back with a set of codecs. The set of codecs should have at least one format in common.
- Asterisk Answers Alice with her preferred codecs.
- Alice and Bob enter a bridge together. Asterisk sends a re-INVITE to Alice and to Bob with the formats that are in common.
Alice responds to the re-INVITE with a failure response (488)
- Asterisk sends an UPDATE request (if Alice/Bob support it) with the previous SDP (see RFC 6337, section 3.4)
- Asterisk transcodes media between Alice and Bob
Modified outbound invite
- Alice sends an INVITE request with a set of codecs.
- Prior to dialling Bob, PJSIP_MEDIA_OFFER modifies which codecs will be offered. (Alternatively, the CHANNEL function in a pre-dial handler)
Asterisk sends an INVITE request with the codecs specified, regardless of whether or not Bob's endpoint supports them.
Modified inbound response
- Alice sends over an INVITE request with a set of codecs.
- Prior to being answered, the CHANNEL function changes what media formats are accepted. Note that this must be a subset of what Alice's endpoint accepts.
- Asterisk responds with the formats the CHANNEL function specified
Modified codecs (chan_sip)
Design
The Present
struct ast_format
Media formats in Asterisk are now represented using a large (~320 byte) sized data structure. This is done because the data structure itself is not a reference counted object and thus carries no guarantee that associated information attached to it will be disposed of. The large size of the data structure is due to needing additional space for optional media format attributes.
ast_format_copy
Copying formats now requires copying this large amount of memory. While one would think this occurs infrequently in practice this can occur more than 5 times for a single frame passing through Asterisk.
ast_format_cmp
Comparing formats is no longer cheap either. Each comparison requires doing a container lookup to see if any additional logic is registered to augment the comparison operation. As code within Asterisk needs to be aware of when formats change this can occur 4 or more times for a single frame passing through Asterisk.
Container lookup using ast_format as key
Since comparing formats are no longer cheap using the ast_format as a container key is extremely expensive.
Assumptions no longer true
There are assumptions throughout the tree that media format related operations are cheap when in reality they are anything but. An example would be reusable frames. Instead of setting the frames up with the information they require at initialization they are set each time the frame is returned.
The Future
Codecs
While past media work has provided us room to add codecs within the codebase there is no dynamic manner available of doing so. For efficient storage of media formats this will need to change. The ability to add codecs to the core will be made available, with the core adding the common codecs that Asterisk is already aware of. The RTP engine API will also be extended to allow SDP specific information to be added. This will provide a truly dynamic and flexible way of adding codecs, with the added benefit that the numerical values for codecs can be used as indexes into arrays.
Codec structures will be immutable once registered and created only once. If a user of the API wants to retrieve a codec they will use ast_codec_get with the provided information.
struct ast_format
The ast_format structure will become an astobj2 allocated object as follows:
Because it is astobj2 allocated additional information can be stored within it, such as a pointer to attribute information and a pointer to the attribute interface to use with it. This reduces the size of the structure by quite a lot and removes the need for container lookups on comparison.
This structure will also be immutable once exposed outside the scope of what has allocated it by any means (such as being stored in an ast_format_cap and then returned).
Another bonus is that for some cases the format structure can be reused, such as when parsing and interpreting "allow" or "disallow" options. This reduces memory usage some more.
Attribute information storage will be left up to the attribute interface implementation.
struct ast_format_cap
The ast_format_cap structure currently internally uses an ao2 hash table to store formats. Leveraging the fact that codecs have a unique identifier we can turn this into a vector with the codec identifier as the index.
This presents an easy mechanism to see if a format is present in the structure.
The framing and preference order is now also made available directly in the cap structure itself, allowing this information to persist in additional places.
struct ast_format_pref
The ast_format_pref structure currently uses a fixed sized array of formats (not pointers). This structure is no longer required since the framing and preference order has now been moved into the cap structure directly.
ast_format_copy
The ast_format_copy operation will simply be incrementing the reference count of the format and returning it.
ast_format_cmp
Comparing formats is similar to the previous implementation except instead of doing a container lookup the pointer to the attribute interface is now directly on the structure.
AST_CONTROL_FORMAT_CHANGE
If media is received with a format that differs from previous frames an AST_CONTROL_FORMAT_CHANGE control frame will be inserted ahead of the new frame at the ingress point. Any code which relies on the format will use this control frame to update themselves to the new format. No format comparisons should be used to determine this. The control frame should be used instead.
RTP Engine API
The RTP engine API currently uses an AO2 container for storing payload mapping information. Since this mapping occurs frequently this comes at a cost.
For mapping from payload to format a fixed array with the payload as the index will be used.
For mapping from format to payload a vector with the format codec id as the index will be used.
Format Usage
Creating a format
For cases where a format has to be created a new API call, ast_format_create, which takes in a codec will be made available.
Example:
Setting attributes
Attribute information can be set on a format by using the ast_format_attribute_set function. To keep things dynamic it takes in both a string for attribute name and value.
Example:
Format Capabilities Usage
Creating and destroying format capabilities structure
The function to allocate a capabilities structure is unchanged but the format capabilities structure is now a reference counted object to reduce copying. As a result there is no explicit function to destroy a structure.
Example:
Adding a format to the capabilities structure
This is slightly changed from the existing API in that the format passed in is not const. The implementation also increments the reference count of the format instead of copying it.
Example:
Capabilities structure manipulation
Numerous functions manipulate the capabilities structure itself. These are used to copy formats between structures, duplicate them, etc. These will go unchanged except internally they will no longer duplicate the format. Instead they will increment the reference count.
Capabilities structure iteration
As the capabilities structure is now stored using an array iteration will involve two functions, ast_format_cap_count and ast_format_cap_get, which returns the number of formats in the structure and gets a specific one based on index.
Example:
Framing size
The framing size controls the length of media frames (in milliseconds). Previously this was stored in a separate structure but has now been rolled into ast_format_cap. To allow control two API calls will be added.
Example:
Getting joint capabilities
Joint capabilities are the common compatible formats between two capabilities structure. These will be done using the existing API functions but will now take preference order into consideration. This will be done by using the order of the first capabilities structure passed in.
Example:
2 Comments
Matt Jordan
Bike shedding!
original_id
field inast_codec
? Having two identifiers feels annoying.ast_format_cap
objects don't get referenced in an invalid fashion? (This may not be an issue ifast_format_cap
is an opaque object that an API protects against unknown format references)(ast_variable)
approach? In general, attributes are accessed during negotiation, which can take a small hit in the amount of time it takes to access the attribute.For your joint capabilities API calls, I'd make the parameters explicit about what has preference:
Joshua C. Colp