When one thinks of measuring call quality a Mean Opinion Score (MOS) is usually the first thing that comes to mind. However, calculating a true MOS inherently involves, and relies upon human perception and judgement. Thus when it comes to determining the call quality in Asterisk only an approximation can be achieved since it can only depend upon relevant statistics. And for that reason we'll call it a "Media Experience Score" instead.
What are these relevant statistics? To start, and in order to keep things simple there are three obvious choices all computed by Asterisk:
- Round Trip Time (RTT) - the amount of time between Asterisk sending to Asterisk receiving an RTCP report.
- Jitter - the relative transit time between Asterisk sending an RTCP report to an endpoint receiving it, or vice versa.
- Packets Lost - the number of packets lost since sending the last RTCP report, or receiving the last RTCP report.
As one may note, these statistics all relate to network conditions, and connectivity. They do not give any indication as to the state of the actual media within a packet. That level of inspection, and analysis is currently outside our current scope. However, media via RTP can be greatly affected by the network state. Things like packet loss, or a delay in delivery translate to degraded media quality for the end user. In fact by combining the mentioned statistics into an appropriate calculation one can claim a confident opinion about the quality of an end user’s media experience.
Forming an Opinion
Luckily there has already been a lot of study, and analysis done on this topic, and some of it is even standardized (ITU-T). While a large portion of that of course concerns calculating an actual MOS, it gives us a starting point. As well others have already taken such ideas and applied them to assessing the media experience based on gathered RTP statistics. Since there is no reason to reinvent the wheel here we’ll base our algorithm for calculating a user’s media experience on the aforementioned evaluations by ITU-T (G.107, G.109, G.113), and from other various internet sources.
We’ll be reducing a given transmission rating factor, or ‘R’ value for short, based on the gathered relevant statistical data. That final ‘R’ value will then be fed into a standardized formula for deriving an opinion score. An ‘R’ value has a final range of 0 (worst) to 100 (best). However for our purposes we’ll be using defaults, which gives us a starting ‘R’ value of 93.2 (see ITU-T G.107 for more details).
Let’s start with latency. Depending on the situation the media itself may or may not be affected by late arriving packets, but the overall user experience may suffer. For example, If you’re on a VOIP call and audio is delayed by a second it makes it harder to have a conversation. As well if packets arrive out of order, and no jitter buffer is enabled that too will make for an unpleasant experience. To that end we can compute an “effective latency” (in milliseconds), and reduce the ‘R’ value accordingly:
If R < 0: Opinion = 1 Else if R > 100: Opinion = 4.5 Else: Opinion = 1 + (0.035 * R) + (R * (R - 60) * (100 - R) * 0.0000007);
An RTP instance in Asterisk keeps a running average for the round trip time, the average jitter between an endpoint (sender) and Asterisk (receiver), the standard deviation for jitter, and packet loss. For data pertaining to the link from Asterisk (sender) to the endpoint (receiver) the instance also tracks the reported (from RTCP) jitter, its standard deviation, and the reported packet loss.
Given that an RTP instance calculates and/or collects the required data for both incoming and outgoing packets means we should be able to arrive at a media experience score about each. That also means that for an actual call between Alice and Bob up to 4 scores can be derived. That is somewhat granular, so depending we may want to average those scores together to give a singular overall media experience score.
We start with an ‘R’ value of 93.2 based on the default recommendations found in ITU-T G.107 (Section 7.7) because we are restricted in our dataset(s). However, it’s possible that for some specific codecs along with a deep packet inspection of the actual media some ‘R’ value parameters could be adjusted. Of course the latter inspection could be quite intrusive and inefficient.
‘R’ value’s equipment impairment factor too could be further looked into. If Asterisk could determine the type of packet loss (burst vs intermittent) the final ‘R’ value could become more accurate. As well, one could factor in the codec in current use and its ability to handle packet loss. For instance, codec opus is probably better at handling packet loss than codec alaw.
Lastly, the advantage factor ‘A’ could be increased based on knowledge of the endpoint. For instance, if the endpoint is known to be a mobile phone the parameter could be bumped to between 5 to 10.