Skip to end of metadata
Go to start of metadata

The next concept we'll cover is called pattern matching. Pattern matching allows us to create extension patterns in our dialplan that match more than one possible dialed number. Pattern matching saves us from having to create an extension in the dialplan for every possible number that might be dialed.

When Alice dials a number on her phone, Asterisk first looks for an extension (in the context specified by the channel driver configuration) that matches exactly what Alice dialed. If there's no exact match, Asterisk then looks for a pattern that matches. After we show the syntax and some basic examples of pattern matching, we'll explain how Asterisk finds the best match if there are two or more patterns which match the dialed number.

Special Characters Used in Pattern Matching

Pattern matches always begin with an underscore. This is how Asterisk recognizes that the extension is a pattern and not just an extension with a funny name. Within the pattern, we use various letters and characters to represent sets or ranges of numbers. Here are the most common letters:

X

The letter X or x represents a single digit from 0 to 9.

Z

The letter Z or z represents any digit from 1 to 9.

N

The letter N or n represents a single digit from 2 to 9.

Now let's look at a sample pattern. If you wanted to match all four-digit numbers that had the first two digits as six and four, you would create an extension that looks like:

exten => _64XX,1,SayDigits(${EXTEN})

In this example, each X represents a single digit, with any value from zero to nine. We're essentially saying "The first digit must be a six, the second digit must be a four, the third digit can be anything from zero to nine, and the fourth digit can be anything from zero to nine".

Character Sets

If we want to be more specific about a range of numbers, we can put those numbers or number ranges in square brackets to define a character set. For example, what if we wanted the second digit to be either a three or a four? One way would be to create two patterns (_64XX and _63XX), but a more compact method would be to do _6[34]XX. This specifies that the first digit must be a six, the second digit can be either a three or a four, and that the last two digits can be anything from zero to nine.

You can also use ranges within square brackets. For example, [1-468] would match a single digit from one through four or six or eight. It does not match any number from one to four hundred sixty-eight!

Icon

The X, N, and Z convenience notations mentioned earlier have no special meaning within a set.

The only characters with special meaning within a set are the '-' character, to define a range between two characters, the  '\' character to escape a special character available within a set, and
the ']' character which closes the set. The treatment of the '\' character in pattern matching is somewhat haphazard and may not escape any special character meaning correctly. 

Other Special Characters

Within Asterisk patterns, we can also use a couple of other characters to represent ranges of numbers. The period character (.) at the end of a pattern matches one or more remaining characters. You put it at the end of a pattern when you want to match extensions of an indeterminate length. As an example, the pattern _9876. would match any number that began with 9876 and had at least one more character or digit.

The exclamation mark (!) character is similar to the period and matches zero or more remaining characters. It is used in overlap dialing to dial through Asterisk. For example, _9876! would match any number that began with 9876 including 9876, and would respond that the number was complete as soon as there was an unambiguous match.

Icon

Asterisk treats a period or exclamation mark as the end of a pattern. If you want a period or exclamation mark in your pattern as a plain character you should put it into a character set: [.] or [!].

Be Careful With Wildcards in Pattern Matches

Icon

Please be extremely cautious when using the period and exclamation mark characters in your pattern matches. They match more than just digits. They match on characters. If you're not careful to filter the input from your callers, a malicious caller might try to use these wildcards to bypass security boundaries on your system.

For a more complete explanation of this topic and how you can protect yourself, please refer to the README-SERIOUSLY.bestpractices.txt file in the Asterisk source code.

Order of Pattern Matching

Now let's show what happens when there is more than one pattern that matches the dialed number. How does Asterisk know which pattern to choose as the best match?

Asterisk uses a simple set of rules to sort the extensions and patterns so that the best match is found first. The best match is simply the most specific pattern. The sorting rules are:

  1. The dash (-) character is ignored in extensions and patterns except when it is used in a pattern to specify a range in a character set. It has no effect in matching or sorting extensions.
  2. Non-pattern extensions are sorted in ASCII sort order before patterns.
  3. Patterns are sorted by the most constrained character set per digit first. By most constrained, we mean the pattern that has the fewest possible matches for a digit. As an example, the N character has eight possible matches (two through nine), while X has ten possible matches (zero through nine) so N sorts first.
  4. Character sets that have the same number of characters are sorted in ASCII sort order as if the sets were strings of the set characters. As an example, X is 0123456789 and [a-j] is abcdefghij so X sorts first. This sort ordering is important if the character sets overlap as with [0-4] and [4-8].
  5. The period (.) wildcard sorts after character sets.
  6. The exclamation mark (!) wildcard sorts after the period wildcard.

Let's look at an example to better understand how this works. Let's assume Alice dials extension 6421, and she has the following patterns in her dialplan:

exten => _6XX1,1,SayAlpha(A)
exten => _64XX,1,SayAlpha(B)
exten => _640X,1,SayAlpha(C)
exten => _6.,1,SayAlpha(D)
exten => _64NX,1,SayAlpha(E)
exten => _6[45]NX,1,SayAlpha(F)
exten => _6[34]NX,1,SayAlpha(G)

Can you tell (without reading ahead) which one would match?

Using the sorting rules explained above, the extensions sort as follows:
_640X sorts before _64NX because of rule 3 at position 4. (0 before N)
_64NX sorts before _64XX because of rule 3 at position 4. (N before X)
_64XX sorts before _6[34]NX because of rule 3 at position 3. (4 before [34])
_6[34]NX sorts before _6[45]NX because of rule 4 at position 3. ([34] before [45])
_6[45]NX sorts before _6XX1 because of rule 3 at position 3. ([45] before X)
_6XX1 sorts before _6. because of rule 5 at position 3. (X before .)

Sorted extensions
exten => _640X,1,SayAlpha(C)
exten => _64NX,1,SayAlpha(E)
exten => _64XX,1,SayAlpha(B)
exten => _6[34]NX,1,SayAlpha(G)
exten => _6[45]NX,1,SayAlpha(F)
exten => _6XX1,1,SayAlpha(A)
exten => _6.,1,SayAlpha(D)

When Alice dials 6421, Asterisk searches through its list of sorted extensions and uses the first matching extension. In this case _64NX is found.

To verify that Asterisk actually does sort the extensions in the manner that we've shown, add the following extensions to the [users] context of your own dialplan.

exten => _6XX1,1,SayAlpha(A)
exten => _64XX,1,SayAlpha(B)
exten => _640X,1,SayAlpha(C)
exten => _6.,1,SayAlpha(D)
exten => _64NX,1,SayAlpha(E)
exten => _6[45]NX,1,SayAlpha(F)
exten => _6[34]NX,1,SayAlpha(G)

Reload the dialplan, and then type dialplan show 6421@users at the Asterisk CLI. Asterisk will show you all extensions that match in the [users] context. If you were to dial extension 6421 in the [users] context the first found extension will execute.

server*CLI> dialplan show 6421@users
[ Context 'users' created by 'pbx_config' ]
  '_64NX' =>        1. SayAlpha(E)                                [pbx_config]
  '_64XX' =>        1. SayAlpha(B)                                [pbx_config]
  '_6[34]NX' =>     1. SayAlpha(G)                                [pbx_config]
  '_6[45]NX' =>     1. SayAlpha(F)                                [pbx_config]
  '_6XX1' =>        1. SayAlpha(A)                                [pbx_config]
  '_6.' =>          1. SayAlpha(D)                                [pbx_config]

-= 6 extensions (6 priorities) in 1 context. =-
server*CLI> dialplan show users
[ Context 'users' created by 'pbx_config' ]
  '_640X' =>        1. SayAlpha(C)                                [pbx_config]
  '_64NX' =>        1. SayAlpha(E)                                [pbx_config]
  '_64XX' =>        1. SayAlpha(B)                                [pbx_config]
  '_6[34]NX' =>     1. SayAlpha(G)                                [pbx_config]
  '_6[45]NX' =>     1. SayAlpha(F)                                [pbx_config]
  '_6XX1' =>        1. SayAlpha(A)                                [pbx_config]
  '_6.' =>          1. SayAlpha(D)                                [pbx_config]

-= 7 extensions (7 priorities) in 1 context. =-

You can dial extension 6421 to try it out on your own.

Be Careful with Pattern Matching

Icon

Please be aware that because of the way auto-fallthrough works, if Asterisk can't find the next priority number for the current extension or pattern match, it will also look for that same priority in a less specific pattern match. Consider the following example:

exten => 6410,1,SayDigits(987)
exten => _641X,1,SayDigits(12345)
exten => _641X,n,SayDigits(54321)

If you were to dial extension 6410, you'd hear "nine eight seven five four three two one".

We strongly recommend you make the Hangup() application be the last priority of any extension to avoid this problem, unless you purposely want to fall through to a less specific match.

  • No labels

2 Comments

  1. I'm missing an explanation concerning the (non)availibility of combinations of special letters (X N Z) and ranges / other characters within alternative brackets, e.g.

    _[+X]X. will not match the intended match as _[+0-9]X. would.

    As this behaviour is different from usual regex-engins (assuming X N and Z are representig character classes) where character classes may be combined in an alternate definition in [], this should be mentioned here to avoid misunderstandings

    1. Thanks for the note Olaf. I've modified the text to describe the expected behavior in a note.