The purpose of this page is to describe my proposal for improving the module loader by using manifest files.
- Example Manifest
- MODULEINFO changes
- Module Loading/Unloading
- Other Options
This example manifest was generated by 'make all'.
|name||The name of the module (excludes .so). Provided by Makefile.|
|checksum||Used to verify the manifest matches the loaded binary. Provided by Makefile.|
|support_level||Extracted from MODULEINFO.|
Optional, unsure this is still needed. Extracted from MODULEINFO.
Optional / default no. Extracted from MODULEINFO.
No config for this module, so not shown. See section on extconfig.conf. This value is read from config to memory but is not yet used for anything.
An example usage would be 'config=extensions.conf' for pbx_config. A module may have multiple configs. We should add ACO style config documentation for all configs. This would give us a place to extract this information from.
This section lists things that are required by the module, and is extracted from MODULEINFO <use> elements. In the case of func_periodic_hook, it requires a few modules. The loader will be able to understand when a module uses "application=Gosub", but menuselect would need to be updated to understand this.
This section could be used to register things that the module automatically use if loaded, but doesn't autoload. It might be helpful to accept callbacks through the AST_MODULE_INFO macro to inform a module when something on it's "canuse" list gets loaded. Not yet implemented, I haven't fully thought out the way this would work.
This section lists things provided by the module. It is implied that a module provides itself: "module=func_periodic_hook" for the example manifest. My current GAWK script extracts applications and functions from the DOCUMENTATION block. It would be useful to list other providers (rtp=asterisk, codec=ulaw, realtime=sqlite3, etc).
Can priorities be expressed here? For example res_timing_timerfd provides timing=timerfd, is there a way for the manifest to say that timerfd is priority 200 timing provider, and dahdi is priority 100? The goal is for autoload to know it needs only the best timing provider. Maybe with something such as "service=timer:200", that way modules can list "service=timer" under
The gawk script I wrote to extract manifest files does not use an XML parser, so the following rules apply:
- In the <use> element a type attribute is mandatory.
- Only a single element may be used per line.
- XML encoded characters like & in extracted fields are not decoded.
These rules apply to all lines of MODULEINFO and DOCUMENTATION that contain manifest data.
This element is currently used for build-time dependencies, and is unknown to the module loader. This means modules that are required at runtime cannot use <depend>. It is appropriate to list required libraries and Compiler options with depend elements.
Element: use type="external"
This is currently used for optional libraries (listed as "can use" in menuselect). I believe it should be changed to <depend optional="true">, this way <use type="..."> always describes a run-time dependency which the loader needs to understand. For now I just filter this out of the manifest files.
Items listed with this element are required dependencies at build-time and run-time. menuselect is currently only aware of module's, this would need to be expanded if we want to support <use type="application">Gosub</use> in MODULEINFO. The type attribute is mandatory.
Optional self-terminating element with no attributes. If included the module will export globals.
Optional CDATA element, default value is "default". Values are similar to current enum ast_module_load_priority symbols - lower case and without the "AST_MODPRI_" prefix. This really only applies to 'autoload=yes'. Autoloader does not yet respect load_priority.
The manifest format I've described can provide information needed to autoload dependencies for most modules. Lets consider the following example updated modules.conf:
In this example chan_sip is my channel driver, and pbx_config is my dialplan. The
[uses] section of modules.conf declares that chan_sip requires a few additional modules, and pbx_config requires a couple dialplan application modules and functions. In the current modules.conf we could simply have a bunch of 'load=' statements in the correct order, but this does not help with shutdown / unload order. By declaring that chan_sip requires pbx_config, the admin is saying that pbx_config cannot be unloaded before chan_sip.
One thing worth noting, none of my example configurations include ".so" after module names. Everything in the module loader operates on module names, not module filenames. Input from users (config, cli, manager, etc) should continue to accept ".so". CLI completion for modules does not include extensions.
This could tell pbx.c that the extension test,s,1 is a user of app_stack, so app_stack should be loaded/held open by this dialplan. Understanding of functions would be more difficult. Some application uses can't be automatically known, such as through ExecIf. For these it would be recommended to put entries in the
[uses] section of modules.conf, such as the examples with GLOBAL and CHANNEL. Autoload from pbx.c is not yet implemented, I think it's better as a follow-up feature - maybe after a general cleanup of pbx.c.
With the addition of realtime=sqlite3 in
[provides] of res_config_sqlite3.manifest, extconfig.conf could add res_config_sqlite3 to pbx_config's dependency list. Does this apply to sorcery too? My plan is for this to replace preload functionality in modules.conf.
The current module loader supports adding channels as users of a module, allowing them to be hung up during unload. This has been made generic to allow adding any object as a user of the module with a callback to allow that user to be cancelled. The dispose callback for channels would run ast_softhangup for example. Module dependencies will automatically cause one module to be the user of another. Module user "cancel" callbacks will attempt to unload the module that is using another (by telling that module to cancel all it's users). The ability to just grab a reference to the module instance or library is also available.
Module lifetime is based on reference counts to struct ast_module_lib and struct ast_module_instance. For a module to stay running the instance needs to be referenced by something - another module, dialplan or admin setting. For a module to stay loaded the lib needs to be referenced by something (a registration object, another module, it's own instance). Enabling autoload will cause an admin user to reference each module. CLI/Manager requests for module load/unload are also be treated like admin user's. All ast_module will become users of modules they directly depend on, libs use libs, instances use instances. It's possible for a module to be stopped, but not immediately unload.
From CLI 'module unload chan_sip' would release the admin setting user of chan_sip, unloading the module if no other users exist. 'module unload -f chan_sip' would request that all users of chan_sip dispose so the module can be unloaded now (or soon). Cascading unloads will work in both directions. Firm unload of a module will unload all modules that depend on it. Once a module unloads, anything module depends on could have just lost it's last user, causing it to unload as well.
Going back to the example modules.conf given above, "load=chan_sip" would make the admin a user of chan_sip. chan_sip has dependencies (both compiled and admin defined), so we load those with chan_sip as a user. pbx_config creates dialplan which uses applications that must be loaded. If you issue CLI 'module unload chan_sip' while a call is active, it will immediately release the admin use of that module. After the call ends chan_sip will reach 0 users, and automatically unload, likely causing all other modules to unload.
The unload_module function no longer allows an error to be returned. Instead you must run ast_module_block_unload, which will prevent dlclose from being run. The loader is designed so modules are not unloaded when requested, they are unloaded when no more references to the instance exist. If the module fails to unload under those conditions it is a serious issue.
Cleanup of many standard items is automatically done before the now optional unload_module is run. This change has resulted in removal of unload_module from just over half of all modules. The remaining unload_module functions are less cluttered. Many of these registrations are handled by a new source main/api_registry.c. This file manages the registration list (a vector), deals with proper references to the interface's holder and module, provides functions to find or use by name. This should help to provide a more consistent API and behaviour between different sources which accept registration. Other registrations are handled in a more ad-hoc way (during a first attempt), but they are not completely safe.
See the commit message at https://gerrit.asterisk.org/557 for the current list of which registrations use api_registry or ad-hoc automatic cleanup.
Installing a module while Asterisk is running
The current proposal is to load all available manifest files at system startup and keep it in memory. This is not required for any basic dependencies (one module requires another), but it's the only way to know what module provides a certain application or realtime driver. For that reason I don't think lazy manifest loading would work. Maybe it's best to always reload manifests when loading a module. We could check for and load new manifests during CLI completion for module load. It might also be best to just label this as a "known limitation", see what we can do about it later when the dust settles.
Currently, if autoload=no there can be no automatic loading of modules. How does changing this effect security? For starters I've proposed a new setting 'neverload' to allow the admin to declare certain modules as 'dangerous'. I'm not so worried about extconfig.conf integration, but dialplan autoloader could be a problem.
Embedded modules cannot be found by the testsuite.
Can be fixed by this proposal, the testsuite would just have to search for *.manifest. It would still search for *.so to support previous versions of Asterisk.
Embedded modules are selected by category.
Embedding needs to be set for all modules at once, or per module. menuselect doesn't currently require that dependencies of embedded modules also be embedded, so enabling per module would require more difficult improvements to menuselect.
Circular dependencies would have to be broken by use of "canuse" in at least one link of the loop - on a module that implements an optional API. In theory we could add a "willuse" - a tag that would cause another module to be loaded immediately after the module that lists it as "willuse". This is a difficult thing to deal with, the better option would be to fix the modules to not have circular dependencies.
To my knowledge there are not currently any circular dependencies in MODULEINFO blocks. The module loader will need to protect itself against that (infinite dependency load recursion). My immediate reaction is that circular dependencies need to prevent those modules from loading. This may also be good reason for a 'make check' target. This would not run the testsuite or even unit tests, but would be the right place to do things like validate XMLDOC against schema, ensure module manifests have no errors, etc.
Module description strings
In my branch module descriptions are still provided to ast_module_register by the AST_MODULE_INFO macro. This means modules that are not loaded have no description available. Modules that are not loaded are currently not visible except by 'module load' CLI completion, but it might be preferable to move the description from the macro to the MODULEINFO comment block. One issue with this is limiting the characters that can be used in the description. If we move description to MODULEINFO we will need to figure out a way to decode any XML characters. Not sure if this is something that should be accomplished by the GAWK script or the manifest loader.
Optional API is somewhat of a nightmare. It does not create any references to the provider module, and the load/unload process is not thread safe unless the provider loads before all user and unloads after all users. This is not a new issue but the new module loader is more sensitive to lack of module references. I'm not sure the best way to handle this. In my testing I've been adding uses statements to make certain modules dependent on others where the module does not have an explicit dependency. For now I've also removed optional API from res_http_websockets. The only user that was actually optional was chan_sip, I'm thinking this can be fixed with a setting in sip.conf to avoid listening for websocket connections.
The channel core does not currently manage any references related to channel technologies. Really ast_channel_register should be converted to use api_registry, or at least similar references. This is difficult since 'struct ast_channel_tech' is passed around without using any references. This needs further work.
Thread local storage
Thread local storage in modules is vulnerable to crash after dlclose. Threadstorage creates static variables and sometimes callbacks from the module in question. If a module is unloaded when a thread is still running that used threadstorage, it can cause a crash when that thread exits.
I suggested this in a comment to David Lee's page, but later decided against this approach:
- XMLDOC is for "localized" strings, not for linker hints.
- This would be very disruptive to xmldoc.c, would require changes across the board to XPath includes.
- I hate XML and remembered that I use it when I must, never by choice.
Inject Dependencies into AST_MODULE_INFO block.
See David Lee's proposal. I believe injecting the dependencies in this way would be difficult, especially to provide the amount of information I'm proposing here. Would still put us in a situation where we repeatedly run dlopen/dlclose to collect module info then load the module.