I am a very infrequent Elmer user, mostly for private projects and out of curiosity. So my opinions need to be taken with a grain of salt.
The following options and thoughts are of varying complexity and required effort. Personally, I favor the options that leave less room for error (missing keywords and their documentation) but they are the more complicated ones. Numbering related to Peter's post.
4.1) Extraction using the keyword miss option:
- Pros: It is already there and can help identifying missing keywords.
- Cons: Requires re-compilation (I think this could be changed to a command line option rather easily). Requires to run test cases that actually use the keywords. Otherwise the miss is not detected, AFAIK. Consequently, it requires up to date testcases for complete results. If a lot of testcases must be executed, the runtime can be quite long for just checking the keyword list.
4.2) Code analysis:
- Pros: Some rudimentary parser is there and getting significant coverage seems not too difficult (but the last percents ...)
- Cons: Requires quite some effort to make it working reliably and produce good concise output. It can miss new or previously already detectable keywords if the code changes in an unanticipated way. This may be easier if a more complete Fortran parser is used, maybe LLVM or other existing parsers. I lack experience in this field and can only guesstimate, but I expect some coding effort would still be required. Another drawback is that the parser must be maintained in addition the the actual Elmer code. It could also lead to some headaches in corner cases, for example if some heuristics relying on coding conventions must be used.
The effort to output ratio of this option seems to be good, but I don't consider it a clean solution for the reasons given.
Some other options I can think of:
4.3) Make the SOLVER.KEYWORDS entries mandatory.
- Pros: May be easy to implement. Maybe just requires to make errors out of warnings.
- Cons: Will provoke messages with existing cases until all keywords have been added. Works by enforcing self-discipline of the developers (can be good or bad). Consequently, it should not be easy do disable it. User defined variable names may be more difficult to cover: May require extending the SOLVER.KEYWORDS syntax.
4.4) Require registering keywords:
Similar to 4.3 (thus many of the following points also apply to 4.3), but the other way round: Make the Elmer framework require the registration of keywords before they can be used. Keyword registration should be performed in a function/subroutine that can be called independently of regular solver functions. The SOLVER.KEYWORDS file could then be generated by calling all the registration functions of all linked and/or loaded solvers. Since no actual test cases must be executed, this would be really quick in comparison to the currently existing compile time option for missing keyword checking. Due to the mandatory registration prior to using the keyword, one can also be rather sure of completeness, because it would be the first thing a developer must implement before he/she can use the keyword. Generation of one keyword file per solver would be easy; just output to the corresponding file.
User defined variable names could be handled by registering keywords with yet unspecified name: For documentation purposes it is sufficient to know that, e.g. HeatSolver has a BC of type Real, nodal loads of type Real and so on. This may be enough information for a user to know how to use it, but maybe not if e.g. the GUI wants to use the keyword file.
The SIF syntax could be also be changed to make it less ambiguous (but breaks backward compatibility ...). For example, BC settings could require the solver name as an additional word or option:
Instead of
Code: Select all
Boundary Condition 1
TempA = Real 0
End
it could be
Code: Select all
Boundary Condition 1
HeatSolve TempA = Real 0
End
This is somewhat similar to the conventions you mentioned, but enforced. I find it also easier to read if the SIF file was written by someone else who selected funky names.
Cons: Lots of coding effeort.
4.5) Comment "markup"
Keyword information could also be provided in the sourcecode via special comments, similar to Doxygen. Actually, I considered this option before writing the parser but decided against it for multiple reasons: The sourcecode must be checked for keywords anyway. It is not clear where to place the comments. Completeness (and up-to-date-ness) is not enforced, so it is not better than manually editing the keywords file apart from bringing the documentation closer to the source.
4.6) Something else?
Documentation aspect:
From my perspective, the documentation aspect of the keywords file is more important than saving the typing of the datatype specification in the SIF. Consequently, I think it should contain more information: The keywords sections in the ElmerModelsManual.pdf should be generated from the keywords file, i.e. the explanation/description of the keyword should be in the keywords file. Making the description an option of the registration function in 4.4 above would be easy. Doing it the other way round for 4.3 is also easy. This would also facilitate generating context helps for Editors, IDEs, etc. (This is related to the discussion in the Github issue linked in the first post of this thread.)
I don't know the requirements and contents of the XML files for the ElmerGUI, because I rarely use the GUI. However, it sounds like a good idea to have one information database from which the others are derived. In other words: If possible, include additional information required for the XML file in the keywords file (or an per-solver metadata-file) and auto-generate the XML. If the XML is already the more complete database, maybe ditch the keywords file and use the XML only. (Or use something else as basis for both, but avoid the need to make modifications in more than one place.)
The keywords file/database doesn't seem to be the correct place for the theoretical information and derivations also given in the ElmerModelsManual (also because it is not as tightly related to the code as the keywords). However, if all meta-information regarding a solver is stored in a per-solver file or directory, this location might be.
Regarding 3): Yes, it would be nice to see which keywords belong to which Solver. A format according to the deliberations of the above paragraphs would provide this information anyway. Whether this information is contained in one file of appropriate structure or multiple files does not matter that much. Conversion back and forth via scripts would be possible once the information is available in a parsable format.
wiesi