I am compiling OpenIFS with the GNU compilation flag -ffpe-trap=zero,underflow,overflow
in order to detect divide-by-zero, underflow and overflow errors. The compilation is successful, and floating-point exceptions do occur which are caught by Dr Hook causing the program to terminate. However, I can't find any information about the type of floating-point exception. The only way I can get this information is by compiling with only one exception check at a time (e.g. zero only) and running the model, which takes a long time.
Is there a way to see detailed floating point exception information?
6 Comments
Unknown User (nagc)
Hi Sam,
I had a look at some of my tracebacks and the signal that causes the exception seems to be reported:
In this case the original signal was 6. Do you see this?
If the signals are being caught correctly by the compiler with a traceback, if you prefer you can turn off signal handling by DrHook in OpenIFS completely by setting this environment variable in the job:
or use the same environment variable to give a list of signals to ignore (and let runtime system handle it).
There is also the equivalent DR_HOOK_CATCH_SIGNALS which takes a comma separated list of additional signals to be caught as well as the default ones.
It's also possible to turn off DrHook tracing & signal handling completely with:
I've attached a PDF describing DrHook which, although a little old now, should still be mostly accurate.
Cheers, Glenn
Unknown User (nagc)
Hi Sam,
I checked about the capabilities of DrHook. In the version provided with OpenIFS 40r1, there is no provision for capturing the floating point exception.
However, DrHook has been updated in the very latest IFS development cycle, and that version provides more information on the floating point exception that caused the SIGFPE. I might look at this to see if we can port it back to and older cycle. But if you'd like to try it, let me know and I'll pass on the latest DrHook code (no promise it will work though).
Glenn
Sam Hatfield
Hi Glenn,
As an example of what I'm looking for, I get the following output when I compile OpenIFS (38r1) with -ffpe-trap=underflow:
So an underflow occurs on line 29 of surrtab.F90, but I only know that because underflow is the only exception that I've enabled.
I thought I once found a way to display the kind of floating-point exception, but I haven't been able to reproduce this with another model (which doesn't use DrHook). For example, I compiled this model with -ffpe-trap=overflow. I then triggered a runtime overflow error by multiplying
huge(variable)
by a system clock dependent number. This still just gives me SIGFPE with "erroneous arithmetic operation" however. There's no mention of an overflow. So I thought what I was asking for wasn't possible, but you mentioned that the latest DrHook somehow retrieves this information?Unknown User (nagc)
Hi Sam,
Have you looked at using 'gdb' to run OpenIFS? I am not an expert on gdb but if you run the executable under gdb ( gdb ./master.exe ...), it will halt on the exception (probably need to turn off drhook), and from the source line should be able to determine the error?
If you are comfortable with C code, then the changes to DrHook are straightforward. With the changes, no recompilation would be needed, DrHook can be told to selectively trap each signal in turn.
Locate drhook.c in ifsaux/support. Look for the trapfpe() function:
Now modify this to set the exception individually based on environment variables:
Now define the new variables near the top of drhook.c (around line 80):
Lastly, in the process_options() function, add the new environment variables:
I've not tested these changes but I hope you can follow ok. If there are other traps you need to implement just follow the pattern of changes above.
With these, you should then be able to compile once, set each DRHOOK_TRAPFPE_* in turn and run.
Let me know how you get on. I can add this to future versions of OpenIFS as these changes are not available in IFS yet.
Cheers, Glenn
Sam Hatfield
Sorry Glenn, I never replied. I did try your modifications to DrHook and they worked quite well. However, I couldn't find a way to run the program with all FPE traps enabled, and have it report which one crashed the program - perhaps this isn't possible. For now I've found a way forward, but I might come back to this in the future.
Thanks,
Sam
Unknown User (nagc)
Hi Sam,
Thanks for the update and letting me know the DrHook changes worked. If you figure this out, I'd be interested.
Glenn