|
|
Home » U++ Library support » U++ Library : Other (not classified elsewhere) » Heap errors behavior is dependent on target machine. (Heap errors occur on one machine but not on a another.)
Heap errors behavior is dependent on target machine. [message #45539] |
Sat, 28 November 2015 14:03 |
jfranks
Messages: 36 Registered: September 2014 Location: Houston, Texas
|
Member |
|
|
Brief Description: Memory heap errors don't happen for development
machine 'A'. However, they do occur for the same
executable image on the embedded target machine
'B'.
Full Description of the issue:
Description of development machine -- i.e., machine 'A'
The development machine is a Dell desktop computer that has an
Intel CORE i5, and a host operating system Windows 7. A virtual
machine was installed using Oracle VM Virtual Box to run a guest
operating system Linux Mint 17.2. Our application development was
done on this virtual machine as a Linux based application using U++
nightly snapshot upp-x11-src-9200.
Description of the embedded target machine -- i.e., machine 'B'
This is proprietary custom hardware that has a touchscreen, custom
keypad entry device, a commercial power supply, commercial single
board computer, and a hard drive. The operating system is the same
as that used on the development machine. The CPU is compatible
to run the executable image produced on the development machine.
Description of the problem we are having with U++ memory diagnostic:
1. We developed and debugged our graphical U++ application on
machine 'A'. All memory heap errors were located and
corrected. The executable image developed on this machine is
compatible for running on machine 'B'.
2. We installed the debug version of our executable image on machine
'B'. Everything runs great except when we exit our application.
The behavior is different on machine 'B' in that there are memory
heap errors, followed by a segfault on X11.
Heap leaks detected!
Segmentation fault
3. We enabled machine 'B' to have development capability and installed
U++ IDE based on upp-x11-src-9200. We did a code checkout into this
machine from our SVN server. We compiled the code. Then we ran the
debug executable built on this machine. The result was the same
as item 2 above.
The debug executable image built on machine 'B' was copied to
machine 'A' and it exhibited a different behavior -- it worked
correctly on exit from the application, i.e., there were no
memory heap errors, nor segfault. That is odd.
Next, we decided to start debugging on machine 'B' in earnest.
We modified our application code and inserted MemoryIgnoreLeaksBegin();
and MemoryIgnoreLeaksEnd(); so as to exclude all of our application
code from leak detection. The result was the same as in item 2 above.
We more aggressively applied the U++ memory ignore function by reworking
the GUI_APP_MAIN macro and explicitly replaced it with the following.
//GUI_APP_MAIN {
void GuiMainFn_();
int main(int argc, const char **argv, const char **envptr)
{
MemoryIgnoreLeaksBegin();
UPP::AppInit__(argc, argv, envptr);
UPP::Ctrl::InitX11(NULL);
UPP::AppExecute__(GuiMainFn_);
UPP::Ctrl::ExitX11();
UPP::AppExit__();
MemoryIgnoreLeaksEnd();
return UPP::GetExitCode();
}
void GuiMainFn_()
{
... our application code starts here ....
}
The results on machine 'B' did not change -- still a memory heap
issue on exit and a segfault. A large log file was produced with
many memory breakpoints.
Next, we compiled a release version of the application on
machine 'B' without any debug flags. Everything works great
because the U++ memory diagnostics are disabled. As we run the
release version, there is nothing that indicates a problem at any
time, even when we exit.
The log file generated from running the debug was over 2400 items.
I've attached a snapshot of the call-stack while in the debugger
for the lowest numbered memory break-point #1.
We are having a difficult time sorting this out and are asking
for help or ideas of where we go from here.
-- Jeff
|
|
|
|
Re: Heap errors behavior is dependent on target machine. [message #45541 is a reply to message #45540] |
Sat, 28 November 2015 22:37 |
jfranks
Messages: 36 Registered: September 2014 Location: Houston, Texas
|
Member |
|
|
Thank you for helping us.
Q1. Does the problem occur with e.g. examples/UWords too?
A1. No, the problem does NOT occur with examples/UWord. I compiled
and ran this on machine 'B' and everything worked correctly.
I was able to enter some text, save it to a qtf file, exit the
program without any issues.
Q2. Have you tried memory breakpoint?
http://www.ultimatepp.org/srcdoc$Core$Leaks$en-us.html
A2. Yes we have done that. Memory breakpoint #1 was used to generate
the snapshot of the call-stack (uploaded previously), while in
the debugger.
It seems strange memory break-point #1 was not hit immediately
when the application was run. Instead, break-point #1 did not
engage in the debugger until we tried to exit the application. I
expected it to be the other way around.
Q3. Can you post a couple of lines of log with leak with smallest
breakpoint number?
A3. Yes, I have done that on this response. Also, I was in error when
I said that there were more than 2200 items in the log file.
Actually, there are only 355 items each time we run the
application and then exit. I ran wc on the log-file erroneously
thinking that each line was a memory leak (too many long hours).
174 items + <size 828>
174 items + <size 812>
7 items <various sizes>
I've included the log-file with comments that show where patterns
repeat until the final 7 items are reported. I have not been able
to figure out anything relating to the repeating patters, however,
the last 7 items have to do with a shared library that manages
the serial ports. Each one of the 7 items is caused by a stdc++
string that is part of that library.
As an experiment, I modified that shared library to use const char*
instead of stdc++ strings. For example:
#if 0
const std::string ERR_MSG_PORT_NOT_OPEN = "Serial port not open." ;
const std::string ERR_MSG_PORT_ALREADY_OPEN = "Serial port already open." ;
const std::string ERR_MSG_UNSUPPORTED_BAUD = "Unsupported baud rate." ;
const std::string ERR_MSG_UNKNOWN_BAUD = "Unknown baud rate." ;
const std::string ERR_MSG_INVALID_PARITY = "Invalid parity setting." ;
const std::string ERR_MSG_INVALID_STOP_BITS = "Invalid number of stop bits." ;
const std::string ERR_MSG_INVALID_FLOW_CONTROL = "Invalid flow control." ;
#else
const char* ERR_MSG_PORT_NOT_OPEN = "Serial port not open." ;
const char* ERR_MSG_PORT_ALREADY_OPEN = "Serial port already open." ;
const char* ERR_MSG_UNSUPPORTED_BAUD = "Unsupported baud rate." ;
const char* ERR_MSG_UNKNOWN_BAUD = "Unknown baud rate." ;
const char* ERR_MSG_INVALID_PARITY = "Invalid parity setting." ;
const char* ERR_MSG_INVALID_STOP_BITS = "Invalid number of stop bits." ;
const char* ERR_MSG_INVALID_FLOW_CONTROL = "Invalid flow control." ;
#endif
I compiled and installed the modified serial port shared library
and then re-tested the application. Those last 7 items
disappeared! Also, the other items in the log file that repeated
174 times now repeat only 124 times. I don't know why that changed.
There must be a clue here.
-- Jeff
|
|
|
|
|
|
|
|
Re: Heap errors behavior is dependent on target machine. [message #45553 is a reply to message #45552] |
Mon, 30 November 2015 10:03 |
|
mirek
Messages: 14162 Registered: November 2005
|
Ultimate Member |
|
|
Getting out of options.
The main hypothesis here is that we are detecting leaks too early.
Still, we can check this:
In the file with those std::string globals, put something like
struct MyInitChecker {
MyInitChecker() { printf("Module initialized"); }
~MyInitChecker() { printf("Module deinitialized"); }
};
static const MyInitChecker myinitchecker;
then at the end of Core/heapdbg.cpp change destructor:
MemDiagCls::~MemDiagCls()
{
if(--sMemDiagInitCount == 0) {
printf("Now checking for leaks");
UPP::MemoryDumpLeaks();
}
}
Also, there are some details not yet provided:
- what is that "compatible" CPU?
- is the system updated to current version and it is exactly the same?
- are there any peripherals using serial communication that are not on Dell?
- what is that shared library?
Last but not least, it is entirely possible that the library leaks by design. In that case, it can be just bad luck and not really fixable. Well, in reality, leaving some global leaks is still considered "normal" in mainstream C++.
[Updated on: Mon, 30 November 2015 10:57] Report message to a moderator
|
|
|
|
|
Re: Heap errors behavior is dependent on target machine. [message #45556 is a reply to message #45555] |
Tue, 01 December 2015 05:58 |
jfranks
Messages: 36 Registered: September 2014 Location: Houston, Texas
|
Member |
|
|
Great work!! Thank you so much for your help.
1. Regarding your instructions -- Mon, 30 November 2015 10:03 . . .
The results of that test is as follows:
$ ./p101-dbg
Module initialized
Now checking for leaks
Heap leaks detected!
Segmentation fault
The destructor for the shared serial library was not yet called
when heap leaks were being evaluated.
----
Regarding details that you requested . . .
Q1. What is that "compatible" CPU?
A1. Celeron J1900 on an IMB-151 single board computer.
Reference: http://www.asrock.com/ipc/overview.asp?Model=IMB-151
Also, on machine 'B', "uname -a" provides the following:
Linux administrator-desktop 3.19.0-28-generic #30~14.04.1-Ubuntu \
SMP Tue Sep 1 09:32:55 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Q2. Is the system updated to current version and is it exactly the same?
A2. Machines 'A' and 'B' are using Linux Mint 17.2 distros. There
is one serial port sharable library (32-bit) that is not part
of the distro that was installed on both machines.
Q3. Are there any peripherals using serial communications that are
not on Dell?
A3. Yes. The installation of our application on the embedded
target machine 'B' uses serial port #1 for logging, and serial
port #2 for Modbus communications. So, two serial ports exist
in hardware for machine 'B'. However, the application can run
without these serial ports, since logging and Modbus are
optional items.
The development Machine 'A' is just a common desktop computer.
Serial ports have not been made available to the guest virtual
machine, which is where we are developing and testing. When
the app runs and discovers no serial logging port and/or no
Modbus port, these functions are just turned off.
Also, the hardware for the embedded system machine 'B' has
additional devices and Linux drivers to handle a custom keypad
HID device, and a touchscreen.
Q4. What is that shared library?
A4. libserial-0.5.2 ... attached to this response. We have
compiled this as a 32-bit library so that it matches our
32-bit application for both machine 'A' and 'B'.
libserial-0.5.2/src/SerialPort.cpp is where those stdc++
strings are located the U++ heap diag was flagging as a
memory leak.
-----
2. Regarding your instructions on Mon, 30 November 2015 10:05 . . .
The results of testing with heap dump disabled:
$ ./p101-dbg
Module initialized
Now checking for leaks
Module deinitialized
This confirms that the destructor for the serial port shared
library is called after heap leaks has been evaluated.
static const MemDiagCls sMemDiagHelper __attribute__ ((init_priority (0)));
caused a compilation error (out-of-range).
I changed the 0 to a 1, and the compiler gave a warning.
The results of testing with init_priority (1) was the same as previous:
$ ./p101-dbg
Module initialized
Now checking for leaks
Module deinitialized
-----
3. Regarding your instructions on Mon, 30 November 2015 10:45 . . .
The modified Core/heapdbg.cpp file was downloaded and replaced
the previous one in my copy of nightly build 9246.
This file was obtained from
https://github.com/ultimatepp/mirror/blob/1ce7608b2fb7571902 917401d4215fb76f03eafd/uppsrc/Core/heapdbg.cpp
Results: There is an improvement !! Heap errors related to stc++
strings disappeared from the log file. However, we are
still experiencing heap errors related to something else.
The pattern of heap leaks with sizes of 812 and 828 are
still there. The number of these varies.
Conclusion:
- memory heap leaks related to stdc++ strings
are resolved with the modifications that you made
to Core/heapdbg.cpp
- memory heap leaks are still being detected
from some other source.
BTW: The above result was double checked.
- Core/heapdbg.cpp was reverted to the original. Retesting showed
the last 7 items in the log file related to stdc++ strings
reappeared. This is what is expected for this test scenario.
- Core/heapdbg.cpp was again replaced with the new modified
version. Retesting showed for this case that heap errors
previously related to stc++ strings disappeared.
This is good and is an improvement.
----------
Conclusion:
The memory heap diagnostic is improved so that stdc++ strings
in shared library are ignored (that is good).
There is another source of heap errors. Let's say that these
are all from a similar source because they always come in pairs and
the total number for each size (812 and 828) are always the same.
Is there a strategy that can be applied to be able to identify
something about where these originate?
----------
Summary:
One down, one to go.
----------
-- Jeff
|
|
|
|
|
|
|
|
Goto Forum:
Current Time: Fri Dec 13 23:44:19 CET 2024
Total time taken to generate the page: 0.03022 seconds
|
|
|