Discussion:
Error when loading libraries at run time in Linux
Leif Asbrink
2014-08-03 21:55:11 UTC
Permalink
Hi All,

The libraries are now loaded at run time in Linrad.
As it turns out this does not work in all Linux
distributions.

For example, I load libusb-1.0.so and librtlsdr.so with
dlopen and then get the functions with dlsym. This often
works well, but under some Linux distributions a call to
libusb_init from librtlsdr does not point to the proper
address. The error occurs in Fedora 15 to Fedora 20, but
not in Fedora 13 and not in Debian or Ubuntu.

The 32 bit version of openSUSE 12.1 is OK, but the 64 bit
version of openSUSE fails. I have made a video that demonstrates
the problem:


To me this looks like a gcc/ld bug - or am I doing something
wrong?

Regards

Leif
Sylvain Munaut
2014-08-03 22:36:07 UTC
Permalink
Hi,
Post by Leif Asbrink
For example, I load libusb-1.0.so and librtlsdr.so with
dlopen and then get the functions with dlsym. This often
works well, but under some Linux distributions a call to
libusb_init from librtlsdr does not point to the proper
address.
Well, in itself a different function pointer is not necessarily a problem.

It could just be that the pointer value inside the lib points to a
stub that does lazy loading. With modern distrib and 64 bits and all
the security stuff with ASLR (address space layout randomization), PIE
and all that stuff, that seems to be a probable explanation.

What would be interesting to see is _what_ exactly is at that address
and see why it fails.


Cheers,


Sylvain
Peter Stuge
2014-08-03 22:50:10 UTC
Permalink
Hej!
Post by Leif Asbrink
The libraries are now loaded at run time in Linrad.
..
Post by Leif Asbrink
For example, I load libusb-1.0.so and librtlsdr.so with
dlopen and then get the functions with dlsym.
When doing this, always dlopen() only the base filename including
the ABI version number, so in this case "libusb-1.0.so.0.1.0" and
"librtlsdr.so.0.0.0", without paths.

If you leave out the ABI version then dlopen() is likely to try to
open a linker script which distributions use to "redirect" the linker.
But they are only intended to be used at build time, not at run time.

Using the full filename including ABI version ensures that dlopen()
gets the correct file, and it is also prudent for your application,
since next year there may be a new version of librtlsdr.so which is
no longer ABI compatible with what you designed for today. If you
specify the filename with ABI version then your program still works,
or fails because the old ABI version is not found. If you specify the
filename without ABI version then best case the symbols you need
simply can't be found by dlsym() once the library has been opened.

One minor detail in your rtl2832.c is that dlsym() returns void *
rather than long int, so I suggest using %p in the format string
and not casting the function parameters.

Other than that, can you show us the contents of your
load_usb1_library() and load_rtlsdr_library() functions?


//Peter
Leif Asbrink
2014-08-04 14:56:55 UTC
Permalink
Hello Peter and All,
Post by Peter Stuge
Post by Leif Asbrink
For example, I load libusb-1.0.so and librtlsdr.so with
dlopen and then get the functions with dlsym.
When doing this, always dlopen() only the base filename including
the ABI version number, so in this case "libusb-1.0.so.0.1.0" and
"librtlsdr.so.0.0.0", without paths.
If you leave out the ABI version then dlopen() is likely to try to
open a linker script which distributions use to "redirect" the linker.
But they are only intended to be used at build time, not at run time.
Using the full filename including ABI version ensures that dlopen()
gets the correct file, and it is also prudent for your application,
since next year there may be a new version of librtlsdr.so which is
no longer ABI compatible with what you designed for today. If you
specify the filename with ABI version then your program still works,
or fails because the old ABI version is not found. If you specify the
filename without ABI version then best case the symbols you need
simply can't be found by dlsym() once the library has been opened.
This is not what I want. Libraries are linked into Linrad to support
various hardware. New versions are usually backwards compatible, but
they may support new hardware.

I have tried to open "libusb-1.0.so.0.1.0" but that makes no difference.
The call to libusb_init from librtlsdr still fails.

I also tried "librtlsdr.so.0.0.5" but that makes dlopen return zero
even though "/usr/local/lib/librtlsdr.so.0.0.5" does return a
non-zero handle. The call to libusb_init from librtlsdr still fails.

I have looked at this page:
http://tldp.org/HOWTO/Program-Library-HOWTO/dl-libraries.html
It tells me there is nothing wrong in specifying files with
the full path as I do and as I want to do.
Like this: "/usr/local/lib/librtlsdr.so"

I have been using RTLD_LAZY, but now I tried both RTLD_LAZY and
RTLD_NOW, both of them with or without RTLD_GLOBAL and RTLD_LOCAL.
I have also tried to load libusb-1.0 both before and after
librtlsdr. Nothink helps under Fedora 20.

Now, I have re-installed Fedora 20 using the same live CD I
used originally. Then updated packages to the latest state today.
That solved the problem!!! Linrad does work now so something
must have been corrupted on my previous installation.

This is highly unsatisfactory. I now have several corrupt
Linux installations. I have no idea what might be wrong, but
presumably there is a way to restore. I have tried to
reinstall gcc and binutils on my corrupt 32 bit Fedora 20
but that does not help.

The load functions look like this:
void load_usb1_library(int msg_flag)
{
if(libusb1_library_flag)return;
libusb1_libhandle=dlopen(LIBUSB1_LIBNAME, RTLD_LAZY);
if(!libusb1_libhandle)goto libusb1_load_error;
libusb_init=(p_libusb_init)dlsym(libusb1_libhandle, "libusb_init");
if(dlerror() != 0)goto libusb1_sym_error;
libusb_control_transfer=(p_libusb_control_transfer)dlsym(libusb1_libhandle, "li
if(dlerror() != 0)goto libusb1_sym_error;
.
.
I tried the RTLD_DEEPBIND flag on librtlsdr. That causes the
address printed from inside the rtlsdr library to be the
same as the address from the main program. Then it seems
rtlsdr can call libusb functions, but when libusb_get_string_descriptor_ascii
is called by librtlsdr.c I get a segmentation fault.
The address of manufact is the same in librtlsdr.c as in Linrad,
presumably it is translated somehow before delivered to libusb.

The whole thing is very confusing.

Regards

Leif
Alex Badea
2014-08-04 20:32:09 UTC
Permalink
Hi,

I don't have specific ideas, but you might try setting $LD_DEBUG [1] to
shed some light into the linking logic. Possibly compare the dlopen() case
with the compile-time link case.

Cheers,
Alex

[1] http://man7.org/linux/man-pages/man8/ld.so.8.html
Post by Leif Asbrink
The whole thing is very confusing.
Leif Asbrink
2014-08-04 23:34:09 UTC
Permalink
Hi All,

Problem solved:-)

There is a package soft66 which depends on libusb-1.0.
On systems where I had installed this package my load at
run-time fails because soft66.so loads libusb and libusb-1.0.

I do not have this hardware and on systems where I did not install
this package everything is fine. Now I have removed this package
from the linker list and load it at run-time if needed (hopefully.)

Now loading librtlsdr at runtime works without problems:-)

Sometimes there is a trivial solution to difficult problems;-)

73

Leif

Continue reading on narkive:
Loading...