Discussion:
patch for libusb errors
Peter A. Bigot
2014-07-13 10:18:49 UTC
Permalink
Please consider the attached patch, also available at
https://github.com/pabigot/rtl-sdr/tree/fix-libusb-segv. This reworks
the buffer management in rtlsdr_read_async to ensure that libusb has no
outstanding transactions related to transfer buffers prior to those
buffers being freed.

Peter
Peter Stuge
2014-07-13 10:59:29 UTC
Permalink
Peter A. Bigot wrote:
Peter A. Bigot
2014-07-13 11:22:56 UTC
Permalink
Peter A. Bigot
2014-07-14 10:21:31 UTC
Permalink
Sylvain Munaut
2014-07-14 11:02:51 UTC
Permalink
Hi,
Is there interest from the rtl-sdr maintainers in my pursuing this?
I think there is definitive interest.

But it also touches a part of the code that's been a source of many
issues before, and so will require a *bunch* of testing to make sure
it doesn't break anything either. Including testing on other platforms
like Windows and OSX. So don't expect it to be merged on a whim.

I'd also encourage readers on those platforms (win/osx) to test this
patch and report, and even when people on linux. We don't have access
to every platform out there and we rely on people actually testing
this and reporting back their results.


Cheers,

Sylvain
Peter A. Bigot
2014-07-14 11:31:24 UTC
Permalink
Post by Sylvain Munaut
Hi,
Is there interest from the rtl-sdr maintainers in my pursuing this?
I think there is definitive interest.
But it also touches a part of the code that's been a source of many
issues before, and so will require a *bunch* of testing to make sure
it doesn't break anything either. Including testing on other platforms
like Windows and OSX. So don't expect it to be merged on a whim.
I'd also encourage readers on those platforms (win/osx) to test this
patch and report, and even when people on linux. We don't have access
to every platform out there and we rely on people actually testing
this and reporting back their results.
Excellent. Yes, please do test: my only available systems are
Ubuntu-based (mostly 12.04-LTS with different hardware enablement
stacks), and validating it on Windows and other versions of libusb is
clearly important. I'd like to see some feedback.

I'm continuing to add minor patches for edge cases to the
fix-libusb-segv-r2 branch. At the moment I sometimes see a shutdown
from interrupt taking a few seconds; I'm continuing to evaluate that.
You can see the cumulative differences at any time at:
https://github.com/pabigot/rtl-sdr/compare/pabigot:master...fix-libusb-segv-r2

Once there's feedback from others confirming that it works and the
maintainers are ready for it, I'll roll an -r3 so the upstream history
is clean. Personally I think what's there now is very pretty (but then
I would, wouldn't I).

Peter
Peter Stuge
2014-07-14 13:28:12 UTC
Permalink
I keep getting close, but haven't conquered the edge cases.
In particular, when things go wrong I can still get hangs or segfaults,
because libusb_submit_transfer() and libusb_cancel_transfer() do not give
reliable information about whether the callback will be subsequently be
The callback will always be called.
even with error returns, sometimes it is and sometimes it isn't.
If the callback is not called that is a (severe) bug in the libusb-1.0
implementation you are using.

The original author of libusb-1.0, Daniel Drake, made me maintainer
when he moved on from libusb.org to work on other things. Meanwhile,
the libusb project on SF and github has been taken over by the
libusbx developers, who created their hostile fork of libusb because
they were frustrated by my development style.

My development style is to be correct before anything else.
Some of the libusbx developers write really poor code really quickly.

Distributions all switched to package the libusbx developers' code,
when one of the libusbx developers who works for Red Hat announced
that he had switched the Fedora package.

I follow the developments and I continue efforts at libusb.org,
although no code has entered the git in a long time.

I so far haven't seen any issue with a libusb-1.0 implementation
causing lost callbacks, and I'd be interested in understanding that
problem better, and certainly fixing it if libusb.org code does it.
There may be a way to determine from the specific error value, but that
behavior isn't documented. Without well-defined behavior, I can't tell
when it's safe to release the transfer buffer.
It's safe to free the buffer after a sync API transfer returns or at
any time after the callback is called for an async API transfer.

libusb is done with the transfer when the callback is called, to
allow the transfer to be resubmitted by the callback if it wants to.
This is by design and I believe documented, but maybe not.
the specific libusb versions I have to deal with.
Which are those?

I'd appreciate if you give the code on git.libusb.org a go, that's
the only code I can recommend as a baseline. There are plenty of
things that it doesn't do, which libusbx developers have implemented
some way or other, and if it's a good way then they will be included
by me too at some point.


//Peter
Peter A. Bigot
2014-07-14 14:31:48 UTC
Permalink
[libusb details; thank you]
I had noticed the version confusion and the fork, and won't be getting
involved in that fight. Ubuntu 12.04-LTS seems to use 1.0.9 from
libusb.org (yours) while Ubuntu 14.04-LTS seems to use 1.0.16 from
libusbx (theirs).

At the moment I'm using pre-built 1.0.9. I will also verify against
pre-built 1.0.16 and self-built head of libusbx (currently 1.0.19)
before declaring the rtl-sdr patch ready to merge.

The libusb_submit_transfer() inconsistency I saw was ultimately my
error, but appeared otherwise due to an apparent bug in libusb which
I'll send you off-list as it's not relevant to rtl-sdr.

I did reliably reproduce a problem with libusb_cancel_transfer() when 15
transfers were submitted, and after the first callback they were all
cancelled: in that situation, the callbacks for successful cancels were
not reliably invoked.
https://github.com/pabigot/rtl-sdr/commit/bf458c873f80347e3ecc2bd2253a0e69e39cc796
fixed that in my branch.

If I do encounter other problems related specifically to libusb behavior
(any version) I'll move the discussion to the libusb-devel mailing list.

At this time I have no confirmed issues present in the patches at
https://github.com/pabigot/rtl-sdr/compare/pabigot:master...fix-libusb-segv-r2
although sometimes control-C does not take effect as quickly as I
expect. I'm continuing to monitor that issue, but look forward to
feedback from people trying the changes on other platforms.

Peter

Peter A. Bigot
2014-07-13 12:31:41 UTC
Permalink
Revised commits to rework libusb buffer management are available at:
https://github.com/pabigot/rtl-sdr/tree/fix-libusb-segv-r2

This is a refactored/rebased version of the content of
https://github.com/pabigot/rtl-sdr/tree/fix-libusb-segv, which has been
extended since first proposed. It reverts the 2014-02-17 workaround for
the problem along with more robust cancellation, and extends previous
proposed patch to handle cancellation because of libusb errors as well
as to external signals or application calls to rtlsdr_cancel_async.

I have tested this with rtl_sdr on Ubuntu-12.04 to ensure clean shutdown
due to keyboard interrupt and to removal of dongle, and using another
application which cancels the read within the program itself. The
patches eliminate segfaults that were observed under these conditions
with the unpatched sources.

Please evaluate fix-libusb-segv-r2 for a merge to master.

Peter
Post by Peter A. Bigot
Please consider the attached patch, also available at
https://github.com/pabigot/rtl-sdr/tree/fix-libusb-segv. This reworks
the buffer management in rtlsdr_read_async to ensure that libusb has
no outstanding transactions related to transfer buffers prior to those
buffers being freed.
Peter
Continue reading on narkive:
Loading...