scottwilson's Recent Posts

I'm wondering if there is any documentation for the serial protocol? I'm talking with someone about a potential used purchase as I've been considering adding support for the soundplane to my eurorack module... (the nw2s::b)

I've recently added support for a couple of USB devices including the monome and a gamepad. They went well, and now that I've got the USB host code basically worked out, I'd really like to include this guy for some easy computer-less soundplane-modular goodness.

I assume most folk work at the OSC or MIDI level, so that the only people who needed the raw USB protocol were the original programmers.

I'm also curious if there are any version differences between the different runs. The monome has some significant hardware capability changes from version to version. This seems a bit more consistent, but I thought I'd ask.

Thanks!

s

Interesting. Good catch. I am guessing that this area of libusb gets nearly no exercise at all since aside from audio devices, isochronous devices are few and far between... and those rarely have a use for userspace usb drivers.

As an aside, I got an email stating that the X15s are going through FCC certification, so they'll be shipping soon. The BeagleBoard site has a place you can sign up to get notified when they ship.

I don't have an X15... but the X15 has a AIC3104 which seems on the order of something you'd get in an android device or iphone? Should be serviceable.

The SOM I'm using has a WM8731 which is used on a lot of euro modules, but the driver isn't ready for it yet, so I'm using a USB audio device for my testing... that's the good thing about having a full ALSA stack.

I'm still debating what the module's final DACs will be. The 'b uses latching dacs which are better for CV and basically unusable for audio. I'll switch to some audio dacs if I can get them to perform with enough CV accuracy to run pitch CV. Otherwise, I'll have to have a few of each.

Yeah, that's irritating. There are virtual root "hubs" so the hub it's referring to is probably not your hub, but the driver's root hub. Definitely a low level issue that software won't likely fix.

Is there any specific case where the USB is disconnecting? What does it take to get it connected again? plug/unplug or just restarting the app? Any syslog messages? I haven'e experienced that yet on the AM57x board... and I've left my soundplane plugged in for multiple weeks as I write code and deploy it remotely onto the board via rsync.

The sockets I'm using are AF_LOCAL/AF_UNIX, so they are stream based but not IP-based. Everything stays in the kernel, so there's really as minimal overhead as possible for IPC.

OSC would require serializing to strings and deserializing which isn't necessary. Any OSC/MIDI/etc control type protocols will be implemented at a slightly higher level in my architecture since it's a bit more general purpose and is meant to be adaptable to a number of different control surfaces.

I'm moving on to getting the DSP cores running, and may leverage the soundplane to be some sort of granular/spectral manipulation control surface once I get that code migrated over from SHARC to C66x.

s

Done with the 'python' binding. I attempted to do a native binding where the soundplanelib code was running within the context of python via cython, however there's just a little too much multithreading going on for it to make sense. cython/boost-python, et al, are generally mean to wrap math libraries and such - not servers.

So instead, I built a socket server that dumps touch data to a unix socket. The python code connects to the socket and tracks the touches from there.

For your reference:

https://github.com/nw2s/libsoundplane/blob/master/src/SoundplaneServer.cpp
https://github.com/nw2s/libsoundplane/blob/master/test/touchtrackertest.py

-s

Yeah, Soundplane model has a lot in it. For my purposes, I am really only interested in touches and I have a centralized "server" that is made to translate raw events from different types of devices to different outputs - effects, CV, USB MIDI, OSC, csound, puredata, etc.

Even zones would be covered by what is in effect a service bus - then zones are not much different than splits on a keyboard from my code's perspective.

So what you've done serves my purposes perfectly.

Fo you - it would probably make more sense to pare it down to what you need rather than introduce overhead in a hub-spoke type of architecture like I'm working on.

...but I should have my python bindings done for soundplanelite in a few more days if you want to use it as an easier way to tack on things like OSC support through python libraries.

s

Thanks for the code. Got it building on my dev board. for the record, it iterates between 5% and 15% CPU, so similar behavior that you were seeing. Really do wonder what that is... But still - great performance - plenty of overhead for the rest of the code.

I'll pass it along more optimizations as I get them running - first I'm going to work on some python bindings so I can incorporate it into the base framework of the module. Woot! Beautiful code by the way - easy to read and find my way around. I tried your new tracker, but yes, it's not quite there, so I'm going to stick with the one that's running now. This will get me way down the road, and that's all I can ask.

-s

...specifically for clamping:

float32x4_t vminq_f32 (float32x4_t, float32x4_t)
float32x4_t vmaxq_f32 (float32x4_t, float32x4_t)

Thanks. I saw a similar thing with the clamping. It's an obvious point for optimization... Two things will make it faster - vectorization and getting rid of conditionals.

The neon vmin/vmax will do the operations without conditionals, but not every call of clamp() is something that can be vectorized... not that that's an issue, the one-off calls are more about making sure the input parameters are in a valid range.

The big one is likely MLSignal::sigClamp - and vectorizing that will be simple and probably reduce it's time by a couple of orders of magnitude.

The convolution may take a bit more consideration to get everything lined up properly to accelerate with vectorizable operations. Not sure there's an easy way to optimize the fact that it needs to be done 3 times - That's the equivalent of running the same signal through three identical filters... To emulate that with a single filter would simply require the filter to be three times as big - defeating the purpose.

But considering there's no optimization whatsoever, once it's optimized, the fact that it's happening three times will be less of an issue - and will just give you three times the benefit for each optimization you make.

Yeah, the x15 will be great for general use and will be a great deal for ~$300 when you can readily get them. Like you say, though, the BBB will probably do fine assuming you're not asking it to also run X... It's available, cheap, and can be run on battery. The x15 is a bit more of a power hog.

For the same reason you mention, I've been weighing how best to build the b2/dsp in a standalone form factor. It could probably be useful in a form factor somewhere on the continuum of aleph/anushri/elektron. We'll see... I'm on a shoestring budget, but have a few smaller modules I need to get out to help fund this crazy thing.

I did find a beagleboard x15 in stock earlier this year, but don't see it now. Mouser's not getting any more until May!

The x15 didn't fit my form factor or IO requirements, so I'm using a dev board for the SOM I will be using in the final hardware. It's the Compulab CL-SOM-AM57x and is a nice compact module with a better IO complement and less extraneous connectors that would make the eurorack module a bit awkward.

Would love to see your code - especially the command line test. I'm making my way through the new tracker, but I see I still have a ways to go even once I get this stuff compiling, There's a lot there that isn't used if all you really want is the touch tracker output.

Anyway, bedtime!

The version of SSE2NEON I've found (https://github.com/jratcliff63367/sse2neon/blob/master/SSE2NEON.h) doesn't have _mm_div_ps in it at all...

It does have _mm_mul_ps which it maps to vmulq_f32. The division version of that intrinsic is vdivq_f32 which is listed as only being supported on 64 bit ARM v8.

http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm_neon_intrinsics_ref.pdf

...but now that you mention it, that document is 2 years old and when I swap in the div operation, it compiles just fine. So maybe I was being too cautious there.

-s

...well, that's exciting. Just optimizing the few sse cals to neon changes the cpu utilization for pressure decoding from 2.5% to 0% as measured wtih pidstat.

I was looking through MLSignal and MLDSP and see a lot of vector operations that aren't optimized yet. Makes migration a lot easier for sure, but also some room for improvement as I get a little further. Back to other commitments for a while, but I'll be picking back up looking through the tracking code. Thanks for the help so far!

s

Thanks! I'll reach out to via PM and we'll coordinate. I'm still wrapping my head around the code, but I certainly got a lot further than I thought I would in just a couple of day's hacking.

Before I get there, I'm going to POC moving some of the sse code to neon by hand rather than through a wrapper. The pressure decoding is hitting about 2.5% CPU with no optimizations, so I should be able to get that down a little and it will help me confirm how I want to start writing the optimized C code. (Probably #ifdef ARM_NEON)

Looking good! Got this running on the AM5728 with no optimizations at all. I'll probably spend some time cleaning up a bit and seeing if I can optimize some of the vector code with NEON at all before tackling the tracking work.

|           .  :                                                 |
|            .;x:  .  .   .   .                   . ::           |
|        ...=X@@X...                             ;=X@@$+.        |
|        :+x@@@@@x;::. .                  .. ....+$@@@@x:        |
|    ...::+=@@@@@=;.::.                     ..  .+$@@@@$+...     |
|        .;:;XXX+. .                             .::++::         |
|           .::=;                                 .:;;..         |
|              :.                                                |
signal @ 0xbec86b38 64x8x1 [512 frames] : sum 61.9141

(the ampersands didn't translate - converted to @)

Now I'm in business. I was missing the udev rule. Getting frames now! (And yes, they have actual values in them, so I don't have the RPi problem)

SUBSYSTEM=="usb", ATTRS{idVendor}=="0451", ATTRS{idProduct}=="5100", MODE="666"

Moving on to peak detection!

Nice - is it available on a repo anywhere? I'd love to give it a shot and see if I can help at all.

Interestingly the syslog does show the correct manufacturer and product even though it's not in lsusb, so I feel a little better.

libusb is still reporting status 0 in hellosoundplane, but... never knew serial number 6. Is that right?

Mar 15 11:44:02 nw2s-0001 kernel: [463066.642188] usb 3-1.1.2: Product: Soundplane Model A
Mar 15 11:44:02 nw2s-0001 kernel: [463066.647854] usb 3-1.1.2: Manufacturer: Madrona Labs
Mar 15 11:44:02 nw2s-0001 kernel: [463066.653476] usb 3-1.1.2: SerialNumber: 0006

Hello! I've been away for quite a while, haven't I? Looks like I'm not the only one hacking away.

In case you guys are interested, after I began looking into support for the soundplane, I was also adding support to the 'b for a few other USB devices like USB MIDI, monome, etc. The arduino-based code was getting cumbersome and I was stretching the little SAM3X to its limits. I was also spending too much time writing USB drivers from scratch.

So I started looking for unix-based systems that supported DSP - as I was very interested in audio as well. The new SHARC+ chips really interested me - dual SHARC cores and an ARM core with buildroot. I wrote quite a bit of DSP code for that core - some spectral manipulation and an impulse response reverb.

Perhaps luckily, I was unable to get access to the early developer program, so could not get their linux kernel. So I started looking around, and around then the Beagleboard x15 was announced. It's got a TI AM5728 - dual 1.5GHz ARM15 cores, dual C66x DSP cores, lots of RAM, etc.

The x15 doesn't quite fit the eurorack format, but the processor is perfect and there are other modules available that are a better fit. I've been working with that for a couple of months now and love it.

I've since gotten USB MIDI support, monome support, and gamepad support running and now it is time to tackle the soundplane! I've spent only a few days so far porting some code, so I don't have much progress to report.

I can say that the device is listed in lsusb, but when running my simplified hellosoundplane, the LED does not come on and no frames are received.

the new b2/dsp will support HDMI video for hackers, but everything I'm building is headless, so after I get past getting some USB frames (hopefully with more luck than the RPi), then I'll be looking at migrating only the core madronalib to NEON - and perhaps with a little C66x help?

I'll let you know how it goes.

Just as a sanity check - Can I assume that this entry in lsusb indicates everything is fine?

Bus 003 Device 057: ID 0451:5100 Texas Instruments, Inc. 
Couldn't open device, some information will be missing
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass         0 
  bDeviceProtocol         0 
  bMaxPacketSize0        64
  idVendor           0x0451 Texas Instruments, Inc.
  idProduct          0x5100 
  bcdDevice            0.95
  iManufacturer           1 
  iProduct                2 
  iSerial                 3 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           41
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          4 
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              230mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           0
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      0 
      bInterfaceProtocol      0 
      iInterface              4 
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       1
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      1 
      bInterfaceProtocol      0 
      iInterface              5 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes           37
          Transfer Type            Isochronous
          Synch Type               Asynchronous
          Usage Type               Implicit feedback Data
        wMaxPacketSize     0x0184  1x 388 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes           37
          Transfer Type            Isochronous
          Synch Type               Asynchronous
          Usage Type               Implicit feedback Data
        wMaxPacketSize     0x0184  1x 388 bytes
        bInterval               1

Thanks for that overview. The callback is the glue I was missing.

The initial goal would just be to establish the communications link, receive the data, and unpack the FFT data without processing anything. Without any OS, this is usually most of the work. The soundplane is clearly an exception to that.

Looking through some of the processing code, I feel like initial goal would be to simply track pressure and position of a single point and I'm guessing I could take a few shortcuts doing that.

That will get me a lot more familiar with the data format so that I could move it to an STM32F4 which would triple the processing power, add floating point, and add a more robust instruction set - still not a macbook pro, but it would be dedicated hardware, so we'll see... step at a time.

Thanks,

Scott

Okay, last one before bed. (Work tomorrow?!?!)

The usb driver unpacks raw FFT data and that is then translated by code running on the host that provides positioning data?

Can you point me to the class that is doing this base work?

thx.

s

Looking at SoundplaneDriver.cpp in github, it looks like most everything is there.

Am I right in reading that basically all data is transferred as isochronous packets? This wil be my first isochronous device to build a driver for. Also looks like I'll also be unpacking the floats into 32 bit ints as best as I can to help the Due along a bit.

I'll let you know as soon as I have some more questions!

s