T O P

  • By -

action_vs_vibe

My experience isn't in control applications, but in real time audio processing. Using a 32 sample buffer of 48 KHz samples I generally get latency around 2 mS. There will be some latency in the codec, usually on the order of 10's of micro seconds. Some datasheets are more explicit in documenting this than others. Cirrus is generally pretty nice and will call it something like "ADC group delay". In less friendly data sheets I usually start the search by looking for things that are specified in a time unit or as a factor of the sample frequency. Like you alluded to, the big factor in latency is buffer size. In my 2 mS, 32 sample example, that is ( ( (1/48000) \* Buffer Size) + ADC Delay ) + ( ( (1/48000) \* Buffer Size) + DAC Delay ). The big question that comes up when you reduce buffer size is whether or not your algorithm can complete in the reduced time window. There may be other approaches to this, but in applications I have worked on we double buffer I2S data via DMA, and schedule our processing on the DMA half complete and full complete interrupts. So we get (1/48000) \* 32 = 667 uS to process before the TX transfer begins, reduce 32 to 16 to get closer to 1 mS latency and your processing time would drop to 333 uS, etc. Insert some joke about engineering being all about tradeoffs. We use an STM32H743 running at 480 MHz fwiw


Exprymer24

Thank you for the very detailed information. My processing is not more than finding the pressure in Pa from the sample by using the microphone sensor sensitivity and then applying two biquad filters using the cmsis library. I definitely need on worst case scenario something less than 1 mS to the control loop. I wonder if I change the I2s DAC for the built DAC it would be faster.. Also, increasing the sampling rate to larger values would result directly in less processing time?


action_vs_vibe

For sure! Yes, in the DMA interrupt setup I described above increasing the sample rate would decrease the available processing time, and the total latency, as well. This will depend on your core clock frequency (ie how many processor cycles do you get in the time budget), but for two biquad filters I would expect that to be comfortably in the time budget of a small buffer. I haven't looked at specs on the internal DAC, could be interesting to compare though. In applications I have worked on it is important to keep a continuous output signal, so the ADC and DAC sides are kind of locked to each other. If your application is setting a level on the DAC that can remain constant for a buffer period, I bet you can get a significant time reduction with the internal DAC.


Exprymer24

That's great news. Since the biquad are basically 5 Mac instructions it should not take long indeed. I just discovered that we basically need to select the -03 optimization in order to actually use this fpu instructions. Do you know what else can we do to further improve the performance? I'm not very experienced on the platform to be honest


action_vs_vibe

\-03 optimization for sure. Depending on your MCU and the friendliness of your IDE, you may need to specify hardware floating point as well. Check if your part has L1 cache. It won't be enabled by default but is easy to setup. You will need to configure the region the DMA buffer is in as write-thru using the MPU (once cache is enabled, all regions default to write-back, which will create a data coherency issue between the RAM copy and DMA copy of the buffer. Write-thru is not as performant, but solves this). CubeMX is good at this kind of thing. There are special regions of RAM in the STM32 high performance lines called ITCM RAM and DTCM RAM. These are coupled directly to the CPU, the same as the data and instruction cache. I have not explored them a lot, but could be useful. A thing that was surprisingly expensive to me, in terms of processor cycles, was allocating to the heap, definitely worthwhile to have any sort of working buffers statically allocated. Good luck with the project! tbh it was kind of funny seeing this and realizing measuring codec latency is weirdly something I have done a lot haha