In my last post I was happy to report that I managed to get the USB interface to work. This interface has since proved to be extremely valuable in software development and testing. While the device is taking measurements you can look at the results (or intermediate results) at your PC in real time. You can even log large amounts of data to a .csv file and inspect the results in Excel.

But let’s look at the developements in a bit more detail. Last time the device was sending pulses and capturing some of the resulting zero-crossings but not much more than that. And in a not very structured way.

**Measuring amplitude**

Now it’s not only capturing the times when the zero-crossings occur but also measures the amplitude of the half-waves in between. In order to do that an interrupt is generated at every zero-crossing capture. Inside that interrupt service routine time of the zero-crossing is saved (as previously) and the ADC is started. In order to measure the amplitude at its maximum (or minimum in case of a negative half-wave), the ADC first samples the input for a 6.25 microseconds (which corresponds to a quarter-wave at 40kHz). I’ve chosen the sampling period slightly shorter to compensate for the time from when the zero-crossing occured unil the ADC is started. After the sampling period the internal sample-and-hold amplifier switches from sample to hold and the actual analog-to-digital conversion takes place.

The PIC’s datasheet advertises a maximum sampling rate of 1 million samples per second. Beware, though, that this requires separate Vref+ and Vref- connections and doesn’t work in differential mode anyway. In my configuration without separate analog references and differential measurements the maximum sampling rate drops to 400k samples per second. And the reported conversion time doesn’t include the sampling time yet. I wasn’t aware of that but luckily the PIC’s ADC is still more than fast enough for what I’m doing here.

**Finding the absolute phase**

Capturing the zero-crossings is relatively easy and the results are very precise. But there’s a challenge: there are hundreds of zero crossings for each burst of pulses sent. How do you know which zero-crossings to use?

The Arduino-based anemometer used an analog envelope detector to solve this problem. After averaging plenty of samples this works most of the time. But even after many attempts to tweek the analog circuit the envelope detector approach was never as reliable as I wanted it to be.

My standalone anemometer has an entirely different approach. Being able to digitally measure the amplitude of each half-wave we can tackle the challenge in software. We have a series of measurements and we have an idea of that the signal we are looking for looks like. Looking the scope screenshot below you could easily and reliably determine where the maximum amplitude occurs. That’s what we need to teach the software to do.

One could, of course, just loop through the all the amplitudes and find the maximum. But there are a lot of similar amplitudes and so any noise would make the result unreliable.

I did a bit of reading on digital signal processing (DSP) and realized that this is a classic DSP problem. The solution is a matched filter. You take the signal you are looking for (also known as *kernel* or* template*) and multiply it sample by sample with the captured signal and sum up the results. Do that for every possible (or reasonable) overlap of signal and kernel and find the position where the result reaches its maximum.

If you’re new to the subject (like I am), Steven W. Smith’s Digital Signal Processing is a great place to start. It’s even available online for free at dspguide.com. The method is also referred as convolution or correlation depending on who you ask. This wikipedia article gives a nice introduction to the subject.

**Implementing the matched filter**

I’ve parameterized the software to capture 34 zero crossings as well as the n=33 amplitudes in between them. The kernel consists of k=17 data points with a peak in the middle.

There are n-k+1=17 possibilities to entirely overlap the kernel with the signal. We can skip all the odd numbers where a negative kernel value would be multiplied with a positive sample value and vice versa. This leaves us with 9 possible positions to chose from.

For each position we need to do 17 multiplications and 16 additions. And we need to do that for every single measurement, i.e. 512 times per second. That sounds like a lot of work and it probably is. Luckily, since this is a very common DSP task, microcontrollers like the PIC32 have a special instruction for this. It’s called MAC – Multiply Accumulate and executes in a single clock cycle. The result is just amazing. The corresponding ISR takes less than 30 microseconds, including some housekeeping tasks as can be seen in the screenshot below.

Once the maximum amplitude is found, the 16 zero-crossings around it (i.e. the 8 before and the 8 after) are summed up and the sum is saved as the result of that measurement.

**Averaging measurements**

The goal is to report wind speed and temperature four times per second. Since there are 512 measurements per second we can use up to 128 individual measurements (32 in each direction) to do our calculations.

So we have plenty of data to identify and exclude outliers and do some averaging of the remaining samples. Robust statistics is the key word here, click here for the corresponding wikipedia article.

For now I’m doing something reasonably simple: First the 32 results per direction are sorted. The 12 lowest and 12 highest results are ignored and only the 8 results in the middle are summed up.

This might seem wasteful but it makes the average very robust against outliers and still results in 16 x 8 = 128 zero-crossings being averaged. The resolution of each zero-crossing is equal to 1 / 48MHz or about 20.83 nanoseconds. Summing up 128 of them gives us a resolution of far below a nanosecond. Beware however that resolution doesn not imply accuracy.

As a last step, the resulting sum is multiplied with 125 and then divided by 768 to make the unit 1 nanosecond. So every 250 milliseconds we finally have 4 time-of-flight measurements with a resolution of 1 nanosecond. That’s what we will use to calculate the winds speed as well as wind direction and temperature.

This sorting and averaging step is a bit more costly in terms of computation time. It takes around 600 microseconds to complete (see below). But unlike the convolution, it only has to be performed 4 times per second so this is more than fine.

**Calculating wind speed**

We are finally in a position to calculate the actual wind speed. There are various ways of doing this. For now I’ve just done the simple approach of taking the difference in time-of-flight, assuming room temperature and solving for wind speed. This means solving a quadratic equation for which I have resorted to floating point math and using <math.h> for the square root function.

I don’t have my wind tunnel any more so doing any testing was difficult. But one thing was soon obvious: at zero wind speed, the time-of-flight measurements match extremely well and are extrordinary stable from measurement to measurement. Also, looking at the intermediate results (i.e. the 128 single measurements before averaging) shows that there are basically no outliers present. I could randomly pick a measurement and still get a very reasonable value.

Something seems to be systematically wrong with the first of the 128 measurements but I haven’t had time to look into this. Otherwise, the results are impressively stable. And I’m only using a relatively primitive kernel for the matched filter.

**USB data logging**

As I’ve mentioned at the beginning, the USB interface lets me do some serious data logging even at this early stage of development. Here’s a list of what I can do by sending a character to the device from a USB terminal.

- ‘Z’: Get all the 34 zero-crossings of a single measurement
- ‘A’: Get all 33 amplitudes of a single measurement
- ‘F’: Get a full single measurement, i.e. 34 zero-crossings plus 33 amplitudes
- ‘T’: Get all the 4 x 32 = 128 time-of-flights for a measurement series
- ‘R’: Get the 4 averaged time-of-flights as well as wind speed and temperature

Some versions of these commands also let you specify the direction you’re interested in as well as how many sets of data you want to receive. This makes it easy to log large amounts of data that you can use to test possible algorithms on your PC before you implement them on the PIC.

**Next steps**

The next step is to get the I2C digipot working so I can control the amplification in software. My idea is to aim for a maximum amplitude of around 3 volts independent of wind speed and so on.

There’s also plenty of work to do to improve the algorithm that calculates wind speed and temperature. And then I also have to implement the I2C and SPI interfaces that let the anemometer communicate to other embedded devices like an Arduino or Raspberry Pi.

Having used floating point math and <math.h> I’m also running out of flash memory. I’m currently using 93% of flash (32kB total) and 52% of ram (8kB total). There will be a slightly revised board (Rev B) that uses a PIC32MX250 which is identical to the MX220 used here but has four times as much flash and ram.

That’s it for today. On the overview page you find the software at the stage of developement just described as well as some examples of logged data (all at zero wind speed) as .csv files.

The next post on this project is here: Part 24.