Kalman Filtering

As a gift this last Christmas, I received a Canon PowerShot SX230HS. This camera has the ability shoot high-speed video at 120 and 240 frames per second. I knew immediately that I wanted to use this feature to do some physics, tracking, and signal processing experiments. There is nothing that incorporates these three things more than a Kalman filter. This post will detail some of the basic Kalman filtering experiments this camera has allowed me to explore.

The first experiment tracks an object in free fall and uses a Kalman filter to estimate the position, velocity, and acceleration due to gravity. The second experiment looks at a harmonic oscillator consisting of a weight suspended on an elastic band and uses an extended Kalman filter to estimate the damping ratio, the natural frequency, and the band’s spring constant. I’ll post all the data and Matlab M-Files for the filtering for those that are interested at the end of the post.

Camera Calibration

Before diving into the experiments, I thought it would be worthwhile to double check the manufacturer’s claim on the frame rate of the high-speed video modes. The manual states that the camera can operate at both 120 FPS and 240 FPS, however they give no tolerance values on these rates. To check the true high-speed frame rate, I created a simple system that used the FPGA from the previous post, to turn on a series of LEDs for 1 millisecond once every second. I then filmed these lights in the high speed mode and used the lights as a 1 second time-stamp.

In the picture below you can see the images of the FPGA immediately before and during the LED illumination. To identify the frame which contained the illuminated LEDs, I integrated the total energy in the white box, shown below, for each frame.

The graphic below shows the integrated energy from frame to frame while shooting at 240 FPS. The LEDs are able to provide a very high SNR time mark that allows for easy identification. It’s interesting to note that every fifth second, there appears to be a slight dip in the total energy. I believe that in this case, the 1 ms illumination is actually straddling two frames, so the energy is spread compared to the other cases.

To help visually track the actual frame rate, I reshaped the 1D plot above, into a 2D image with columns of length 240. If the frame rate were truly 240 FPS, the 2D image should have a single bar horizontally across the image, indicating that 240 frames represents 1 second exactly. As you can see below, the line is not horizontal, and instead indicates that the frame rate is slightly higher than 240 FPS. Also, this image makes it clear that the dips in energy seen above are due to the illumination spreading across two frames.

To get a better idea of the exact frame rate, I plotted the deviation from an ideal 240 FPS below. There is a clear linear trend, with a slope of approximately 0.386 Frame/Sec. This indicates that the true frame rate of the camera, shooting at the supposed 240 FPS, is 240.386 FPS. I performed a similar check at the 120 FPS and found the rate to be about 120.12 FPS.

Armed with an accurate estimate of the true time between frames, I was then able to proceed with the data collection and tracking.

Falling Weight

The setup for the first experiment consisted of the camera mounted on a tripod filming a weight falling in front of a wall. The wall contained a measuring stick to allow me to make accurate measurements of the distance traveled. An image from the movie is shown below, with the weight in free fall.

The tracking was done in Matlab and consisted of very simple blob detection. The goal of this experiment was not to test out computer vision algorithms, it was to use the high-speed camera to get tracking data, so I used a very simple algorithm to detect the weight. The displacement data could have just as easily come from me manually identifying the location of the weight from frame to frame. One thing to note, I waited a few frames after the weight left my hand to begin tracking. This ensured that there were no unmodeled forces on the weight, but also introduced a slight initial velocity to the object.

The plot below shows the measured displacement along with a quadratic fit to this line. As you can see, the data is quite clean and the fitted line is very close to what I would expect for this setup. The thing to remember about this fit, is that I had to wait until all the data was collected and then perform the least squares fitting, the Kalman filter allows this calculation to be done recursively with each new sample.

Kalman Filter Tracking

The first step in using the Kalman filter is identifying the model to represent the system. I chose to model the falling weight as a mass moving with constant acceleration. This model ignores the effects of air resistance, but for a free fall of this sort distance in still air, this model seems reasonable. The continuous state-space model is shown below with w(t) representing a random acceleration, i.e. the process noise.

The representation shown above is not the only way to model an object in free fall. Since the acceleration due to gravity on Earth is well known and constant, I could incorporate this into the model and reduce the state-vector to 2 dimensions. Instead, I chose to augment the state vector and allow the filter to converge to the acceleration. This convergence will provide a check on the overall performance of the filter.

For this model to be of use in the Kalman filter, it must be moved from the continuous representation into a discrete form. This requires finding the state-transition matrix that will propagate the current state to the state at the next sampling interval. This calculation is shown below.

In addition to the state-propagation, we must also compute the process noise covariance matrix. This requires a matrix integral, but for this 3×3 state-transition matrix, the math is not too difficult. The graphic below shows this calculation.

With these matrices in hand, we can essentially plug and chug with the Kalman filter equations to process the data. The actual code to track the data with the Kalman filter is very straightforward and consists of just 9 lines of m-code (including code to save the variables). I’ve included the entire m-file at the end of the post, but I thought I would also post the core Kalman code below.

Kalman Filter Matlab CodeIn the code above, A represents the state-transition matrix, D represents the measured displacement, H is the measurement matrix ([1 0 0]), and R is the measurement noise (estimated from the data). What’s not shown above is the initialization of the X and P vectors. The initial diagonal elements of the covariance matrix P give an indication in the confidence of the initial state vector X. A small value in P represents higher confidence in the initial value, while a large value represents less certainty. For this experiment I set the initial values of P quite large (compared to the initial X values). If the chosen model is a good fit for the process, the covariance matrix P should converge indicating the system is incorporating the measurements with the model to reduce the uncertainty in the state estimate.

Results

The Kalman filter code above will produce and store three pairs of variables, the state estimates along with their covariances, as well as the Kalman gain at each step. The state variables and associated variances will allow us to inspect both the current estimate along with the uncertainty present in that estimate. The Kalman gain will allow us to look at how the filter is incorporating new measurements. Initially, we should expect that the Kalman gain is relatively high, (i.e. the filter needs to integrate new information) but as time progresses, if the filter is accurately modeling the system, the Kalman gain should fall, indicating the a-priori measurements are accurately predicting the next state.

The plot of the estimated displacement with time is shown below. It’s difficult to make out, but there is a blue line representing the displacement and a dashed black line above and below the displacement representing the std. deviation of the position error. The minimal error present in this state variable throughout the entire filter duration is due to the high SNR measurement used as input to the filter. The filter is balancing the error from the measurement with the error from the model, but it will never produce an estimate with more error than the measurement.

The next plot shows the estimated velocity with time and begins to demonstrate the ability of the Kalman filter to reduce the error in the state estimate. I chose an initial velocity estimate of 0 m/s with a std. deviation of 5 m/s. As the filter took in more data, the estimated error quickly converged, this can be visualized with how tightly the dashed line borders the estimate.

The final state variable, the acceleration, allows some independent confirmation that the filter is accurately modeling the system dynamics. Where as the velocity plot above seems good given the convergence of the error, it is difficult to know if the estimated velocity represents the true velocity of the weight. The acceleration, on the other hand, should converge to the well known 9.8 m/s^2. As you can see in the plot below, this is exactly what happens.

Kalman Filter EstimateThe acceleration was initialized at 2 m/s^2 with a std. deviation of 10 m/s^2. The final estimate of the acceleration was approximately 9.82 m/s^2 with a std. deviation of approximately 0.2 m/s^2. So this is inline with what we would expect with the acceleration estimate.

The final aspect of the filter operation that should be inspected is how the Kalman gain varies with time. The plot below shows the Kalman gains for the individual states vs. time.

Kalman Filter Gain

Initially the Kalman gains are relatively high, but as time progresses and the state estimate and model are able to better predict the system and the gains begin to fall.

I think it is worthwhile to look at how the Kalman velocity and acceleration estimates compare with numerical estimates of the first and second derivatives of position. I’m going to use a finite difference approximation to the derivative instead of something like a Savitzky-Golay filter since this will offer a derivative estimate with minimal delay. The first plot below shows the numerical derivative approximation of the velocity.

This velocity estimate is clearly more noisy than the Kalman estimate, but the overall trend is still clear and correct. It’s interesting to look at what happens between 300 and 350 ms, there is a clear disruption in the data which causes the velocity estimate to jumpy wildly. This disruption has minimal impact on the Kalman velocity estimate, which is to be expected since by that point the filter is relying more on the a-priori estimates as evidenced by the low estimate error and Kalman gain.

The plot below shows the numerical derivative estimate of the acceleration. This estimate is wildly inaccurate and serves as a nice comparison to the much more accurate Kalman estimate.

I’ve created a small video incorporating the Kalman filtering results above and also showing the high-speed video used to perform the tracking. In doing all the data processing to get these results, I sort of lost track of how quickly everything happens. It is one thing to see a plot with a duration of 400 ms, but it’s another to watch the convergence in real-time or even slowed down 6x as in the video.


 

The video shows the weight falling three times. The first time shows the raw video slowed down 6 times. The second repetition overlays a green dot on the tracked position of the weight. The final repetition shows the online convergence of the acceleration state parameter.

Extended Kalman Filter

After fooling around with the Kalman filter, I wanted to explore the extended Kalman filter (EKF) a little. An EKF is an extension of the Kalman filter that can handle non-linear system models. A common use of the EKF is for model parameter identification. For any model of relative complexity there are usually a variety of constants that must be known (e.g. mass of an object, coefficient of friction, temperature of surroundings, etc…). When these are known constants they can be used to populate the state-transition matrix resulting in a linear update equation. If, however, these parameters are not known, the state estimate vector can be augmented with the unknown parameters. This allows the filter to attempt to estimate the parameters, but it introduces non-linearity into the system since now the state-transition matrix is dependent on the state. It’s this idea of using the EKF to identify model parameters that I wanted to explore.

The specific model I decided to use was a harmonic oscillator, specifically a weight bouncing on an elastic string. The rationale behind this was two fold. One, there aren’t many systems that have such a nice physical model and can be implemented as easily as a harmonic oscillator. Two, one of my lab mates, Dalong Liu, published a great paper on using an EKF to estimate tissue viscoelastic parameters by modeling a tissue construct as a forced harmonic oscillator excited by an acoustic radiation force pulse. This work, and also more recent work done by Mehul Soman, has given me some familiarity with using EKFs for parameter identification.

The general setup of this experiment is very similar to the previous setup. A camera was mounted to film a weight bouncing on an elastic string in front of the same wall as the previous experiment. To begin the oscillations, I pulled the weight down from its natural resting location and released it. I attempted to pull the weight straight down with as little side-to-side movement as possible. The tracking used the same blob detection as before, but only measured the vertical displacement. I made no attempt to record or model the side-to-side movement of the weight throughout its oscillations, instead, these deviations are incorporated into the process noise of the system. The graphic below shows a single frame from the high-speed movie.

From this video, I was able to obtain the following displacement data. Here again, the data is of relatively high quality and the motion, at least qualitatively, appears to be representative of a harmonic oscillator. It’s clear that the motion is under-damped with a period of just over a second.

The differential equation representing a harmonic oscillator is well known and shown below. Here the process noise is represented by w(t). Given the initial conditions of the oscillator, and knowledge of the damping ratio (xi) and the natural frequency (omega), this formula will describe the subsequent motion.

To help make this equation more tractable, we can reformulate it in its state space representation. If we ignore the process noise for the moment, the state space representation of this systems is shown below. The first element of the state vector represents the position of the object while the second element represents the velocity.

Just like the Kalman filter, the state-space equation here represents the continuous time model. To be of use in the EKF, we must be able to convert this formulation to a discrete model. If we know the natural frequency, the damping ratio, and the sampling interval then we can quickly identify the state-transition matrix as we did above. What I didn’t talk about previously, is that the identification of this state-transition matrix is a computationally intensive process. For the Kalman filter this was not much of an issue since we only needed to compute the matrix once and it could be done offline, with the EKF, things are not that simple. Once we augment the state-vector with the damping ratio and the natural frequency, the state matrix is no long time-invariant and will depend on the current state. This implies, that if we are to use the techniques from above to estimate the state-transition matrix, we will need to raise e to the matrix power at each prediction update. Given how expensive this operation is, it is worthwhile to look at a few alternatives.

Specifically, I looked at two, one, using a first-order Taylor series approximation of the state-transition matrix, and two, using an Runge-Kutta 4th order to propagate the state. To test the performance of these two schemes at propagating the state-vector, I tested them on an ideal harmonic oscillator with a natural frequency of 2*pi*3, and a damping ratio of 0.1. I looked at the ability of these two methods to accurately propagate the state across a 100 ms interval with various time steps. The phase plane for this system is shown below, with the integration results from the Taylor series approximation shown in red and the Runge-Kutta results shown in black. The red dots represent the starting and ending points on the true phase plane trajectory. Theses results here are from breaking the 100 ms interval into 4, 25 ms intervals to perform the integration over.

Harmonic Oscillator Phase Plane

The RK4 method is clearly superior than the first order Taylor series approximation, but it also required more operations. Perhaps, if I decreased the integration step size I could achieve similar performance with the Taylor series approximation, lightening the computational burden? To look at this, I measured the norm of the integration error for both methods with various integration step sizes. The log compressed results are shown below.

The error in the RK4 method very quickly drops to what is functionally equivalent to no error. The Taylor series approximation does not drop off nearly as quickly. In fact, to reach the same level of error present in the RK4 method with only 3 integration steps across the 100 ms, the Taylor series approximation requires just over 600 integration steps. These results justify the additional computation required with the RK4 method.

The other computation required during the prediction step is the covariance update step. Unlike the Kalman filter, where the state-transition matrix was used to update the covariance. In the EKF, we must use the Jacobian of the derivative function f. This is relatively straightforward to calculate and is shown below.

With the state update method chosen and the matrices needed for the covariance update identified, we now have everything in place to process the data. The only remaining step is to choose the initial state vector and the initial covariance estimates. I chose the initial position estimate based on the first analyzed video frame, and the velocity I really just guessed a reasonable value (60 cm/s). For both the position and velocity, I chose very large initial covariance values, 10,000 and 1,000,000 respectively. I also tried to choose reasonable values to initialize the damping ratio and the natural frequency, 0.1 and 3 Rad/S. I also chose large covariance estimates for these values as well, 100 and 1,000.

 The plot below shows the estimated displacement overlaid on the measured displacement. As was the case in the Kalman filter above, this estimate was very much in line with the measured displacement.

Extended Kalman FilterThe high-speed video had a relatively low image size, introducing quantization effects into the measured data. The plot below shows a close up of the measured data and the estimated data. You can clearly see the quantization present in the measured data, but the estimated data does not suffer from these effects.

I’ll skip the velocity state variable and move directly to the damping ratio and the natural frequency. Both of these variables showed nice, fast convergence. The plot of the damping ratio below shows that within half a second, the covariance associated with this state has already converged to a relatively low value. The steady state value of the damping ratio also makes intuitive sense. The displacement plots clearly shows an under damped oscillator, and a damping ratio of ~0.07 is very much under damped.

The next plot shows the estimate of the natural frequency. For reference, the initial value of the natural frequency was 3 Rad/S with a std. dev. of just over 30 Rad/S. Within 1 second the std. dev. of this estimate has decreased to around 1 Rad/S, and continues to decrease throughout the observation period.

It’s actually possible to independently check the validity of the estimated natural frequency. The natural frequency of a harmonic oscillator consisting of a weight on a spring is given by the square root of the spring constant over the mass of the weight. I measured the mass of the weight used in these experiments as approximately 138 grams. Measuring the spring constant took a little improvisation, I used quarters as a standard weight to measure the displacement due to two different forces. I first measured the displacement with no quarters attached to the weight, I then taped 7 quarters to the weight to increase the mass by 39.7 and measured the displacement a second time. Under this load, the measured displacement was  15.2 cm. This gives a spring constant of about 2.5 N/M, giving a natural frequency of about 4.3 Rad/S. This estimate is very close to the estimated frequency of 4.37 Rad/S.

The video below shows the parameter convergence along with the high-speed video at 6 times slower than real-time.

Moving forward, I’d like to use this same data to test out some more advanced Bayesian filters such as the unscented Kalman filter or particle filters. It would be interesting to compare various aspects of these various filters (e.g. overall convergence time, stability, sensitivity to initial guess, etc…). I’m also looking to collect additional data some well behaved dynamic motion. Right now I’m thinking I’ll try to collect some data of a free spinning bicycle tire or a some sort of a projectile (maybe a nerf bullet or a marble launched by a rubber band catapult). I’d like to use these filters to help predict future states as well. For instance, with the bicycle tire, could I predict at what point in the future the tire will stop spinning, or for the Nerf bullet, could I predict where the bullet would land? One really interesting idea would be to use a two camera setup to observe a dart in motion and attempt to predict where on the board the dart would land.

For those that are interested, I’ve uploaded the Matlab code for the Kalman filter and the Extended Kalman filter along with the measured data, you can find the files here.

3D Magnetic Localization

A few years ago, for my senior design project, I worked with a group of guys to look at ways of magnetically tracking a baseball. The idea was that you could embed a small magnet in the ball and with suitably placed sensors around home plate , you could automatically detect if a pitch was a strike or a ball. The intent was not to use this in professional sports (I doubt MLB would take kindly to implanting a magnet in the ball) but rather as an inexpensive training tool for little leaguers. The project went well and we were able to create a prototype that was capable of calling a ball or strike with good fidelity.

After this project, I ended up with the magnetic sensors and I’ve always wanted to revisit this idea (tracking a permanent magnet in 3D space, not the baseball application) with the benefit of a few more years of experience as an engineer. I thought I would use this post to detail the development of a basic research platform for testing out different tracking techniques. I’ll also post some of the sensor data in case others are interested in applying their own analysis techniques.

System Overview

The goal I had in designing this platform was to create a system that would allow me to easily collect data as well as test out different algorithms for determining the magnet’s position. The system used an FPGA to coordinate data collection from the sensors with data offload to the computer. Once on the computer, the raw sensor data would be displayed on screen, as well as made available to a variety of non-linear optimization routines for determining the magnet’s position relative to the sensors. The final solved-for position of the magnet would then be displayed on the screen.

Sensors

The picture below shows one of the three MicroMag sensors used to measure the magnetic field. The sensor (green board) was acquired through Sparkfun, and comes with two, seven pin hundred mil headers for relatively easy prototyping. I had a separate PCB board made (yellowish board) to provide mount points as well as an RJ-45 connection for power and communication lines.

Each MicroMag sensor has the ability to detect the magnetic field in three perpendicular axis. The device sensitivity is inversely related to the time required to take a measurement. At the fastest update rate of 2 KHz/Axis, the resolution of the device is ~2 uT, at its slowest rate of ~17 Hz/Axis the resolution is ~16 nT. I typically run the device at its most sensitive setting to provide the highest SNR measurement available.

The device communicates through a slightly non-conventional SPI interface. In addition to the CE, MOSI, MISO, and SCLK typically used in an SPI port, there is also a DRDY line and a Reset line. The Reset line is pulsed high to prepare the device to perform a measurement, while the DRDY line allows the device to indicate to the host that a measurement is complete. To get a complete 3-axis reading, the host must interrogate each axis separately, each time pulsing the Reset line, transmitting an 8 bit command byte indicating which axis to poll and at what sensitivity level, wait for the device to indicate the measurement is finished, and finally clock out the 16 bit field measurement.

FPGA

In this system I used an FPGA to serve as the intermediary between the computer and the sensors. The nice thing about using an FPGA vs a micro-controller is that I am free to synthesize as many SPI ports as my IO pins will allow. So where as with a micro-controller (which typically only have 1 SPI port) access to the sensor would have to be time-multiplexed, with an FPGA I can create 3 independent SPI ports and sensor communications can take place simultaneously. This feature greatly speeds up the acquisition time and was the main motivation for using an FPGA over a micro-controller.

The FPGA I used was a Spartan-3 1000K Xilinx chip mounted on a development board from Digilent Inc. The development board contains a number of nice on-board accessories, but most importantly, it contains 3, 40-pin expansion ports for connecting external hardware and logic. The board, along with two external PCB boards, is shown below.

Spartan FPGA FTDI USBThe board attached to the expansion port along the right interfaces the FPGA development board with the individual MicroMag sensors. This board provides a separate power supply as well as the connections needed to route the RJ-45 signals to the appropriate IO pins on the header.

The board attached along the top provides a connection to an FTDI development board for their FT2232H chip. I’ve really grown to love this FTDI development board and I’ve ended up using it in a bunch of projects. It provides a really simple interface on both sides of the connection. On the FPGA or micro-controller side, you can choose from a variety of interface protocols from low speed to high speed. I really like their high speed connection, which is capable of providing a sustained throughput of well over 10 Mbytes/Sec. FTDI provides a set of drivers and an API for communicating on the host side, so there is no problem there either. My only complaint about this device, is that in the high speed mode there is only one, 8 bit data bus with no reset signal. This means you need to create your own communication protocol, and the lack of any reset signal means it can be difficult to recover from an error condition. For my communications I use a simple packet structure which contains a 16 bit address field, a 16 bit packet length field, the data payload, and a final 16 bit checksum field. In addition, on the FPGA, I am constantly monitoring the bytes transmitted independent of any protocol, when I detect a specific 128 bit pattern, I reset the USB state. This allows me to always place the USB controller in a known state by simply transmitting 8 bytes.

The logic on the FPGA consists of four main parts: a USB interface module, an SPI interface module, a FIFO for data buffering, and a finite-state machine to tie all the parts together. The FSM uses parameters set by the USB interface module to continually pull data from the sensors and construct the data packets for offload through the USB.

Magnet

The final key system component is the magnet. I decided to use a rare earth magnet since I felt this would give me the strongest field in the smallest package. I briefly considered using a current carrying coil to create the magnetic field. Using a coil, I could create a very uniform, calibrated magnetic field. The big downside is that I would need cables to provide the current. I ultimately decided that I would prefer the maneuverability of a permanent magnet. In the future I may explore using small current carrying coils to provide the magnetic field. It would also be interesting to explore replacing the MicroMag sensors with small coils to sense the magnetic field. In this case I would need to replace the permanent magnet with a coil carrying AC current, since a static field would be undetectable.

The magnet I used is shown in the photo below. It’s actually two small cylindrical magnets stacked together.Rare Earth Magnet TrackingThe biggest issue with the magnet is accurately modeling its magnetic field. I chose a magnetic dipole model to enable fast calculation of the field at arbitrary points in space. In addition, I used the simplifications suggested by this site to allow the calculation of the field with only two parameters, angle and radial distance. Using these assumptions, the field components are calculated as shown below.

Bar Magnet FieldI think the dipole assumption and the associated equations are probably the weakest points of the system. I am trying to measure the magnetic field relatively close to the magnet, and some of the assumptions required for the equations above may not be valid.

The only term, in the equations above, which is dependent on the magnet used is the U term. This is actually the combinations of a couple constants, as shown below.

In this equation, Mu Naught is 4*pi*10^(-7) Newtons per square Ampere, and M is the magnetic dipole moment in Ampere-square Meters. To calculate the magnetic dipole moment for this magnet, I measured the magnetic field strength at various distances along the dipole axis. I calculated the strength of the dipole moment as approximately 1.41 Ampere-square Meters. After a little bit of looking online, this seems like a reasonable value for this sort of magnet.

Sensor Registration

In order accurately locate the magnet, not only must we know the field strength, we must also know where the field was measured. To ensure the sensors were placed precisely, I created a simple plexiglass baseboard that mounted with the PCB boards pictured above. This baseboard ensured that the sensors did not slide around from measurement to measurement, and day to day.

The baseboard was nice to ensure there was no sensor movement, but I still needed to know where each sensor was located in 3D space. There were a few ways I could go about deducing this information. Since I had the plexiglass baseboard and PCB designed, I know all the dimensions used, so I could use these design files to estimate the final position of the sensors. The problem with this method, is that I gave the drill holes in the PCB and plexiglass board a wide margin to ensure I could use a variety of non-magnetic screws. This means the final position of the sensor boards could vary by hundreds of mils. Instead, what I chose to do was to assemble the sensors on the baseboard, and use photographic images of the final assembly to register the location of the sensors.

This method required registering three different images, the photograph of the final setup, the design schematic of the MicroMag Sensor, and the CAD drawing of the plexiglass baseboard. The MicroMag Schematic and the CAD drawing are nice targets for registration since they have zero rotational misalignment. The picture of the final setup is not as friendly, so the first step in the image registration process was to ensure that the image of the final setup was rotationally aligned with the CAD drawing and the MicroMag schematic. This was equivalent to ensuring the bottom edge of the PCB boards were all perfectly horizontal. The picture below shows the image of the final setup. The amount of rotation required was easily found by estimating the angle between the bottom edge of the PCB with the horizontal axis.

With all three images rotationally aligned, the only remaining step is to scale and shift each image. I chose to use the CAD drawing as the image with which to reference the other images. The registration was accomplished by identifying common points in each image (centers of mount holes, corners, edges, etc…) and then finding the global scaling and X,Y shifting that would align these common points as much as possible in the least squares sense. The nice thing about using the least squares criterion for aligning the points is that the linear algebra is relatively straightforward. The graphic below gives a quick overview of how I set up the matrices to solve the problem.

Least Squares Image RegistrationI’m able to create the Y vector, and the system matrix A from the points on the image. It’s clear that the forward problem is an over-determined problem, so in all likelihood, there will be no single set of parameters that will correctly align all points. The one thing that is a little misleading about the graphic above is the X and Y Offset labels. This registration scales the images and then shifts, so the X and Y offsets shown above may be different depending on the scaling. I’ll include the Matlab code, along with the actual images I used to perform this registration for those who are interested. Below you can see the registered sensor pod with the CAD drawing and MicroMag schematic.

MicroMag Magnetic SensorFrom this registration, I was able to identify the locations of all 9 sensors to the following locations.

For each of the sensors, MS1 measures the magnetic field in the Y-axis, MS2 measures the magnetic field in the -X-axis, and MS3 measures the magnetic field in the -Z-axis. Aligning the sensors in these axis ensures the use of a right hand coordinate system.

Sensor Calibration

The only remaining step before I could begin collecting and processing meaningful data was to calibrate the sensors. The sensors needed to be calibrated in two manners. First, within each board, each of the three axis sensors needed to be calibrated to each other, I called this intra-sensor calibration. Second, each of the sensor boards needed to be calibrated to each other, I called this inter-sensor calibration. I relied on the Earth’s magnetic field for each of the calibrations. It’s interesting that the Earth’s magnetic field acts as a hinderance during normal operation, but for calibration purposes it serves wonderfully as a nice homogenous field.

To calibrate the individual axis sensors on MicroMag board I took three different sensor readings. The first reading was of the Earth’s field with the sensors oriented in their normal horizontal orientation. The second reading was after rotating the sensors to swap the Z-axis and the X-axis. The final reading was taken after a final rotation to swap the Z-axis and the Y-axis. These three readings allowed me to calculate the scalings needed to calibrate the X-axis and Y-axis to the Z-axis. As a test of the calibration, I rotated the entire setup in air while collecting data. With perfect calibration, no matter what the orientation, the overall sensed magnitude of the X,Y,Z components should not vary (i.e. the value sqrt( X^2 + Y^2 + Z^2) should be invariant under rotations). The first graphic below shows the field magnitude without any calibration.

It’s clear that with rotations, the magnitude of the sensed field varies quite a bit when uncalibrated. The graphic below shows the same data after calibration.

The field magnitude is much more uniform after calibration. I tried to keep the rotational speed relatively slow since the measurement time is relatively long, but even still, there is bound to be some error due to movement that can’t be avoided.

The final calibration step is to normalize the sensor readings between boards. The graphic above makes it obvious that field strength measured by the first sensor is significantly different than the other two sensors. I set the second sensor as the reference sensor and scaled the other two sensors to match the field measured by sensor 2. The graphic below shows the fully calibrated field.

The measured magnetic field corresponds well with the reported field strength at my location. According to the USGS, the Earth’s magnetic field should be between 55 uT and 60 uT for Minneapolis, MN.

Magnet Localization

With all the sensor locations established and the readings calibrated, the remaining step is to use the measured magnetic field to locate the magnet. The problem can be framed as finding the location and orientation of a magnet whose magnetic field most closely matches the observed field. Or to put it another way, we want to find the location (X,Y,Z) and the orientation (ThetaX, ThetaY, ThetaZ) such that the difference between the measured field and the calculated field is minimized according to some norm. What this boils down to, is a non-linear optimization problem to solve for the 6 parameters describing the position and orientation of the magnet. Non-linear optimization is outside of what I deal with on a day-to-day basis so I know next to nothing about it, luckily, many other people know about it and there are many non-linear optimization solvers available. What’s needed to take advantage of these solvers is a function which takes in the parameters to be optimized and returns an error. I initially began using the Matlab environment to solve these problems but I have recently moved into a C++/QT environment which uses a solver passed in through a run-time DLL to enable rapid evaluation of various algorithms.

Matlab Optimization

The first step in taking advantage of the Matlab optimizers is creating the error function. This requires creating a function which can take in the location and orientation of a magnet and return the magnetic field at the measured sensor locations. I’ve included all the M-Files needed to run the optimization routines in the link at the end of this post, but I’ll briefly go through the function which calculates the magnetic field at the sensor locations based on a given location and orientation. The following graphic shows that function.

The function takes in three parameters: Loc which contains the Cartesian coordinates of the magnet location as the first three elements of the vector, and the direction cosines as the last three elements, M which is the magnetic dipole moment, and Sensors which contains the Cartesian coordinate of the individual sensors as column vectors.

The first thing I do is separate out the components of Loc and ensure that the magnitude of the direction cosines is unity. Next, I find the radial vector (and radial unit vector) connecting the sensor location with the magnet location. Using the radial unit vector, and the dipole axis vector, I find the angle between these two vectors. I use the property that the dot product of two unit normal vectors is equal to the cosine of the angle between the vectors.

At this point, I have all the information I need to calculate the magnitude of the magnetic field in the radial direction as well as the theta direction, however I still need to calculate the direction of the theta vector in XYZ space. This is a little bit tricky. This vector is rotated 90 degrees off the radial vector and in the plane formed by the magnetic dipole axis and the radial vector. The surface normal vector of the plane formed by the radial vector and the dipole axis vector can be found by taking the cross product of these two vectors. This surface normal vector is stored in the variable vP in the code above. To find the theta vector, we must rotate the radial vector 90 degrees about the vector vP. Since the vector vP can be an arbitrary axis, I use the axis-angle rotation matrix with the angle set to 90 degrees. This matrix is stored in the variable R, and used to create the theta direction vector rHp. rHp should remain a unit vector under transformation by matrix F, however I normalize it again just to be sure. I now calculate both components and add the vectors to get the resulting field vector. The final step is to project this magnetic field vector onto the axis corresponding to the current sensor.

I use Matlab’s “lsqnonlin” solver to perform the optimization. This takes as one of its arguments the function described above and uses it to find the location and orientation which minimizes the least squared error between the simulated and measured sensor readings. This function is very sensitive to the initial starting position. I’ve used two different initial starting points, one is that I just start centered between the sensors at an elevation of 10 cm with the axis of the magnet pointed up. The other option I use is to start the search at the previous search’s ending point. The advantage of the second option, is that if the sensor hasn’t moved much, your initial guess is likely to be close to the actual location, the danger is that your guesses may begin to diverge if you get a bad estimation.

The following shows some basic position tracking. The magnet’s position is given in blue and the sensor locations are given in red. The first example below shows a square trajectory above the sensors.

The next example I’ll show is moving the magnet forward and backward along the Y-axis.

The final example shows the magnet moving up and down in the Z-Axis.

In the future, I would like to make a table I could put over the sensors that would have pre-cut slots to place the magnet in with a specified orientation so I could get some measure of the absolute positioning error. I’ve also been careful not to bring the magnet too close to the sensor to ensure the devices are not overwhelmed by the strong magnetic field.

Research Platform

The analysis through Matlab is nice for verification and visualization, but I wanted something independent of the Matlab engine and capable of real-time display, so I decided to create a C++/QT based research platform.

The idea was to create a basic framework that would interface with the FPGA to retrieve the sensor data before offloading the data to a variety of modules. There are two main offloads, a data storage module and a locator module.

One of the problems I always face when deciding how I want to save data is what format to use. I use Matlab a lot to process my data, so it’s always tempting save the data in a .mat file. But it can be difficult to use .mat files outside of Matlab and I’m not sure I will always have access to Matlab, so whenever possible I try to use a non-proprietary storage format. In this case, since the data rate is not too high, I was able to save the data in a text based, human readable format. The files themselves contain instructions on how to interpret the data. I’ve included some sample data files in the files contained in the link at the end of the post. There are two sets of data files, the calibration files which were used to calibrate the sensors and the sensor readings that were used to generate the plots above.

To allow for fast testing of various different non-linear optimizers, I chose not to hard-code the solving routines. Instead, the solver is loaded from a run-time DLL. This allows me to try out various solvers without needing to recompile the main program. I can try out a new algorithm by creating a DLL with a predefined function table. The inspiration for this was from the WinRad program I used for my SDR. This program allows anyone to take advantage of the WinRad framework by writing their own DLL interface with their hardware.

The screenshot below shows the program displaying the location of the sensed magnet with respect to the sensors. The sensor readings at the bottom of the screen represent the measured field at the sensor directly above the plot. The position in the Z-axis is represented by the size of the magnet graphic, the larger the graphic, the greater the distance in the Z-axis.

Magnetic Dipole FieldThe video below shows the program operation in real-time as I move the magnet in a variety of patterns above the sensors. When the program initially connects to the FPGA the boards are only sensing the Earth’s magnetic field. I initially zero out this field before moving the magnet above the sensors and beginning to solve for the location.

Future Work

I’m really happy with the overall tracking ability of the current system, but there are a few areas I would like to work on. As I mentioned above, the current magnetic dipole model is perhaps too simple. I would like to perform some tests to quantify the accuracy of this model with this particular magnet. Right now I am assuming the magnetic field is rotationally symmetric about the dipole axis as predicted by the dipole model, it seems possible that impurities in the magnet may invalidate this assumption. I’m not really sure what a more accurate model would look like, would it be a high order polynomial in R and Theta fitted to measured values? I dont’ know.

The other thing I would like to look at is using current carrying coils to generate the magnetic field. I talked about the downsides of this approach, but I think the upsides are significant. Not only could you have a very calibrated field, you could also have an adjustable strength field. One thing I didn’t talk about, but it is a real limitation, is the linear range of the sensors. As the sensed magnetic field increases above 500 uT, the MicroMag sensors begin to behave non-linearly. With a current carrying wire, you could dynamically adjust the current through the coil to ensure that the sensors were all placed in their linear region of operation. It would almost be like automatic gain control for radios.

Another interesting idea that’s a little different than the application here, would be to couple an array of 3-axis magnetic sensors with a metal detector to image metallic objects instead of just giving an audible tone when metal is present. Imagine if you had a metal detector capable of creating a very homogenous, well defined magnetic field, and around the edge of the detector you had a series of 3-axis sensor (could be these MicroMag sensors or perhaps something more sensitive). Then when someone with a metallic object walked through the detector, you could sense how the metallic object distorted the magnetic field, and solve the inverse problem to locate the metal causing the distortion. It would be similar to what’s done in brain mapping with MEG. I’m confident you could locate the general area of the metallic object, but how well you could discern various shapes, I’m not sure.

I’ve collected the files for the image registration and the magnet localization into a single zip file, available here. I hope the file names are self-explanatory. I’ve included M-Files to read the text based data files into the Matlab workspace. There are is also a function to scale the readings based on the calibrations as well as remove the Earth’s magnetic field.