In this post we will see how to eliminate that requirement and perform the Gauss-Newton algorithm in a one-pass streaming algorithm with space complexity effectively .

**1. The heart of the matter **

Notice that we can rewrite the entry of as follows

So if we know , , and , then we can compute the matrix entry for arbitrary . We can scan the data once, compute these three numbers and then just use the formula above to compute the matrix entry while iterating through the Gauss-Newton algorithm. Once these numbers are computed, there is never a need to look at the data again.

A similar situation arises for all og the matrix entries. Before proceeding, let’s introduce some new notation that will make the sequel easier to follow, easier to code, and easier to generalize.

**2. Notation **

To think clearly about the calculations here, we need to stop thinking about 3-vectors and start thinking about -vectors. We will build on the notation introduced in post 1 of this series and define three new -vectors. For an axis define

This is the -vector of all observations for that axis. Next define

This is the -vector of the squares of the observations along this axis. Finally, it will be convenient to use a vector of all ones:

For any pair of -vectors , , we define the inner product as usual

With this notation, we can rewrite the formula in the last section as

**3. Rewriting the matrix entry formulas **

We will rewrite each of the formulas derived in post 2 in terms of these -vectors. It is a worthwhile exercise to verify these.

For

for

and for

To we write similar expressions for the entries of , note that

So for

and for

These expressions may look tedious, but they are just straightforward algebra. If we scan the observation data and compute the following statistics for all we will have all of the information we need to compute our matrix entries:

In fact, the situation is even better. Notice that

So we do not need to store . Also the arrays and are symmetric, allowing us to store just 6 of the 9 entries. Altogether, we need to store 24 numbers. Of course we also need to store , the number of observations, for a total of 25.

**4. Putting it into code **

Implementing this is straightforward even if it demands a bit of care. To see this implemented, look at the file BestSphereGaussNewtonCalibrator.cpp in muCSense. Here is a quick overview of what to look for.

- The arrays of statistics are private members of the
`BestSphereGaussNewtonCalibrator`

class and are declared in BestSphereGaussNewtonCalibrator.h

- The statistics are all accumulated in the function in the method
`BestSphereGaussNewtonCalibrator::update`

(BestSphereGaussNewtonCalibrator.cpp). - The calibration matrices are computed in the function in the method
`BestSphereGaussNewtonCalibrator::computeGNMatrices`

(BestSphereGaussNewtonCalibrator.cpp).In muCSense this sphere fitting procedure is used to calibrate “spherical sensors” such as magnetometers and accelerometers. At present, the library specifically supports the inertial sensors on SparkFun’s 9DoF Sensor Stick: the ADXL 345 accelerometer, HMC 5843 magnetometer, and the ITG 3200 gyroscope. To use this code to calibrate a different accelerometer or magnetometer, you just need to define a new subclass of Sensor — see ADXL345.h/ADXL345.cpp for an example — then follow this example sketch.

**5. What’s next. **

The code in muCSense works, but needs to be optimized. As it is, a simple sketch using this calibration algorithm takes up 20K of the 32K available on an Arduino Uno. The sums and products have a clear structure that I have not attempted to exploit at all. Using this structure should allow a valuable reduction in compiled code size. I will work on this, but would love to hear from anyone else who has some ideas.

It is interesting to note that this algorithm has a clear generalization to any residual function which is a polynomial of the observation vector. It works for higher dimensional observations and for higher degree polynomials. In particular, it can be adapted to handle linear maps with shear, something I plan to implement soon. The process I walked through here should be abstracted to build a system that takes a polynomial residual function as input and determines the statistics and computations needed in the background, optimizing the tedious arithmetic and saving us from writing this sort of code.

]]>What took up all of that memory? The matrix , the vector , and the set of observations .

It turns out that we don’t really need to keep any of these around. In this post I’ll describe how to implement the algorithm without materializing and . This will reduce the memory footprint enough to make a practical algorithm that can handle ~100 observations. This is the algorithm used in the accelerometer calibration sketch I posted last year.

In the next post, I’ll show how to accumulate 25 statistics about the observations that have all of the information we need to run Gauss-Newton in constant space and time.

Let’s start by getting rid of the big arrays of statistics.

**1. Only materialize what you need **

When carrying out the Gauss-Newton algorithm as it was sketched in the last post, we never need direct access to the matrix entries of or . We just need to work with and — two arrays with sizes that do not depend on the number of samples collected.

We will materialize these matrices instead of the big matrices.

This is possible because each matrix entry in and can be written as a sum of functions of individual observations. This means we can look at an observation, compute the function of the observation, add it to the matrix entry, then move on to the next observation.

Let’s make this more concrete and look at the 00 entry of . Recall Equation 4 of part 1:

from which the definition of matrix multiplication gives

So to compute we can just accumulate this sum observation by observation. There is no need to compute the big matrices and .

The general matrix entries are as follows. For

has a similar expression. Recall that

So all of the matrix entries we need to compute can be accumulated one observation at a time. We can store the observations, then at each iteration step in the Gauss-Newton algorithm run through these observations and use the formulas above to accumulate the matrix entries. Once we have done that, it is simple linear algebra to solve the matrix equation, find , and adjust . (Review the algorithm sketch here if needed.) This is essentially what is done in the sketch I posted last October, and I’ll use a slightly modified version of that code in the rest of this post.

**2. Putting it into code **

Now let’s translate this into C++. We won’t try to engineer the code here, just outline a simple Arduino sketch that does the job.

To start, we will need to declare some arrays to hold , and the observations . I’ll add an array for too even though it isn’t strictly needed.

//matrices for Gauss-Newton computations float JtJ[6][6]; float JtR[6]; float delta[6]; float beta[6]; //Array of observations int16_t* x; //malloc this to length 3*N in setup().

This amounts to bytes, where is the number of samples. If we only stored the upper triangular part of , we could eliminate 15 floats, reducing the cost to bytes at the expense of making the code below a bit tougher to read. I will leave the modification as an exercise for the interested reader (but I’ll perform a similar one in the next post). The cost of the naïve algorithm was (had we declared arrays for and ), so this method uses significantly less memory for any number of samples. Importantly, the dependency on is dramatically reduced and this algorithm can now be used on and Arduino on data sets with ~100 samples.

With those declared, let’s look at the `calibrate()`

routine. It starts by setting up some general parameters — how small a is small enough to stop? (`eps`

), after how many iterations should we give up? (`num_iterations`

), how big was the last ? (`change`

) — then enters the main loop. There is nothing subtle about this loop, just

- clear the matrices
- compute the calibration matrices and
- find by solving the matrix equation
- measure the size of
- adjust

Here is what that looks like in code.

void calibrate() { int i; float eps = 0.000000001; int num_iterations = 20; float change = 100.0; while (--num_iterations >=0 && change > eps) { //set all of the matrix entries to zero. reset_calibration_matrices(); //use the samples and the current values of beta to compute // the matrices JtJ and JtR compute_calibration_matrices(); //solve the matrix equation JtJ*delta = JtR find_delta(); //find the size of delta (look at log change in the coefficient parameters) change = delta[0]*delta[0] + delta[1]*delta[1] + delta[2]*delta[2] + delta[3]*delta[3]/(beta[3]*beta[3]) + delta[4]*delta[4]/(beta[4]*beta[4]) + delta[5]*delta[5]/(beta[5]*beta[5]); //adjust beta for(i=0;i<6;++i) { beta[i] -= delta[i]; } } }

But up to this is really just a template for an algorithm. As we’ve just been discussing, the real work of the algorithm all falls in to the routine `compute_matrices()`

. Everything else is simple. In this implementation, we compute the matrices by scanning the samples and updating the matrix entries as required by Equations 1–5. Here is an implementation.

void compute_calibration_matrices() { int i; reset_calibration_matrices(); for(i=0;i<N;i++) { update_calibration_matrices(x+3*i); } } void update_calibration_matrices(const unsigned int* data) { int j, k; float dx, b; float residual = 1.0; float jacobian[6]; for(j=0;j<3;++j) { b = beta[3+j]; dx = ((float)data[j])- beta[j]; residual -= pow(dx/b,2); jacobian[j] = 2.0*dx/pow(b,2); jacobian[3+j] = 2.0*dx*dx/pow(b,3); } for(j=0;j<6;++j) { JtR[j] += jacobian[j]*residual; for(k=0;k<6;++k) { JtJ[j][k] += jacobian[j]*jacobian[k]; } } }

**3. Wrap-up and what’s next **

By being careful about computing only the matrices we need, we were able to reduce the memory cost of the Gauss-Newton algorithm significantly and produce an algorithm that is viable on an inexpensive microcontroller. But we still need to keep the samples around and make a full pass through the sample set with each iterative pass — as changes so do the matrices. This costs space and time and limits the viability of the algorithm.

In the next post we will see that even here we are storing too much. Thanks to the nature of our residual function, we will see that there are 25 statistics that we can compute in one pass through the observation set that contain all of the information we need to compute and for any given . Once we do this, we will have an algorithm with space costs that are independent of the number of observations. (Well, really logarithmic because with too many samples a float will not have enough precision to capture the information in the marginal sample, but I’m not worrying about that for now.)

]]>I’ll outline a wealth of options we have in answering these questions in later posts. This article will specify reasonable answers to these questions, use those answers to pose a clear problem, then describe an algorithm to solve the problem.

The naïve algorithm presented here is easy to implement but requires too much memory for many micro-controller applications. Some simple refinements give us a second algorithm that still requires space linear in the number of samples, but with a constant small enough to make it usable on an Arduino. Finally we observe that we can collect 25 summary statistics of our data and have all the information we need to carry out the Gauss-Newton algorithm, giving us a constant space, constant time streaming algorithm. These algorithms will be described in detail in later articles.

Although the focus here is on sphere fitting, the methods described, including the constant-space streaming algorithm, have straightforward (if tedious) extensions to very general polynomial mapping problems.

First, a bit about notation.

**1. Notation **

We denote real 3-vectors with boldface lower-case Latin letters and label the axis with the integers . So . We assume that we are given a set of 3-vectors .

We work with a real vector space of parameters — 6-dimensional in this application — and will denote these parameter vectors with Greek letters that have an overbar. For example are parameter vectors.

We also work with -vectors where is the number of samples. These are denoted by boldface lower-case Latin letters with an arrow above. The residual vector is one example we will encounter soon.

Matrices of any dimension will be denoted by boldface upper-case Latin letters and transpose is denoted by , so is a matrix and is its transpose.

The Latin letter will be used to index the samples and other sets of size . The letter and will index the dimensions of space: .

**2. The Problem **

To answer Question 1, we restrict our attention to a 6-parameter family of maps that allows each axis to be shifted and scaled independently.

Definition 1 (Axial Affine Transformation)Given a 3-vector and parameters , define theaxial affine transformationof for parameters to be

Three comments are in order here.

- This is a special case of a general affine transformationon , we have just ruled out shear. The techniques described below extend in a straightforward way to handle general affine transformations, but are more involved and more difficult to follow. Also when calibrating MEMS sensors in practice the axial parameters are far more important than the shear parameters. It is not clear that modeling shear is worthwhile, but this needs more investigation.
- At this point it would look nicer to name the parameters like we did the first time we looked at the Gauss-Newton method: for the offsets and for the sensitivities. In the calculations that follow, we study functions on the six-dimensional space of parameters and want to treat them uniformly. Special names for the parameters actually make those calculations more confusing (for me anyway, YMMV). We store them as one array in the code too, so it is best to get used to that early.
- You may wonder why we
*divide*by instead of multiply. In theory, both choices should work equally well. In practice, samples coming from a magnetometer will generally give us offset values on the order of 10-1000 and sensitivities on the order of 100-1000. If we multiplied our parameter instead of dividing it, we’d have to work with the inverse of the sensitivity, which will be on the order of 0.01 – 0.001. So we may have parameters differing by 6 orders of magnitude. In the intermediate matrices we’ll see in the Gauss-Newton calculations, this can lead to matrix entries differing by 12 orders of magnitude creating “nearly singular” matrices that can’t be handled with 32-bit entries. I know this from experience, I did this my first time through and saw data come along and break it. Choosing parameters that are about the same order of magnitude leads to stable, reliable performance.

Now we address Question 2: how can we decide when one map is better than another? We are given a set of *observations* trying to find an axial affine transformation that takes all of these onto the unit sphere. If the map does this perfectly, then for all we will have

If , we cannot hope to make this always happen but if our model is reasonable we can hope to get close. If the map is good, then should be small. So we define

Definition 2 (Residual)For sample and parameters , theresidual of foris

The

residual vectoris the -vector

This measures the error for a single sample, but we need to combine all of the residuals to get one number representing the “total error”. We do this by taking the sum of square residuals.

We can now state our problem formally.

Problem 3Given observations , find parameters that minimize . That is, find

Alarms should be going off when you see this definition. Two apparently arbitrary decisions have been made that could have an important impact on our final results.

- Why did we use instead of or or something else for the residual?
- Why choose the sum of squares of residuals instead of absolute values or -th powers or some other function to combine residuals?

These decisions were made for *computational convenience*. Using the square of the magnitude in the residual makes the residual a polynomial of , a critical fact when we develop a streaming Gauss-Newton algorithm for sphere fitting. Using the sum of squares of residuals is what allows us to use the Gauss-Newton algorithmin the first place. Making these decisions differently may lead to better (or worse) calibration results in practice — I don’t know and I would like to study it — but they will require different algorithms.

**3. A Quick Recap of the Gauss-Newton Algorithm **

Problem 1 can be solved using the Gauss-Newton algorithm. What we need to know about this method is

- The residual vector is a function .

- The Jacobian matrix of this map lets us locally approximate with a linear function:
This is just basic calculus – we’re approximating a smooth function with its first derivative.

- The first derivatives of are then
- The second derivative Hessian matrix of can be approximated by . These two facts follow from the form of as a sum of square residuals.
- We can use and to approximate with a its second-order Taylor expansion. This is a quadratic function and we can find its minimum by solving the six-dimensional matrix equation
Then

So we can improve our parameters by replacing with .

- This gives us the Gauss-Newton algorithm:

**4. A Naïve Implementation **

The iterative step of the Gauss-Newton consists of

- computing the matrix and the vector for observations and parameters ,
- solving Equation 2, and
- updating then deciding whether to continue.

Item 2 is a standard linear algebra problem — I do it using textbook Gaussian elimination in muCSense. Item 3 is even more straightforward. Everything interesting happens in item 1.

The obvious thing to do is compute and . Looking at the definition of and differentiating, we find that

In code, to do this at some point we would need to have arrays in memory that look something like this

float r[N]; //residual vector float J[N][6]; //Jacobian matrix int16_t x[N][3] //the observations float JtJ[21]; //The left-hand side of equation 2. It is symmetric so //we only need to store 21 of the 36 matrix entires. float JtR[6]; //The right-hand side of equation 2.

I won’t write the code out for the implementation here because these declarations are enough to tell us that we are in trouble if we are implementing the algorithm on a micro-controller. These arrays require bytes of storage. If we take 10 samples we will be fine, but if we take 100 samples we need 3508 bytes. That will lead to some very undefined behavior on my Arduino Uno.

This will work for small samples, and the implementation is straightforward: loop over the observations and use Equations 3 and 4 to fill in the matrix entries. Then perform the needed matrix multiplications, solve the linear equation, update , and decide whether to continue. I will leave this as a warm-up exercise for the interested reader.

In the next post we will start writing some real code.

]]>

In this post, I’ll walk through an example sketch that uses these new classes. I’ll dig into the internals in later posts.

This sketch assumes that you have connected a SparkFun Sensor Stick to an Arduino (I have an Arduino Uno), just as I described a while back. Here’s the diagram (after the jump):

As explained in the last post, to use the library just download the code from GitHub and put it in a subdirectory of your /libraries/ directory. You may have to create the libraries directory. I put mine in /libraries/muCSense.

This assumes you are using Arduino 1.0.1. I will try to keep the library working on future versions of Arduino.

You can fetch this full sketch from GitHub. I’ll walk through it section by section here.

First we need to include some things. We need Wire.h to get I2C communication going with the sensors, and the other headers point to files in muCSense. You don’t need to include all of these if you aren’t using them.

#include <wire.h> #include <adxl345.h> #include <hmc5843.h> #include <i2c.h> #include <ITG3200.h> #include <sensor.h> #include <serialdatalistener.h> #include <simpledatacollector.h> #include <bestspheregaussnewtoncalibrator.h> #include <minmaxspherecalibrator.h>

Then we declare global objects. This includes

- an array of pointers to Sensors that we will use to access the Sensor Stick sensors
- a
**DataCollector**that is responsible for collecting observations from the sensors and making them available to clients like Calibrators. This is a very simple DataCollector: specify a number of samples to take and a number of milliseconds to take them, and it takes that many samples in that many millis. More sophisticated DataCollectors may discard outliers, filter time series, etc. - pointers to two
**Calibrator**s that will listen for updates from the DataCollector and use the observations to estimate calibration parameters for the accelerometer and magnetometer.

//Global objects. Pointers are to objects that are // created and initialized in setup(). //Pointers to sensors on the SparkFun Sensor Stick Sensor* sensors[3]; //A DataCollector object that will simply callect 100 // raw readings over 15 seconds SimpleDataCollector sdc(100, 15000); //The accelerometer and magnetometer will each get a Calibrator Calibrator* pAccelCal; Calibrator* pMagCal;

The code in this section is all part of the setup() routine. This is where the heavy lifting happens.

The first thing we need to do is start the Wire and Serial objects. I do serial communication at 115200 because that is the default for the bluetooth radio I use.

Serial.begin(115200); //115200 is default for BlueSMiRF Wire.begin(); //start I2C

Then we create and initialize Sensor objects, just like we did in the last post.

//Create instances of the sensors we'll be using sensors[0] = ADXL345::instance(); sensors[1] = HMC5843::instance(); sensors[2] = ITG3200::instance(); //Initialize the sensors, making sure they are ready // for continuous measurement. // Once initialized, add each sensor to the DataCollector. Serial.println("Initializing sensors"); size_t i; for(i=0;i<3;++i) { sensors[i]->init(); sdc.addSensor(sensors[i]); } Serial.println("Sensors ready");

Now we get into the meat of the calibration setup. The main players here are

- DataCollectors — orchestrate collection of data from the sensors. These may filter the data, discard outliers, or provide other data quality services. They may also support a data collection protocol, collecting data in phases, lighting up status lights (as Jens does with his RGB Tumbler ), responding to button presses, or more. The SimpleDataCollector used here just collects a fixed number of raw samples at a specified frequency.
- DataListeners — Responsible for doing something with the data. When a DataCollector has data ready, it notifies its listeners (this follows a textbook Observer design pattern). The listeners might then collect statistics, store the data, or simply write the data out to serial. In this framework, Calibrator objects are a type of DataListener.

//Now we create some objects that will listen to the // data collector. //The simplest DataListener is a SerialDataListener. // When notified that a DataCollector // has data ready, it simply fetches the data and // writes it to Serial SerialDataListener* pSDL = new SerialDataListener; //Have the SerialDataListener listen to all of the sensors for(i=0;i<3;++i) { pSDL->addSensor(sensors[i]); }

Then we create two Calibrator objects — one for the ADXL345 and the other for the HMC5843. Here we have two kinds of calibrators. The accelerometer uses a heavyweight Gauss-Newton calibrator that finds the model parameters that are optimal (in a certain sense, discussed here)

For the magnetometer, we use a much more lightweight (and less accurate, but still decent) calibrator that uses the minimum and maximum observations in each dimension to estimate offset and sensitivity.

Either one of these calibrators could be used on either a magnetometer or accelerometer. The point here is that you are free to mix and match, or write your own Calibrators, as you see fit. The choice of Calibrator has a major impact on the size of your compiled binary.

//the ADXL345 pAccelCal = new BestSphereGaussNewtonCalibrator(sensors[0]); //the HMC5843 pMagCal = new MinMaxSphereCalibrator(sensors[1]);

Before going on, we need to let the DataCollector know who is listening for updates.

//Now we need to register each of our DataListeners with // the DataCollector. This // lets the DataCollector know which listeners to // notify when new data is ready. sdc.addListener(pSDL); sdc.addListener(pMagCal); sdc.addListener(pAccelCal);

And finally we can collect the data and calibrate. As coded, this will collect 100 samples over 15 seconds. While it is running, it will output raw observations to serial (this is what the SerialDataListener is doing). During this time, smoothly rotate the sensor around so that each axis becomes oriented vertically both up and down at least once.

If you are aware of your local magnetic field, try to make sure you orient the axes along the magnetic field too. Where I live in Seattle, the Earth’s magnetic field points about 70 degrees straight down into the ground, so I don’t worry about it much. Near the equator it may matter.

One more note: this SimpleDataCollector is not ideal for accelerometer calibration. It helps to do something more careful, but I haven’t ported that something into this library yet.

//Run the data collection strategy sdc.collect(); //Perform calibration on the collected data. // We don't have a gyro calibrator here. pMagCal->calibrate(); pAccelCal->calibrate();

At this point, setup is complete and the hard work is done.

Now we’ll run a simple loop to read, transform, and print sensor readings using our newly computed calibration parameters.

First we’ll name some convenience variables and read the sensors.

//Pointer to a buffer that will cointain the raw readings const int16_t* buf = 0; Sensor* pAccel = sensors[0]; Sensor* pMag = sensors[1]; //read the sensors directly for(size_t i=0;i<3;++i) { sensors[i]->read(); }

Then we will print the raw readings, transform the raw readings into calibrated readings, and print the calibrated readings.

//Now transform the reading using the calibratior. // Accumulate the sum // of squares too, so we can check that it is close to 1. float calibrated[4]; calibrated[3] = 0.0; pAccelCal->transform(buf, calibrated); for(int i = 0; i < 3; ++i) { calibrated[3] += calibrated[i]*calibrated[i]; Serial.print(calibrated[i]); Serial.print("\t"); } Serial.println(sqrt(calibrated[3])); //Read and print the magnetometer data buf = pMag->rawReading(); printInt16Array(buf,3); //Now transform the reading using the calibrator. // Accumulate the sum // of squares too, so we can check that it is close to 1. calibrated[3] = 0.0; pMagCal->transform(buf, calibrated); for(int i = 0; i < 3; ++i) { calibrated[3] += calibrated[i]*calibrated[i]; Serial.print(calibrated[i]); Serial.print("\t"); } Serial.println(sqrt(calibrated[3]));

Then print a line to separate readings and wait one second.

Serial.println("_________________________"); delay(1000);

That’s it!

This is freshly written and there are plenty of problems to fix and features to add. The most fundamental problem I see is with using the Gauss-Newton calibration objects. While I’ve tried to implement them efficiently, the objects are huge for the 2K of RAM we’re working with — each instance is 150 bytes! — so you can’t allocate too many of them on the heap without overwriting the stack (this doesn’t happen gracefully on an Arduino BTW). I have some ideas to trim it down, but for now I only see how to reclaim 15 bytes.

What’s worse is that the code size is huge. Using the Gauss-Newton calibrator instead of the Min-Max calibrator bloats the binary sketch by about 10KB. Maybe this just needs to be used to collect good numbers that we store in EEPROM before reprogramming.

With that said, there is structure in the code that could be exploited to significantly reduce the size.

Beyond that, all of the issues I raised in my last post still stand.

So there is work to do. But if you are looking for code to calibrate your accelerometer or magnetometer, this should get you started.

*(UPDATE: I originally had named this library “SensorLib” — very creative, I know! — and found there were already a few SensorLibs on the internet. So I changed it to muCSense, for “micro-controller sensor”. It will take me just a little while to purge everything of the name SensorLib, but I’m working on it.)*

you’d have a big mess.

It would work, but it would be awful to read, maintain, or upgrade. And what if you wanted to put some real logic in it? Maybe something that does more than dump raw readings to Serial?

And what happens when we get other sensors? We will usually be doing the same things:

- Define device-specific constants, likely lifted from the datasheet.
- Initialize.
- Read data.
- Calibrate.
- Maybe tweak some device specific parameters, listen for interrupts, etc.

Everything but item 5 can be pretty well abstracted away, and we can encapsulate all of the device specific mess in classes that won’t clutter our code or namespaces. Doing this will leave us with code that is more robust, easier to use correctly, and easier to maintain.

That’s why I am writing muCSense.

The purpose of muCSense is to provide a framework that abstracts basic operations of initializing, accessing, and calibrating sensors while allowing enough flexibility to allow clients full control over the specific features of each sensor. Along with the framework, the library provides implementation classes for a variety of common sensors.

That’s the goal, anyway. I’m still working on the calibration part of the library, and I will write about that — and about some of the cool algorithms behind it — once I get the code online. As of today, I have checked in a basic sensor access framework along with implementation classes for all of the sensors on the 9DoF Sensor Stick.

Let me give you a quick taste of how this library changes your code. If you haven’t looked at the code I already posted for the 9DoF Sensor Stick, open a new tab and have a look now.

Now let’s look at a sketch that will produce the same output (modulo startup messages) using muCSense. Start with a few needed #includes.

#include <wire.h> #include <adxl345.h> #include <hmc5843.h> #include <ITG3200.h>

Next we’ll declare a few pointers for implemented sensors that we’ll construct on the heap. We’ll also keep an array of Sensor objects and an array of names that helps us print things later.

//Pointers to implemented sensors that we'll create on the heap. ADXL345* pAccel; HMC5843* pMag; ITG3200* pGyro; Sensor* sensors[3]; char* names[] = {"ACCEL: ", "MAG: ", "GYRO: "};

The setup routine is simple: we use factory methods to fetch the single instances of each of our sensors. It is possible that we have more than one of these sensors connected. In that case, you just need to pass the I2C bus address of the sensor you want as an argument to the ::instance() method.

Once they are created, we put the Sensor objects into an array and loop through to initialize them.

void setup() { Serial.begin(115200); //Default for BlueSMiRF Wire.begin(); //Call factory methods to get single instances of each sensor. // To get a sensor at a specific I2C bus address, pass the // address in to the factory method. pAccel = ADXL345::instance(); pMag = HMC5843::instance(); pGyro = ITG3200::instance(); //Put the sensors into the array. We could have //constructed them here from the start, but sometimes //it is nice to have access to the derived class pointers. sensors[0] = pAccel; sensors[1] = pMag; sensors[2] = pGyro; //All of our objects implement the Sensor interface, //so we can work with them uniformly. Here we initialize. Serial.println("Initializing sensors"); for(size_t i=0; i < 3 ; ++i) { sensors[i]->init(); } Serial.println("Sensors ready"); }

While running, we can loop through the sensors and tell each one to fetch a new reading, then loop through them again to see what the reading was. Why separate the steps like that? There are many times I want to have many clients listening to one sensor, and I want to them to all get the same data. In the calibration part of the library (not checked in yet!) I do this with an Observer pattern, where a DataCollector object tells the sensor to read(), then notifies all of the listeners to come get their data.

void loop() { //Now we read the sensors. This call actually triggers // I2C communication (or whatever other communication is needed) // with the device and stores the readings internally. for(size_t i=0; i < 3 ; ++i) { sensors[i]->read(); } //Now that the sensors have been read, we can ask the sensors what // the readings were. This DOES NOT trigger communication with the // sensor and the rawReading method is const. This allows multiple // clients to request the rawReading and they will all get the // same result -- a feature that can be very important. const int16_t* buf = 0; for(size_t i=0; i < 3 ; ++i) { buf = sensors[i]->rawReading(); Serial.print(names[i]); printInt16Array(buf,sensors[i]->dim()); } Serial.println(); delay(100); }

Finally, here is a little helper function I used above (that really should be a template).

void printInt16Array(const int16_t* buf, size_t len) { size_t i; for(i=0;i

Yes, it is still a bit long but notice that most of it is whitespace and comments. There are no device-specific #defines. You can create, initialize, and start reading data from a sensor in four lines of code. If you have a bunch of sensors — like we do on the Sensor Stick — then you can put them in an array and loop through them using a standard interface.

To make the sketch above work, you need to download and install muCSense. To do that:

- Download and install Arduino 1.0.1. I don’t know if it works on earlier versions, I’ll try to keep it working on later versions.
- In your Arduino directory, create a directory called “libraries” if it isn’t already there.
- Download the code as a zipfile from the GitHub repository and extract the files into the “libraries” directory you created in Step 2. It will extract into a folder called something like “rolfeschmidt-muCSense-9b13d30” in your libraries directory. Feel free to change the name.
- Now you can either put a sketch together from the snippets above, or grab one I already put together here. You should be up and running in no time. Enjoy!

I’ve been using this for my own projects for a week or two, but it is far from complete and needs thorough testing. So caveat downloador.

Some particular issues that currently need to be addressed:

- Need a unit testing framework and a set of tests.
- Could use an implementation class for ADXL335 (connects through analog pins).
- Could use implementation class for simple potentiometers — I particularly want to use flex sensors.
- A “keywords” file would be nice.
- There is no wiki or documentation.
- There is nothing in there for calibration.
- EEPROM storage doesn’t work.

Right now (August 2012), I’m working hard on item 6, making a calibration library, and hope to have something online within a week. If anyone wants to join in and help me get the other parts done, that would be great!

Accelerometers are fun, but I want something more. So I picked up a “sensor stick” from Sparkfun.

It has an ADXL345 accelerometer (datasheet) , an ITG-3200 gyroscope (datasheet), and an HMC5843 magnetometer (datasheet). When still, the accelerometer can read the local gravitational field and the magnetometer can read the local magnetic field. As long as I’m not at a magnetic pole, this is enough to give me an external frame of reference and know exactly how my sensor is positioned in space. When the device is spinning, the gyro can help me separate translational acceleration from rotational acceleration as well as help me keep a running estimate of my sensor’s orientation even when I can’t trust the acceleration. A system that does this is called an Attitude and Heading Reference System [AHRS].

I’m just getting into the Mathematics of how to use all of this data, and it is going to be good fun. For today I’m going to talk about a much simpler problem: how do you connect to this board, initialize it, and read data from it.

Reading nine analog measurements directly would require nine analog pins, something we don’t have. Instead, we will access these sensors using I2C, a two-wire protocol that allows us to put up to 112 devices on one bus. This makes the wiring simple and the programming just a touch more tedious.

There are loads of great introductions to I2C on the web. I may add to them one day, but for now, use your favorite search engine and learn about it if you don’t already know.

First things first, I soldered wires to my sensor stick instead of the plugable headers I usually work with. I figured it would be more flexible if I wanted to stick it on a ski or sew it into my clothing later. This is what my stick looks like:

Thanks to I2C, our circuit is dead simple. We just need the following connections to get this going on an Arduino Uno:

- Sensor stick VCC -> Arduino 3.3V
- Sensor stick GND -> Arduino GND
- Sensor stick SDA -> Arduino A4 (Arduino’s SDA, or data line)
- Sensor stick SCL -> Arduino A5 (Arduino’s SCL, or clock line)

Here’s a Fritzing diagram in case the words are too complicated…

Once the circuit is put together, you are ready to start programming.

In order to access the sensor stick, we will need to

- know the 7-bit I2C addresses for each of our sensors
- know the addresses of each register we need to access on each sensor
- know how to initialize the sensors to make sure they are recording data, configured as we want them, and not sitting in power save mode.
- start reading data

Steps 1-3 require spending some quality time with the datasheets (linked above) and the schematic of the sensor stick so you know which pins are being pulled to GND, etc. Or you can just copy my code below.

I’ll walk through code for steps 1-3 for each of the three sensors before putting it all together in a simple data collection loop.

All I2C access will be done using the Wire library, so put this line at the top of your sketch:

#include <Wire.h>

I also use these two utility functions to interact with the I2C bus throughout the code. This is really all there is to I2C on Arduino if your needs are simple. These methods are based on ones I found on Keith Neufeld’s blog.

void i2c_write(int address, byte reg, byte data) { // Send output register address Wire.beginTransmission(address); Wire.send(reg); // Connect to device and send byte Wire.send(data); // low byte Wire.endTransmission(); } void i2c_read(int address, byte reg, int count, byte* data) { int i = 0; // Send input register address Wire.beginTransmission(address); Wire.send(reg); Wire.endTransmission(); // Connect to device and request bytes Wire.beginTransmission(address); Wire.requestFrom(address,count); while(Wire.available()) // slave may send less than requested { char c = Wire.receive(); // receive a byte as character data[i] = c; i++; } Wire.endTransmission(); }

Per the datasheet, the 8-bit address for the ADXL345 is 0x3A unless pin 12 is pulled to GND, then it is 0xA6. A quick look at the schematic shows that pin 12 is indeed pulled low, so our address is 0xA6. (Don’t ask how long I kept trying and failing to use 0x3A!)

The Wire library just wants a 7-bit address, so we right shift and add this line to our code:

#define ADXL345_ADDRESS (0xA6 >> 1)

There are a couple of register addresses we need to access:

//There are 6 data registers, they are sequential starting //with the LSB of X. We'll read all 6 in a burst and won't //address them individually #define ADXL345_REGISTER_XLSB (0x32) //Need to set power control bit to wake up the adxl345 #define ADXL_REGISTER_PWRCTL (0x2D) #define ADXL_PWRCTL_MEASURE (1 << 3)

With these definitions in place, this is how we initialize the ADXL345 on startup:

void init_adxl345() { byte data = 0; i2c_write(ADXL345_ADDRESS, ADXL_REGISTER_PWRCTL, ADXL_PWRCTL_MEASURE); //Check to see if it worked! i2c_read(ADXL345_ADDRESS, ADXL_REGISTER_PWRCTL, 1, &data); Serial.println((unsigned int)data); }

Once the accelerometer is initialized we can read the data. It comes to us in a chunk of 6 bytes: the least significant byte of X, the most significant byte of X, the least significant byte of Y, the most significant byte of Y, and so on. Here’s the code.

int accelerometer_data[3]; void read_adxl345() { byte bytes[6]; memset(bytes,0,6); //read 6 bytes from the ADXL345 i2c_read(ADXL345_ADDRESS, ADXL345_REGISTER_XLSB, 6, bytes); //now unpack the bytes for (int i=0;i<3;++i) { accelerometer_data[i] = (int)bytes[2*i] + (((int)bytes[2*i + 1]) << 8); } }

Perusing the datasheet and schematic like we did above, we find that the ITG-3200 has 8-bit address 0xD0. To get reasonable data from it, we must also initialize the scale and digital low-pass filter. * This is critical: the default value for the scale on startup is 0b00, a reserved value!* Here are the definitions we need:

#define ITG3200_ADDRESS (0xD0 >> 1) //request burst of 6 bytes from this address #define ITG3200_REGISTER_XMSB (0x1D) #define ITG3200_REGISTER_DLPF_FS (0x16) #define ITG3200_FULLSCALE (0x03 << 3) #define ITG3200_42HZ (0x03)

And here is the initialization function

void init_itg3200() { byte data = 0; //Set DLPF to 42 Hz (change it if you want) and //set the scale to "Full Scale" i2c_write(ITG3200_ADDRESS, ITG3200_REGISTER_DLPF_FS, ITG3200_FULLSCALE | ITG3200_42HZ); //Sanity check! Make sure the register value is correct. i2c_read(ITG3200_ADDRESS, ITG3200_REGISTER_DLPF_FS, 1, &data); Serial.println((unsigned int)data); }

Once initialized, we can use the following function to capture the data. Notice that the most-significant and least-significant bits come in a different order than they did with the ADXL345.

int gyro_data[3]; void read_itg3200() { byte bytes[6]; memset(bytes,0,6); //read 6 bytes from the ITG3200 i2c_read(ITG3200_ADDRESS, ITG3200_REGISTER_XMSB, 6, bytes); //now unpack the bytes for (int i=0;i<3;++i) { gyro_data[i] = (int)bytes[2*i + 1] + (((int)bytes[2*i]) << 8); } }

Now let’s do the same work for our magnetometer, the HMC5843. The trick with this one is that by default it starts in a single measurement mode. If you don’t change that, you’ll just keep getting the same reading over and over.

#define HMC5843_ADDRESS (0x3C >> 1) //First data address of 6 is XMSB. Also need to set a configuration register for //continuous measurement #define HMC5843_REGISTER_XMSB (0x03) #define HMC5843_REGISTER_MEASMODE (0x02) #define HMC5843_MEASMODE_CONT (0x00)

The initialization just sets the measurement mode.

void init_hmc5843() { byte data = 0; //set up continuous measurement i2c_write(HMC5843_ADDRESS, HMC5843_REGISTER_MEASMODE, HMC5843_MEASMODE_CONT); //Sanity check, make sure the register value is correct. i2c_read(HMC5843_ADDRESS, HMC5843_REGISTER_MEASMODE, 1, &data); Serial.println((unsigned int)data); }

With this done, we can now read the data registers, almost exactly like we did for the gyro.

int magnetometer_data[3]; void read_itg3200() { byte bytes[6]; memset(bytes,0,6); //read 6 bytes from the HMC5843 i2c_read(HMC5843_ADDRESS, HMC5843_REGISTER_XMSB, 6, bytes); //now unpack the bytes for (int i=0;i<3;++i) { magnetometer_data[i] = (int)bytes[2*i + 1] + (((int)bytes[2*i]) << 8); } }

First let’s perform all of the initialization we need on setup:

void setup() { Wire.begin(); Serial.begin(9600); for(int i = 0; i < 3; ++i) { accelerometer_data[i] = magnetometer_data[i] = gyro_data[i] = 0; } init_adxl345(); init_hmc5843(); init_itg3200(); }

Now that all of the hard work is done, a simple little loop can hum away and report all of your raw data for you:

void loop() { read_adxl345(); Serial.print("ACCEL: "); Serial.print(accelerometer_data[0]); Serial.print("\t"); Serial.print(accelerometer_data[1]); Serial.print("\t"); Serial.print(accelerometer_data[2]); Serial.print("\n"); read_hmc5843(); Serial.print("MAG: "); Serial.print(magnetometer_data[0]); Serial.print(","); Serial.print(magnetometer_data[1]); Serial.print(","); Serial.print(magnetometer_data[2]); Serial.print("\n"); read_itg3200(); Serial.print("GYRO: "); Serial.print(gyro_data[0]); Serial.print("\t"); Serial.print(gyro_data[1]); Serial.print("\t"); Serial.print(gyro_data[2]); Serial.print("\n"); //Sample at 10Hz delay(100); }

Of course once you run this and play with it for a while, you’re going to ask “Now how do I calibrate these things?”

Well, the Gauss-Newton method I’ve been beating on in my earlier posts works great for the ADXL345 (of course) and the HMC5843. For the gyro it’s a different story and I’m working on it.

I’ll try to get a full sensor stick sketch with calibration up some time soon too. In the mean time, have fun checking your attitude.

*UPDATE:* If you want a quick and easy way to visualize your raw data to make sure it is reasonable and perhaps do a quick “hand calibration” (not that I’d ever do such a thing then have a look at this post and try out the gnuplot scripts. Here’s what my magnetometer data looked like after spinning the stick around haphazardly for a while:

It is very close to an ellipsoid, and I was able to run this raw data through my Gauss-Newton octave scripts (described in the accelerometer calibration series) to calibrate the magnetometer beautifully.

* *

*Introduction**Simple Methods**Least-Squares and Gauss Newton**Implementing Gauss-Newton on an ATMEGA (this post)**Error Analysis**?*

This post is short on explanations, I just wanted to something a bit more useful out. We can implement the Gauss-Newton method directly on an Arduino using this sketch to drive a circuit with an ADXL 335 and two pushbuttons like this:

Once the circuit is wired and the sketch is uploaded, open the serial monitor and move the thing around to 6 or more distinct positions. In each position push the button on pin 2, and hold it still for a second (or until you see the reading print out on the serial monitor. After you’ve collected at least 6 readings, push the button on pin 3 — this will perform the optimization and print your parameters.

See how I do it in this very fuzzy video:

**Some details**

In the last post we saw how we could use the Gauss-Newton method to use any set of sample reading from a still accelerometer to get an accurate calibration. The big problem was the tediousness. We took measurements on an arduino, sent it over the serial connection to a computer, ran scripts on the computer to compute calibration parameters, then reprogrammed the Arduino with the new parameters. Sure most of this could be automated, but it would still require a serial connection to a “real” computer with more than 1K of RAM.

To fix this, we need to implement Gauss-Newton directly on the arduino, but the ATMEGAs 1K to 2K memory limitation makes this challenging. We simply cannot store the big matrices used in the intermediate calculations of the Gauss-Newton method. To get around this, we do two things.

- To reduce the number of data samples we need to store without losing too much information about the distribution of samples, we let the accelerometer sit in each position long enough to collect a few samples (I picked 32), then we average them and computed the variance.
- We never materialize the large matrices, instead just compute the smaller matrices on the fly when they are needed.

There are a few other tricks I use to improve the data quality, and I’ll go over all of the details in the next post where I plan to walk through the code.

Even though it is handheld and held by a hand that had too much coffee, this process gives highly repeatable results. I performed this calibration 20 times and recorded the parameters produced. You can see the data in this spreadsheet. The summary is this: the zero-G parameters on each axis has standard deviations less than 0.03% and the sensitivity parameters all had standard deviations less than 0.2%. Accurate? Maybe Repeatable? definitely.

**On-Arduino Gauss-Newton Calibration Overall Grades**: Accuracy – **Great **Easiness – **Great **

In the next post I’ll explain the details behind all of this, then move on to error analysis, saving parameters in EEPROM, and more.

]]>*Introduction**Simple Methods**Least-Squares and Gauss Newton**Streaming Gauss-Newton on an ATMEGA**Error Analysis**?*

In the last post we looked at some simple calibration methods and they didn’t seem to get us results that were better than the ones we got by just trusting the datasheet. There were two obvious problems:

- Our calibration methods assumed we had placed our accelerometer in perfect axial alignment each time. We were, um, less than perfect.
- We ignored the fact that there is noise in the sensor readings. Eyeballing it, this alone would give us about ~1% error.

We could address problem (1) by finding ways to place our device very carefully before taking measurements. We could address problem (2) by averaging our samples (and assuming that the average was a good estimate of the real values).

Instead, we will use some Mathematics to develop a calibration method that is robust to noise and is not dependent on careful sensor placement. This will give more accurate results even though it requires less care by the user.

Here is the idea. We’ve been assuming that there are just six numbers we need to find to get a good calibration: . (See the last post for details and definitions. Also note that if there is any skew we’ll need a more complicated model.) Now matter how we place our sensor before we take readings, we assume it is still and is detecting exactly 1G of acceleration in some direction.

So if we read value on the X-pin, on the Y-pin, and on the Z-pin we can look at the acceleration vector given by

If our measurements have no noise and our parameters are correct, the length of this vector should be exactly 1 no matter which way it is pointing. That is

Or

To calibrate our model we just need to find parameters so that this equation is always true. Unfortunately

- With only one observation we get one equation in 6 unknowns. There will be infinitely many solutions in general. How do we pick one?
- With exactly six distinct observations we have 6 equations in 6 unknowns and should get a unique solution. The observations should be as different as possible so the solution isn’t error prone. But we know there is noise in our readings to this won’t be exact.
- With more than six observations we capture information about the noise in the data but we get more than 6 equations in 6 unknowns and cannot expect to find a solution.

It seems that the right thing to do is work with many observations. We won’t be able to solve our equations, but we can choose parameters that make the errors as small as possible.

Let’s presume we used the circuit we built last time, collected data for the 6-point calibration, and ended up with readings. Write and for the reading on the X-pin, Y-pin, and Z-pin in the -th sample. Then the error we get for sample is

Now we want to find parameters so that all of the are small. We expect some errors so we don’t want to over-penalize these, but large errors suggest significant problems and should be penalized heavily. It turns out that if we look at square errors, , they have just this sort of property and make the algorithms that follow simpler. So we will try to* pick parameters so that the sum of square errors is small*:

Why square errors? In statistics we always look at square errors first, even if it isn’t always right. In this case I believe using square errors are both reasonable and convenient even if there isn’t a clear reason we shouldn’t use some other convex function of the error. And you’ll see we get pretty good results.

But I digress.

Now we can formally state the problem we are trying to solve.

**Least Squares Problem.** Given samples for , find parameters such that the penalty function

* *

*is as small as possible.*

This is a classic nonlinear least-squares problem and it can be solved numerically using the Gauss-Newton method. One day I may write more about these things — it is a beautiful algorithm and we’ll need to adjust it to work with the limited memory of an ATMEGA chip — but for today I will just offer you my code.

I use the following workflow to calibrate an ADXL335 using the Gauss-Newton method. I do things on windows 7, but it is all implemented with Python and Octave – free tools that are available many platforms. Small adjustments may be needed. (You can download Python here and download octave here.)

- Pick a directory to store data and code. I’ll call this C:\your\calibration\directory\.
- Collect data using the pushbutton circuit described in the last post. Put the sensor in a variety of positions (the 6 positions you used for 6-point calibration would be a good start). Store the serial output in a text file. I’ll assume it is stored in C:\your\calibration\directory\calData.txt.
- download the Python script octaveprep.py and place it in C:\your\calibration\directory\, then run it, passing the data file name as an argument. At the windows command line (already in the right directory) I type
**C:\Python27\python.exe octaveprep.py calData.txt**This will write a new file, an octave script called calTest_octave.m. This script will be used to perform the real data analysis.

- Download the file gaussnewton.mand place it in C:\your\calibration\directory\. Open octave and load the script by typing
**source “C:\\your\\calibration\\directory\\calData_octave.m”**This will compute a 6-number array beta which holds all of our calibration parameters. The correspondence is as follows:

= beta(1)

= beta(2)

= beta(3)

= beta(4)

= beta(5)

= beta(6)

When I run this workflow on the data I collected for the 6-point calibration in the last post I get the following:

beta = 5.1448e+002 5.0231e+002 5.1665e+002 9.5485e-003 9.4631e-003 9.6936e-003

which means that and . Sticking these parameters in the unit conversion loop I used to evaluate the 6-point calibration method in the last post, I get readings

X-up 0.9980144 0.0254738 0.0615826 X-down -0.9976190 0.0254738 0.0131148 Y-up 0.0145204 1.0001717 -0.0062723 Y-down -0.0332220 -0.9965395 0.0131148 Z-up -0.0618675 0.0538631 0.9921641 Z-down 0.0336174 -0.0123785 -1.0047088

We did not get a bunch of beautiful readings like “0 0 1”, “0 0 -1”, etc. because we still had sloppy placement going in. But if you compute the magnitudes of each of these vectors you will see that they range from 0.99555 to 1.00535. They are almost all within one half of one percent of the perfect 1G reading!

If you have trouble getting these scripts to work for you please leave a comment and let me know.

Now we have a calibration that uses exactly the same measurements that our 6-point calibration system did but that has very accurate results. If we automated the workflow described above, we’d have an easy accurate calibration method.

This is starting to look pretty good, but we can still do better. For the last example I just used data collected from six basic positions: Z-up, Z-down, Y-up, Y-down, X-up, and X-down. We can get better results if we sample different directions, but wich new directions should we add?

You can think of these positions we already sampled as the ones we would get if we stuck our accelerometer on each of the faces of a cube. We want to get new directions that are “as different as possible”. To do this, just chop off each of the corners of the cube to get 8 new faces and set the accelerometer on each of these new faces too.

I don’t really use a truncated cube to take measurements. Actually I tape my accelerometer to a ruler and clamp it in a third hand from my soldering bench to position it securely.

After 30 minutes of fiddling, I got a stream of slightly noisy data from each of these 14 positions. I calibrate and get the following parameters:

beta = 5.1409e+002 5.0231e+002 5.1662e+002 9.5632e-003 9.4572e-003 9.6777e-003

These are very close to the parameters we got from the 6-position data. In fact, on those 6 positions these parameters do not give better results (they can’t because we had chosen the best parameters for those 6 positions). But for the new positions and for positions scattered around the sphere we see significantly better accuracy. I’ll show you the details when we talk about error analysis later.

**Six-point Calibration Overall Grades**: Accuracy – **Great **Easiness – **OK**

We could go through this process every time we get a new sensor, store the parameters in EEPROM, and carry on. But I want to be able to calibrate a device on the slopes or on the lift and I won’t be toting around my laptop and Octave installation. If you looked through some of that code you would see that we were dealing with some very large matrices, ones that would quickly overrun the memory on an Arduino. In the next post we’ll see how to overcome these problems and implement the Gauss-Newton method directly on the Arduino.

]]>Now let’s look at some of the simplest methods we can use to calibrate an accelerometer. By “simple” here I mean simple all the way through: easy to wire the circuit, easy to code, easy to understand, easy to use. Later we’ll see that these don’t always come as a package: fairly sophisticated software can make the users job even simpler than anything we see here.

The simple methods I will describe here all follow the same flow: the user carefully places the accelerometer in position, flips a switch to tell it to start recording data, flips the switch back to tell it to stop, then moves the accelerometer into a different position to start again. The differences all come down to deciding how many measurements to take and what to do with the numbers we get. Because of that, all of these methods can be performed using the same simple circuit.

Let’s build that before we go on.

The data collection circuit is pretty simple: the ADXL335 is set up just like it was in the basic circuit example. The only difference is that I add a switch. One tells the circuit to start recording data, the other tells it to stop. This lets us make sure we only record data when the accelerometer is properly positioned and still.

This is what the circuit looks like.

Of course we need some code to control the circuit. The sketch below will do the job:

// these constants describe the pins. They won't change: const int xpin = A1; // x-axis of the accelerometer const int ypin = A2; // y-axis const int zpin = A3; // z-axis (only on 3-axis models) int switchPin = 2; int reportData = 0; int sampleDelay = 20; //number of milliseconds between readings int collectionPhase = 0; //Button presses will break up data collection into phases. This variable counts and identifies the phase. void setup() { // initialize the serial communications: Serial.begin(9600); analogReference(EXTERNAL); pinMode(xpin, INPUT); pinMode(ypin, INPUT); pinMode(zpin, INPUT); pinMode(switchPin, INPUT); } void loop() { //Check if the button was pressed. Toggle data reporting if it was. int val; val = digitalRead(switchPin); //When the button is not pressed there is no connection to GND //so the value we read is HIGH if (val==HIGH) { if(reportData==1) { collectionPhase += 1; Serial.print("#Collection phase "); Serial.println(collectionPhase); } reportData = 0; delay(500); } else { //The value is LOW. This means the button is being pressed, // opening a connection to GND that pulls down the voltage on // the input pin. reportData = 1; } if(reportData>0) { //take sample and write values to the serial monitor Serial.print( analogRead(xpin)); Serial.print("\t"); delay(1); Serial.print( analogRead(ypin)); Serial.print("\t"); delay(1); Serial.print( analogRead(zpin)); Serial.print("\n"); } // delay before next reading: delay(sampleDelay); }

The code here is pretty simple. When the button is pressed, a connection is opened to GND that pulls the voltage on pin 2 down to LOW. If we see this happen, go into a “reporting data” state (reportData == 1). When we’re in the reporting state, we write the readings to the serial monitor. Keep doing this as long as the button is still pressed.

When the button isn’t pressed, pin 2 reads HIGH. This puts is in a “quiet state” (reportData==0). If this happens when we had been in a reporting state then we print a one-line comment telling us what data collection phase we just finished. These comments can be *very* helpful for understanding the data later.

Now you can take measurements. Build the circuit, upload the sketch, and open the serial monitor. At first you’ll see nothing, but push the button and hold it down a bit and watch the data flow. Move the accelerometer and try again.

Now we are ready to calibrate.

The simplest possible empirical calibration we can do will only use one measurement. In fact we’re only going to use one-third of a measurement. Here is what you do.

- Build the data collection circuit from above, upload the sketch, and open the serial monitor.
- lay the accelerometer flat on a table.
- push the button for a bit.

You should see a stream of consistent readings show up on the monitor. If they are bouncing around a lot, something is wrong and you won’t be able to calibrate. My readings look like this:

511 521 619 510 521 618 511 521 618 511 521 618 511 521 618 511 521 618 511 521 619

I could average them, but I’ll just say x=511, y= 521, and z=618. Now I want to use this to calibrate. Recall that when we did the faith-based calibration based on datasheet specs, we needed two numbers:

*zero-G.*This number tells us which voltage reading corresponds to zero G’s on an axis.*sensitivity.*This tells us how much the voltage changes per G an an axis.

Now if we only take one measurement, we can’t estimate two parameters. We’ll need to take one on faith and estimate the other.

Let’s assume that a reading of 512 corresponds to zero G and use this reading to estimate the sensitivity, or how much the voltage changes per G. The way we placed the device, all of the acceleration is on the z-axis, so we should be seeing 1 G of acceleration on the z pin. Thus

Recall that per the datasheet, we expected this sensitivity factor to be 102.3. This is significantly different.

Now when we get new readings we can convert them to G’s as follows. Let be the acceleration on an axis measured in Gs and let be the actual measurement we got off the pin. Then

In code, we can now change our loop to output converted units:

void loop() { int x = analogRead(xpin); //add a small delay between pin readings. I read that you should //do this but haven't tested the importance delay(1); int y = analogRead(ypin); //add a small delay between pin readings. I read that you should //do this but haven't tested the importance delay(1); int z = analogRead(zpin); //zero_G is the reading we expect from the sensor when it detects //no acceleration. Subtract this value from the sensor reading to //get a shifted sensor reading. float zero_G = 512.0; //scale is the number of units we expect the sensor reading to //change when the acceleration along an axis changes by 1G. //Divide the shifted sensor reading by scale to get acceleration in Gs. float scale = 106; Serial.print(((float)x - zero_G)/scale); Serial.print("\t"); Serial.print(((float)y - zero_G)/scale); Serial.print("\t"); Serial.print(((float)z - zero_G)/scale); Serial.print("\n"); // delay before next reading: delay(sampleDelay); }

When I run this loop, I get readings like this:

-0.04 0.08 1.00 -0.03 0.09 1.00 -0.04 0.08 1.00

It is better than what we had before on the z-axis, but not too great on the other axes. I do a more thorough error analysis later, but for now it is clear that we should do something better.

**One-point Calibration Overall Grades**: Accuracy – **Bad ** Easiness – **OK**

It’s pretty clear why the one-point calibration didn’t work very well: We only took measurements in one direction on one axis. Eyeballing the data makes it clear that 0G doesn’t really give us a reading of 512 on our input pin, and it looks like different pins might have different centers.

So we are not trying to find just two numbers. We need to find the 0G mark and the sensitivity on each axis. Call them and (e.g. is the zero-G mark or the “middle” on the z-axis, is the sensitivity on the z-axis). That makes 6 numbers, so we should expect to need at least 6 measurements.

Let’s do the obvious thing. Take right-side-up and up-side-down measurements on each axis. Here I will start with the z-axis. Laying my circuit flat on the table I got the reading

511 521 618

Flipping it upside down I get

518 501 413

The X and Y axis readings tell me I’m not laying this down perfectly flat, but I tried! We could futz back and forth until we get it just right, but in the next post we’ll see a better way to deal with positioning trouble. For now let’s accept it as “good enough” and see how good that is.

Our 1G reading for the Z axis was 618, the -1G reading (upside down) was 413. Linear interpolation (aka common sense) tells us that 0G on the z-axis should be exactly in the middle:

We can also estimate the sensitivity. The difference between our two measurements should be 2G, so just find the difference and divide by 2:

Continuing on the Y axis I get measurements

516 608 516 #Y-up 511 397 518 #Y-down

Giving us a zero-G mark

and sensitivity

Finally on the X axis we get measurements

619 505 523 #X-up 410 505 518 #X-down

Giving us a zero-G mark

and sensitivity

So the parameters are close, but still significantly different on each axis. To convert the measurements to Gs using these parameters, use a loop like this:

void loop() { int x = analogRead(xpin); delay(1); int y = analogRead(ypin); delay(1); int z = analogRead(zpin); float zero_G_x = 509.5; float zero_G_y = 502.5; float zero_G_z = 515.5; float scale_x = 104.5; float scale_y = 105.5; float scale_z = 102.5; Serial.print(((float)x - zero_G_x)/scale_x); Serial.print("\t"); Serial.print(((float)y - zero_G_y)/scale_y); Serial.print("\t"); Serial.print(((float)z - zero_G_z)/scale_z); Serial.print("\n"); // delay before next reading: delay(sampleDelay); }

Here are some sample readings I get from this calibration with the accelerometer in each of the six test positions:

X up: 1.02 0.02 0.00 X down: -0.99 0.05 -0.02 Y up: 0.08 1.01 -0.09 Y down: -0.02 -0.97 0.02 Z up: -0.02 0.16 1.00 Z down: 0.08 -0.08 -0.99

This is OK, but not great. We get clear readings close to +/- 1G on the axis of interest and we get something “smaller” on the other axes. Obviously I was not laying the thing down straight, and this caused calibration errors. I also made no effort to account for noise in the input signals.

Both of these problems will be solved through software in the next post when we look at statistics, least-squares, and the Gauss-Newton method.

I’ll do a thorough error analysis of all of these methods later, but for now I’ll give six-point calibration these grades.

**Six-point Calibration Overall Grades**: Accuracy – **Bad ** Easiness – **OK**

- The data coming out were stable and repeatable
- The conversion I used wasn’t right: the acceleration of gravity changed depending on the tilt of the accelerometer.

The problem is that I wasn’t really calibrating. I never took measurements from my device and compared them with known values. I just trusted the datasheet (which doesn’t promise anything precise, by the way). Even if the datasheet promised exact sensitivity, it doesn’t know how the sensor will be placed in the circuit. For example:

- Is the soldering dodgy on any of the pins?
- Is the sensor tilted in the device? (yep, it is off by a few degrees thanks to that dodgy soldering.)
- Is everything on the circuit exactly the same downstream from the sensor pins?

Because the sensor readings are stable and repeatable, real calibration gives much better results. It just takes a bit more work. In this series of posts I’ll survey a few calibration methods starting with faith-based methods like reading datasheets, moving on to naïve but effective calibration, and finishing off with nonlinear least-squares based approaches that require a bit more Math.

Before jumping in to the specific techniques, I want to say a few words about how I evaluate them. I have two basic metrics:

- How
**accurate**is the calibration? - How
**easy**is it to perform the calibration?

To evaluate accuracy, I produced a standard data set with the accelerometer held fixed in 14 different known positions (approximately corresponding to the faces of a truncated cube). I use methods I’ll describe in another post to determine what the “true” accelerometer reading should be at each of these positions then compare that to the reading I get from whatever calibration method I am testing.

If I have such a great calibration method and can get the “true” values out of my device, why do I bother with other techniques? Well, for one I am curious about how good they are. But the real problem is that the best calibration method is not easy to perform: it involved 30 minutes of careful device placements, measurements, button pushing, and data recording. If it were possible to get an error on the order of 1% with less than 10 seconds of work wouldn’t you consider making that trade-off?

That is why I also evaluate how easy each calibration method is to perform. When I talk about ease of calibration, *I am not talking about how easy it is to write the code*. I’m giving you all the code anyway. All I care about is this:

How easy is it to get a device calibrated after it has been programmed?

If calibration takes thirty minutes of futzing and special equipment, I’ll say the easiness is bad. If it just takes a button press and 10 seconds of moving the device around I’ll say it is pretty good. If it doesn’t require anything at all, then it is great.

Sure, you can automate a tedious and error-prone calibration process if the results are so much better it makes it worthwhile, but that is not much help for DIYers or prototyping. Also, for the skiing devices I’m building I want to have a way to recalibrate them in the field when I reconfigure them or even just because the temperature changed.

So every calibration method I describe will get two grades: one for accuracy and one for easiness. In the end, we will see that there are some methods that get good marks in both categories.