Backward compatibility considerations when migrating a data model

As you might have noticed, I am a fan of compatibility and interoperability.

Imagine you have a model of question (fill-in-the-blanks), where answers are computed by formulae specified by the question creator, and since sometimes the answer cannot be expressed as a finite decimal number (for example, the arithmetic square root of 2), we allow a certain amount of error.

In the beginning, each answer spec comes with a natural number, which specifies how many digits after the decimal point (or comma, in some countries) should be preserved when comparing the standard answer with user input.

This model covers most scenario for numerical computation, but it does not allow the question creator to make the answer exact. If the answer is always an integer, it is desirable to have the answer exact. Luckily, if the answer is not zero, we can set a canonical number, say 99, as the number of digits to preserve. Since the internal computation is done with double, if the user does not enter a string that parses to the same double, the user will surely fail the question. (But the user can enter 1.0000000000000000000000000000000000000000000000000000001, which parses to 1.0. This problem is currently not under consideration.)

After some discussion, the team decided to change the model from ‘the number of digits to preserve’ to ‘the maximum absolute error allowed’. Say the maximum absolute error allowed is eps, the user inputs y, the answer is x, then the user passes if and only if fabs(x - y) <= eps.

Note that the new model is almost strictly more powerful than the old one. If the number of digits to preserve was d, the maximum absolute error allowed will be eps = 5 * pow(10, -d - 1). I know that there is a subtle difference when it comes to the inequality sign (< or <=), but it doesn’t really matter much. Moreover, the new model allows us to explicitly call out the fact that we want the answer to be exact, as we now write 0 for the maximum absolute error allowed, which is super clear, at least clearer than the magical number 99 (digits to preserve).

Now how would you seamlessly upgrade your model? We have exposed Web APIs for storing, retrieving and modifying models in the database. To be sad, we already have some consumers (of the APIs) that use the old model, and one of them will not be aware of this change soon. By not soon, perhaps it’s half a year?

We did some compatibility workaround. When someone calls the API to save a question, we:

Check if it has eps set in payload (form, JSON or XML, anything you like), if so, then the caller is current, and that parameter should be respected;
Otherwise, check if it has d set, if so, then the caller is legacy, and we convert d to eps using the formula given above;
Otherwise, the caller is ignorant, and we ignore the request.

That sounds great, doesn’t it? Only, it fails to work in the first round of testing.

I wrote an utility that sends the new model, and it succeeded, making all subsequent calls to ‘getting a list of questions’ fail with Internal Server Error. Why is that?

Answer The backend itself is not updated thoroughly. ‘Getting a list of questions’ doesn’t know the new model and fails.

Having identified the problem, I suddenly found several flaws in the workaround. But you might not, since it requires specific knowledge. The first thing to notice is that:

We forgot to return legacy formatted payload when ‘getting a list of questions’ — that means the legacy client will get insane since it doesn’t see d.

It’s easy to resolve this issue, we only need to convert eps back to d. This is not always possible but you can do your best, with some rounding and special process for 0.

With the specific knowledge that that legacy client edits the returned payload and resends it back, you might find the second flaw.

That legacy client doesn’t strip things it doesn’t know off the payload, therefore it will never be able to edit the precision specification for an answer — the payload always contains eps, overriding d, if edited.

Solving this with the following logic might seem okay, but is it okay? Upon receiving an edit request,

If the payload contains d, convert it to eps, set eps and remove d;
If the payload does not contain eps, ignore it;
Process the payload.

Sadly, this is still wrong. Editing a question with eps that cannot be converted to d with the legacy client will corrupt it. But this is not that bad, since at least everyone is happy — nobody will get mad because it doesn’t recognise a question payload.