winterkoninkje | Eng: On Sanity (Reply)

So here's another crazy idea I have for Eng. When writing code, particularly at a low level, there's frequently call for doing basic sanity checks. That is, you often see code like this if ($foo == nil) { omg die! } elsif ($foo->bar()) { blah blah; } scattered liberally around everywhere. And especially with method calls when you got your object from some potentially unreliable source you see if ($foo and $foo->bar()) all over. Wouldn't it be nice if we could abstract that away?

I've had a few different ideas in the general area of different semantics of "return values" — for example you have different senses of "return value" when you're talking about the evaluation of a function/method/operator, the error code for failed evaluation, and returning "this" object so you can chain method calls together — but I recently had an idea that might be easier to implement in a similar field: we could do sanity checking with a special postfix operator ?. You would use it like $foo?->bar() and it would have short-circuit semantics so that if $foo passes the sanity check it behaves like a no-op, but if $foo fails sanity checking then it will skip evaluating bar() because trying to evaluate it will probably explode, what with $foo not being sane and all. In case there are different types of unsanity it would also be helpful to have some special variable ( $? perhaps) that stores the type of unsanity, or have perhaps some other way to deal with unsanity handling.

The challenge is where exactly to short-circuit to. If we have a statement that's a chain of method or field calls like $foo->bar()->baz->bif()->zorch()->quux; then it makes sense, wherever the ? is, to skip everything from there to the end of the chain/statement. But in other circumstances things get more complicated.

For instance, what do we do if it's in a conditional statement? If it's just a basic conditional statement like if ($foo?) or if ($foo?->bar) then it would make sense to have the whole statement return false. But if it's in a compound conditional statement like if ($foo?->bar() or $zorch->bazzle()) then it would make sense to skip the call to bar(), fail the first subexpression, and so evaluate the second. We could say that in boolean contexts the expression returns false, but that is contingent on what we mean by "expression" here.

Another example of complexity is how it interacts with non-method calls, such as arithmetic expressions. If we have $foo?->numeric() + 5 and $foo isn't sane, then what should the expression return? Well maybe the whole greater expression should fail (i.e. be skipped), that sounds sensible. Now what happens in non-algebraic expressions, like assignment? Should $foo = $bar?->baz() skip the assignment too, or should it set $foo to a known-bad value like nil? In case that question seems too straight forward, compare it against foo($bar?->baz(), $bif); should we call foo() with a known-bad value, or should we short-circuit out of ever calling foo()? Also, since ? is intended to simplify code, we must expect that callers won't necessarily test $? unless they really care about why ? failed.

A brief digression into implementation details. When calling ? we obviously need to pass it the thing we're testing. But at an assembly level, in order to know where to short-circuit to we also need to pass a label to jump back to. If during the course of evaluating sanity we determine the object's not sane, we can't just do a normal return because that wouldn't give us the short-circuit semantics we desire. The easiest way to get our short-circuiting would be to overwrite the memory location that stores the return address with the label we wish to jump to, and then when we return we'll return to somewhere else. In the event that some architectures disallow overwriting the return address, we'll have to manually pop a stack frame and do an unconditional jump, or use an inline function instead of a real function call, or devise some other method. If we allow operator overloading, overloading ? will have to be treated specially since we're not using a normal return.

Back to semantics. So far, I can identify six contexts that to be specified: method chains, boolean expressions, arithmetic expressions, string expressions, function calls, and assignment. And two general approaches: returning nil/false/zero/lambda or skipping evaluation. Since the whole point of this sanity checking operator is to avoid Doing Bad Things(tm) I'm thinking that we should err on the side of skipping evaluation, but there are certain places where just jumping to the next statement is jumping further than we may like. Inside conditional expressions — particularly disjunctions — is the big one, but also boolean expressions used as shorthand for control flow (like Perl's famous open($fh, ">", $file) or die "omg!";). Perhaps when the ? is buried in a boolean context it will skip the rest of evaluation for that whole subexpression and return false to the boolean operator, but in all other situations it just skips to the next statement. That sounds like a sensible enough place to start for Doing The Right Thing.

Eng: On Sanity

Post a comment in response: