winterkoninkje: shadowcrane (clean) (Default)

So here's another crazy idea I have for Eng. When writing code, particularly at a low level, there's frequently call for doing basic sanity checks. That is, you often see code like this if ($foo == nil) { omg die! } elsif ($foo->bar()) { blah blah; } scattered liberally around everywhere. And especially with method calls when you got your object from some potentially unreliable source you see if ($foo and $foo->bar()) all over. Wouldn't it be nice if we could abstract that away?

I've had a few different ideas in the general area of different semantics of "return values" — for example you have different senses of "return value" when you're talking about the evaluation of a function/method/operator, the error code for failed evaluation, and returning "this" object so you can chain method calls together — but I recently had an idea that might be easier to implement in a similar field: we could do sanity checking with a special postfix operator ?. You would use it like $foo?->bar() and it would have short-circuit semantics so that if $foo passes the sanity check it behaves like a no-op, but if $foo fails sanity checking then it will skip evaluating bar() because trying to evaluate it will probably explode, what with $foo not being sane and all. In case there are different types of unsanity it would also be helpful to have some special variable ( $? perhaps) that stores the type of unsanity, or have perhaps some other way to deal with unsanity handling.

The challenge is where exactly to short-circuit to. If we have a statement that's a chain of method or field calls like $foo->bar()->baz->bif()->zorch()->quux; then it makes sense, wherever the ? is, to skip everything from there to the end of the chain/statement. But in other circumstances things get more complicated.

For instance, what do we do if it's in a conditional statement? If it's just a basic conditional statement like if ($foo?) or if ($foo?->bar) then it would make sense to have the whole statement return false. But if it's in a compound conditional statement like if ($foo?->bar() or $zorch->bazzle()) then it would make sense to skip the call to bar(), fail the first subexpression, and so evaluate the second. We could say that in boolean contexts the expression returns false, but that is contingent on what we mean by "expression" here.

Another example of complexity is how it interacts with non-method calls, such as arithmetic expressions. If we have $foo?->numeric() + 5 and $foo isn't sane, then what should the expression return? Well maybe the whole greater expression should fail (i.e. be skipped), that sounds sensible. Now what happens in non-algebraic expressions, like assignment? Should $foo = $bar?->baz() skip the assignment too, or should it set $foo to a known-bad value like nil? In case that question seems too straight forward, compare it against foo($bar?->baz(), $bif); should we call foo() with a known-bad value, or should we short-circuit out of ever calling foo()? Also, since ? is intended to simplify code, we must expect that callers won't necessarily test $? unless they really care about why ? failed.

A brief digression into implementation details. When calling ? we obviously need to pass it the thing we're testing. But at an assembly level, in order to know where to short-circuit to we also need to pass a label to jump back to. If during the course of evaluating sanity we determine the object's not sane, we can't just do a normal return because that wouldn't give us the short-circuit semantics we desire. The easiest way to get our short-circuiting would be to overwrite the memory location that stores the return address with the label we wish to jump to, and then when we return we'll return to somewhere else. In the event that some architectures disallow overwriting the return address, we'll have to manually pop a stack frame and do an unconditional jump, or use an inline function instead of a real function call, or devise some other method. If we allow operator overloading, overloading ? will have to be treated specially since we're not using a normal return.

Back to semantics. So far, I can identify six contexts that to be specified: method chains, boolean expressions, arithmetic expressions, string expressions, function calls, and assignment. And two general approaches: returning nil/false/zero/lambda or skipping evaluation. Since the whole point of this sanity checking operator is to avoid Doing Bad Things(tm) I'm thinking that we should err on the side of skipping evaluation, but there are certain places where just jumping to the next statement is jumping further than we may like. Inside conditional expressions — particularly disjunctions — is the big one, but also boolean expressions used as shorthand for control flow (like Perl's famous open($fh, ">", $file) or die "omg!";). Perhaps when the ? is buried in a boolean context it will skip the rest of evaluation for that whole subexpression and return false to the boolean operator, but in all other situations it just skips to the next statement. That sounds like a sensible enough place to start for Doing The Right Thing.

Date: 2006-07-17 03:53 am (UTC)From: [identity profile] konomaigo.livejournal.com
Isn't this, to some degree, what the try/catch syntax is for in C? Or at least what this is trying to function as?

(disclaimer: I know very little about the syntax, having not used it. I understand only the basics of it, but from what I know it's slightly more sophisticated than if/then. It may be too sophisticated for what you want, or it may have other problems I'm not aware of.)

Date: 2006-07-17 08:40 pm (UTC)From: [identity profile] winterkoninkje.livejournal.com
Sort of. (That's C++ and Java, btw. C lacks it.) The way try-catch works is similar to interrupt handling (don't know if you've dealt with that either); basically a function can opt to "throw an exception" (i.e. error) instead of returning normally, and then if the call to that function is wrapped in a try block the exception will be handed off to an exception handler. Sic:


sub foo($x) throws omg { if ($x) { return true; } else { throw new omg(); } }
sub bar($y) { try { foo($y) ; } catch (omg) { die "omg!"; } }


If you didn't have the try-catch then the call to foo() is equivalent to throwing the exception out of bar() and so you'll keep unwinding the stack until you find some enclosing try-catch environment. IIRC, it's done more efficiently at the assembly level, but that's what the semantics are.

You can have multiple catch blocks per try block for catching different types of exceptions, and in Java you need to declare what sorts of exceptions can be thrown. In Perl there's no throw-try-catch because the exec() function serves the same general need in a somewhat different fashion, though `man perlsub` discusses that more.

Exceptional flow of control can be useful, but that's not quite what I'm going for here. Exceptions are expensive (though not as expensive as interrupts since there's no trapping into the kernel) and so should only be used when they're needed (though Java ignores this and uses them everywhere, e.g. in lieu of returning null pointers).

The sanity checking is more for convenience of cleaning up code and making it legible, though I suppose one could look at it as an exception on the expression level (instead of the function level) which is automatically caught (and generally discarded). In some ways it's similar to Perl's `||` operator when used for things like: $x ||= $default; as a shorthand for $x = $x ? $x : $default;, insofar as it reduces redundancy from doing things like $x != null && $x->foo();

April 2019

S M T W T F S
 123456
78910111213
14151617181920
212223242526 27
282930    

Tags

Page generated 19 Jun 2025 05:07 pm
Powered by Dreamwidth Studios