It's a well known fact that many algorithms in NLP are the same algorithm, just parameterized by a different semiring. For instance, the Viterbi algorithm for getting the probability of the best path in an HMM uses the (max,+)
semiring. The forward part of the forward-backward algorithm is the same algorithm, just using the (+,*)
semiring. (And sometimes you want the backward part of Viterbi too.)
Today's moment of duh: The usual implementation of Viterbi, with a sparse map, is the just the Max semiring coalesced to the MaxPriority semiring, i.e. where we map Max minBound
to Nothing
in order to remove the extra bottom in the lattice. And the Viterbi algorithm with backpointers so that you can extract the best path is just the Arg Max
semiring (or Args Max
if you want to keep ties).