Short summary: instead of relying on initialization to a known value, ignore the initial values of the array and rely instead on a consistency constraint (data may be random, but won't be random and satisfy the constraint) that you can set and check. A neat trick you can probably use in other places as well.
Anyone know the name of this algorithm? There is a similarly solved classic problem in complexity theory: Given an O(m) list of O(n)-bounded integers, determine if there are duplicates in O(m) time and O(n) space.
I don't know if it has a name (it seems to simply be referred to as "an efficient sparse set representation"), but I think I initially remember seeing it introduced in "Compilers: Principles, Techniques, and Tools" in an exercise at some point in the past.
This seems trivially incorrect. If both arrays are randomized at the outset, there is nothing to prevent them from containing valid data indicating a state having objects marked. In fact, if you free one set of arrays used by this structure and immediately allocate another, there's a good chance that your "random" initial state will, in fact, be the exact one you just freed.
I'm often left confused by people who downvote me without leaving a correction or counter-argument, so I guess I ought to explain why I've downvoted you here.
Who is the "you" in your sentence? The author of the comment to which you reply? They never mention the counter. The author of the original article? You're not replying to them.
And the counter isn't mentioned in the original prose, but it's clearly mentioned in the description of the algorithm. Surely that's reasonable. It's a single variable, hence constant to allocate in both space and time.
So your comment seems to make an unreasonable criticism that doesn't add value, and that's why I've downvoted it.
I was referring to jaspervdj, who is both the author of the comment to which I reply and the author of the original article (well, I presume).
The article states: "an element in array can, by chance, point to an element in marks, but this won’t matter since the element in marks won’t point back, and so we can determine it’s fake." This sentence is missing the part where the counter is made use of, and as such is erroneous. The mentioned element can point back, and its index must be compared with the counter to ascertain whether it is valid.
Although that is a small omission, it is in no way trivial, and probably the source of all the confused comments the posting received.
OK, I missed the coincidence (by which I mean they are identical, not that it's by chance) or the names of the authors. That clears up my confusion on that point.
Regarding the lack of mention of the counter in the section of the description you mention, I thought it was obvious. In the section before he says: we keep the number of marks set. That means, to me, that if the element in array accidentally/incidentally points to an element in marks that accidentally/incidentally points back, it's not counted as pointing back because it's known (by using the counter) not to be valid.
It is subtle, I agree, but for me the explanation is clear enough already. Adding more detail could simply overwhelm the reader with words, without letting the reader see the wood for the trees.
I agree, now that you've expanded it, that your point is valid, but I, personally, prefer the version as it is.
In this I expect we should simply agree to disagree. One person's "necessary detail" is another's "pointless nit-picking."
just to chime in, jaspervdj is the commenter, the submitter, the website is calles jaspervdj.be so it is reasonable to guess he is the author (he is btw).