## Friday, 22 May 2015

### Object.equals, primitive '==', and Arrays.equals ain't equal

It is a fact well known to those who know it well that "==" != "equals()" the example usually going something like:
String a = "Tom";
String b = new String(a);
-> a != b but a.equals(b)

It also seems reasonable therefore that:
String[] arr1 = {a};
String[] arr2 = {b};
-> arr1 != arr2 but Arrays.equals(arr1, arr2)

So far, so happy... That's examples for you...
For primitives we don't have the equals method, but we can try boxing:
int i = 0;
int j = 0;
-> i == j and also ((Integer)i).equals((Integer)j), and Arrays.equals({i}, {j})

### Floats ruin everything

But... some primitives are more equal then others. Consider for instance the following example:
float f1 = Float.Nan;
float f2 = Float.Nan;
-> f1 != f2, and also f1 != f1, Nan's are nasty

This is what floats are like children, they be treacherous little hobbitses and no mistake. The fun starts when you box them fuckers:
-> ((Float)f1).equals((Float)f2), because Java likes to fix shit up
-> Arrays.equals({((Float)f1)},{((Float)f2)}), hmm consistent
-> but also: Arrays.equals({f1},{f2})...

This is counter to how one would normally think arrays are compared. You would think that for primitives (skipping arrays null and length checks):
boolean feq = true;
for(int i=0;i<farr1.length;i++){
if(farr1[i] != farr2[i]) {
feq = false; break;
}
}
Is the same as:
boolean feq = Arrays.equals(farr1,farr2);
But for double[] and float[] the contract has been changed to accommodate Nan. This is perhaps a good decision on the JDK authors side, but it is somewhat surprising.
Conclusion: 2 arrays can be equal, even if some elements are not equal.

### Objects are weird

Let's consider objects for a second. We started with a != b does not predicate that !a.equals(b), but what about a == b?
The Object.equals() javadoc offers the following wisdom:
The equals method implements an equivalence relation on non-null object references:

• It is reflexive: for any non-null reference value x, x.equals(x) should return true.
• It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
• It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
• It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
• For any non-null reference value x, x.equals(null) should return false.
You know what the most important word in the above lines is? SHOULD
In the wonderful land of Should, all classes behave and if they bother implementing an equals method they follow the above rules (and also override the hashcode method while they are there).
But what happens when an object is implement perversely, like this:
public class Null {
@Override
public boolean equals(Object obj) {
return obj == null;
}
}

This has some nice implications:
Null n = new Null();
-> n != null, but n.equals(null)
Object[] a = {n};
Object[] b = {null};
Object[] c = {n};
-> Arrays.equals(a, b) is true, because n.equals(null)
-> Arrays.equals(b, a) is false, because null != n
-> Arrays.equals(a, a) is true, because a == a
-> Arrays.equals(a, c) is false, because !n.equals(n)

Or to quote The Dude: "Fuck it man, let's go bowling"

## Tuesday, 19 May 2015

### Degrees Of (Lock/Wait) Freedom

 Yo, Check the diagonal, three brothers gone...
I've been throwing around the terms lock-free and wait-free in the context of the queues I've been writing, perhaps too casually. The definition I was using was the one from D. Vyukov's website (direct quote below):
• Wait-freedom:  Wait-freedom means that each thread moves forward regardless of external factors like contention from other threads, other thread blocking. Each operations is executed in a bounded number of steps.
• Lock-freedom: Lock-freedom means that a system as a whole moves forward regardless of anything. Forward progress for each individual thread is not guaranteed (that is, individual threads can starve). It's a weaker guarantee than wait-freedom. [...] A blocked/interrupted/terminated thread can not prevent forward progress of other threads. Consequently, the system as a whole undoubtedly makes forward progress.
• Obstruction-freedom: Obstruction-freedom guarantee means that a thread makes forward progress only if it does not encounter contention from other threads. That is, two threads can prevent each other's progress and lead to a livelock.
• Blocking Algorithms: It's the weakest guarantee - basically all bets are off, the system as a whole may not make any forward progress. A blocked/interrupted/terminated thread may prevent system-wide forward progress infinitely.
The above definitions refer to "forward progress/moving forward" in the context of "system" and "thread". In particular they translate "X-freedom" to "a guarantee that T/S makes forward progress".
Now "thread" is a well defined term, but what counts as forward progress of a given thread?
To my mind a thread which is free to return from a method is free to 'make progress'. For example: if the next element in a queue is not visible to a consumer thread I can return control to the consumer thread which is then free to make progress on anything it feels like and try getting an element out of the queue later. Similarly a producer thread which is unable to add elements to a queue is 'free' to 'make progress' even if the queue is full. In short, I interpreted 'freedom' as regaining control of execution.

### Freedom? yeah, right...

My interpretation however is limited to the scope of the thread and assumes it has other things to do (so placing the thread within the context of a given 'system'). So the freedom to make progress is really a freedom to make progress on 'other things'. The definitions above when applied to a given data structure have no knowledge of the 'system' and it is therefore fair to assume nothing. And so if the 'system' is viewed as being concerned only with using the data structure, it seems my view of 'regained control freedom' is not inline with the 'progress making freedom'.
Let's consider for example the linked Multi-Producer-Single-Consumer queue discussed in the previous post (this is the not the j.u.Queue compliant version):

Now let us assume a producer has stalled in that unpleasant gap between setting the head and pointing prev to the new head(at line 34), this is what I've come to think of as a 'bubble' in the queue. The next producer will see the new head, but the consumer will not see it (or any nodes after it) until prev.next is set.
Others producers can 'make progress' in the sense that elements will get added to the queue. Those elements visibility to the consumer however is blocked by the 'bubble'. What is the correct degree of freedom for these producers? what about the consumer?
When describing the queue Vyukov makes the following statements:
• Wait-free and fast producers. One XCHG is maximum what one can get with multi-producer non-distributed queue.
• Push [offer in the Java version] function is blocking wrt consumer. I.e. if producer blocked in (* [line 34]), then consumer is blocked too. Fortunately 'window of inconsistency' is extremely small - producer must be blocked exactly in (* [line 34]).
So producers are wait-free, but consumer is blocking. The consumer is truly blocked by the 'bubble' and can make no progress in the terms of the data structure. Despite other producers adding nodes the consumer cannot see those nodes and is prevented from consuming the data. The fact that control is returned to the caller is considered irrelevant. And if we are being precise we should call this a blocking queue.

### Life imitates Art

The "Art of Multiprocessor Programming" offers a different definition:
"A concurrent object implementation is wait free if each method call completes in a finite number of steps. A method is lock-free if it guarantees that infinitely often some method call finishes in a finite number of steps" - page 99
This sentiment is similarly echoed in "Wait-Free Queues With Multiple Enqueuers and Dequeuers" a paper on lock free queues:

• [...] to ensure that a process (or a thread) completes its operations in a bounded number of steps, regardless of what other processes (or threads) are doing. This property is known in the literature as (bounded) wait-freedom.
• lock-freedom ensures that among all processes accessing a queue, at least one will succeed to finish its operation.

Hmmm... no system, no progress. The boundaries are now an object and a method, which are well defined. I think this definition matches my understanding of 'regained control freedom' (open to feedback). If a method returns, progress or no progress, the condition is fulfilled. Under this definition the queue above is wait-free.
The wiki definition is again closer to the one outlined by Vyukov:
• An algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread for some operations. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress.
• Wait-freedom is the strongest non-blocking guarantee of progress, combining guaranteed system-wide throughput with starvation-freedom. An algorithm is wait-free if every operation has a bound on the number of steps the algorithm will take before the operation completes.
• Lock-freedom allows individual threads to starve but guarantees system-wide throughput. An algorithm is lock-free if it satisfies that when the program threads are run sufficiently long at least one of the threads makes progress (for some sensible definition of progress). All wait-free algorithms are lock-free.
We are back to a fuzzy definition of system, progress, but the underlined sentence above is interesting in highlighting suspension. In particular I'm not sure how this is to be interpreted in the the context of the lock-free definition where suspension is not an issue if some threads can keep going.
Note that once we have fixed poll to behave in a j.u.Queue compliant way: The poll method is no longer wait-free by any definition.

### What about the JCTools queues?

JCTools covers the full range of MPMC/MPSC/SPMC/SPSC range of queues, and aims for high performance rather than strict adherence to any of the definitions above. To be more precise in the definitions I would say:
• SpscArrayQueue/SpscLinkedQueue are Wait Free (on both control and progress senses)
• MpscArrayQueue is lock free on the producer side and blocking on the consumer side. I'm planning to add a weakPoll method which will be wait free in the control sense.
• MpscLinkedQueue is wait free on the producer side and blocking on the consumer side. I'm planning to add a weakPoll method which will be wait free in the control sense.
• SpmcArrayQueue is lock free on the consumer side and blocking on the producer side. I'm planning to add a weakOffer method which will be wait free in the control sense.
• MpmcArrayQueue is blocking on the both producer and consumer side. I'm planning to add a weakOffer/Poll method which will be lock free in the control sense.
Does it matter that the queues do not meet the exact definitions set out below? As always it depends on your needs...

### Non-Blocking Algorithms On The JVM?

{Full disclosure: please note when reading the next section that I work with Azul Systems on the Zing JVM, I'm not trying to sell anything, but I am naturally more familiar with it and consider it awesome :-)}
What happens when we include the JVM in our definition of a system? Can you build a non-blocking algorithm on the JVM? All the JVMs I know of are blocking in some particular cases, important to the discussion above are:
1. Allocation: The common case for allocation is fast and involves no locking, but once the young generation is exhausted a collection is required. Young generation collections are stop the world events for Oracle/OpenJDK. Zing has a concurrent young generation collector, but under extreme conditions or bad configuration it may degrade to blocking the allocator. The bottom line is that allocation on the JVM is blocking and that means you cannot consider an algorithm which allocates non-blocking. To be non-blocking a system would have to provably remove the risk of a blocking collection event on allocation.
2. Deoptimization: Imagine the JIT compiler in it's eager and aggressive compilation strategy has decided your offer method ends up getting compiled a particular way which ends up not working out (assumptions on inheritance, class loading, passed in values play a part). When the assumption breaks a deoptimization may take place, and that is a blocking event. It is hard to prove any piece of code is deoptimization risk free, and therefore it is hard to prove any Java code is non-blocking. Deoptimization is in many cases a warmup issue. Zing is actively battling this issue with ReadyNow which reloads previously recorded compilation profiles for the JVM, greatly reducing the risk of deoptimization. Oracle/OpenJDK users can do a warmup run of their application before actually using it to reduce the risk of deoptimization. My colleague Doug Hawkins gave a great talk on this topic which will give you far more detail then I want to go into here :-).
3. Safepoints: If your algorithm includes a safepoint poll it can be brought to a halt by the JVM. The JVM may bring all threads to a safepoint for any number of reasons. Your method, unless inlined, already has a safepoint on method exit(OpenJDK) or entry(Zing). Any non-counted loop will have a safepoint-poll, and this includes your typical CAS loop. Can you prevent safepoints from ever happening in your system? It might be possible for a JVM to provide users with compiler directives which will prevent safepoint polls from being inserted in certain code paths, but I don't know of any JVM offering this feature at the moment.
In short, if you are on the JVM you are probably already blocking. Any discussion of non-blocking algorithms on the JVM is probably ignoring the JVM as part of the 'system'. This needs to be considered if you are going to go all religious about wait-free vs. lock-free vs. blocking. If a system MUST be non-blocking, it should probably not be on the JVM.

### Summary

Any term is only as good as our ability to communicate with others using it. In that sense, splitting hairs about semantics is not very useful in my opinion. It is important to realize people may mean any number of things when talking about lock-free/wait-free algorithms and it is probably a good idea to check if you are all on the same page.
I personally find the distinction between control/progress freedom useful in thinking about algorithms, and I find most people mean lock/wait-free excluding the effects of the JVM...
Thanks Martin & Doug for the review, feedback and discussion, any remaining errors are my own (but their fault ;-) )