Sequential consistency in challenge 4 (grow-only counter)

maelstrom isn’t happy with my solution for g_counter, and I’m not quite sure why. Here’s the simplest history.txt I managed to get from it (with 2 nodes):

0	:invoke	:add	2
:nemesis	:info	:start-partition	:one
:nemesis	:info	:start-partition	[:isolated {"n1" #{"n0"}, "n0" #{"n1"}}]
0	:ok	:add	2
0	:invoke	:read	nil
0	:ok	:read	2
0	:invoke	:read	nil
0	:ok	:read	2
:nemesis	:info	:stop-partition	nil
:nemesis	:info	:stop-partition	:network-healed
1	:invoke	:read	nil
0	:invoke	:read	nil
0	:ok	:read	2
1	:ok	:read	0

I assume it doesn’t like the n1 read returning 0, but this seems compatible with sequential consistency to me (this is the only operation involving n1, so it seems legitimate for it not to have observed n0’s add yet). It’s relying on seq-kv, and inheriting this behavior from there (n1 isn’t observing a write issued by n0). What are the specific requirements here that this needs to meet?

The g-counter workload requires that the final read is consistent and correct but I missed that in the challenge description. I’m sorry about that. Thanks for catching it.

I pushed up a change to the Specification section so it requires that. Unfortunately, there’s not currently a way to determine from Maelstrom if a read is the last one before the test ends.

Thanks for the update. Is there a way to require the same behavior from seq-kv, if this service is meant to be implemented on top of seq-kv?

Under the assumption that adds are more common than reads, I had each node write to its own key to avoid any write contention, and reads read (potentially-stale versions of) other nodes’ keys. If this isn’t permitted for the final read, which is indistinguishable from other reads, it sounds like the requirements are just stronger than sequential consistency (it sounds bit like a partial synchrony assumption, but it seems odd to require it from my service and not from the underlying one it’s supposed to use). Is the specification hinting that it wants all the nodes to CAS a single key or something, to achieve this on top of seq-kv?

Also, looking at the implementation of the workload, it looks like it does try to account for multiple possible states in the final read. So I might still be missing something. This talks about adds that “definitely happened”, but in a sequentially-consistent non-linearizable system, one node is permitted to observe an add as having happened while another one may not see it yet, so I’m not sure what its model is.

Yes, that’s the easiest option. You don’t even need to CAS another key—you can CAS on the count value key.

Sequential consistency has a total order so writes ensure that a client sees the latest version. Reads, on the other hand, simply cannot go backwards. I’m not sure if that helps clarify things?

Thanks for the response! I figured that CAS approach was being hinted at.

Sequential consistency has a total order so writes ensure that a client sees the latest version. Reads, on the other hand, simply cannot go backwards. I’m not sure if that helps clarify things?

Hmm, are you saying that if you send a write to node A, and then you send a read to node A, and it shows the write, and then you send a read to node B, and it doesn’t show the write, that violates sequential consistency? To me that sounds like some sort of linearizability requirement, not sequential consistency. Sequential consistency just requires the results to be compatible with some total order on operations which is compatible with the order seen by each node. So for

A: add 1
A: read
B: read

I would think that it’s valid to have B’s read ordered before A’s add. B’s read never went backwards from showing 1 to showing 0 or anything like that, it just didn’t see A’s add yet (because seq-kv didn’t propagate it to B yet). Am I missing something?

Ah, sorry I wasn’t clearer. I meant that if A performs a write and then B performs a write then those are ordered. You’re correct that B’s reads could be reordered before A’s read or write.

Spoiler, if you’re interested: The trick in this challenge is that you can perform an identity CAS to synchronize after the KV read and ensure that it is returning the latest value.

:wave: Attempting this challenge too and perhaps there is something I don’t understand about sequential consistency. Is it true that if I CAS a value and it succeeds then at some later time I shouldn’t be able to CAS a lesser value (assuming values are only growing)? For example, in the attached screenshot I find 121 in the kv store by CAS, then after I find 119 by CAS.

Thanks!

Yes, that’s how I understand it as well. I haven’t seen that before. I filed an issue on the Maelstrom project.

Thank you for the extra eyes!

Hmm, I’m a little confused by the way everyone is talking about sequential consistency! My understanding is: A particular execution is valid as long as all operations and results are compatible with having executed in some total order, such that every node’s individual operation order is a valid suborder. (I.e. each node’s operations are executed in the given order, with an arbitrary interleaving.)

In particular, I don’t see why a read followed by a nop CAS would necessarily give you anything that a read doesn’t – the CAS can always come right after the read in the global order, and you don’t get any real-time guarantees from that, if you communicate with other nodes over a side channel.

Edit: on second thought, you probably didnt do writes but CAS to set the previous values which I think changes things. A paste of your full history might help.

I imagine your history looks like:

  • n0: write key=121
  • n1: write key=119
  • n2: write key=117
  • n1: read key 119
  • n2: read key 117
  • n0 read key 121
  • n2 cas key 117,117
  • n0 cas key 121,121
  • n1 cas key 119,119

Which can be reordered to:

  • n0 write key=121
  • n0 read key=121
  • n0 cas 121,121
  • n1 write,read,cas 119
  • n2 write,read,cas 121

I didnt manage to see how an identity CAS on reads would be helpful. In this history of KV ops:

  • n0 : read 0
  • n0: cas 0,0 # ident cas
  • n0 cas 0,3 # add 3
  • n1 read 3 # only and final read for n1
  • n1 cas 3,3 # ident cas
  • n0 read 3
  • n0 cas 3,3
  • n0 cas 3,4 # add 1
  • n0 read 4 # n0 final read
  • n0 cas 4,4

We can still have final reads with different values even using identity CAS after reads.

I did come up with 2 related solutions using only the seqKV store (will try to leave some things out):

  • Since we dont know when the final read may come, all reads need to have the correct value and thus reads require something stronger than seq consistency.
  • On reads you can either wait until the table syncs (may cause timeouts, increases msgs-per-op)
  • or just detect the stale read and return an error (jepsen retries, we dont guarantee successful reads just that we dont return stale reads). I went with the latter.

Thinking about it, maybe what @benbjohnson was getting at can be accomplished with non-identity CAS. If your key holds both the counter value and a globally unique token, and each time you either write or read, you CAS to change the token (while maintaining the value, in read cases), then there’s no valid history where both CASes succeed, since tokens are never reused. E.g.

A: read  (0@init)
A: cas 0@init -> 5@A1 (ok)

B: read  (0@init)
B: cas 0@init -> 0@B1

If A acknowledges the add, meaning its CAS succeeded, then B’s CAS can’t possibly succeed as well, since the value will never go back to 0@init again (this isn’t true with an identity CAS, because that can happen before A’s CAS and still have them both succeed).

This seems like a very sad counter, with maximal contention on both reads and writes. It also still seems like a stronger requirement than sequential consistency, so I’m still confused about the specification. But it does seem like it could work to force synchronization.

1 Like

Hi, I was able to solve the question and make the test cases pass, I have a few questions though, and I was hoping someone could answer them.

Question 1

@benbjohnson1 I am not sure if this is correct, B’s write could also be reordered before A’s read or write. This is told in the formal definition here. Please let me know if I am wrong or right because I am very confused about this.

Question 2
CAS (CompareAndSwap) is a specialized operation for the purpose of this challenge. The next few lines could be a spoiler.

Did we have to figure out ourselves that a special kind of CAS can help us force the seq-kv to evaluate the pending actions?

Does the CAS operation exist in real-world scenarios? Any examples?

NOTES

  • If you need a few hints, go through this and this.

The comments in this thread by @shachaf were very helpful. He/She does not assume anything other than what Sequential Consistency means, and I also had a lot of the same doubts as him.

I don’t think so. I don’t see why you came up with this conclusion. After successful CAS any other good CAS is possible with the correct expected value.

The funny thing to me with this problem is the amount of how much it is underspecified.

With CAS ops they are useless IMHO as if you will be CAS-ing a global register you don’t have a guarantee that your CAS has exactly-once semantics. Basically, you don’t know if your CAS was successful or not with even a timeout. Especially with partitioning most of CAS responses can be dropped so you can’t be sure and can’t rely on that. For CASing only your own (node-specific key) its useless as you can just use a Store() call.

Then for the SeqCon model with no constraints on the consistent reads how can you guarantee in the end that you will ever return anything consistent if you are not guaranteed that your process will ever read any other update from the store beside the first one?

Otherwise, it’s a beautiful problem.