Sequential consistency in challenge 4 (grow-only counter)

shachaf · February 23, 2023, 9:38pm

maelstrom isn’t happy with my solution for g_counter, and I’m not quite sure why. Here’s the simplest history.txt I managed to get from it (with 2 nodes):

0	:invoke	:add	2
:nemesis	:info	:start-partition	:one
:nemesis	:info	:start-partition	[:isolated {"n1" #{"n0"}, "n0" #{"n1"}}]
0	:ok	:add	2
0	:invoke	:read	nil
0	:ok	:read	2
0	:invoke	:read	nil
0	:ok	:read	2
:nemesis	:info	:stop-partition	nil
:nemesis	:info	:stop-partition	:network-healed
1	:invoke	:read	nil
0	:invoke	:read	nil
0	:ok	:read	2
1	:ok	:read	0

I assume it doesn’t like the n1 read returning 0, but this seems compatible with sequential consistency to me (this is the only operation involving n1, so it seems legitimate for it not to have observed n0’s add yet). It’s relying on seq-kv, and inheriting this behavior from there (n1 isn’t observing a write issued by n0). What are the specific requirements here that this needs to meet?

benbjohnson · February 24, 2023, 12:57am

The g-counter workload requires that the final read is consistent and correct but I missed that in the challenge description. I’m sorry about that. Thanks for catching it.

I pushed up a change to the Specification section so it requires that. Unfortunately, there’s not currently a way to determine from Maelstrom if a read is the last one before the test ends.

shachaf · February 24, 2023, 4:01am

Thanks for the update. Is there a way to require the same behavior from seq-kv, if this service is meant to be implemented on top of seq-kv?

Under the assumption that adds are more common than reads, I had each node write to its own key to avoid any write contention, and reads read (potentially-stale versions of) other nodes’ keys. If this isn’t permitted for the final read, which is indistinguishable from other reads, it sounds like the requirements are just stronger than sequential consistency (it sounds bit like a partial synchrony assumption, but it seems odd to require it from my service and not from the underlying one it’s supposed to use). Is the specification hinting that it wants all the nodes to CAS a single key or something, to achieve this on top of seq-kv?

Also, looking at the implementation of the workload, it looks like it does try to account for multiple possible states in the final read. So I might still be missing something. This talks about adds that “definitely happened”, but in a sequentially-consistent non-linearizable system, one node is permitted to observe an add as having happened while another one may not see it yet, so I’m not sure what its model is.

benbjohnson · February 24, 2023, 7:02pm

Yes, that’s the easiest option. You don’t even need to CAS another key—you can CAS on the count value key.

Sequential consistency has a total order so writes ensure that a client sees the latest version. Reads, on the other hand, simply cannot go backwards. I’m not sure if that helps clarify things?

shachaf · February 24, 2023, 8:38pm

Thanks for the response! I figured that CAS approach was being hinted at.

Sequential consistency has a total order so writes ensure that a client sees the latest version. Reads, on the other hand, simply cannot go backwards. I’m not sure if that helps clarify things?

Hmm, are you saying that if you send a write to node A, and then you send a read to node A, and it shows the write, and then you send a read to node B, and it doesn’t show the write, that violates sequential consistency? To me that sounds like some sort of linearizability requirement, not sequential consistency. Sequential consistency just requires the results to be compatible with some total order on operations which is compatible with the order seen by each node. So for

A: add 1
A: read
B: read

I would think that it’s valid to have B’s read ordered before A’s add. B’s read never went backwards from showing 1 to showing 0 or anything like that, it just didn’t see A’s add yet (because seq-kv didn’t propagate it to B yet). Am I missing something?

benbjohnson1 · February 24, 2023, 9:16pm

Ah, sorry I wasn’t clearer. I meant that if A performs a write and then B performs a write then those are ordered. You’re correct that B’s reads could be reordered before A’s read or write.

Spoiler, if you’re interested: The trick in this challenge is that you can perform an identity CAS to synchronize after the KV read and ensure that it is returning the latest value.

craigpastro · February 24, 2023, 10:17pm

Attempting this challenge too and perhaps there is something I don’t understand about sequential consistency. Is it true that if I CAS a value and it succeeds then at some later time I shouldn’t be able to CAS a lesser value (assuming values are only growing)? For example, in the attached screenshot I find 121 in the kv store by CAS, then after I find 119 by CAS.

Thanks!

benbjohnson · February 24, 2023, 11:40pm

Yes, that’s how I understand it as well. I haven’t seen that before. I filed an issue on the Maelstrom project.

craigpastro · February 24, 2023, 11:41pm

Thank you for the extra eyes!

shachaf · February 25, 2023, 12:52am

Hmm, I’m a little confused by the way everyone is talking about sequential consistency! My understanding is: A particular execution is valid as long as all operations and results are compatible with having executed in some total order, such that every node’s individual operation order is a valid suborder. (I.e. each node’s operations are executed in the given order, with an arbitrary interleaving.)

In particular, I don’t see why a read followed by a nop CAS would necessarily give you anything that a read doesn’t – the CAS can always come right after the read in the global order, and you don’t get any real-time guarantees from that, if you communicate with other nodes over a side channel.

sdab · February 25, 2023, 8:07pm

Edit: on second thought, you probably didnt do writes but CAS to set the previous values which I think changes things. A paste of your full history might help.

I imagine your history looks like:

n0: write key=121
n1: write key=119
n2: write key=117
n1: read key 119
n2: read key 117
n0 read key 121
n2 cas key 117,117
n0 cas key 121,121
n1 cas key 119,119

Which can be reordered to:

n0 write key=121
n0 read key=121
n0 cas 121,121
n1 write,read,cas 119
n2 write,read,cas 121

sdab · February 25, 2023, 9:42pm

I didnt manage to see how an identity CAS on reads would be helpful. In this history of KV ops:

n0 : read 0
n0: cas 0,0 # ident cas
n0 cas 0,3 # add 3
n1 read 3 # only and final read for n1
n1 cas 3,3 # ident cas
n0 read 3
n0 cas 3,3
n0 cas 3,4 # add 1
n0 read 4 # n0 final read
n0 cas 4,4

We can still have final reads with different values even using identity CAS after reads.

I did come up with 2 related solutions using only the seqKV store (will try to leave some things out):

Since we dont know when the final read may come, all reads need to have the correct value and thus reads require something stronger than seq consistency.
On reads you can either wait until the table syncs (may cause timeouts, increases msgs-per-op)
or just detect the stale read and return an error (jepsen retries, we dont guarantee successful reads just that we dont return stale reads). I went with the latter.

shachaf · February 26, 2023, 1:27am

Thinking about it, maybe what @benbjohnson was getting at can be accomplished with non-identity CAS. If your key holds both the counter value and a globally unique token, and each time you either write or read, you CAS to change the token (while maintaining the value, in read cases), then there’s no valid history where both CASes succeed, since tokens are never reused. E.g.

A: read  (0@init)
A: cas 0@init -> 5@A1 (ok)

B: read  (0@init)
B: cas 0@init -> 0@B1

If A acknowledges the add, meaning its CAS succeeded, then B’s CAS can’t possibly succeed as well, since the value will never go back to 0@init again (this isn’t true with an identity CAS, because that can happen before A’s CAS and still have them both succeed).

This seems like a very sad counter, with maximal contention on both reads and writes. It also still seems like a stronger requirement than sequential consistency, so I’m still confused about the specification. But it does seem like it could work to force synchronization.

AnubhavUjjawal · March 3, 2023, 6:30pm

Hi, I was able to solve the question and make the test cases pass, I have a few questions though, and I was hoping someone could answer them.

Question 1

@benbjohnson1 I am not sure if this is correct, B’s write could also be reordered before A’s read or write. This is told in the formal definition here. Please let me know if I am wrong or right because I am very confused about this.

Question 2
CAS (CompareAndSwap) is a specialized operation for the purpose of this challenge. The next few lines could be a spoiler.

Did we have to figure out ourselves that a special kind of CAS can help us force the seq-kv to evaluate the pending actions?

Does the CAS operation exist in real-world scenarios? Any examples?

NOTES

If you need a few hints, go through this and this.

The comments in this thread by @shachaf were very helpful. He/She does not assume anything other than what Sequential Consistency means, and I also had a lot of the same doubts as him.

johnkoepi · March 14, 2023, 5:29pm

I don’t think so. I don’t see why you came up with this conclusion. After successful CAS any other good CAS is possible with the correct expected value.

johnkoepi · March 14, 2023, 5:35pm

The funny thing to me with this problem is the amount of how much it is underspecified.

With CAS ops they are useless IMHO as if you will be CAS-ing a global register you don’t have a guarantee that your CAS has exactly-once semantics. Basically, you don’t know if your CAS was successful or not with even a timeout. Especially with partitioning most of CAS responses can be dropped so you can’t be sure and can’t rely on that. For CASing only your own (node-specific key) its useless as you can just use a Store() call.

Then for the SeqCon model with no constraints on the consistent reads how can you guarantee in the end that you will ever return anything consistent if you are not guaranteed that your process will ever read any other update from the store beside the first one?

Otherwise, it’s a beautiful problem.

Topic		Replies	Views
Gossip Gloomers: Sequential Consistency semantics of SeqKV store	2	490	March 9, 2023
Challenge #4: Clarifications dist-sys-challenge	6	717	April 10, 2023
Gossip glomers - Challenge 6 dist-sys-challenge	14	892	April 13, 2023
Challenge 6 question dist-sys-challenge	1	520	February 27, 2023
Gossip Glommers Challenge 4 how to set :src in request to seq-kv? Questions / Help	2	441	July 26, 2023

Sequential consistency in challenge 4 (grow-only counter)

Related topics