Challenge 6: maelstrom does not detect G0 even when it can + how to abort txs for G1a?

Hi, I want to show in my solution every anomaly mentioned. G1b and G1c are there. G1a (aborted reads) requires aborts. How to do proper aborts in terms of protocol?

Then the maelstrom does not detect G0 (dirty write cycle) or it does not generate them.

The naive implementation of the transaction processor that uses (Warning. Spoiler to 6a/b):

Summary

a single database mutex per the whole database that is ONLY acquired per single PUT/GET operation when the processor wants to do an op to the underlying storage

is immediately suspect to the G0 dirty writes anomaly, but Maelstrom does not detect them. It does not detect G0 even with 1 node, concurrency 10 and rate 10000, tx len 20, keys set 10 - which is nonsense. I of course added sleep to threads to make txn to last longer to improve conflict probability.

Did anyone have any luck seeing G0 in Maelstrom workload anomalies report with read-uncommitted anomaly?

Actually, it seems it just does not work. Here is the history I do have generated that has G0 and the Maelstrom does not see it

$ go build && ~/.../maelstrom test -w txn-rw-register --bin ./maelstrom-txn --node-count 1 --time-limit 20 --rate 10 --concurrency 10n --consistency-models read-uncommitted --max-txn-length 20 --max-writes-per-key 10000000 --key-count 1

2023/03/29 12:11:11 Received {c3 n0 {"txn":[["w",0,1],["r",0,null],["r",0,null],["w",0,2],["w",0,3],["r",0,null],["w",0,4],["r",0,null],["r",0,null],["r",0,null],["w",0,5],["w",0,6],["r",0,null],["w",0,7],["w",0,8],["r",0,null],["w",0,9]],"type":"txn","msg_id":1}}
910670418 + 0 = 1 [#2]  - write key 0, value 1, tx 2
991888336 - 0 = 0 [#2]
2023/03/29 12:11:11 Received {c4 n0 {"txn":[["r",0,null],["w",0,10],["r",0,null],["w",0,11]],"type":"txn","msg_id":1}}
997560527 - 0 = 0 [#3]
8713188 + 0 = 10 [#3] - write key 0, value 10, tx 3 (ww-edge)
17116025 - 0 = 0 [#2]
60262526 + 0 = 2 [#2] - write key 0, value 2, tx 2 (ww-edge - cycle with 3)
77438615 + 0 = 3 [#2] - write key 0, value 3, tx 2 (ww-edge - cycle with 3)
90639108 - 0 = 0 [#2] - read key 0, value 3, tx 2 (wr-edge - cycle with 3)
...
2023/03/29 12:11:12 Sent {"src":"n0","dest":"c4","body":{"in_reply_to":1,"txn":[["r",0,1],["w",0,10],["r",0,3],["w",0,11]],"type":"txn_ok"}}
2023/03/29 12:11:12 Sent {"src":"n0","dest":"c3","body":{"in_reply_to":1,"txn":[["w",0,1],["r",0,1],["r",0,10],["w",0,2],["w",0,3],["r",0,3],["w",0,4],["r",0,11],["r",0,17],["r",0,19],["w",0,5],["w",0,6],["r",0,14],["w",0,7],["w",0,8],["r",0,34],["w",0,9]],"type":"txn_ok"}}

So c3 observes the read value of 3 at operation index 5 (0-indexed), while it just previously wrote 2 and 3 on top of ww-conflict (10). It’s a G0 and maelstrom ignored it.

If it were working correctly, it had to abort at

60262526 + 0 = 2 [#2] - write key 0, value 2, tx 2 (ww-edge - cycle with 3)

according to the definition from Challenge #6b: Totally-Available, Read Uncommitted Transactions · Fly Docs

G0 (dirty write): a cycle of transactions linked by write-write dependencies. For instance, transaction T1 appends 1 to key x, transaction T2 appends 2 to x, and T1 appends 3 to x again, producing the value [1, 2, 3].

Here we have many cycles of tx 2 and tx3: w2(1), w3(10), w2(2), w2(3). It can easily be proved that the MS could alone detect based only on the received responses going from that (1) every written value is unique (2) all writes in those 2 txs covered with reads:

t3:           ["r",0,1],["w",0,10],                                          ["r",0,3],["w",0,11]
t2: ["w",0,1],["r",0,1],           ["r",0,10],["w",0,2],["w",0,3],["r",0,3]

Thus, if I am correct. It’s worth changing this line

For a hard challenge, try getting Maelstrom to fail because of a write-write cycle.

at the Challenge #6b: Totally-Available, Read Uncommitted Transactions · Fly Docs to something different as the Maelstrom does not detect ww cycles.

Just for the info, the report

jepsen.core {:perf {:latency-graph {:valid? true},
        :rate-graph {:valid? true},
        :valid? true},
 :timeline {:valid? true},
 :exceptions {:valid? true},
 :stats {:valid? true,
         :count 208,
         :ok-count 208,
         :fail-count 0,
         :info-count 0,
         :by-f {:txn {:valid? true,
                      :count 208,
                      :ok-count 208,
                      :fail-count 0,
                      :info-count 0}}},
 :availability {:valid? true, :ok-fraction 1.0},
 :net {:all {:send-count 418,
             :recv-count 418,
             :msg-count 418,
             :msgs-per-op 2.0096154},
       :clients {:send-count 418, :recv-count 418, :msg-count 418},
       :servers {:send-count 0,
                 :recv-count 0,
                 :msg-count 0,
                 :msgs-per-op 0.0},
       :valid? true},
 :workload {:valid? true, :anomaly-types (), :anomalies nil},
 :valid? true}

@benbjohnson did you have any luck with showing G0 in Maelstrom anomalies?

@benbjohnson so, I have got an answer from Aphyr (txn-rw-register workload does not detect g0 write cycles under read-uncommitted consistency · Issue #56 · jepsen-io/maelstrom · GitHub)

Right, so I think this is a shortcoming in Elle’s rw-register checker, which does pretty much all its inference through external reads and writes–IIRC, my rationale was that internal anomalies would be glaringly obvious, and it offers a pretty major speedup–a lot of problems gain an extra level of iteration if you consider internal reads/writes. I took a stab at changing that, and after seven hours I have to admit–this is a bigger problem than I was hoping. Elle is… fantastically complex, and has a lot of moving parts. Gotta shelve this for now.

So I guess I am the first one who actually tried to do the homework :ahahahahahahahahahahahahaha:. Anyway, we can both agree I have showed the G0, but Maelstrom at the moment would not fail on those. So pls correct the problem specification at the fly.io.

@johnkoepi Thanks for digging into that. I didn’t have luck with the G0 although Kyle said he was close but we had limited time left to dig into it further. I’ll get the spec fixed up.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.