Challenge #3c: Fault Tolerant Broadcast

Hi folks, I am stuck on 3c. I know that after a network partition, some nodes will be unreachable. What I can’t figure out is if RPC() or SyncRPC() calls to unreachable nodes be stuck forever (and I need to force a timeout) until the partition heals or the simulation will return a RPC error with a timeout. As far as I can tell none of my RPC calls return an error, I checked after adding log lines to each error path. My code can seen [here].(3c.go · GitHub)

There’s not many (if any) guarantees about the partitioning. You should assume that messages can get delayed or dropped and there’s nothing in Maelstrom to guarantee a timeout error will be returned.

1 Like

100% to what @benbjohnson said.

it just the API functions names are confusing. Those RPC functions are not actually Remote Procedure Calls but they are just sending messages. If the message was dropped/lost somewhere in the middle - sorry, don’t wait for anything in return. Think of them as Send/Recv a UDP packet. The messages theoretically can be even dupped.

hi! are we allowed to change the message structure for broadcast in 3c? i’m not sure how to guarantee all nodes receive the message without tracking who has already received it and sent it to someone else

sorry to piggyback here but i figure it’s much better than creating another topic since the challenges have been out for a while now and i assume is already finished(?)

@arhyth if maelstrom workload works (valid: true) than its fine


I’m struggling with this a bit, I feel like this is tricky without more hints, and didn’t find anything else in the forums. I tried adding the failed messages to a list and resending failed messages on each broadcast.

edit: managed to do it by sending all the failed messages at a regular interval (or just sending all the failed messages on a read, which works but not recommended if there are a lot of reads).

Can you provide some info on where you’re getting stuck? Or can you post some code? The challenges are open-ended so there can be multiple approaches that can work.

I managed to come up with a solution by storing the failed messages using a map. I think it was just quite a steep curve from the previous step to this one. What made it difficult was that I didn’t see the log messages initially. Also the API functions are a bit confusing especially when using the async sender. In the end, I just used the sync sender, and once I found where to view the logs on each node, then it was easier to debug.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.