Challenge #5c: Efficient Kafka-Style Log

Want to get an idea of how much more I can optimize my 5b solution.

Could anyone else share their results.edn to get an idea of what’s possible?

Sharing mine: (expand collapsed tab)

results.edn
{:perf {:latency-graph {:valid? true},
        :rate-graph {:valid? true},
        :valid? true},
 :timeline {:valid? true},
 :exceptions {:valid? true},
 :stats {:valid? true,
         :count 16461,
         :ok-count 16455,
         :fail-count 0,
         :info-count 6,
         :by-f {:assign {:valid? true,
                         :count 2097,
                         :ok-count 2097,
                         :fail-count 0,
                         :info-count 0},
                :crash {:valid? false,
                        :count 6,
                        :ok-count 0,
                        :fail-count 0,
                        :info-count 6},
                :poll {:valid? true,
                       :count 7549,
                       :ok-count 7549,
                       :fail-count 0,
                       :info-count 0},
                :send {:valid? true,
                       :count 6809,
                       :ok-count 6809,
                       :fail-count 0,
                       :info-count 0}}},
 :availability {:valid? true, :ok-fraction 0.9996355},
 :net {:all {:send-count 125706,
             :recv-count 125706,
             :msg-count 125706,
             :msgs-per-op 7.6365957},
       :clients {:send-count 40764,
                 :recv-count 40764,
                 :msg-count 40764},
       :servers {:send-count 84942,
                 :recv-count 84942,
                 :msg-count 84942,
                 :msgs-per-op 5.160197},
       :valid? true},
 :workload {:valid? true,
            :worst-realtime-lag {:time 0.047601458,
                                 :process 4,
                                 :key "8",
                                 :lag 0.0},
            :bad-error-types (),
            :error-types (),
            :info-txn-causes ()},
 :valid? true}

In terms of absolute values, your msgs-per-op looks good. Though rather than just looking at that number, its worth thinking about it like other algorithmic complexity questions. i.e. how does your performance grow with respect to # of nodes? message rate? For example if there are 20 nodes or the message rater is 10 how does your msgs-per-op change?

I also notice the challenges dont really care about message sizes, but if you are syncing the whole state all the time you can see why that would be bad (not that I know what you are doing).

1 Like

Thanks for the results to compare. This is my result file for #5c.

results.edn
{:perf {:latency-graph {:valid? true},
        :rate-graph {:valid? true},
        :valid? true},
 :timeline {:valid? true},
 :exceptions {:valid? true},
 :stats {:valid? true,
         :count 16810,
         :ok-count 16799,
         :fail-count 0,
         :info-count 11,
         :by-f {:assign {:valid? true,
                         :count 2110,
                         :ok-count 2110,
                         :fail-count 0,
                         :info-count 0},
                :crash {:valid? false,
                        :count 7,
                        :ok-count 0,
                        :fail-count 0,
                        :info-count 7},
                :poll {:valid? true,
                       :count 7648,
                       :ok-count 7644,
                       :fail-count 0,
                       :info-count 4},
                :send {:valid? true,
                       :count 7045,
                       :ok-count 7045,
                       :fail-count 0,
                       :info-count 0}}},
 :availability {:valid? true, :ok-fraction 0.9993456},
 :net {:all {:send-count 74488,
             :recv-count 74488,
             :msg-count 74488,
             :msgs-per-op 4.431172},
       :clients {:send-count 41166,
                 :recv-count 41166,
                 :msg-count 41166},
       :servers {:send-count 33322,
                 :recv-count 33322,
                 :msg-count 33322,
                 :msgs-per-op 1.9822725},
       :valid? true},
 :workload {:valid? true,
            :worst-realtime-lag {:time 0.02322775,
                                 :process 4,
                                 :key "8",
                                 :lag 0.0},
            :bad-error-types (),
            :error-types (),
            :info-txn-causes ([:crash "context deadline exceeded"])},
 :valid? true}

With imaginary leader leases and lazy topic index pushes I did around ~3.1 per op on avg. But all that is only possible because we are somewhat on a reliable network run. The proper fallback and recovery on those algorithms would be tricky. Thanks, we are not asked to balance the latency over the network failures. They mention availability tho but boy thanks no.

It could be better if the storage supported batching. But then you can go without lin-* at all and implement storage yourself. But then it will end up with a tradeoff of reliability and latency. For example if 2 replicas is enough you can improve over 3 in stable case. If you can afford losing last 10ms of messages than you can improve even more.

I really appreciate they don’t ask in the end to rework everything to manage membership changes. :lollipop:

1 Like