Question: Cassandra read/write bandwidth

Question

Cassandra read/write bandwidth

Answers 1
Added at 2017-01-02 17:01
Tags
Question

I'm using Cassandra docker latest. I'm running multiple containers, each run a stand alone Cassandra instance. Each instance has a local nvme SSD to store data and commit log on it. I'm working with YCSB to test performance- workload a (50% reads 50% inserts) 100M records. for 2 containers on a single host I'm getting ~23K TPS.

What I don't understand is the nvme ssd performance: I see a steady ~2GB/s read bandwidth on each ssd and only ~ 20MB/s writes. The writes are done only for a short time- most of the time there are no writes to the disks and once in a while I can see a peak of 300MB/s writes.

is that an expected behavior of Cassandra? is the rate between disks reads to writes is so huge?

(the host has 65GB Memory)

Regards,

David

Answers
nr: #1 dodano: 2017-01-02 19:01

Yes, sounds right to me. Reads are more expensive, and writes are cheap. Since you cant do joins the idea is to make each query only read from one partition. You accomplish this by denormalizing and writing many times instead of just once.

When the memtables flush it will cause a lot of write traffic, which is what is likely causing those large spikes. Reads are gonna hit disk a lot and depending on compaction strategy, may require a lot of IO. There then will also be more steady (although still bursty, every 10 seconds) writes to the commit log. You may want to checkout the doc on the write path or check out these the read and write path introductions. There are a lot of other online references on this too if you search for it.

Source Show
◀ Wstecz