ZOS poor performance on PCIE 4 NVME SSD

Hi,

this is @naturecrypto here, I changed my nickname to match the one on github :slight_smile:

I got a little bit of time right now so I am continuing my tests from the start of the year. I built a machine for Zos with 5950X 16 cores 32 threads, 128 GB RAM and 2 TB Sabrent Rocket 4+ PCIE 4 SSD

In January I made some benchmarks with FIO inside phoronics test suite container to verify the SSD performance on ZOS with 5.4 kernel and on Ubuntu 20.04 with different kernels (baremetal, installed phoronix-test-suite on it)

I’ve got poor results on ZOS and Ubuntu 5.4.10 or 5.4.108 kernels :

Random read 4k blocks : 12.4 Mb/s 4142 IOPS
Random write 4k blocks : 13.3 Mb/s 4489 IOPS
Sequential read 2MB blocks : 1316 Mb/s 864 IOPS
Sequential write 2MB blocks : 2326 Mb/s 1528 IOPS

When I switched to 5.8.0-48 (or today to 5.10.59 for current testing purpose) kernel on Ubuntu, results are much better :

Random read 4k blocks : 1855 Mb/s 488 000 IOPS
Random write 4k blocks : 563 Mb/s 144 000 IOPS
Sequential read 2MB blocks : 6728 Mb/s 3360 IOPS
Sequential write 2MB blocks : Ubuntu 6271 Mb/s 3132 IOPS

Would it be possible to test a ZOS version with more recent kernel to verify that it is the root cause of the problem ? Switching to 5.10 LTS kernel could be a good idea for better storage performance, which is critical for hosting workload.

2 Likes

I edited the previous post for better clarity on the troubleshooting process

Hi @naturecrypto, We have created a github issue with your post. We will post the updates on that issue in here as and when they happen.

There is an update on the github issue that we had created with your post:

"It’s not kernel related, if you run fio on your root filesystem of the container, you hit 0-fs , which is not made to be fast, specially for random read/write.

If you want good performance, you have to mount a volume inside your container and test against it, which won’t be on top of an flist but on direct disk access."

I deployed a new alpine container with extra storage mounted as /data.

Results are the same if the test is done on / or /data

I’d like the github issue link to follow up on this issue please

Hello @archit3kt,

Sure will have them team look into it.

Thanks you @Amanda !

If you have the github issue link please provide it :slight_smile:

I am sorry @archit3kt but we don’t share Github links.

I’m very surprised by this answer since all your development is supposed to be open source and hosted on github and the issues are clearly widely used :

Could you please elaborate on why you don’t share github links on the forum, despite TF’s wish for transparency ?

Hello @archit3kt, we don’t share links to the private support repo, but you can of course take a look at www.github.com/threefoldtech for our open code software.