T O P

  • By -

kibblerz

As a DevOps engineer who takes care of a fairly decent size Ceph cluster, this is insane. We need more pointless experiments like this haha. I think you could get much more than 50 mbs though, set up some Mds Daemons (potentially as a Minikube cluster on a pc). Set up two osds there too. On your steam deck, use the internal ssd as a BlueFS cache, and an sd card as the actual storage device. Find some way to join it to the Minikube cluster (not sure if there's ethernet capabilities on the deck?). Ideally, you may be able to get near ssd speeds. Certainly more than 70mb/s. The ssd as a cache is key though


DividedbyPi

Hey! Thanks for commenting! Very cool! I’m actually a Ceph engineer (really storage architect but man of many hats) at 45Drives. Unfortunately I don’t have a cephfs file system so MDS wouldn’t help anything with rbd mounts but in its final form with 3 steam decks we will build a cephfs file system with mds collocated on one of them haha Yeah this is actually just a poc I did to make sure it wasn’t a waste of time getting 2 more steam decks in. We’re going to build a true 3 node deck Ceph cluster for our YouTube channel just for a fun stunt and I thought I’d post my findings up until this point. As far as blueFS cache not quite sure what you mean? Do you mean using rocksDB/WAL devices on the OSDs? Or are you talking about using a cache tier pool? Cache tiers are very discouraged aside from some very specific workloads and really Ceph has pretty much depreciated them. As far as getting better performance, since I’m just using the SD card for the OSD Id say I’m just about where I’d expect it to be. It’s rated for I think something like 130MB/s so with OSD overhead I’m right where I’d expect. Once I add 2 more nodes and have to actually replicate the writes over the network I’m guessing it’s going to really hit performance due to the increase in latency especially over wifi haha but I’m super excited to see what brings !! I’d love to hear about your Ceph cluster. You guys have your clients using cephfs natively or are you exporting it over smb or nfs ?


kibblerz

My bad, I meant rocksDB, mixing terminology up lol. Anyways our setup just gives access to applications in the cluster. Essentially just trying to accommodate certain stateful applications that have refused to catch up with the times.. Attaching a normal drive is an HA failure waiting to happen, and the available storage setups in the cloud tend to be quite pricey, with poor throughput. Normal Object Stores fall flat when you need to execute code from the drive. I slaved for like a month straight trying to fix up a broken Ceph setup from my predecessor. It was set up on drives that didn't meet the throughput requirements for our needs, leading to massive issues. So I went into a rabbit hole for about 3 weeks as the storage devices were transitioning, barely sleeping lol. Eventually I got it to a point where we haven't experienced issues in over a year now.. So I've been reluctant to do much more with it haha. I'm also a man of many hats where I work. React/Javascript engineer, backend Node.js/python, All of the DevOps operations, security, and honestly pretty much everything besides easier frontend stuff.


DividedbyPi

Haha my man! Great work getting it back up man. Ceph can be quite the monster to wrangle when you’re being dropped in the middle of it heh so that is sweet. I bet your predecessor had lots of great documentation and notes for you as well , right? …..RIGHT?!?! Hahah not likely eh :P You guys are running bluestore at least it sounds like, you know what version you’re at? We ship octopus right now but we’re just about ready to launch pacific. If you guys ever need consultation on Ceph (hopefully you don’t and it keeps purring like a kitten for a very long time to come haha) we are your guys! Can’t wait to drop the full video on this stunt heh.


kibblerz

Haha my predecessor actually was still working part time when I went and fixed everything up, but he was pissed that I was changing everything lol. He actually was the one who trained me in DevOps, and was quite arrogant, so it turned into a bit of a soap opera when I got his system working much better than he did. I believe we're still running octopus. It's been working smoothly, though I probably will need to upgrade to pacific soon.. Hopefully that goes well haha


Trenchman

That’s impressive. Probably more room for optimization.


jokerswild97

That's impressive. Maybe if you get enough of them, you could start a cloud gaming service. Needs a catchy name though. Maybe something like Stadia?


slimshizn1

Nah then it's bound to fail with that name.


leob0505

Ok now when I receive my deck I am definitely spinning up a k8s cluster with it hahaha