PDA

View Full Version : 11/01/2022 SWGEmu and Service Provider Change



Lolindir
11-01-2022, 02:48 PM
SWGEmu and Service Provider Change

11/1/2022
The SWGEmu Team


Many of you know we moved our servers in 2017 to packet.net's bare metal servers. Through a gracious arrangement with packet.net, we were able to secure multi-year discounted prices that gave us the considerable resources we needed to run our services. In March 2020, Equinix acquired packet.net and continued to honor our contract. However, as they look to optimize their data center footprint, they have decided to shut down packet.net's old data center that hosts our services. They have given their customers until 30 November 2022 to shut down all servers in the old data centers.

We are looking into solutions where we can get bare metal hosts or one of the top three public cloud providers near the same geolocation as our current servers (EWR - New Jersey). The requirements for our servers are quite high compared to basic shared hosting, while our latency and reliability requirements go beyond many “cheap server” providers. Our current systems have 256G or RAM, Intel CPUs with 48 Threads, and 960G of mirrored SSD/NVME storage combined with 10Gbs access to the internet with 4TB of outbound transfer. Our previous deal with packet.net provided all this for about $1,200.00 USD/month.

The publicly listed cost for equivalent servers in the new Equinix data centers is about 2.7 times more expensive than our current monthly costs. Unfortunately, our current donations will not support those costs, so we need to optimize our services. To this end, we will be running an event to bring Basilisk to a final conclusion, and we will be testing other service providers to find a price/quality match that will meet our needs going forward.

This move is complicated; most people don't know how complex our environment is. We have 35 services running, including registration, forums, archives, support portal, servers for Login, Nova, TC-Prime, Finalizer, Basilisk, DevOps: Jenkins, Gerrit, and build tools. These are all managed in a Kubernetes cluster across the worker nodes.

While we will do our best to make the move painless for the community, it's important to remember we are a 100% volunteer org, and our team members have day jobs, family, friends, and other commitments outside the project.

In the coming days, the EC team will announce an event to end Basilisk so we can avoid moving more than 650Gigs of disk usage, 128G of ram usage, and many Terabytes of backups. This is the first in several steps we will take to simplify our environment to optimize for cost and reduce the time needed to move our services to new hosting.

When we are ready to cut-over Finalizer, we will post a notice on the actual date of the move both on the forums and in-game. We will move other services incrementally as we progress to the new environment, hopefully with minimal disruption.







Thank You,
~SWGEmu Staff

SLiFeR
11-01-2022, 06:37 PM
Good update, thank you.

Livy2K
11-01-2022, 06:47 PM
Its tough when economic realities impact the virtual world like this.

I'm sure many people will miss basilisk it's ability to offer the most accurate pre cu experience since the CU was patched.

Tomahawk
11-02-2022, 07:51 AM
All this extra work for you :(

Thank you for all your amazing work! And Goodbye Bas !!!

neopixie
11-02-2022, 11:47 AM
3 questions..

1) Why does the server 'need' to be in New Jersey?
2) Why does it 'need' 10Gbit pipe? Although not game related, We're running a financial server, order processign system, webhost, B2B and global manufacturing servers from our inhouse datacenter on a 1Gbit pipe with roughly 15-20k x .5-2MB worth of data every hour and never hit above 250Mbit let alone 1Gbit... so your need for 10Gbit seems a little overkill..
3) how on earth are you paying $1,200 a month for that? Giving the benefit of doubt on the 'need' for 10Gbit having the server outside of New Jersey and looking at other providers I was able to find a 2x AMD 7551 (32/64x2) 256GB DDR4, 2x 1tb NVMe, 2x8tb HDD for €575 ($568.96) a month..

Not a dig on the project as you're doing great work so far. Just honest question that have now come to a good time to ask why you're paying for what you're getting and why you need what you have.

Lolindir
11-02-2022, 03:20 PM
3 questions..
1) Why does the server 'need' to be in New Jersey?
Because it should be about the same latency for all players AFTER the move as it is before, somewhere latency wise to this geolocation will work best, it doesn't have to be in NJ, that's where it is now. When we moved from Dallas to NJ there was no end of complaints from players who noticed changes in latency.

2) Why does it 'need' 10Gbit pipe? Although not game related, We're running a financial server, order processign system, webhost, B2B and global manufacturing servers from our inhouse datacenter on a 1Gbit pipe with roughly 15-20k x .5-2MB worth of data every hour and never hit above 250Mbit let alone 1Gbit... so your need for 10Gbit seems a little overkill..
When we have events and 1500 people are on at once we easily exceed 1Gbit peeks, also our offsite backups take several hours even with 10Gbit and we don't want to stop doing regular full backups.

3) how on earth are you paying $1,200 a month for that? Giving the benefit of doubt on the 'need' for 10Gbit having the server outside of New Jersey and looking at other providers I was able to find a 2x AMD 7551 (32/64x2) 256GB DDR4, 2x 1tb NVMe, 2x8tb HDD for €575 ($568.96) a month..
We have 3 servers, not just 1. The primary runs Finalizer, Login server and other latency sensitive workloads. The secondary runs TC-Prime, the Build Environment (i.e. compiling and running tests, and pushing the docker images), and it runs other services like a local docker registry, archived forums, live forums, csr tooling, jenkins, gerrit, TC-Nova, and a host of other things we can "pile on" in one box w/o latency concerns. The third server runs all our management tools, several vms with things like db's or isolated docker builders etc. We think we can optimize that one out, but it won't be simple.

Not a dig on the project as you're doing great work so far. Just honest question that have now come to a good time to ask why you're paying for what you're getting and why you need what you have.
This project not only runs a server for players, it runs many services to support the community, CSR's, Dev's, and testing servers. Plus we do hourly snapshots, daily snapshots and we ship backups off site daily and weekly.
Also in the past our ISP had numerous networking issues with being highly over subscribed, and when we needed support it took them literally a week to reply to tickets. With packet.net we had much better support, and almost zero network issues, and the one h/w issue we had they helped us with right away rather than pointing fingers at us.
All this goes to say we don't want to move to some fly-by-the-night provider, or a cheap low-quality one, nor one who packs their servers so tight their buildings catch on fire.

Just to show off some of the loads we have. Remember Mb isn't megabits, but bytes
https://i.imgur.com/AS2xADS.png
https://i.imgur.com/iN328l3.png
https://i.imgur.com/bRYOAZ8.png

neopixie
11-02-2022, 03:59 PM
Great to see an actual answer to questions many have thought over the years, respect to that.

Is a shame, specially price wise. With Lolindir being also in the EU/Nordic area I hope he can agree. Dedicated server and datacenters are cheap as chips here, specially with the likes of Ionos, 123 and Hetzner.

Has the idea of just Buying and colocation ever come into consideration?

neopixie
11-02-2022, 04:17 PM
Also them pictures are Megabits (Mb) not Megabyte (MB)..

robegan99
11-02-2022, 11:36 PM
If you're considering the big three cloud providers, let me know if you can use any assistance in calculating and comparing costs. I work for a company with enormous footprints in all three, and have quite a bit of experience with these sorts of migrations.

neopixie
11-09-2022, 04:21 PM
With the shutdown of bas now pretty much confirmed...

Can we get some insight into what the donations would now actually be funding?

specially with the data that has been shown, the server is not even hitting close to a 1Gbit of bandwidth let alone 10Gbit.

lordkator
11-10-2022, 11:10 AM
With the shutdown of bas now pretty much confirmed...

Can we get some insight into what the donations would now actually be funding?

specially with the data that has been shown, the server is not even hitting close to a 1Gbit of bandwidth let alone 10Gbit.

I've been away with, well, you know, Real Life(tm), had to move, big changes at work, need to pay my personal bills, etc..

The picture shared by Lolindir was from the ISP's network report in MB, not bits (we pay by megabytes per month), not Megabits.

Also, internet speed is only one of the many network criteria; latency, loss, jitter, and availability are important.

Basilisk consumes 600Gb+ of online disk and 64G of RAM, but now Finalizer actually outstrips it in RAM and CPU and is starting to build up quite an online disk footprint.

This is part of why we're optimizing Basilisk out of the environment, it's wasting resources and supporting a handful of players.

Also, it's yet another server we have to support, keep track of, handle crashes, and deal with support questions, and all by the 100% volunteer staff.

As stated by Lolindir, the environment runs many services, Basilisk is just one of them. You can see the current setup here (https://www.swgemu.com/forums/content.php?r=359).

I think I can remove one of the three servers by pushing all its services onto the secondary server. It'll add some lag to the forums but that's not super important, it will also slow down builds which is annoying but that's life.

And donations not only pay for the servers but our infra costs like github org, off-site backups (90 days = 6+ Terabytes), domain name registrations, on and on. And we had a great deal for the servers and network from packet.net.

However, those days are past, and we have to move, and no, we're not moving to a data center provider who's datacenters burn up (https://www.reuters.com/article/us-france-ovh-fire/millions-of-websites-offline-after-fire-at-french-cloud-services-firm-idUSKBN2B20NU) because they pack their servers too tightly, nor are we moving to an EU-based location.

The project is a US-based legal entity and does not have the resources to deal with laws in other countries, regulations, and other challenges such as currency or filing taxes in other countries.

The live environment is not just a single simple server, remember we run a CI/CD pipeline (Jenkins, Gerrit, Docker), login server, Finalizer, multiple test servers (Nova, TC-Prime), global registration, bug reporting, forums, archives, CSR/Support tooling, support ticketing system, Eye of Sauron for anti-cheat analytics and alerting, primary and replica databases to support all these, monitoring and alerting. In total, there are 38 production containers and 3 VM's running on the three servers.

Finalizer has seen peaks of 2,500 logins a day (1,725 online at a time), and at those times, we're not only burning network bandwidth, but we're also burning lots of RAM, and CPU is often right at the edge even with 24 cores and 48 HT's.

The reality is we've gotten away with a 55%+ discount on our servers for years now, and the donations barely cover that these days.

We will do our best to optimize the services we support to keep the project going, but over time if donations go down, we will have to shut down more services and combine others all onto a smaller footprint, and people will just have to deal with the latency, and availability issues.

Oh, and don't forget all of us are 100% volunteers, we run all this in our spare time, and the last thing we need is more randomness injected by running this all in a closet somewhere.

PS: This move is already costing me 20 hours a week in prep; when it's done, it'll be 160+ hours of my life gone; please be careful about waving hands and saying things are easy, they are not, and I know first hand.

Praxi34
11-10-2022, 12:57 PM
I've been away with, well, you know, Real Life(tm), had to move, big changes at work, need to pay my personal bills, etc..

The picture shared by Lolindir was from the ISP's network report in MB, not bits (we pay by megabytes per month), not Megabits.

Also, internet speed is only one of the many network criteria; latency, loss, jitter, and availability are important.

Basilisk consumes 600Gb+ of online disk and 64G of RAM, but now Finalizer actually outstrips it in RAM and CPU and is starting to build up quite an online disk footprint.

This is part of why we're optimizing Basilisk out of the environment, it's wasting resources and supporting a handful of players.

Also, it's yet another server we have to support, keep track of, handle crashes, and deal with support questions, and all by the 100% volunteer staff.

As stated by Lolindir, the environment runs many services, Basilisk is just one of them. You can see the current setup here (https://www.swgemu.com/forums/content.php?r=359).

I think I can remove one of the three servers by pushing all its services onto the secondary server. It'll add some lag to the forums but that's not super important, it will also slow down builds which is annoying but that's life.

And donations not only pay for the servers but our infra costs like github org, off-site backups (90 days = 6+ Terabytes), domain name registrations, on and on. And we had a great deal for the servers and network from packet.net.

However, those days are past, and we have to move, and no, we're not moving to a data center provider who's datacenters burn up (https://www.reuters.com/article/us-france-ovh-fire/millions-of-websites-offline-after-fire-at-french-cloud-services-firm-idUSKBN2B20NU) because they pack their servers too tightly, nor are we moving to an EU-based location.

The project is a US-based legal entity and does not have the resources to deal with laws in other countries, regulations, and other challenges such as currency or filing taxes in other countries.

The live environment is not just a single simple server, remember we run a CI/CD pipeline (Jenkins, Gerrit, Docker), login server, Finalizer, multiple test servers (Nova, TC-Prime), global registration, bug reporting, forums, archives, CSR/Support tooling, support ticketing system, Eye of Sauron for anti-cheat analytics and alerting, primary and replica databases to support all these, monitoring and alerting. In total, there are 38 production containers and 3 VM's running on the three servers.

Finalizer has seen peaks of 2,500 logins a day (1,725 online at a time), and at those times, we're not only burning network bandwidth, but we're also burning lots of RAM, and CPU is often right at the edge even with 24 cores and 48 HT's.

The reality is we've gotten away with a 55%+ discount on our servers for years now, and the donations barely cover that these days.

We will do our best to optimize the services we support to keep the project going, but over time if donations go down, we will have to shut down more services and combine others all onto a smaller footprint, and people will just have to deal with the latency, and availability issues.

Oh, and don't forget all of us are 100% volunteers, we run all this in our spare time, and the last thing we need is more randomness injected by running this all in a closet somewhere.

PS: This move is already costing me 20 hours a week in prep; when it's done, it'll be 160+ hours of my life gone; please be careful about waving hands and saying things are easy, they are not, and I know first hand.

Brilliant work LK. A great explanation to add to your already big workload....We appreciate you and the team massively! Hopefully the donations stay up and we dont have to downsize!! Especially with what's going on in the world financially, streamlining and getting rid of the 'non-essentials' is good project management!

neopixie
11-10-2022, 01:00 PM
PS: This move is already costing me 20 hours a week in prep; when it's done, it'll be 160+ hours of my life gone; please be careful about waving hands and saying things are easy, they are not, and I know first hand.

No one here said it was easy?

I, among others pretty do this on the daily as a real life(tm) job and understand its can be a royal pain in the breasticales. Its more a curiousity question how you're hitting such high server usage and bandwidth (although id maybe get your ISP to display as MB and not Mb) when a certain other 'place in a galaxy far far away' is able to run 'double' the pop with a fracture of the cost.

pugguh
11-10-2022, 03:12 PM
I've been away with, well, you know, Real Life(tm), had to move, big changes at work, need to pay my personal bills, etc..

The picture shared by Lolindir was from the ISP's network report in MB, not bits (we pay by megabytes per month), not Megabits.

Also, internet speed is only one of the many network criteria; latency, loss, jitter, and availability are important.

Basilisk consumes 600Gb+ of online disk and 64G of RAM, but now Finalizer actually outstrips it in RAM and CPU and is starting to build up quite an online disk footprint.

This is part of why we're optimizing Basilisk out of the environment, it's wasting resources and supporting a handful of players.

Also, it's yet another server we have to support, keep track of, handle crashes, and deal with support questions, and all by the 100% volunteer staff.

As stated by Lolindir, the environment runs many services, Basilisk is just one of them. You can see the current setup here (https://www.swgemu.com/forums/content.php?r=359).

I think I can remove one of the three servers by pushing all its services onto the secondary server. It'll add some lag to the forums but that's not super important, it will also slow down builds which is annoying but that's life.

And donations not only pay for the servers but our infra costs like github org, off-site backups (90 days = 6+ Terabytes), domain name registrations, on and on. And we had a great deal for the servers and network from packet.net.

However, those days are past, and we have to move, and no, we're not moving to a data center provider who's datacenters burn up (https://www.reuters.com/article/us-france-ovh-fire/millions-of-websites-offline-after-fire-at-french-cloud-services-firm-idUSKBN2B20NU) because they pack their servers too tightly, nor are we moving to an EU-based location.

The project is a US-based legal entity and does not have the resources to deal with laws in other countries, regulations, and other challenges such as currency or filing taxes in other countries.

The live environment is not just a single simple server, remember we run a CI/CD pipeline (Jenkins, Gerrit, Docker), login server, Finalizer, multiple test servers (Nova, TC-Prime), global registration, bug reporting, forums, archives, CSR/Support tooling, support ticketing system, Eye of Sauron for anti-cheat analytics and alerting, primary and replica databases to support all these, monitoring and alerting. In total, there are 38 production containers and 3 VM's running on the three servers.

Finalizer has seen peaks of 2,500 logins a day (1,725 online at a time), and at those times, we're not only burning network bandwidth, but we're also burning lots of RAM, and CPU is often right at the edge even with 24 cores and 48 HT's.

The reality is we've gotten away with a 55%+ discount on our servers for years now, and the donations barely cover that these days.

We will do our best to optimize the services we support to keep the project going, but over time if donations go down, we will have to shut down more services and combine others all onto a smaller footprint, and people will just have to deal with the latency, and availability issues.

Oh, and don't forget all of us are 100% volunteers, we run all this in our spare time, and the last thing we need is more randomness injected by running this all in a closet somewhere.

PS: This move is already costing me 20 hours a week in prep; when it's done, it'll be 160+ hours of my life gone; please be careful about waving hands and saying things are easy, they are not, and I know first hand.

Thanks for the update LK.

seahawk99
11-10-2022, 10:01 PM
Thanks for the info and all the time / effort you put in this project.

meidae
11-18-2022, 09:32 AM
The 30th is getting closer, have you managed to find a new solution yet? Also, is there any interview or similar planned with Mobyus that we can look forward to? I would love to hear what we can expect in the near future and some reflections on the past year.


</pre>

neopixie
11-21-2022, 04:55 PM
The 30th is getting closer, have you managed to find a new solution yet?

With the fact they haven't answered questions and the radio silence.. I guess.. No.

lordkator
12-03-2022, 02:33 PM
All services moved as of 11/29/2022 09:30a Eastern Time.

Thank you for your patience.

cmurphy50
12-03-2022, 05:47 PM
All services moved as of 11/29/2022 09:30a Eastern Time.

Thank you for your patience.
Pop pop! Great job!

pugguh
12-03-2022, 07:53 PM
All services moved as of 11/29/2022 09:30a Eastern Time.

Thank you for your patience.

Thanks for all you do Sir...

neopixie
12-04-2022, 04:27 PM
All services moved as of 11/29/2022 09:30a Eastern Time.

Thank you for your patience.

What ,where and how much is this new platform?