Cloud Engineering weeknotes, 19 November 2021

More documentation this week, including a draft “Team ways of working” document that has really made me think. When writing this, I looked back to our first show & tell a year ago, when we set out our principles and values; I really do believe that we have held true to these. The fact that we’re doing this unconsciously is a good sign that the principles and values really do describe who we are as a team. 

We are tantalisingly close to finishing off some huge pieces of work. Our new firewalls are in, Panorama is being deployed for central management, and we have a host of improvements to Globalprotect lined up. One significant change coming in the near future will be new, and separate, URLs depending on where the application is hosted. The majority of applications will be on gp-apps.hackney.gov.uk and any applications hosted in our own AWS, like Qlik, will be on gp-vpn.hackney.gov.uk.

Unfortunately, there’s been no real progress on account migrations. We are ready to go on the Advanced e5 account, with a new VPN, but delays at Advanced mean that this will now not happen before next Tuesday. We are also still dealing with competing priorities in MTFH, but are meeting with the lead developers later today to unblock that. Until the Housing accounts are moved, we cannot move the API accounts. 

However, we are able to clean the API accounts up. The last significant group of apps to be moved to a new account is the GIS apps, such as Earthlight and LLPG. We have five EC2 instances and an RDS database to move, and the infrastructure to do so is just about ready. 

This week, we’ve noticed some issues in the platform, and have taken steps, or will take steps, to fix it. For example, we noticed that our Backups module wasn’t operating as expected – none of the backups older than 30 days had been deleted. We identified a missing line in the code, which has been fixed, and all the old snapshots have been purged. There is an associated cost with this, so S3 costs should fall a bit next month. 

Costs have been on my agenda this week. So we ran a report to identify over- or under-provisioned EC2 instances, and the recommendations have been shared. Some of the changes recommended don’t save a lot individually, but if we accepted all of them, we could save in the region of $10,000 per month (based on 24/7 usage). And that’s before Savings Plans, which we’re talking to AWS about. 

EC2 cost is now our single most expensive line item. Please make sure that your non-prod EC2s are powered down overnight by enabling the scheduler tags in Terraform. 

Finally, we are ripping up our roadmap next week, the second time we’ve done this since starting. We now have a much better understanding of where we are and what is needed, and some of the things we originally envisaged are either no longer necessary, or not possible. We would welcome input into this, and feedback once drafted, so if there’s anything you think should be included, please let us know. 

+ posts

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.