Data Platform Project weeknotes 29/10/2021

What is the Data Platform project?

Hackney has long viewed its data as a strategic asset that has the potential to deliver insights to help us make better decisions and improve the lives of our residents. Behind the scenes of a statistic in a report or a dashboard are the tools, processes and infrastructure needed to get access to our data, move it, store it and transform it so that it can be used for analysis. That’s where a data platform comes in.

A data platform is an integrated technology that allows data located in data sources to be governed, accessed and delivered to users, data applications, or other technologies. We’re using the recovery from the cyber attack as an opportunity to ‘build back better’ for the future and deliver a secure, scalable, reusable cloud-based data infrastructure that brings together the council’s key data assets. We want our data platform to help us democratise access to data (where appropriate), use technology to enable deeper insight, and derive greater value from our data to improve the lives of residents.

In practice, our data platform will be composed of a number of different elements:

  • Data Lake – a centralised repository to store data all in one place, and a set of loosely coupled processes to ingest, process and publish that data (see diagram below).
  • Playbook – documentation of the platform’s tools and processes, along best practices when interacting with them
  • Data catalogue – documentation and metadata about specific datasets, columns etc.

Feedback on outputs from the Data Platform

The Data Platform team previously worked to bring together disparate repairs data into a single, cleaned repairs dataset that could be joined with other data on resident vulnerability.  Our goal was to identify households that hadn’t had a recent repair and were potentially more vulnerable so that the council could make proactive contact to check in on both the needs of the resident and the condition of the property. Through this work, we were able to give the new Link Work team (who provide targeted, holistic, and proactive support to residents) a list of residents who were aged 70+, living alone who hadn’t had a repair in two years or more. Link Workers have started to make calls to these residents and we have been receiving some excellent feedback on how well targeted these interventions have been.

The Link workers recently dealt with the needs of a resident  in her mid-70s, living alone who has health conditions that limit her mobility. She had had outstanding repairs issues and her property was very cluttered, but she has worried about raising things with her TMO because she’s afraid of how people will perceive her. She has also struggled with navigating the benefits system to claim Attendance Allowance.

The Link work team proactively reached out to her because of the data insights they were able to surface from the data platform. She said it was ‘a blessing’ to have someone check in on her and felt a weight had been lifted. She’s now receiving food support, had a visit from a therapeutic decluttering service, and is getting financial advice.

Our challenges

Our main goal was to start to ingest Council Tax data from the Academy. In addition we continued our work on making sure analysts are able to use planning data in the platform with limited support from the team.

Meeting these sprint goals has been challenging. We as a team have experienced a lot of frustration due to our inability to get access to the data we need. On numerous occasions the means of accessing the data have been insufficient for bulk reporting.

Getting Council Tax data from Academy

Academy is the Capita owned system that contains records relating Council Tax, Housing Benefit and Business Rates. It uses an INGRES database which is notoriously difficult to extract data from but has massive potential for analytics.
After  undergoing  a research ‘spike’ to decide on the best approach to ingest council tax data from the Academy, we discovered firstly that there was an API but it doesn’t support the bulk downloads we’d need to use it effectively. We also knew that it wouldn’t be great practise to connect straight to the database and run complex queries as it may slow down the academy software.

We investigated restoring the Academy app from a disk backup but discovered that the encryption keys don’t allow sharing across AWS accounts.  We also investigated creating a ‘read’ replica. Ideally this would mean a copy of the database would sit in the Academy account which we could connect to and query the database. However, this is currently blocked by a lack of access and requires some negotiation with the vendor.

Getting Planning Data from Tascomi

Tascomi is the system used by the planning team. We have been provided with an API by the vendor to access planning data.

The first version of Tascomi data workflow has been deployed. This means that we are now able to get  a change-only update each day of new or amended records and add this  to our previous snapshots to create a full dataset. We are currently onboarding a planning data analyst on the use of Qlik and Athena. We are also getting the help of another data analyst (Adam from the Data and Insight team) to help with further refinement.

However, we continue to have some challenges with the vendor, Tascomi, turning the API access on and off without prior warning. 

Next Show & Tell – Friday 12th November
Our next Show & Tell is on the 12th of November at 12-12.30pm. Come along to find out more about what we are up to and invite others that may be interested (the calendar invite is open). Email if you need any help. For anyone that can’t make it, don’t worry we will record the session and post on Currents/Slack after.