Data Platform Project Weeknotes 14: 08.12.2021

Data Platform Project

For more information about the HackIT Data Platform project please have a look at this weeknote on the HackIT blog.

Enabling MMH to push data to the platform

We have been working on setting up a template for the Manage My Home  team to work with, set up and create their project.  The template  will minimise the amount of repetitive set up tasks developers who are collaborating with the data platform team (and Kafka specifically) would be expected to do. This will also speed up the process of sending data from the MMH app to Kafka so that it can then be ingested into the data platform. Kafka is an open source software which provides a framework for storing, reading and analysing streaming data. 

Moving Qlik to the production environment 


We are in the process of moving the business insight tool Qlik away from the Productions API account  and onto an AWS production Data Platform account. This will allow us to move away from accessing Qlik through AppStream. We hope that it will also make the use of Qlik a lot more straightforward for its users. The process involves the replication of existing connections to data sources. So far we have connected to the Social Care tool which is in the Mosaic Productions AWS account and we are currently looking at connecting up the Repairs Hub data. 

Refining Our Product Roadmap

We continue to further refine our product roadmap. We have now established about 25 user needs for four unique user groups. We  have also worked to find common user needs between these four user groups in order to find out the large scale tasks which we need to prioritise. 

We then worked to find common user needs in order to establish the large scale tasks which we need to prioritise. In the image below, we mapped the user needs in order to establish the complexity/difficulty of tasks. We now  have a clearer understanding of what we have achieved so far and what remains to be done. All of the user needs highlighted in yellow have had some work started on them already.

There is still a lot of work to be done but we are also very proud of the achievements made thus far. We will be going into more detail into the development of the roadmap as well as answering any questions you may have.

 Establishing a process for supporting users

We came together to look at the best way to support our collaborators with requests for help. We have found that it has become somewhat unmanageable to keep track of user support requests and the work that needs to be prioritised or is already in  progress. We have come up with a solution to track requests and will be sharing the proposal with some of our users this week.

Next Show & Tell – Friday 10th of December

Our next Show & Tell is on the 10th of December at 12-12.30pm. Come along to find out more about what we are up to and invite others that may be interested (the calendar invite is open). Email if you need any help. For anyone that can’t make it, don’t worry we will record the session and post on Currents/Slack after. 

Data Platform Project weeknotes 29/10/2021

What is the Data Platform project?

Hackney has long viewed its data as a strategic asset that has the potential to deliver insights to help us make better decisions and improve the lives of our residents. Behind the scenes of a statistic in a report or a dashboard are the tools, processes and infrastructure needed to get access to our data, move it, store it and transform it so that it can be used for analysis. That’s where a data platform comes in.

A data platform is an integrated technology that allows data located in data sources to be governed, accessed and delivered to users, data applications, or other technologies. We’re using the recovery from the cyber attack as an opportunity to ‘build back better’ for the future and deliver a secure, scalable, reusable cloud-based data infrastructure that brings together the council’s key data assets. We want our data platform to help us democratise access to data (where appropriate), use technology to enable deeper insight, and derive greater value from our data to improve the lives of residents.

In practice, our data platform will be composed of a number of different elements:

  • Data Lake – a centralised repository to store data all in one place, and a set of loosely coupled processes to ingest, process and publish that data (see diagram below).
  • Playbook – documentation of the platform’s tools and processes, along best practices when interacting with them
  • Data catalogue – documentation and metadata about specific datasets, columns etc.

Feedback on outputs from the Data Platform

The Data Platform team previously worked to bring together disparate repairs data into a single, cleaned repairs dataset that could be joined with other data on resident vulnerability.  Our goal was to identify households that hadn’t had a recent repair and were potentially more vulnerable so that the council could make proactive contact to check in on both the needs of the resident and the condition of the property. Through this work, we were able to give the new Link Work team (who provide targeted, holistic, and proactive support to residents) a list of residents who were aged 70+, living alone who hadn’t had a repair in two years or more. Link Workers have started to make calls to these residents and we have been receiving some excellent feedback on how well targeted these interventions have been.

The Link workers recently dealt with the needs of a resident  in her mid-70s, living alone who has health conditions that limit her mobility. She had had outstanding repairs issues and her property was very cluttered, but she has worried about raising things with her TMO because she’s afraid of how people will perceive her. She has also struggled with navigating the benefits system to claim Attendance Allowance.

The Link work team proactively reached out to her because of the data insights they were able to surface from the data platform. She said it was ‘a blessing’ to have someone check in on her and felt a weight had been lifted. She’s now receiving food support, had a visit from a therapeutic decluttering service, and is getting financial advice.

Our challenges

Our main goal was to start to ingest Council Tax data from the Academy. In addition we continued our work on making sure analysts are able to use planning data in the platform with limited support from the team.

Meeting these sprint goals has been challenging. We as a team have experienced a lot of frustration due to our inability to get access to the data we need. On numerous occasions the means of accessing the data have been insufficient for bulk reporting.

Getting Council Tax data from Academy

Academy is the Capita owned system that contains records relating Council Tax, Housing Benefit and Business Rates. It uses an INGRES database which is notoriously difficult to extract data from but has massive potential for analytics.
After  undergoing  a research ‘spike’ to decide on the best approach to ingest council tax data from the Academy, we discovered firstly that there was an API but it doesn’t support the bulk downloads we’d need to use it effectively. We also knew that it wouldn’t be great practise to connect straight to the database and run complex queries as it may slow down the academy software.

We investigated restoring the Academy app from a disk backup but discovered that the encryption keys don’t allow sharing across AWS accounts.  We also investigated creating a ‘read’ replica. Ideally this would mean a copy of the database would sit in the Academy account which we could connect to and query the database. However, this is currently blocked by a lack of access and requires some negotiation with the vendor.

Getting Planning Data from Tascomi

Tascomi is the system used by the planning team. We have been provided with an API by the vendor to access planning data.

The first version of Tascomi data workflow has been deployed. This means that we are now able to get  a change-only update each day of new or amended records and add this  to our previous snapshots to create a full dataset. We are currently onboarding a planning data analyst on the use of Qlik and Athena. We are also getting the help of another data analyst (Adam from the Data and Insight team) to help with further refinement.

However, we continue to have some challenges with the vendor, Tascomi, turning the API access on and off without prior warning. 

Next Show & Tell – Friday 12th November
Our next Show & Tell is on the 12th of November at 12-12.30pm. Come along to find out more about what we are up to and invite others that may be interested (the calendar invite is open). Email if you need any help. For anyone that can’t make it, don’t worry we will record the session and post on Currents/Slack after.