Verification Hub: Weeknotes 04/06/19

Verification Hub has been suffering slightly in recent weeks from competing priorities and leave commitments across our part-time project team. We’re approaching a point where we can start live testing the VH in front of Citizen Index in the parking verification process so we’re redoubling efforts to get to that point by the end of June.

Our focus this week has been on establishing the key steps to get to live testing over our next 2 sprints. Farthest Gate are working at the moment to connect with our new API and Tom is working with InfoShare to get ClearCore 5 deployed in our live environment. Steve continues to work on the feed of Housing Benefit data and will be prioritising the daily feeds of both HB and Council Tax data over the next sprint. After that, we’re reliant on Matt to make sure that our new API is connected to ClearCore version 5 and deploy it into the production AWS environment. Then it’s all about watching how VH performs in the live environment so we can’t start tweaking and testing our data inputs and thresholds…

Verification Hub: Weeknotes 15/05/19

We’ve been working this week on the spec for the development required from Farthest Gate to bring the Verification Hub in to the parking permit verification process in Liberator. In the first instance, it’ll be (after testing in the test environment) put in front of the existing verification steps including the current Citizen Index. This means we can watch how the VH performs and tweak our matching rules without creating any unnecessary extra effort for either residents or parking staff. We’re still not sure when Farthest Gate will be able to prioritise the work so we may have to consider pausing the project until we get the work scheduled in.

Matthew Keyworth and Tom Clark have managed to successfully connect and authenticate the API to the new ClearCore 5 web service but are still working through the search, possibly because some field names have changed between version 4 and 5.

Steve Farr continues to work to automate and clean up Housing Benefits and housing tenant data to bring it into the VH hub. Our testing has shown we’re already hitting around 57% verification based on Council Tax alone and we’re confident that the inclusion of these data sets will push that up considerably.

Building and scaling our data science skills

We’re really excited that next month, one of our Data & Insight Analysts, Anna Gibson, will be joining the next cohort on GDS’ Data Accelerator Programme. We’ve heard great things about the programme from previous participants – from Bharath’s blog and from Tom Foster at Warwickshire County Council who shared his experience which was a great help in reminding us to focus down our topic of interest for our application.

From early April, Anna will be spending one day a week at GDS’ offices in Whitechapel working with an experienced data scientist who will mentor her as she works on a project to provide a unified view of the property market in Hackney. The Council has many repositories of data about housing, and a further wealth of data about rented accommodation in the borough is available online. However, we don’t have access to a single, comprehensive view of private renting in the borough.

Anna’s work, both at GDS and back in Hackney day to day, will be focussed on working to create a profile of the rental market in the borough, combining some of our internal Council resources with external data sources – we want to try to use web scraping, image processing and text processing to try and learn more from private rental advertising sites (such as SpareRoom) and from Google Street View, for example.

As well as building Anna’s data science skills directly (and scaling this to the wider team as she shares this work with her Data Analytics team colleagues), we think this work will be a great resource to help us identify Council properties which are being illegally sub-let or to detect properties which should be subject to a House of Multiple Occupation licence and also help us better understand social inclusion in Hackney as we get a better view of transience of residents in certain areas.

The programme runs for 3 months until July 2019 and Anna will be sharing her experience for others to benefit from, as the work develops.

Embedding an ethical approach to underpin our data science projects

We’re lucky in Hackney – in 2018, our Data & Insight function has grown in both number and scope, and we’re one of only a few local authorities to employ a permanent data science resource. Our data scientist works closely with the rest of the team, whose overall focus is on joining and analysing the huge range of data we hold, to help services better meet the needs of our residents. The talent and skills of the team, combined with the vision of our ICT leadership, which challenges us to look at the same problems in radically different ways, offers no small opportunity.

The private sector has led the way in practically employing data science techniques, harnessing vast swathes of information on customers and their spending habits to maximise revenues. For example, we’ve seen companies like Amazon use machine learning to persuade shoppers to part with extra cash at the online checkout by showcasing related products. The public sector has lagged behind, in part because of a lack of investment in the necessary skills but also due to the longstanding central government focus on data being used primarily for retrospective reporting. This has limited our ability to use our knowledge  – about our residents and how they interact with services – more creatively. Shifting the focus to predictive analysis could help us change the way we support people in future, to help us deliver better services at lower cost.

We want to replicate the success of the private sector in leveraging the vast volumes of data we hold as an asset to improve our service provision. These include the opportunity to prevent homelessness; better targeting of resources to assess social care cases that require most urgent attention; improved customer service by channeling users to other services they may interested in as they transact with us; or tackling fraud, to name a few.

While opportunity abounds, we face a unique challenge in meeting the expectations of our residents who hold us to a much higher standard than private companies, when it comes to handling their data. Many local government data teams are starting to work on predictive models but we know system bias is a concern to the public. How can we trust the results of predictive algorithms that have been built on data which may be limited, or only reflect how a long established service has traditionally engaged with specific sections of our community?

If we are to successfully implement forward-looking predictive analytics models, we have to build trust with our citizens: to ensure that they understand our motivations, and can transparently assess how we work with data.

The approach we’re taking:

From the outset, we’ve been alert to the need to test and build an ethical approach to our data science work, which is still in its infancy.

Building on examples we’ve seen elsewhere, we’ve developed a Data Ethics Framework which nestles alongside our Privacy Impact Assessment (PIA) process in Hackney, to make sure that for every project we undertake, we’re able to clearly articulate that we’re using a proportionate amount of data to enable us to draw robust conclusions.

At each stage of our 5 step project cycle, we stop to consider:

– Is our data usage proportionate and sufficient to meet user need?

– Are we using data legally, securely and anonymously?

– Are we using data in a robust, transparent and accountable way?

– Are we embedding and communicating responsible data usage?

One of the most important checks and balances on our work will come from an increasing focus on how we embed responsible use of our findings. Applying data science methods to our large volumes of data offers huge opportunities to provide a positive impact for residents, but we know there are risks if outputs are misinterpreted. We’re trying to mitigate against this by developing our team’s skills to communicate our findings in the simplest way possible so that local government officers don’t need to become expert data analysts to responsibly make use of this work. Democratising access to data and providing everyone with the skills or tools they need to make sense of information has to be the right approach.

We’re taking small steps every day to improve our skills and maximise the benefit of the opportunity we have in Hackney. We’re learning from others – notably the Data Ethics Workbook which inspired our approach and trying to embed in a simple and proportionate way. The key for us is balance; we’ve tried to streamline this into a simple process with sufficient rigour to build confidence in the ethics of our work without unnecessarily slowing down our ability to experiment. We’re keen to open out the conversation and hear from other public sector organisations who are beginning to unpick this sticky issue.

We also recognise that to truly build trust with our citizens on how and when we use their data, we need to openly engage with people. We’re thinking about how best to start a conversation with residents so we can hear their concerns, discuss the risks and opportunities and agree a way forward, together.