How EPA Ireland is using new technologies in its regulatory processes: Case study on the Environmental Protection Agency (EPA)

The Environmental Protection Agency (EPA) is at the front line of environmental protection and policing in Ireland. The EPA is an independent public body established under the Environmental Protection Agency Act, 1992. The EPA has a wide range of knowledge, regulatory and advocacy functions including responsibility for environmental licensing and enforcement of industrial, waste and wastewater activities.

Data management technologies and practices have undergone dramatic transformation since 1992. In 1992, our regulatory information and all interactions between the EPA and its licensees were recorded in paper files, which gradually evolved into electronic document storage. From 2005 to 2015, the EPA invested significant ICT time and effort in developing database management systems (DBMS). Important features of the EPA data management architecture are the Environmental Data Exchange Network (EDEN), a Microsoft CRM application to manage all authorisations data and a centralised database of environmental geographic (GIS) data. For more information please see Annex Figure 2.A.1.

In 2017, the EPA adopted an enterprise architecture approach to ICT portfolio management and examined the ICT organisational structure. The EPA had achieved a great deal in establishing robust data management systems: the next opportunity was to unlock the potential of this structured data to gather more knowledge and insights about Ireland’s environment. In 2018, the EPA established a small Analytics team of four staff and two contractors to pilot the use of data science, spatial analysis, earth observation and data visualisation techniques. The purpose of the team is to analyse data to generate knowledge and insights on the environment by working collaboratively with EPA subject matter experts. When the team of experts was formed they embarked on a series of stakeholder meetings across the EPA to identify opportunities for the application of data science and analysis. The idea for this project came from a stakeholder meeting. The project was prioritised because:

  • the data was available to make it a feasible project to deliver quickly, and,

  • the importance of UWW treatment to the protection of human health and water quality indicated that it would be a project that would have high environmental impact.

The EPA regulates more than 1 000 urban waste water treatment plants (small to large). The plants are operated by Ireland’s water utility company Irish Water. A small team of eleven EPA inspectors check that these plants are operating safely within their emission limit values (ELV) and meeting all the conditions of their environmental licenses. All the administrative data about the plants is stored in a Microsoft CRM database. Monitoring results from the environment upstream and downstream of the plants are stored in a Laboratory Information System: seven different parameters are monitored. The EPA has access to thousands of data points about Ireland’s waste water treatment plants. This is a rich resource but the data needed to be organised and presented in a way that allows the team of eleven EPA inspectors to quickly identify issues at the plants without repeatedly examining hundreds and thousands of data points.

In its simplest terms the job of a waste water treatment plant is to take in dirty water and remove pollutants from it before discharging that water back into the environment. The effectiveness of a plant can be measured by comparing the quality of water entering the plant (the influent) to the quality of the water leaving (the effluent). If a plant is operating effectively then the effluent water should be clean: the monitoring results for any of the seven parameters should return results that are below the Emission Limit Value. Irish Water monitor the influent and effluent water and submit their results to the EPA via the web based Environmental Data Exchange Network (EDEN) portal.

The Analytics team identified an opportunity to assess the UWWT data and use statistical methods (integrating multiple results over a number of cycles to produce period trends) to group the monitoring results into “improving”, “staying the same” or “getting worse” for each of the different parameters to produce an Urban Waste Water Scorecard. This would allow an Inspector to examine all their plants and quickly focus on the specific plants and parameters that were a problem or “trending” towards failure.

The data analysts isolated the monitoring data for each parameter at each emission point for each waste water treatment plant. Then the data is manipulated to assess how efficient the plant is at removing each parameter. The available influent and effluent data are aggregated as monthly averages. A monthly efficiency value is determined by expressing the monthly effluent figure as a percentage of the monthly influent figure. A metric is applied to the set of monthly efficiency values that measures if the plant performance is either improving or deteriorating for this parameter over the last three years.

Then the rate of ELV compliance for each parameter is analysed. A calculation is applied to express the annual rate of compliance based on the number of effluent results that breach the ELV threshold each year. A metric is applied to the annual compliance rates of the past three years to express whether the plant performance is getting better or worse for each parameter over the previous three years.

The outcome is a prototype dashboard that summarises the trends at each wastewater treatment plant using clear graphics (see Figure 2.3).

The Inspector can click through on a plant to see graphs of the monitoring results for any of the seven parameters. There are three graphs: influent results, effluent results and the monthly removal coefficient (see Figure 2.4).

The Inspector can see the trend for each parameter. If a plant’s performance is declining, then the monitoring results for the effluent will start to drift towards an ELV limit line: interventions should be taken well before an ELV is breached. The monthly removal co-efficient will also start to trend downwards: the removal coefficient has a seasonal trend but over time the whole profile starts to trend downwards if a plant’s performance is declining. Presenting the data in this way makes it easy for an inspector to see if a change in plant performance is an anomaly or indicative of a systemic failure at the plant.

The graphs can also be used to confirm effectiveness of interventions and improvements. The graph below shows the dramatic improvement in the effluent monitoring after an upgrade to a treatment plant in the south east of Ireland (see Figure 2.5). There is an opportunity to expand the methods to allow an Inspector to model the impact of new connections to the sewage network that would increase the load on plant, or to model the expected impacts of new investments to improve the plant.

The result of the prototype was proof of the concept that the application of statistical techniques to monitoring data leads to more insights into the patterns and trends of monitoring results. The overall “performance indicator” derived from assessment of many factors is innovative and unique. The time input required from the Analytics team was six weeks of one data analyst to analyse the problem, develop the scripts and create the prototype dashboard.

This would not be straightforward without a structured database of frequent monitoring results (i.e. the one version of the truth). The lesson learned is that investment in database management systems and good data management processes can be capitalised on quite easily with a small investment of time from a data analyst. If the data was not structured and well managed to start with then the prototype would have taken a lot more effort and would be difficult to maintain.

The impacts of the urban waste water scorecard are:

  • Inspectors can more readily adopt a risk based enforcement approach using the “getting worse” flags in the scorecard. Efficiencies can be achieved by presenting the data to Inspectors so that they can quickly zone in on problem sites.

  • Inspectors can verify that the works on the plant have contributed to improving downstream water quality; this evidence is useful in creating river basin management plans.

  • A large volume of data from different sources was made accessible to UWWT Inspectors. This had two benefits. The first was extra insights: the calculation of metrics provided the inspectors with insights into what a large volume of monitoring data was telling them about the UWWT plant performance. The second was accessibility – presenting the data in a single dashboard interface made it easy for the Inspectors to access. Traditionally this would be reported as a saving in staff time, communicated as FTE. Time is indeed saved but this is only one aspect of a range of benefits that are harder to quantify in traditional measures like FTE. Accessibility of data, with additional insights, makes it easier to make decisions, communicate decisions and have confidence in the decisions. Anecdotally the Inspectors report that having this kind of access to data – and metrics – makes it easier for them to assess the situation at a UWWT plant and to be confident in their interactions with the UWWT operators.

Analytics is an emerging area in the EPA: the prototype dashboard is a good way to showcase the potential of analytics techniques. As a working example of analytics techniques in action it allows other EPA teams to understand the potential of analytics and how it could benefit their own work areas. The creation of this expert team allowed for an analytical use of UWWT data. UWWT inspectors would have been able to collate and analyse the data but did not have the time to do so, or the technical skills in R programming to create the reproducible metrics and dashboard. Setting up a team of experts with data analysis and technical skills means the EPA has the capacity to build metrics and tools that teams can apply to their day to day work.

The team of experts is working with other EPA teams and is applying data analysis in different ways to the UWWT case study that was submitted. In brief:

  • The team of experts worked with the EPA Laboratory teams on a prototype dashboard that is similar to the UWWT example. The laboratory teams collect monitoring data from our authorised facilities. The prototype demonstrated more dynamic ways for the lab teams to share their data with the EPA enforcement teams.

  • The team of experts is working with the Greenhouse Gas emissions inventory and projections team on a feasibility study to see if EPA can use a land-use map as the data source for annual Land Use, Land Use Change & Forestry (LULUCF) regulation reporting. This is using spatial analysis and remote sensing techniques.

  • The team of experts is working with the EPA radon team to assess the feasibility of updating the national radon risk map (a radon risk map is referenced in Ireland’s building regulations). The team of experts is using data verification and spatial analysis techniques here rather than prototyping.

  • The team of experts is at the early stages of developing a project with the EPA waste enforcement team to examine the potential of analytics for waste data analysis.

The process of raising awareness of the potential of analytics across the EPA is an ongoing one. It is made easier by having a bank of prototypes and completed projects as this gives EPA teams something to relate to so that they can imagine their own uses for analytics techniques. For more information please see Annex 2.B.

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2020

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at