$\resizebox*{0.05\textwidth}{!}{\includegraphics{computer-icon.eps}}$ Modeling Intermittent Supply using EPANET and ArcView

Subsections

Modeling Intermittent Supply using EPANET and ArcView

by STEFFEN MACKE

The EPANET hydraulic engine is very powerful. However, a 'normal' EPANET model is not suited for intermittent supply analysis. Several steps are necessary to overcome this problem. These steps heavily depend on data preparation with GIS software like ArcView.

The following section will summarize the concepts used to establish an intermittent supply capable hydraulic model using ArcView and EPANET software.

It will also be an introduction to a software module developed during this project that allows the integration of EPANET hydraulic models in GIS data.

The motivation to create such software even though there are software packages available that provide a lot of the necessary functionality is clearly financial:

ArcView is a low end GIS software package, the high end ArcInfo software comes with network trace functionality and connectivity checks but constrains the user to proprietary data model. The proprietary data model in combination with the high price (approximately tenfold the price of ArcView) were the reasons not to use ArcInfo (section 6.1).

After careful consideration, the GNU Lesser General Public License (LGPL) was chosen for the software described hereafter[9]. DORSCH Consult decided to make the software available for free on the internet⁹.

The GNU Lesser General License is an open-source license that enables everybody to develop the software further - if the developer agrees in turn to license and publish his additions under the LGPL.

ArcView Extensions

ArcView GIS software provides a powerful way to extend the software's capabilities: So called 'Extensions' written in ArcView's scripting language AVENUE that seamlessly integrate with ArcView's graphical user interface. ArcView Extensions are typically installed and removed very easily. The user decides which extension to load during the ArcView session - depending on the problems he wants to analyze. The software module developed is called ``DC Water Design Extension'' and consists of approximately 6000 lines of AVENUE source code.

Appendix 11 contains the DC Water Design Extension Manual. Appendix 12 contains the description of the Data Model used by the DC Water Design Extension. In order to model more complex systems, reading the EPANET 2 Users Manual[16] is recommended.

Substitution of Demand Nodes with Small Tanks

Intermittent supply yields a fundamentally different demand pattern than continuous supply. In fact, there is no demand pattern: The storage tanks of the customers will fill up whenever the systems provides water, until they are full and the float valve closes[19].

The approach to substitute demand nodes with small tanks in intermittent supply models has been used once in a thesis in Palestine. However it was not possible to get hands on the respective paper.

The software described in [19] utilizes 4 components to model intermittent water supply:

Demand Model
Secondary Network Model
Network Charging Model
Modified Network Analysis Method

While this approach is far more sophisticated and may provide superior results, it requires more specialized software. The substitution of demands with generalized tanks makes it possible to model pressure dependent demand transparently with a number of existing hydraulic network analysis software packages and is therefore very flexible.

Tank Sizes

The tank sizes depend on whether the network is available down to the individual house connection level or not. If the individual house connections are available in a GIS, each tank in the model should represent one real household storage tank.

In case the house connections are not available in the GIS, generalized house connections should be generated. It is practical to create one tank per junction, which is located at the centroid of the customers that are assigned to it. The one-tank-one-junction relationship resembles the traditional nodal demand of junctions. Additional a 'house connection' pipeline has to be generated that connects the tank with the junction.

Fill Rates

The individual tank fill rates depend mainly on the headloss over the pipeline that connects the tank to the network. For the case that uses generalized tanks it might be necessary to graduate the diameter of the connecting pipeline depending on the size of the tank.

Pressure Dependencies

The intermittent supply model relies on pressure dependencies in two points:

The customer demands are pressure-dependent because of their storage tanks.
Leakage is pressure-dependent.

The substitution of demand nodes with tanks also enforces a pressure-dependency for the customer demands.

Leakage

Leakage is pressure dependent. Especially if the water is supplied intermittent: In times where the system is empty, there will be no leakage at all.

In traditional hydraulic models, leakage is equally distributed with the demand, in general as a factor that increases all demands. (Assuming that there is a lot of leakage where is a lot of consumption.) However, this assumption might not be true for all water networks. A more flexible implementation of leakage is therefore desirable.

Orifice Diameter

discharge rate in $\displaystyle {\frac{{l}}{{m}}}$ = 0.9^. $\displaystyle \left(\vphantom{ orifice\; diameter\; in\; mm}\right.$ orifice diameter in mm $\displaystyle \left.\vphantom{ orifice\; diameter\; in\; mm}\right)^{{2}}_{}$

(1)

Equation 1 has been found through regression from data available in [14].

Pressure Dependency

The following equation describes the discharge through an orifice[13]:

q_i = K_i^. $\displaystyle \left(\vphantom{ p_{i}-p_{o}}\right.$ p_i - p_o $\displaystyle \left.\vphantom{ p_{i}-p_{o}}\right)^{{\beta }}_{}$

(2)

where p_i is the pressure upstream of the orifice, p_o the pressure downstream of the orifice and $\beta$ a value of 0.5 according to theory and laboratory experience. K_i stands for the orifice coefficient, which is orifice-dependent. The following parameters also origin from [13]:

p_o = 0 Leakage is discharge into the atmosphere.
$\beta$ > 1.18 Leakage defects are different than simple orifices. They are deformed by the network pressure.

Implementation

Pressure-dependent leakage can be implemented with tanks connected to the network with short pipelines of a very small diameter. Such a setup creates flows that are similar to the discharge into atmosphere. Unlike the tanks used to model the household storage tanks, the tanks used in leakage modeling should never fill up completely - the leakage should only be limited by the headloss over the connecting pipe.

As for the tanks used in demand modeling, EPANET ensures pressure-dependency in this case. The parameter $\beta$ of equation 2 can not be taken into account, but the results should still be better than those of the traditional network model.

Unfortunately, no literature could be found on the relationship between diameter and average leakage rates per metre of pipeline. The number of defects increases for lower diameters[13], but the number of defects is not necessarily related to the leakage rates.

Section 8.3.5 describes two approaches to overcome this problem.

Virtual Lines

In the EPANET hydraulic model, pumps and valves are represented as lines. From the hydraulic modeling point of view this makes sense, as the orientation of the valves and pumps is important information. In GIS data, pumps and valves are typically represented as points, as they are also symbolized with point symbols. Point data is lacking the orientation information. Because of the pipe-node duality pipes and valves will be referred to as virtual lines in this section.

The pipe-node duality complicates the creation of hydraulic models from the GIS data. It is possible to overcome the problem with one of the following solutions:

Storage of orientation information for each virtual line in the GIS
Take over the orientation of connected pipes

The second possibility has some advantages, as it does not require additional data storage - it was applied in the described application. However, it imposes some restraints on the data. The concept to model virtual lines as points in the GIS can be summarized as follows:

Each virtual line needs to have exactly two pipes connected
Both connected pipes must be oriented in the same way: One pipe has to start at the virtual line and the other pipe has to end at the virtual line

**Figure 22:** Virtual Line Validity
$\resizebox*{0.7\columnwidth}{!}{\includegraphics{virtual-line-validity.eps}}$

Figure 22 shows examples of different pipe orientations at a virtual line. The case a shows a pump with two pipes connected that are oriented in the same way. This allows the creation of the hydraulic model and is therefore considered valid. Case b shows pumps with pipes connected that are not oriented in the same way. This is invalid as is does not allow the creation of the hydraulic model. Case c is invalid because the pump has more than 3 pipes connected to it. Note that the orientation information of the pump symbol is not necessarily contained in the GIS data.

**Figure 23:** Virtual Line Creation
$\resizebox*{0.7\columnwidth}{!}{\includegraphics{virtual-line-creation.eps}}$

Figure 23 depicts the conversion process of virtual lines:

Number and orientation of the pipes connected to the virtual line are checked for validity
The virtual node is replaced with a junction (PJ1).
An additional junction is added (PJ2).
The pipe from the virtual line to the next node starts at the additional junction. ( PJ2 -> J2).
The pump or valve is created. It connects the two new Junctions (PJ1->PJ2).

The DC Water Design Extension follows this conversion process when it is creating EPANET models. Additional considerations used in the process are:

The length of the virtual line is one metre.
If the pipe starting at the virtual line is shorter than one metre, the virtual line length is set to half of the pipe length.

Bit codes

Bit codes make it possible to store fields of yes-no information in 'normal' numbers. Every bit in the number having the value 1 is considered set, every bit of value 0 is not set. Thus making it possible to bit-code several independent pieces of information in one ArcView Number.

As ArcView is using 32-bit (integer) numbers, it should be theoretically possible to store up to 32 pieces of information. This requires that the databases where the data is stored use 32-bit numbers as well. Tests up to 19 sets of yes-no have been successful.

The following example shows how this concept allows storing the network information in one seamless data set and utilizing the same network features in different hydraulic models:

**Figure 24:** Bit-coding Supply Zones
$\resizebox*{0.7\columnwidth}{!}{\includegraphics{bitcoding-zones.eps}}$

In figure 24 the nodes of three hydraulic models are bit-coded for storage in the GIS. Each zone has its own bit in the bit code, indicating if the node is used in the model or not.

The DC Water Design Extension Manual (appendix 11) contains more information bit codes. The DC Water Design Extension provides some functions to work with such bit codes.

Calibration

The calibration of the intermittent supply model for Judayta village was influenced by a multitude of different factors.

**Figure 25:** Calibration Influences
$\resizebox*{0.7\columnwidth}{!}{\includegraphics{calibration-influences.eps}}$

Figure 25 shows the influences that have been taken into account.

The following global parameters have been chosen to calibrate the intermittent supply model:

A factor to adjust the diameters of the leakage pipes
The pipe roughness
A factor to adjust the diameters of the house connections
A factor to adjust the sizes of the household storage tanks
Three thresholds to distribute the size of the house connections
A factor to adjust the pump power

This differs from the calibration of a traditional hydraulic model, where only one global factor, the roughness is taken into account.

Though the Windows version of EPANET provides functionality to assess the quality of a calibration, this was not enough to evaluate the multitude of possible parameter combinations.

Genetic Algorithms

Genetic Algorithms are a method to optimize non-linear problems efficiently. For this reason they have become popular with hydraulic analysis and water supply network design software [17,19]. Genetic Algorithms are based on a process that is similar to the natural evolution: Individuals, which are more 'fit' to solve problems are more likely to be reproduced than other, not so fit individuals. The resulting selection process speeds up the problem solution. The approach itself is very generic, it is suited for many applications.

As the whole concepts stems from biology, the terms used to describe it are also biological:

Population
Gene
Allele
Reproduction
Mutation
Crossover

For the calibration, a gene is a set of parameters that describes one calibrated model. An allele is one of the parameters. A number of genes, called population, reproduces itself to form the next generation of genes.

As the parameters used in the calibration are not discrete, a concept is needed that allows genes with alleles that cover the whole space of possible solutions. For example it would be possible to create a population that is large enough to cover the necessary solutions. However, this would reduce the process to a simple trial-and-error one, as fitness evaluation and reproduction would only take place after the results for the large population have been calculated.

Another concept to increase the diversity of the population are mutations: Mutations allow changing the allele values of a gene. It has proved successful to allow complete mutations of an allele as well as only slight alterations of a value.

**Figure 26:** Allele Mutation
$\resizebox*{1\columnwidth}{!}{\includegraphics{mutation.eps}}$

Figure 26 shows the allele mutation probabilities that have been used in the genetic program.

Diversity is also increased by the reproduction process itself: The two parents (genes) of a child mix their alleles in a crossover process.

**Figure 27:** Reproduction with Crossover
$\resizebox*{0.7\columnwidth}{!}{\includegraphics{crossover.eps}}$

Figure 27 displays the reproduction process. The crossover point is chosen at random.

Evolution

After several generations of the population have been reproduced, the survival of the fittest takes place: Before the reproduction, a fitness value is calculated for each gene. Genes with a higher fitness, thus representing a better solution, are more likely to reproduce than those with a low fitness.

The complete process can be described as following:

Creation of a random population: A given number of genes with random alleles.
Calculation of fitness values for all the genes of the population.
Reproduction of the population. This includes crossover and mutations. A new population is created.
Continue with step 2.

Robust Software

Robustness, performance and development effort and usability are factors that have to be taken into account for any software development. For the calibration software robustness was the ultimate design goal. It was necessary to run the calibration software for many hours unattended - unstable, non-robust software would not have been suited for this task.

The software developed has been split into the following modules:

Calibration.sh - a shell script that controls the whole calibration. This script calls all other programs (EPANET, genetic, epanet2mysql as well as others) and sends SQL commands to the mySQL¹⁰ database.
Genetic - a C++ program that takes care of the reproduction and offers a front-end to the calibration data stored in an XML file¹¹
Epanet2mysql - a C program that converts the binary EPANET output into two text files that could be imported to the mySQL.

This approach prevents errors in the genetic and epanet2mysql from breaking the whole calibration process. The errors can only break one iteration. Note that these programs have also been quite stable during the process.

Calibration Process

**Figure 28:** Calibration Process
$\resizebox*{0.7\columnwidth}{!}{\includegraphics{calibration-process.eps}}$

Figure 28 gives an overview over the automated calibration process.

The chosen process has the following advantages:

The robust EPANET input validation is utilized
The genetic program uses the error checking provided by the XML C library for GNOME¹².
The mySQL database systems provides high performance.
All described software modules are free software.

The described calibration process was able to test thousands of possible calibration possibilities - providing substantial aid in the engineering process.

Model Limitations

Like every other model, the discussed approach is not able to render reality completely. The following limitations should not hinder the model's functionality:

Actual consumption is not considered: The consumption of water from the customer's storage tanks is not modeled. This could cause the tanks to empty and refill again. As the tanks are sized to contain the consumption of one week and the supply period is usually shorter than 24 hours, this effect should be negligible.
Household storage tanks are aggregated.
No support for empty or partly filled pipes.
Demands are assigned to the nearest pipe; this might be different in reality.
The consideration of unequal spatial distribution of leakage requires additional processing.

In addition, the model limitations for EPANET apply, see [16] for a discussion of the limitations.

A Strategy to Reduce Technical Water Losses for Intermittent Water Supply