Predicting non-deposition sediment transport in sewer pipes using Random Forest.

Designing sediment-carrying sewer systems is a well-known field of research in hydraulic engineering

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

Water

 Research 189 (2021) 116639 

Contents

 

lists

 

available

 

at

 

ScienceDirect

 

Water

 

Research

 

journal

 

homepage:

 

www.elsevier.com/locate/watres

 

Predicting

 

non-deposition

 

sediment

 

transport

 

in

 

sewer

 

pipes

 

using

 

Random

 

forest

 

Carlos

 

Montes

 

a

 

,

 

,

 

Zoran

 

Kapelan

 

b

 

,

 

Juan

 

Saldarriaga

 

a

 

a

 Department of Civil and Environmental Engineering, Universidad de los Andes, Bogotá, Colombia 

b

 Department of Water Management, Delft University of Technology, Delft, Netherlands 

a

 

r

 

t

 

i

 

c

 

l

 

e

 

i

 

n

 

f

 

o

 

Article

 history: 

Received

 15 July 2020 

Revised

 29 October 2020 

Accepted

 12 November 2020 

Available

 online 13 November 2020 

Keywords:

 

Non-deposition

 

Random

 forest 

Sediment

 transport 

Self-cleansing

 

Sewer

 systems 

a

 

b

 

s

 

t

 

r

 

a

 

c

 

t

 

Sediment

 

transport

 

in

 

sewers

 

has

 

been

 

extensively

 

studied

 

in

 

the

 

past.

 

This

 

paper

 

aims

 

to

 

propose

 

a

 

new

 

method

 

for

 

predicting

 

the

 

self-cleansing

 

velocity

 

required

 

to

 

avoid

 

permanent

 

deposition

 

of

 

material

 

in

 

sewer

 

pipes.

 

The

 

new

 

Random

 

Forest

 

(RF)

 

based

 

model

 

was

 

implemented

 

using

 

experimental

 

data

 

col-

 

lected

 

from

 

the

 

literature.

 

The

 

accuracy

 

of

 

the

 

developed

 

model

 

was

 

evaluated

 

and

 

compared

 

with

 

ten

 

promising

 

literature

 

models

 

using

 

multiple

 

observed

 

datasets.

 

The

 

results

 

obtained

 

demonstrate

 

that

 

the

 

RF

 

model

 

is

 

able

 

to

 

make

 

predictions

 

with

 

high

 

accuracy

 

for

 

the

 

whole

 

dataset

 

used.

 

These

 

predictions

 

clearly

 

outperform

 

predictions

 

made

 

by

 

other

 

models,

 

especially

 

for

 

the

 

case

 

of

 

non-deposition

 

with

 

de-

 

posited

 

bed

 

criterion

 

that

 

is

 

used

 

for

 

designing

 

large

 

sewer

 

pipes.

 

The

 

volumetric

 

sediment

 

concentration

 

was

 

identified

 

as

 

the

 

most

 

important

 

parameter

 

for

 

predicting

 

self-cleansing

 

velocity.

 

© 2020

 

Elsevier

 

Ltd.

 

All

 

rights

 

reserved.

 

1.

 

Introduction

 

Designing

 

sediment-carrying

 

sewer

 

systems

 

is

 

a

 

well-known

 

field

 

of

 

research

 

in

 

hydraulic

 

engineering.

 

This

 

interest

 

is

 

explained

 

by

 

the

 

problems

 

related

 

to

 

the

 

presence

 

of

 

material

 

in

 

the

 

systems.

 

Due

 

to

 

the

 

varying

 

environmental

 

conditions

 

(i.e.

 

loading

 

and

 

sedi-

 

ment

 

characteristics

 

and

 

intermittent

 

flow),

 

the

 

risk

 

of

 

building

 

up

 

a

 

permanent

 

sediment

 

deposit

 

increases

 

during

 

dry

 

weather

 

sea-

 

sons.

 

These

 

deposits

 

lead

 

to

 

problems

 

such

 

as

 

reduced

 

pipe

 

capac-

 

ity,

 

increased

 

roughness,

 

and

 

premature

 

overflows.

 

As

 

an

 

example,

 

Ackers

 

et

 

al.

 

(2001)

 

showed

 

that

 

the

 

presence

 

of

 

a

 

permanent

 

de-

 

posit

 

at

 

the

 

bottom

 

of

 

sewer

 

pipes

 

increases

 

hydraulic

 

roughness

 

and

 

reduces

 

discharge

 

capacity

 

by

 

about

 

20%.

 

The

 

most

 

common

 

criterion

 

to

 

avoid

 

permanent

 

deposit

 

of

 

ma-

 

terial

 

in

 

sewer

 

pipes

 

is

 

known

 

as

 

non-deposition.

 

Several

 

authors

 

(

 

Safari

 

et

 

al.,

 

2018

 

;

 

Vongvisessomjai

 

et

 

al.,

 

2010

 

)

 

have

 

classified

 

this

 

criterion

 

into

 

two

 

subgroups:

 

1)

 

Non-deposition

 

without

 

deposited

 

bed

 

and

 

2)

 

Non-deposition

 

with

 

deposited

 

bed.

 

Both

 

groups

 

are

 

based

 

on

 

the

 

presence

 

of

 

sediments

 

at

 

the

 

bottom

 

of

 

the

 

pipe.

 

In

 

the

 

first

 

case,

 

high

 

water

 

velocities

 

produce

 

an

 

individual

 

and

 

sep-

 

arate

 

movement

 

of

 

the

 

particles

 

by

 

slicing

 

or

 

rolling

 

over

 

the

 

pipe

 

invert,

 

i.e.

 

without

 

deposited

 

bed.

 

In

 

contrast,

 

the

 

second

 

case

 

is

 

seen

 

when

 

lower

 

water

 

velocities

 

are

 

presented

 

and

 

the

 

particles

 

are

 

grouped

 

and

 

move

 

as

 

a

 

transitional

 

deposited

 

bed.

 

Corresponding

 author at: Cra 1 Este No. 19A – 40 Bogota, Colombia. 

E-mail

 

addresses:

 

cd.montes1256@uniandes.edu.co

 

(C.

 

Montes),

 

Z.Kapelan@tudelft.nl

 

(Z.

 Kapelan), 

jsaldarr@uniandes.edu.co

 

(J.

 Saldarriaga). 

In

 

the

 

case

 

of

 

‘without

 

deposited

 

bed’,

 

traditional

 

criteria

 

of

 

minimum

 

velocities

 

and

 

shear

 

stress

 

values

 

are

 

commonly

 

found

 

in

 

water

 

utilities

 

standards

 

and

 

industry

 

design

 

codes.

 

Generally,

 

these

 

standards

 

and

 

codes

 

suggest

 

values

 

ranging

 

from

 

0.30

 

m

 

s

 

−1

 

to

 

1.0

 

m

 

s

 

−1

 

for

 

minimum

 

velocity

 

and

 

from

 

1.0

 

Pa

 

to

 

4.0

 

Pa

 

for

 

shear

 

stress

 

(

 

Montes

 

et

 

al.,

 

2019

 

;

 

Nalluri

 

and

 

Ab

 

Ghani,

 

1996

 

;

 

Vongvisessomjai

 

et

 

al.,

 

2010

 

).

 

Several

 

authors

 

(

 

Merritt

 

and

 

Enfin-

 

ger,

 

2019

 

;

 

Nalluri

 

and

 

Ab

 

Ghani,

 

1996

 

)

 

have

 

shown

 

how

 

tradi-

 

tional

 

threshold

 

values

 

lead

 

to

 

over-design

 

of

 

small

 

diameter

 

pipes

 

and

 

under-design

 

of

 

large

 

diameter

 

pipes

 

(as

 

a

 

rule-of-thumb,

 

pipes

 

with

 

diameter

 

greater

 

than

 

500

 

mm).

 

Consequently,

 

large

 

sewers

 

commonly

 

require

 

frequent

 

removal

 

of

 

sediment

 

deposits

 

(

 

Ackers

 

et

 

al.,

 

2001

 

)

 

because

 

of

 

the

 

minimum

 

self-cleansing

 

value

 

adopted

 

during

 

the

 

design

 

stage.

 

A

 

unique

 

design

 

value

 

is

 

inad-

 

equate;

 

hence

 

sediment

 

characteristics

 

and

 

hydraulic

 

conditions

 

must

 

be

 

included

 

in

 

the

 

definition

 

of

 

the

 

self-cleansing

 

design

 

cri-

 

terion.

 

According

 

to

 

Safari

 

and

 

Aksoy

 

(2020)

 

,

 

existing

 

traditional

 

self-

 

cleansing

 

criteria

 

can

 

be

 

up

 

to

 

20%

 

different

 

from

 

laboratory-

 

scale

 

measured

 

values.

 

The

 

channel

 

cross-section

 

is

 

relevant

 

in

 

the

 

choice

 

of

 

the

 

self-cleansing

 

criterion.

 

For

 

example,

 

rectangu-

 

lar

 

cross-sections

 

require

 

lower

 

velocities

 

compared

 

to

 

V-bottom

 

or

 

U-shape

 

channels.

 

Even

 

criteria

 

based

 

on

 

the

 

Shields

 

diagram,

 

such

 

as

 

the

 

Camp

 

criterion,

 

seem

 

to

 

be

 

inadequate

 

to

 

define

 

the

 

self-cleansing

 

value

 

due

 

to

 

the

 

non-inclusion

 

of

 

sediment

 

concen-

 

tration.

 

The

 

above

 

has

 

motivated

 

extensive

 

experimental

 

research

 

(

 

Ab

 

Ghani,

 

1993

 

;

 

El-Zaemey,

 

1991

 

;

 

May,

 

1993

 

;

 

May

 

et

 

al.,

 

1989

 

;

 

https://doi.org/10.1016/j.watres.2020.116639

 

0043-1354/© 2020 Elsevier Ltd. All rights reserved.

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

Mayerle,

 

1988

 

;

 

Montes

 

et

 

al.,

 

2020a

 

,

 

2020b

 

;

 

Ota,

 

1999

 

;

 

Perrusquía,

 

1991

 

;

 

Vongvisessomjai

 

et

 

al.,

 

2010

 

)

 

aiming

 

to

 

collect

 

data

 

and

 

developing

 

models

 

for

 

predicting

 

the

 

self-cleansing

 

velocity

 

as

 

a

 

function

 

of

 

sediment

 

characteristics

 

and

 

sewer

 

hydraulics,

 

based

 

on

 

the

 

concept

 

of

 

non-deposition.

 

These

 

studies

 

have

 

been

 

car-

 

ried

 

out

 

at

 

laboratory

 

scale

 

under

 

well-controlled

 

and

 

steady

 

flow

 

conditions,

 

using

 

non-cohesive

 

sediments.

 

Different

 

authors

 

col-

 

lected

 

data

 

in

 

pipes

 

with

 

different

 

materials

 

(e.g.

 

concrete,

 

acrylic

 

or

 

PVC,

 

among

 

other

 

materials)

 

and

 

internal

 

diameters,

 

ranging

 

from

 

100

 

mm

 

to

 

595

 

mm.

 

In

 

the

 

end,

 

all

 

these

 

studies

 

proposed

 

a

 

model

 

for

 

predicting

 

the

 

self-cleansing

 

conditions

 

in

 

practice

 

that

 

was

 

either

 

developed

 

with

 

their

 

own

 

experimental

 

data

 

or

 

using

 

the

 

benchmark

 

data

 

reported

 

in

 

the

 

literature.

 

Most

 

models

 

devel-

 

oped

 

are

 

regression-based

 

and

 

include

 

the

 

group

 

of

 

input

 

param-

 

eters

 

that

 

most

 

affect

 

the

 

prediction

 

of

 

the

 

self-cleansing

 

veloc-

 

ity

 

(

 

Ackers

 

et

 

al.,

 

2001

 

;

 

Ebtehaj

 

and

 

Bonakdari,

 

2016a

 

;

 

May

 

et

 

al.,

 

1996

 

).

 

Most

 

of

 

these

 

models

 

are

 

in

 

the

 

form

 

of:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

aC

 

b

 

v

 

d

 

R

 

or

 

d

 

D

 

c

 

λ

e

 

D

 

f

 

gr

 

W

 

b

 

Y

 

or

 

y

 

s

 

Y

 

or

 

y

 

s

 

D

 

g

 

P

 

B

 

h

 

(1)

 

where

 

V

 

l

 

is

 

the

 

self-cleansing

 

velocity,

 

the

 

mean

 

particle

 

diam-

 

eter,

 

the

 

gravity

 

acceleration

 

coefficient,

 

S

 

s

 

the

 

specific

 

gravity

 

of

 

sediments,

 

C

 

v

 

the

 

volumetric

 

sediment

 

concentration,

 

R

 

the

 

hy-

 

draulic

 

radius,

 

D

 

the

 

pipe

 

diameter,

 

λ

the

 

channel

 

friction

 

fac-

 

tor,

 

D

 

gr

 

the dimensionless grain

 

size

(

 

=

 

(

 

(

 

S

 

s

 

−1

)

 

g

 

d

 

3

 

ν

2

 

)

 

1

 

3

 

)

 

,

 

ν

the wa-

 

ter

 

kinematic

 

viscosity,

 

W

 

b

 

the

 

sediment

 

deposited

 

width,

 

P

 

the

 

wetted

 

perimeter,

 

y

 

s

 

the sediment

 

deposited

 

thickness,

 

B

 

the

 

wa-

 

ter

 

surface

 

width,

 

Y

 

the

 

water

 

level

 

and

 

a

 

,

 

b,

 

c,

 

e

 

,

 

,

 

and

 

h

 

re-

 

gression

 

coefficients.

 

Other

 

parameters

 

as

 

V

 

t

 

the

 

threshold

 

veloc-

 

ity

 

required

 

to

 

initiate

 

movement

 

(

 

=

 

0

 

.

 

125

 

(

 

gd

(

 

S

 

s

 

− 1

 

)

 

)

 

0

 

.

 

5

 

(

 

Y

/d

 

)

 

0

 

.

 

47

 

)

 

and

 

S

 

o

 

the

 

pipe

 

slope

 

have

 

also

 

been

 

included

 

in

 

regression

 

models

 

(

 

May

 

et

 

al.,

 

1996

 

;

 

Montes

 

et

 

al.,

 

2020a

 

).

 

Most

 

of

 

above

 

studies

 

for

 

both

 

non-deposition

 

criteria,

 

have

 

de-

 

veloped

 

predictive

 

models

 

which

 

tend

 

to

 

be

 

overfitted

 

to

 

their

 

own

 

experimental

 

data.

 

This

 

problem

 

can

 

be

 

seen

 

especially

 

in

 

the

 

ear-

 

lier

 

works,

 

where

 

no

 

advanced

 

techniques

 

were

 

used

 

to

 

develop

 

regression

 

models.

 

For

 

example,

 

several

 

authors

 

(

 

Montes

 

et

 

al.,

 

2020b

 

;

 

Safari

 

et

 

al.,

 

2018

 

)

 

have

 

pointed

 

out

 

that

 

early

 

work

 

of

 

Mayerle’s

 

(1988)

 

has

 

developed

 

a

 

model

 

that

 

shows

 

high

 

accu-

 

racy

 

prediction

 

with

 

its

 

data

 

and

 

poor

 

prediction

 

when

 

other

 

datasets

 

are

 

used.

 

In

 

contrast,

 

recent

 

regression-models,

 

which

 

used

 

novel

 

techniques

 

such

 

as

 

Evolutionary

 

Polynomial

 

Regression

 

– Multi-Objective

 

Genetic

 

Algorithm

 

(EPR-MOGA)

 

and

 

Least

 

Abso-

 

lute

 

Shrinkage

 

and

 

Selection

 

Operator

 

(LASSO)

 

have

 

demonstrated

 

better

 

prediction

 

results

 

(

 

Montes

 

et

 

al.,

 

2020a

 

,

 

2020b

 

).

 

In

 

order

 

to

 

address

 

the

 

above

 

overfitting

 

issue

 

in

 

regres-

 

sion

 

models,

 

new

 

Machine

 

Learning

 

(ML)

 

and

 

Artificial

 

Intelli-

 

gence

 

(AI)

 

techniques

 

have

 

been

 

introduced

 

for

 

predicting

 

the

 

self-

 

cleansing

 

velocity

 

based

 

on

 

the

 

concept

 

of

 

non-deposition

 

sed-

 

iment

 

transport.

 

Examples

 

of

 

models

 

developed

 

for

 

the

 

‘with-

 

out

 

deposited

 

bed’

 

case

 

include

 

using

 

techniques

 

such

 

as

 

Artifi-

 

cial

 

Neural

 

Network

 

(ANN)

 

(

 

Ebtehaj

 

and

 

Bonakdari,

 

2013

 

),

 

Sup-

 

port

 

Vector

 

Regression

 

(SVR)

 

coupled

 

with

 

the

 

Firefly

 

Algorithm

 

(

 

Ebtehaj

 

and

 

Bonakdari,

 

2016b

 

),

 

the

 

Group

 

Method

 

of

 

Data

 

Han-

 

dling

 

(GMDH)

 

(

 

Ebtehaj

 

and

 

Bonakdari,

 

2016a

 

),

 

neuro-fuzzy

 

in-

 

ference

 

system

 

combined

 

with

 

the

 

Particle

 

Swarm

 

Optimisation

 

(ANFIS-PSO)

 

(

 

Ebtehaj

 

et

 

al.,

 

2019

 

),

 

Decision

 

Trees

 

(DT),

 

Generalised

 

Regression

 

Neural

 

Network

 

(GRNN),

 

Multivariate

 

Adaptive

 

Regres-

 

sion

 

Splines

 

(MARS)

 

(

 

Safari,

 

2019

 

)

 

and

 

Extreme

 

Learning

 

Machine

 

(ELM)

 

(

 

Ebtehaj

 

et

 

al.,

 

2020

 

).

 

For

 

the

 

other

 

case

 

of

 

‘non-deposition

 

with

 

deposited

 

bed’,

 

fewer

 

ML/AI

 

type

 

models

 

have

 

been

 

devel-

 

oped.

 

Examples

 

include

 

models

 

based

 

on

 

Particle

 

Swarm

 

Optimisa-

 

tion

 

(PSO)

 

algorithm

 

(

 

Safari

 

et

 

al.,

 

2017

 

),

 

Gene

 

Expression

 

Program-

 

ming

 

(GEP)

 

(

 

Roushangar

 

and

 

Ghasempour,

 

2017

 

)

 

and

 

Multigene

 

Genetic

 

Programming

 

(MGP)

 

(

 

Safari

 

and

 

Danandeh

 

Mehr,

 

2018

 

).

 

The

 

above

 

models,

 

developed

 

using

 

different

 

ML/AI

 

tech-

 

niques

 

(for

 

both

 

non-deposition

 

criteria),

 

have

 

improved

 

the

 

prediction

 

accuracy

 

of

 

self-cleansing

 

velocities

 

and

 

addressed

 

the

 

issues

 

of

 

model

 

overfitting

 

but

 

only

 

partially.

 

As

 

noted

 

by

 

Zendehboudi

 

et

 

al.

 

(2018)

 

,

 

these

 

models

 

still

 

tend

 

to

 

have

 

rather

 

limited

 

extrapolation

 

capabilities

 

meaning

 

that

 

once

 

they

 

are

 

ap-

 

plied

 

to

 

datasets

 

that

 

were

 

not

 

used

 

for

 

their

 

training

 

they

 

tend

 

to

 

underperform.

 

Also,

 

the

 

ML/AI

 

based

 

models

 

developed

 

so

 

far

 

are

 

largely

 

black-box

 

type

 

models

 

(e.g.

 

ANN)

 

meaning

 

that,

 

un-

 

like

 

white-box

 

type

 

regression

 

models,

 

they

 

suffer

 

from

 

low

 

inter-

 

pretability

 

of

 

physical

 

significance

 

of

 

model

 

inputs

 

(i.e.

 

explanatory

 

factors),

 

and

 

interactions

 

with

 

the

 

model

 

output.

 

The

 

aim

 

of

 

this

 

paper

 

is

 

to

 

overcome

 

above

 

deficiencies

 

us-

 

ing

 

the

 

Random

 

Forest

 

(RF)

 

technique

 

for

 

predicting

 

self-cleansing

 

sewer

 

velocities.

 

RF

 

(

 

Breiman,

 

2001

 

)

 

is

 

a

 

flexible

 

and

 

interpretable

 

supervised

 

ML

 

technique

 

that

 

combines

 

the

 

results

 

(outputs)

 

of

 

multiple

 

individual

 

decision

 

trees

 

to

 

make

 

a

 

prediction

 

of

 

interest.

 

Due

 

to

 

its

 

good

 

characteristics

 

and

 

easy

 

application,

 

it

 

has

 

been

 

a

 

widely

 

used

 

for

 

addressing

 

many

 

other

 

problems

 

in

 

water

 

en-

 

gineering.

 

Tyralis

 

et

 

al.

 

(2019)

 

showed

 

a

 

full

 

review

 

of

 

studies

 

in

 

which

 

RF

 

was

 

successfully

 

applied

 

to

 

water

 

resources

 

problems.

 

Using

 

the

 

RF

 

technique,

 

a

 

new

 

predictive

 

self-cleansing

 

model

 

is

 

developed

 

and

 

presented

 

here

 

for

 

both

 

non-deposition

 

criteria

 

(with

 

and

 

without

 

deposited

 

bed).

 

This

 

model

 

aims

 

to

 

increase

 

prediction

 

accuracy

 

whilst

 

avoiding

 

overfitting

 

issues

 

and

 

enabling

 

interpretability

 

of

 

results

 

obtained.

 

The

 

new

 

modelling

 

technique

 

is

 

compared

 

to

 

ten

 

literature

 

models

 

using

 

multiple

 

datasets.

 

2.

 

Data

 

2.1.

 

Non-deposition

 

without

 

deposited

 

bed

 

data

 

Several

 

experimental

 

data

 

were

 

collected

 

from

 

the

 

literature

 

to

 

implement

 

the

 

RF

 

method.

 

Mayerle

 

(1988)

 

studied

 

the

 

sediment

 

transport

 

in

 

a

 

152

 

mm

 

diameter

 

pipe

 

and

 

in

 

two

 

rectangular

 

chan-

 

nels

 

of

 

311.5

 

mm

 

and

 

462.3

 

mm

 

bottom

 

width

 

(

 

W

 

)

 

using

 

granular

 

sands

 

ranging

 

from

 

0.50

 

mm

 

to

 

8.74

 

mm.

 

Ab

 

Ghani

 

(1993)

 

col-

 

lected

 

221

 

data

 

in

 

154

 

mm,

 

305

 

mm

 

and

 

450

 

mm

 

diameter

 

pipes,

 

testing

 

sands

 

between

 

0.46

 

mm

 

and

 

8.40

 

mm.

 

Ota

 

(1999)

 

used

 

a

 

225

 

mm

 

concrete

 

pipe

 

with

 

a

 

constant

 

slope

 

of

 

0.002,

 

vary-

 

ing

 

the

 

volumetric

 

sediment

 

concentration

 

between

 

4.2

 

ppm

 

to

 

59.4

 

ppm.

 

Vongvisessomjai

 

et

 

al.

 

(2010)

 

used

 

two

 

circular

 

PVC

 

pipes

 

of

 

100

 

mm

 

and

 

150

 

mm

 

diameter

 

to

 

study

 

the

 

bedload

 

and

 

suspended

 

load

 

transport.

 

Montes

 

et

 

al.

 

(2020a)

 

collected

 

ex-

 

perimental

 

data

 

in

 

a

 

242

 

mm

 

acrylic

 

pipe

 

using

 

granular

 

mate-

 

rial

 

with

 

a

 

mean

 

particle

 

diameter

 

of

 

0.35

 

mm

 

and

 

1.51

 

mm.

 

Montes

 

et

 

al.

 

(2020b)

 

carried

 

out

 

107

 

experiments

 

in

 

a

 

595

 

mm

 

PVC

 

pipe,

 

using

 

sediments

 

ranging

 

from

 

0.35

 

mm

 

to

 

2.6

 

mm.

 

2.2.

 

Non-deposition

 

with

 

deposited

 

bed

 

data

 

For

 

the

 

non-deposition

 

with

 

deposited

 

bed,

 

El-Zaemey

 

(1991)

 

studied

 

the

 

sediment

 

transport

 

in

 

a

 

305

 

mm

 

diameter

 

pipe,

 

using

 

granular

 

particles

 

ranging

 

from

 

0.53

 

mm

 

to

 

8.40

 

mm.

 

Perrusquía

 

(1991)

 

carried

 

out

 

experiments

 

in

 

a

 

225

 

mm

 

diame-

 

ter

 

pipe,

 

varying

 

the

 

sediment

 

concentration

 

from

 

18.7

 

ppm

 

to

 

408.0

 

ppm.

 

Ab

 

Ghani

 

(1993)

 

collected

 

the

 

deposited

 

bed

 

data

 

only

 

in

 

the

 

450

 

mm

 

concrete

 

pipe

 

and

 

using

 

granular

 

sand

 

with

 

a

 

mean

 

particle

 

diameter

 

of

 

0.72

 

mm.

 

May

 

(1993)

 

extended

 

their

 

previous

 

study

 

(

 

May

 

et

 

al.,

 

1989

 

)

 

and

 

collected

 

experimental

 

data

 

with

 

sediment

 

thickness

 

varying

 

from

 

57.6

 

mm

 

to

 

129.6

 

mm.

 

Finally,

 

Montes

 

et

 

al.

 

(2020b)

 

carried

 

out

 

experiments

 

in

 

a

 

595

 

mm

 

PVC

 

pipe,

 

considering

 

a

 

relative

 

sediment

 

thickness

 

(

 

y

 

s

 

/D

 

)

 

between

 

0.13%

 

and

 

1.11%.

 

Table

 

1

 

outlines

 

the

 

characteristics

 

of

 

the

 

data

 

used

 

for

 

developing

 

the

 

RF

 

algorithm.

 

2

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

Ta

b

le

 1

 

Dat

use

fo

im

plementing

 dat

mining

 and

 r

e

gr

ession

 models.

 

R

e

fer

e

nce

 

N

o

n-deposition

 

crit

erion

 

No

of

 runs

 

Pipe

 diame

te

or

 

bo

tt

om

 width

 (mm)

 

Flo

w

 Velocity

 

(m/s)

 

Pipe

 slope

 (%)

 

Se

diment

 

Concentr

a

tion

 (ppm)

 

Se

diment

 thic

kness

 

bed

 (mm)

 

Ma

y

e

rl

(1988)

 

cir

cular

 c

h

annel

 

Without

 deposit

e

bed

 

106

 

152

 

0.37

 -

 1.10

 

0.13

 -

 0.56

 

20.0

 -

 1275.0

 

Ma

y

e

rl

(1988)

 

re

ct

angular

 c

h

annel

 

Without

 deposit

e

bed

 

105

 

311.5

 and

 462.3

 

0.41

 -’

 1.04

 

0.09

 –0

.6

14.0

 –

1568.0

 

Ab

 Ghani

 (1993)

 

Without

 deposit

e

bed

 

221

 

154,

 305

 and

 405

 

0.24

 -

 1.22

 

0.04

 -

 2.56

 

0.8

 -

 1450.0

 

Ot

(1999)

 

Without

 deposit

e

bed

 

36

 

305

 

0.39

 -

 0.74

 

0.2

 

4.2

 -

 59.4

 

Vongvisessomjai

 et

 al.

 (2010)

 

Without

 deposit

e

bed

 

45

 

100

 and

 150

 

0.24

 -

 0.63

 

0.20

 -

 0.60

 

4.0

 -

 90.0

 

Mont

e

et

 al.

 (2020a)

 

Without

 deposit

e

bed

 

44

 

242

 

0.24

 -

 1.05

 

0.20

 -

 0.80

 

0.3

 -

 875.7

 

Mont

e

et

 al.

 (2020b)

 

Without

 deposit

e

bed

 

107

 

595

 

0.41

 -

 1.41

 

0.04

 -

 3.43

 

1.3

 -

 19,957.0

 

El-Zaeme

(1991)

 

With

 deposit

e

bed

 

290

 

305

 

0.39

 -

 0.96

 

0.05

 -

 0.44

 

7.0

 -

 917.0

 

47.0

 –

120.0

 

Pe

rr

u

sq

u

ía

 (1991)

 

With

 deposit

e

bed

 

38

 

225

 

0.29

 -

 0.67

 

0.20

 -

 0.60

 

18.7

 -

 408.0

 

45.0

 –9

0

.0

 

Ab

 Ghani

 (1993)

 

With

 deposit

e

bed

 

26

 

450

 

0.49

 -

 1.33

 

0.07

 -

 0.47

 

21.0

 -

 1259.0

 

52.0

 –1

0

8

.0

 

Ma

(1993)

 

With

 deposit

e

bed

 

46

 

450

 

0.39

 -

 1.14

 

0.07

 -

 0.97

 

3.5

 -

 823.0

 

57.6

 –

129.6

 

Mont

e

et

 al.

 (2020b)

 

With

 deposit

e

bed

 

54

 

595

 

0.73

 -

 1.53

 

0.46

 -

 5.42

 

389.0

 -

 10,275.0

 

0.8

 –6

.6

 

As

 

shown

 

in

 

Table

 

1

 

,

 

a

 

total

 

of

 

664

 

and

 

454

 

data

 

are

 

available

 

for

 

the

 

development

 

of

 

models

 

without

 

deposited

 

bed

 

and

 

with

 

deposited

 

bed,

 

respectively.

 

3.

 

Mehodology

 

3.1.

 

Random

 

forest

 

model

 

Random

 

Forest

 

model

 

developed

 

here

 

predicts

 

the

 

par-

 

ticle

 

Froude

 

number

 

(

 

F

 

r

 

)

 

as

 

a

 

function

 

of

 

several

 

well-

 

known

 

dimensionless

 

explanatory

 

factors

 

(

 

Kargar

 

et

 

al.,

 

2019

 

;

 

Vongvisessomjai

 

et

 

al.,

 

2010

 

):

 

F

 

r

 

=

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

f

 

C

 

v

 

,

 

D

 

gr

 

,

 

d

 

R

 

,

 

λ

,

 

y

 

s

 

D

 

(2)

 

Random

 

forest

 

(RF)

 

is

 

a

 

bagging

 

algorithm

 

for

 

regression

 

and

 

classification

 

problems

 

proposed

 

by

 

Breiman

 

(2001)

 

.

 

This

 

is

 

a

 

low-

 

variance

 

method,

 

which

 

randomly

 

split

 

the

 

training

 

data

 

and

 

the

 

input

 

variables

 

predictors

 

to

 

build

 

a

 

set

 

of

 

b

 

decision

 

trees

 

(

 

B

 

t

 

).

 

The

 

results

 

of

 

all

 

decision

 

trees

 

generated

 

from

 

bootstrapped

 

train-

 

ing

 

samples

 

(

 

T

 

b

 

(

 

x

 

;

b

 

)

 

)

 

are

 

then

 

averaged,

 

i.e.

 

the

 

final

 

result

 

(

 

ˆ

 

y

 

(

x

 

)

 

)

 

is

 

the

 

average

 

of

 

the

 

output

 

of

 

all

 

decision

 

trees

 

(as

 

shown

 

in

 

Eq.

 

(3)

 

).

 

This

 

procedure

 

ensures

 

the

 

reduction

 

of

 

the

 

model

 

vari-

 

ance

 

and

 

consequently

 

the

 

reduction

 

of

 

the

 

risk

 

of

 

overfitting.

 

A

 

simplified

 

conceptual

 

diagram

 

of

 

the

 

RF

 

method

 

is

 

shown

 

in

 

Fig.

 

1

 

.

 

ˆ

 

y

 

(

 

x

 

)

 

=

 

1

 

B

 

t

 

B

 

t

 

 

b

=1

 

T

 

(

 

x

 

;

b

 

)

 

(3)

 

In

 

this

 

paper,

 

the

 

R

 

package

 

‘RandomForest’

 

(

 

Liaw

 

and

 

Wiener,

 

2002

 

)

 

was

 

used

 

for

 

constructing

 

both

 

non-deposition,

 

without

 

deposited

 

bed

 

and

 

deposited

 

bed,

 

self-cleansing

 

models.

 

The

 

number

 

of

 

predictors

 

considered

 

at

 

each

 

split

 

(

 

mtry

 

)

 

and

 

the

 

number

 

of

 

trees

 

in

 

the

 

forest

 

(

 

B

 

t

 

)

 

are

 

the

 

parameters

 

that

 

define

 

the

 

structure

 

of

 

the

 

RF

 

regression

 

model.

 

The

 

mtry

 

parameter

 

is

 

estimated

 

by

 

using

 

the

 

rfcv()

 

function,

 

which

 

shows

 

the

 

cross-

 

validation

 

performance

 

for

 

each

 

number

 

of

 

predictors.

 

In

 

addition,

 

the

 

optimal

 

number

 

of

 

trees

 

is

 

defined

 

as

 

the

 

value

 

that

 

minimises

 

the

 

Mean

 

Square

 

Error

 

(MSE)

 

value

 

of

 

the

 

training

 

data.

 

These

 

pa-

 

rameters

 

are

 

estimated

 

and

 

the

 

results

 

are

 

shown

 

in

 

Fig.

 

2

 

.

 

Accord-

 

ing

 

to

 

this

 

figure,

 

the

 

optimal

 

number

 

of

 

features

 

(i.e.

 

the

 

random

 

predictors

 

used

 

in

 

each

 

tree)

 

are

 

three

 

and

 

four

 

non-dimensional

 

parameters

 

for

 

the

 

cases

 

of

 

without

 

deposited

 

bed

 

and

 

with

 

de-

 

posited

 

bed,

 

respectively.

 

Similarly,

 

the

 

optimal

 

number

 

of

 

trees

 

is

 

471

 

for

 

without

 

deposited

 

bed

 

and

 

229

 

for

 

with

 

deposited

 

bed.

 

Cross-validation

 

is

 

carried

 

out

 

during

 

the

 

training

 

stage

 

using

 

out-of-bag

 

(OOB)

 

samples.

 

As

 

mentioned

 

above,

 

the

 

method

 

ran-

 

domly

 

bootstraps

 

the

 

training

 

sample,

 

that

 

is,

 

some

 

of

 

the

 

train-

 

ing

 

data

 

are

 

left

 

out

 

to

 

build

 

each

 

decision

 

tree.

 

Only

 

two

 

out

 

of

 

three

 

parts

 

of

 

the

 

total

 

training

 

data

 

are

 

used

 

to

 

build

 

the

 

tree

 

(

 

Breiman,

 

2001

 

).

 

Based

 

on

 

this,

 

data

 

not

 

included

 

in

 

the

 

boot-

 

strapped

 

sample

 

(OOB

 

data)

 

are

 

predicted,

 

and

 

the

 

prediction

 

error

 

is

 

averaged

 

over

 

the

 

trees

 

that

 

do

 

not

 

include

 

these

 

data

 

(OOB

 

Er-

 

ror).

 

3.1.1.

 

Splitting

 

of

 

training

 

and

 

testing

 

data

 

The

 

whole

 

benchmarking

 

data

 

collected

 

from

 

the

 

literature

 

are

 

used

 

for

 

both

 

training

 

and

 

testing

 

stages

 

of

 

the

 

RF

 

model.

 

Usually,

 

75%

 

of

 

the

 

data

 

is

 

used

 

during

 

the

 

training

 

stage

 

of

 

the

 

model

 

and

 

the

 

other

 

25%

 

to

 

validate

 

the

 

results.

 

According

 

to

 

Safari

 

(2020)

 

,

 

the

 

range

 

of

 

variation

 

in

 

the

 

training

 

data

 

has

 

direct

 

implications

 

for

 

model

 

performance

 

(i.e.

 

accuracy).

 

As

 

a

 

result,

 

the

 

model

 

can

 

show

 

overfitting

 

issues

 

and

 

poor

 

extrapolation

 

capabilities

 

when

 

narrow

 

datasets

 

are

 

used

 

in

 

the

 

training

 

stage

 

(i.e.

 

data

 

with

 

a

 

low

 

range

 

of

 

variation).

 

3

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

Fig.

 1. Simplified conceptual diagram of the RF method. 

Fig.

 2. Selection of the optimal Random forest parameters. 

Fig.

 3. Variation of the training and testing error using different combination of percentages between the training and testing dataset. A) Training stage and B) Testing stage. 

Checking

 

the

 

non-overfitting

 

of

 

the

 

RF

 

model

 

is

 

carried

 

out

 

by

 

using

 

several

 

sizes

 

in

 

the

 

training

 

and

 

testing

 

data

 

(i.e.

 

changing

 

the

 

percentage

 

of

 

data

 

used

 

as

 

training

 

and

 

testing)

 

and

 

by

 

ver-

 

ifying

 

the

 

error,

 

defined

 

by

 

the

 

Coefficient

 

of

 

Determination

 

(

 

R

 

2

 

)

 

(as

 

shown

 

in

 

Eq.

 

(14)

 

).

 

For

 

this,

 

ten

 

different

 

combinations

 

of

 

per-

 

centages

 

are

 

defined

 

(i.e.

 

%

 

of

 

the

 

training

 

data

 

:

 

%

 

of

 

the

 

test-

 

ing

 

data

 

=

 

[5:95,

 

15:85,

 

25:75,

 

35:65,

 

45:55,

 

55:45,

 

65:35,

 

75:25,

 

85:15,

 

95:5]),

 

randomly

 

changing

 

the

 

ranges

 

of

 

the

 

training

 

and

 

testing

 

data,

 

and

 

developing

 

100

 

RF

 

models

 

for

 

each

 

combination.

 

As

 

a

 

result,

 

10

 

0

 

0

 

RF

 

models

 

are

 

trained

 

and

 

the

 

error

 

is

 

estimated

 

for

 

both

 

training

 

and

 

testing

 

stage.

 

Using

 

this

 

information,

 

several

 

boxplots

 

are

 

constructed

 

showing

 

the

 

R

 

2

 

variation

 

for

 

each

 

stage.

 

Fig.

 

3

 

shows

 

how

 

the

 

model

 

error

 

decreases

 

as

 

the

 

training

 

sam-

 

ple

 

size

 

increases.

 

For

 

example,

 

when

 

only

 

5%

 

of

 

the

 

whole

 

dataset

 

is

 

used

 

for

 

training

 

the

 

model

 

and

 

the

 

remaining

 

95%

 

for

 

testing

 

it,

 

the

 

error

 

varies

 

between

 

0.84

 

and

 

0.96,

 

for

 

the

 

training

 

stage,

 

and

 

between

 

0.39

 

and

 

0.73

 

for

 

the

 

testing

 

stage.

 

This

 

clearly

 

shows

 

that

 

the

 

model

 

is

 

under-trained;

 

however,

 

when

 

the

 

ratio

 

is

 

greater

 

than

 

50:50

 

the

 

error

 

tends

 

to

 

be

 

constant

 

and

 

slightly

 

variable

 

for

 

both

 

stages.

 

Ratios

 

greater

 

than

 

90:10

 

tend

 

to

 

generate

 

unsatis-

 

factory

 

results

 

for

 

the

 

testing

 

stage,

 

i.e.

 

the

 

model

 

is

 

over-trained

 

and

 

shows

 

high

 

variation

 

in

 

the

 

error,

 

i.e.

 

overfitting,

 

(as

 

shown

 

in

 

Fig.

 

3

 

b).

 

Based

 

on

 

this,

 

a

 

combination

 

of

 

75:25

 

is

 

taken

 

as

 

optimal

 

for

 

implementing

 

the

 

model.

 

The

 

variation

 

of

 

the

 

data

 

used

 

for

 

training

 

and

 

testing

 

dataset

 

is

 

presented

 

in

 

Table

 

2

 

.

 

Using

 

the

 

above

 

considerations,

 

the

 

RF

 

model

 

is

 

implemented

 

with

 

the

 

optimal

 

parameters

 

defined

 

in

 

Fig.

 

2

 

and

 

using

 

the

 

ranges

 

of

 

variation

 

of

 

the

 

training

 

data

 

outlined

 

in

 

Table

 

2

 

.

 

The

 

full

 

data

 

collected

 

from

 

the

 

literature

 

are

 

shown

 

in

 

the

 

Supplementary

 

ma-

 

terial.

 

Table

 

S1

 

and

 

Table

 

S2

 

show

 

the

 

data

 

for

 

non-deposition

 

without

 

and

 

with

 

deposited

 

bed,

 

respectively,

 

and

 

the

 

correspond-

 

4

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

Table

 2 

Variation

 of the data for training and testing the RF model. 

Non-deposition

 

criterion

 

Stage

 

No.

 of runs 

Channel

 geometry 

(mm)

 

Flow

 Velocity (m/s) 

Pipe

 slope (%) 

Sediment

 

Concentration

 

(ppm)

 

Sediment

 

thickness

 bed 

(mm)

 

Without

 deposited bed 

Training

 

498

 

D

 = 100.0 – 595.0 

W

 = 311.5 – 462.3 

0.237

 - 1.41 

0.04

 – 3.43 

0.53

 – 19,957 

Testing

 

166

 

D

 = 100.0 – 595.0 

W

 = 311.5 – 462.3 

0.237

 – 1.24 

0.04

 – 2.74 

1.00

 – 13,840 

With

 deposited bed 

Training

 

340

 

D

 = 225 – 595 

0.294

 – 1.53 

0.05

 – 5.42 

3.50

 - 10,274 

0.78

 – 129.6 

Testing

 

114

 

D

 = 225 – 595 

0.319

 – 1.28 

0.05

 – 2.58 

17.00

 - 9101 

1.78

 – 120.0 

Fig.

 4. Random Forest code to calculate the particle Froude number in sewer pipes. 

.

 

ing

 

RF

 

particle

 

Froude

 

number

 

predictions.

 

The

 

implemented

 

code

 

for

 

the

 

RF

 

method

 

is

 

shown

 

in

 

Fig.

 

4

 

.

 

An

 

example

 

of

 

one

 

of

 

the

 

471

 

decision

 

trees

 

generated

 

by

 

the

 

RF

 

model,

 

for

 

the

 

non-deposition

 

without

 

deposited

 

bed,

 

is

 

shown

 

in

 

Figure

 

S1,

 

in

 

the

 

Supplemen-

 

tary

 

material.

 

3.1.2.

 

Measure

 

of

 

feature

 

importance

 

Note

 

that

 

in

 

this

 

paper,

 

a

 

decrease

 

in

 

model

 

accuracy

 

when

 

the

 

j

 

th

 

variable

 

is

 

permuted

 

(i.e.

 

the

 

percentage

 

of

 

the

 

increase

 

in

 

the

 

MSE,

 

%

 

IncMSE)

 

is

 

considered

 

as

 

a

 

measure

 

of

 

the

 

importance

 

of

 

a

 

model

 

input

 

variable.

 

This

 

index

 

shows

 

the

 

strength

 

of

 

each

 

ex-

 

planatory

 

variable

 

based

 

on

 

the

 

reduction

 

of

 

the

 

MSE.

 

The

 

step-

 

by-step

 

to

 

calculate

 

the

 

%

 

IncMSE is

 

shown

 

as

 

follows

 

(

 

Hastie

 

et

 

al.,

 

2009

 

):

 

(1)

 

Calculate

 

the

 

MSE

 

of

 

the

 

OOB-sample

 

data

 

in

 

each

 

tree

 

of

 

the

 

forest

 

(

 

MS

 

E

 

b

 

).

 

(2)

 

Randomly

 

permute

 

the

 

value

 

of

 

the

 

j

 

th

 

explanatory

 

variable

 

and

 

calculate

 

the

 

MSE

 

(

 

MS

 

E

 

j

 

).

 

(3)

 

Finally,

 

calculate

 

%

 

IncMSE for

 

each

 

explanatory

 

variable

 

as:

 

%

 

IncM

 

SE

 

=

 

100

 

·

M

 

S

 

E

 

j

 

− M

 

S

 

E

 

b

 

M

 

S

 

E

 

b

 

(4)

 

As

 

a

 

result,

 

the

 

more

 

the

 

%

 

IncMSE increases

 

for

 

a

 

variable,

 

the

 

more

 

important

 

it

 

is.

 

3.2.

 

Performance

 

assessment

 

3.2.1.

 

Models

 

used

 

for

 

comparing

 

the

 

RF

 

results

 

In

 

order

 

to

 

evaluate

 

the

 

RF

 

model

 

performance,

 

it

 

is

 

com-

 

pared

 

to

 

several

 

literature

 

models.

 

The

 

models

 

selected

 

for

 

com-

 

parison

 

are

 

the

 

replicable

 

white-box

 

models

 

with

 

high

 

predic-

 

tion

 

accuracy

 

reported

 

in

 

the

 

literature

 

and

 

two

 

black-box

 

mod-

 

els

 

where

 

the

 

implementing

 

code

 

is

 

provided

 

in

 

the

 

original

 

pa-

 

pers.

 

Other

 

black-box

 

models

 

cannot

 

be

 

evaluated

 

due

 

to

 

the

 

lim-

 

ited

 

replicability

 

shown

 

by

 

these

 

models

 

(e.g.

 

ANN).

 

Based

 

on

 

this,

 

in

 

the

 

case

 

of

 

non-deposition

 

without

 

deposited

 

bed,

 

seven

 

mod-

 

els

 

selected

 

are

 

the

 

EPR-MOGA

 

model

 

(

 

Montes

 

et

 

al.,

 

2020a

 

),

 

the

 

GEP

 

model

 

(

 

Kargar

 

et

 

al.,

 

2019

 

),

 

the

 

MARS

 

model

 

(

 

Safari,

 

2019

 

),

 

the

 

May

 

et

 

al.

 

(1996)

 

model,

 

the

 

Safari

 

and

 

Aksoy

 

(2020)

 

model,

 

the

 

ANFIS-PSO

 

model

 

(

 

Ebtehaj

 

et

 

al.,

 

2019

 

)

 

and

 

the

 

ELM

 

model

 

(

 

Ebtehaj

 

et

 

al.,

 

2020

 

).

 

In

 

the

 

case

 

of

 

non-deposition

 

with

 

de-

 

posited

 

bed,

 

three

 

models

 

used

 

for

 

comparison

 

are

 

the

 

PSO

 

model

 

(

 

Safari

 

and

 

Shirzad,

 

2019

 

),

 

the

 

LASSO

 

model

 

(

 

Montes

 

et

 

al.,

 

2020b

 

)

 

and

 

the

 

MGP

 

model

 

(

 

Safari

 

and

 

Danandeh

 

Mehr,

 

2018

 

).

 

The

 

EPR-

 

MOGA,

 

LASSO,

 

May

 

et

 

al.

 

(1996)

 

and

 

Safari

 

and

 

Aksoy

 

(2020)

 

are

 

the

 

regression

 

type

 

models

 

whilst

 

GEP,

 

MARS,

 

ANFIS-PSO,

 

ELM,

 

PSO

 

and

 

MGP

 

models

 

make

 

use

 

of

 

ML/AI

 

techniques.

 

The

 

equations

 

used

 

by

 

above

 

ten

 

models

 

are

 

as

 

follows:

 

EPR-MOGA:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

5

 

.

 

6

 

C

 

0

 

.

 

16

 

v

 

d

 

R

 

−0

 

.

 

58

 

S

 

0

 

.

 

14

 

o

 

D

 

0

 

.

 

02

 

gr

 

(5)

 

GEP:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

3

 

.

 

05

 

C

 

0

 

.

 

16

 

v

 

atan

 

atan

 

 

d

 

R

 

+

 

atan

 

(

 

3

 

.

 

41

 

− ln

 

(

 

D

 

gr

 

)

 

)

 

+

 

atan

 

 

 

tan

 

 

8

 

.

 

37

 

− 7

 

.

 

99

 

λ

+

 

d

 

R

 

λ

2

 

 

2

 

 

 

+

 

ln

 

 

 

 

d

 

R

 

3

 

 

2

 

λ

 

 

(6)

 

MARS:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

7

 

.

 

26

 

− 1

 

.

 

75

 

· max

 

(

 

0

 

,

 

d

/R

 

− 0

 

.

 

12

 

)

 

+

 

2

 

· max

 

(

 

0

 

,

 

0

 

.

 

12

 

− d/R

 

)

 

+15

 

.

 

89

 

· max

 

(

 

0

 

,

 

C

 

v

 

− 0

 

.

 

44

 

)

 

− 16

 

.

 

42

 

· max

 

(

 

0

 

,

 

0

 

.

 

44

 

− C

 

v

 

)

 

+0

 

.

 

47

 

· max

 

(

 

0

 

,

 

D

 

gr

 

− 0

 

.

 

29

 

)

 

− 7

 

.

 

25

 

· max

 

(

 

0

 

,

 

λ

− 0

 

.

 

3

 

)

 

−16

 

.

 

03

 

· max

 

(

 

0

 

,

 

C

 

v

 

− 0

 

.

 

01

 

)

 

+

 

3

 

.

 

7

 

· max

 

(

 

0

 

,

 

D

 

gr

 

− 0

 

.

 

12

 

)

 

−4

 

.

 

33

 

· max

 

(

 

0

 

,

 

D

 

gr

 

− 0

 

.

 

08

 

)

 

+

 

0

 

.

 

43

 

· max

 

(

 

0

 

,

 

λ

− 0

 

.

 

59

 

)

 

+6

 

.

 

75

 

· max

 

(

 

0

 

,

 

λ

− 0

 

.

 

28

 

)

 

+

 

1

 

.

 

67

 

· max

 

(

 

0

 

,

 

d

/R

 

− 0

 

.

 

07

 

)

 

(7)

 

5

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

May

 

et

 

al.

 

(1996)

 

:

 

C

 

v

 

=

 

0

 

.

 

0303

 

D

 

2

 

A

 

d

 

D

 

0

 

.

 

6

 

1

 

V

 

t

 

V

 

l

 

4

 

V

 

l

 

2

 

gD

 

(

 

S

 

s

 

− 1

 

)

 

1

 

.

 

5

 

(8)

 

Safari

 

and

 

Aksoy

 

(2020)

 

:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

4

 

.

 

83

 

C

 

0

 

.

 

09

 

v

 

d

 

R

 

−0

 

.

 

32

 

D

 

−0

 

.

 

14

 

gr

 

P

 

B

 

0

 

.

 

20

 

(9)

 

ANFIS-PSO:

 

No

 

equation.

 

The

 

Matlab

 

code

 

can

 

be

 

found

 

in

 

Ebtehaj

 

et

 

al.

 

(2019)

 

.

 

ELM:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

1

 

(

 

1

 

+

 

exp

 

(

 

InW

 

· InV

 

+

 

BHI

 

)

 

)

 

T

 

· OutW

 

(10)

 

where

 

InW

 

and

 

OutW

 

are

 

the

 

input

 

and

 

output

 

weights,

 

BHI the

 

bias

 

of

 

the

 

hidden

 

neurons

 

and

 

InV

 

the

 

input

 

variables

 

(i.e.

 

C

 

v

 

,

 

d

/R

 

,

 

D

 

2

 

/A

 

,

 

R

/D

 

,

 

D

 

gr

 

,

 

d

/D

 

and

 

λ

).

 

Full

 

details

 

of

 

the

 

values

 

chosen

 

for

 

each

 

parameter

 

are

 

shown

 

in

 

Ebtehaj

 

et

 

al.

 

(2020)

 

.

 

PSO:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

3

 

.

 

66

 

C

 

0

 

.

 

16

 

v

 

d

 

R

 

−0

 

.

 

40

 

y

 

s

 

Y

 

−0

 

.

 

10

 

(11)

 

LASSO:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

5

 

.

 

83

 

C

 

0

 

.

 

144

 

v

 

d

 

R

 

−0

 

.

 

305

 

λ

−0

 

.

 

059

 

D

 

−0

 

.

 

169

 

gr

 

y

 

s

 

D

 

−0

 

.

 

104

 

(12)

 

MGP:

 

V

 

l

 

 

gd

 

(

 

S

 

s

 

− 1

 

)

 

=

 

1

 

.

 

96

 

− 0

 

.

 

61

 

λ

− 0

 

.

 

51

 

C

 

v

 

+

 

1

 

.

 

18

 

D

 

0

 

.

 

50

 

gr

 

λ

1

 

.

 

50

 

+

 

0

 

.

 

61

 

2

 

C

 

v

 

+

 

d

 

R

 

0

 

.

 

50

 

− 2

 

.

 

45

 

d

 

R

 

1

 

/

 

8

 

(13)

 

3.2.2.

 

Performance

 

indices

 

The

 

RF

 

model

 

performance

 

is

 

evaluated

 

and

 

compared

 

to

 

above

 

ten

 

models

 

using

 

three

 

performance

 

indicators.

 

These

 

are

 

the

 

Co-

 

efficient

 

of

 

Determination

 

(

 

R

 

2

 

),

 

the

 

Root

 

Mean

 

Square

 

Error

 

(

 

RMSE

 

)

 

and

 

the

 

Mean

 

Absolute

 

Percentage

 

Error

 

(

 

MAPE

 

),

 

defined

 

as

 

fol-

 

lows:

 

R

 

2

 

=

 

1

 

 

n

 

i

 

=1

 

F

 

r

 

OBS

 

− F

 

r

 

MOD

 

2

 

 

n

 

i

 

=1

 

F

 

r

 

OBS

 

− F

 

r

 

OBS

 

2

 

(14)

 

RMSE

 

=

 

 

1

 

n

 

n

 

 

i

 

=1

 

F

 

r

 

OBS

 

− F

 

r

 

MOD

 

2

 

(15)

 

MAP

 

E

 

=

 

100

 

n

 

n

 

 

i

 

=1

 

F

 

r

 

OBS

 

− F

 

r

 

MOD

 

F

 

r

 

OBS

 

(16)

 

where

 

F

 

r

 

OBS

 

is

 

the

 

particle

 

Froude

 

number

 

observed

 

data,

 

F

 

r

 

MOD

 

the

 

particle

 

Froude

 

number

 

estimated

 

by

 

RF

 

algorithm

 

(or

 

other

 

pre-

 

dictive

 

model),

 

n

 

the

 

number

 

of

 

data

 

and

 

F

 

r

 

OBS

 

the

 

mean

 

of

 

ob-

 

served

 

particle

 

Froude

 

number

 

data.

 

The

 

Coefficient

 

of

 

Determination

 

measures

 

the

 

percentage

 

of

 

the

 

model

 

variance

 

that

 

can

 

be

 

explained.

 

This

 

coefficient

 

varies

 

between

 

0

 

and

 

1,

 

with

 

a

 

value

 

of

 

1

 

denoting

 

a

 

perfect

 

match

 

be-

 

tween

 

observed

 

and

 

modelled

 

data.

 

The

 

Root

 

Mean

 

Square

 

Error

 

measures

 

the

 

standard

 

deviation

 

of

 

the

 

residuals.

 

Note

 

that

 

a

 

value

 

close

 

to

 

0

 

indicates

 

high

 

model

 

prediction

 

accuracy.

 

Finally,

 

the

 

Mean

 

Absolute

 

Percentage

 

Error

 

assesses

 

the

 

model

 

prediction

 

ac-

 

curacy

 

(i.e.

 

bias)

 

as

 

a

 

percentage

 

of

 

the

 

observed

 

value.

 

Value

 

of

 

0

 

indicates

 

the

 

perfect

 

model

 

where

 

there

 

are

 

no

 

differences

 

be-

 

tween

 

predictions

 

and

 

observations.

 

4.

 

Results

 

The

 

results

 

obtained

 

by

 

using

 

the

 

methodology

 

shown

 

in

 

the

 

previous

 

section

 

are

 

presented

 

in

 

Tables

 

3

 

and

 

4

 

,

 

for

 

without

 

de-

 

posited

 

bed

 

and

 

deposited

 

bed

 

criteria,

 

respectively.

 

Graphically,

 

these

 

results

 

are

 

shown

 

in

 

Figs.

 

5

 

and

 

6

 

.

 

As

 

shown

 

in

 

these

 

tables,

 

for

 

the

 

MARS,

 

ANFIS-PSO,

 

ELM

 

and

 

MGP

 

models,

 

the

 

outliers

 

of

 

the

 

particle

 

Froude

 

number

 

(i.e.

 

F

 

r

 

<

 

0.00

 

and

 

F

 

r

 

>

 

20.00)

 

were

 

re-

 

moved.

 

This

 

is

 

because

 

these

 

models

 

can

 

produce

 

extreme

 

values

 

(e.g.

 

F

 

r

 

=

 

−58.67

 

or

 

F

 

r

 

=

 

163.59,

 

among

 

others)

 

that

 

misrepresent

 

the

 

model

 

comparison

 

when

 

evaluating

 

the

 

performance

 

indices.

 

As

 

it

 

can

 

be

 

seen

 

from

 

Table

 

3

 

,

 

Random

 

Forest

 

model

 

shows

 

a

 

better

 

generalisation

 

capacity

 

than

 

other

 

models

 

shown,

 

as

 

demonstrated

 

in

 

high

 

prediction

 

accuracy

 

observed

 

for

 

all

 

avail-

 

able

 

datasets

 

(0.88

 

>

 

R

 

2

 

>

 

0.98,

 

0.24

 

>

 

RMSE

>

 

0.73

 

and

 

4.36%

 

>

 

MAP

 

E

>

 

11.09%).

 

The

 

following

 

observations

 

can

 

be

 

made

 

from

 

the

 

performance

 

of

 

the

 

other

 

models

 

evaluated:

 

EPR-MOGA,

 

similarly

 

to

 

RF,

 

shows

 

good

 

results

 

but

 

has

 

infe-

 

rior

 

accuracy

 

in

 

large

 

sewer

 

pipes

 

(

 

R

 

2

 

=

 

0.86,

 

RMSE

=

 

1.03

 

and

 

MAP

 

E

=

 

11.31%).

 

In

 

addition,

 

EPR-MOGA

 

model

 

shows

 

limita-

 

tions

 

for

 

predicting

 

the

 

particle

 

Froude

 

number

 

in

 

non-circular

 

sections

 

(as

 

shown

 

in

 

the

 

Mayerle

 

(1988)

 

rectangular

 

data).

 

This

 

equation

 

shows

 

good

 

extrapolation

 

capabilities

 

because

 

of

 

the

 

inclusion

 

of

 

the

 

pipe

 

slope

 

as

 

input

 

feature

 

for

 

the

 

self-

 

cleansing

 

prediction.

 

GEP

 

shows

 

acceptable

 

results

 

(0.79

 

>

 

R

 

2

 

>

 

0.87,

 

0.66

 

>

 

RMSE

>

 

0.89

 

and

 

11.45%

 

>

 

MAP

 

E

>

 

22.33%)

 

for

 

the

 

datasets

 

used

 

for

 

its

 

development

 

in

 

circular

 

channels

 

(

 

Ab

 

Ghani,

 

1993

 

;

 

Mayerle,

 

1988

 

;

 

Vongvisessomjai

 

et

 

al.,

 

2010

 

)

 

and

 

poor

 

perfor-

 

mance

 

for

 

other

 

datasets

 

(0.00

 

>

 

R

 

2

 

>

 

0.76,

 

1.00

 

>

 

RMSE

>

 

1.95

 

and

 

14.35%

 

>

 

MAP

 

E

>

 

37.92%).

 

This

 

model

 

presents

 

good

 

performance

 

for

 

large

 

sewer

 

pipes.

 

In

 

contrast,

 

for

 

non-circular

 

channels

 

the

 

model

 

quickly

 

loss

 

accuracy.

 

According

 

to

 

Safari

 

(2019)

 

,

 

MARS

 

model

 

was

 

developed

 

by

 

us-

 

ing

 

the

 

experimental

 

data

 

collected

 

by

 

Mayerle

 

(1988)

 

(in

 

both

 

circular

 

and

 

rectangular

 

channels),

 

May

 

(1993)

 

,

 

Ab

 

Ghani

 

(1993)

 

and

 

Vongvisessomjai

 

et

 

al.

 

(2010)

 

.

 

As

 

a

 

result,

 

this

 

model

 

shows

 

acceptable

 

performance

 

for

 

these

 

datasets

 

(0.49

 

>

 

R

 

2

 

>

 

0.87,

 

0.81

 

>

 

RMSE

>

 

1.15

 

and

 

13.63%

 

>

 

MAP

 

E

>

 

28.08%)

 

but

 

poor

 

performance

 

for

 

the

 

remaining

 

datasets

 

(

 

R

 

2

 

=

 

0.00,

 

1.48

 

>

 

RMSE

>

 

2.88

 

and

 

29.14%

 

>

 

MAP

 

E

>

 

51.28%).

 

Based

 

on

 

the

 

above,

 

and

 

compared

 

to

 

the

 

RF

 

model,

 

limited

 

extrapolation

 

capabilities

 

are

 

identified

 

for

 

the

 

MARS

 

model.

 

May

 

et

 

al.

 

(1996)

 

is

 

the

 

best

 

regression-based

 

equation

 

re-

 

ported

 

in

 

the

 

literature

 

(

 

Ackers

 

et

 

al.,

 

2001

 

;

 

Ebtehaj

 

et

 

al.,

 

2014

 

),

 

as

 

it

 

was

 

developed

 

using

 

several

 

experimental

 

datasets.

 

This

 

is

 

the

 

equation

 

proposed

 

by

 

the

 

Construction

 

Industry

 

Re-

 

search

 

and

 

Information

 

Association

 

(CIRIA)

 

for

 

designing

 

self-

 

cleansing

 

sewer

 

pipes

 

transporting

 

coarser

 

granular

 

material

 

as

 

bedload

 

(

 

Ackers

 

et

 

al.,

 

2001

 

).

 

This

 

model

 

shows

 

good

 

perfor-

 

mance

 

for

 

pipe

 

diameters

 

less

 

than

 

500

 

mm

 

(0.83

 

>

 

R

 

2

 

>

 

0.99,

 

0.13

 

>

 

RMSE

>

 

0.82

 

and

 

2.38%

 

>

 

MAP

 

E

>

 

11.61%).

 

In

 

con-

 

trast,

 

limited

 

extrapolation

 

for

 

large

 

sewer

 

pipes

 

is

 

identified

 

as

 

the

 

low

 

performance

 

indices

 

values

 

obtained

 

(

 

R

 

2

 

=

 

0.00,

 

RMSE

=

 

4.88

 

and

 

MAP

 

E

=

 

48.97%).

 

This

 

equation

 

shows

 

better

 

performance

 

than

 

the

 

RF

 

model

 

when

 

compared

 

to

 

data

 

from

 

Vongvisessomjai

 

et

 

al.

 

(2010)

 

,

 

but

 

lower

 

accuracy

 

when

 

applied

 

to

 

the

 

rest

 

of

 

the

 

datasets.

 

6

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

Table

 3 

Accuracy

 of self-cleansing models for without deposited bed criterion using performance indices for training and testing dataset. Bolded values show best performance 

model.

 

Dataset

 

Performance

 Index 

Model

 

RF

 

EPR-MOGA

 

GEP

 

MARS

 

May

 et al. (1996) a  Safari and Aksoy (2020) 

ANFIS-PSO

 

ELM

 

Training

 

R

 

2

 

0.98

 

0.90

 

0.75

 

0.00

 

0.27

 

0.74

 

0.51

 

0.30

 

RMSE

 

0.33

 

0.76

 

1.22

 

2.55

 

2.17

 

1.25

 

1.69

 

1.95

 

MAPE

 (%) 

4.88

 

11.54

 

23.52

  34.16 

17.49

 

17.21

 

19.32

 

29.76

 

Testing

 

R

 

2

 

0.91

 

0.86

 

0.69

 

0.00

 

0.09

 

0.74

 

0.40

 

0.32

 

RMSE

 

0.73

 

0.88

 

1.33

 

2.55

 

2.27

 

1.21

 

1.84

 

1.92

 

MAPE

 (%) 

11.09

 

12.35

 

26.43

  36.57 

19.15

 

17.24

 

20.95

 

29.82

 

Mayerle

 (1988) 

circular

 

R

 

2

 

0.96

 

0.89

 

0.87

 

0.87

 

0.87

 

0.75

 

0.80

 

0.42

 

RMSE

 

0.45

 

0.75

 

0.81

 

0.81

 

0.82

 

1.12

 

1.00

 

1.71

 

MAPE

 (%) 

5.62

 

8.90

 

14.77

  14.03 

11.49

 

14.91

 

17.92

 

26.75

 

Mayerle

 (1988) 

rectangular

 

R

 

2

 

0.93

 

0.38

 

0.30

 

0.81

 

0.87

 

0.00

 

0.47

 

RMSE

 

0.49

 

1.44

 

1.54

 

0.81

 

0.66

 

2.74

 

1.33

 

MAPE

 (%) 

8.49

 

28.97

 

33.00

  15.51 

13.14

 

45.28

 

20.75

 

Ab

 Ghani (1993) 

R

 

2

 

0.97

 

0.96

 

0.83

 

0.72

 

0.90

 

0.81

 

0.88

 

0.38

 

RMSE

 

0.36

 

0.43

 

0.89

 

1.15

 

0.67

 

0.94

 

0.74

 

1.69

 

MAPE

 (%) 

5.94

 

9.35

 

22.33

  28.08 

10.32

 

15.60

 

10.34

 

23.96

 

Ota

 (1999) 

R

 

2

 

0.97

 

0.98

 

0.44

 

0.00

 

0.96

 

0.97

 

0.97

 

0.55

 

RMSE

 

0.24

 

0.20

 

1.00

 

1.48

 

0.27

 

0.25

 

0.22

 

0.90

 

MAPE

 (%) 

5.55

 

6.90

 

37.92

  51.28 

7.78

 

7.90

 

6.46

 

19.54

 

Vongvisessomjai

 et al. (2010) 

R

 

2

 

0.88

 

0.95

 

0.79

 

0.49

 

0.99

 

0.71

 

0.97

 

0.00

 

RMSE

 

0.49

 

0.33

 

0.66

 

1.03

 

0.13

 

0.78

 

0.24

 

1.59

 

MAPE

 (%) 

6.56

 

5.78

 

11.45

  13.63 

2.38

 

13.34

 

3.62

 

28.50

 

Montes

 et al. (2020a) 

R

 

2

 

0.96

 

0.98

 

0.00

 

0.00

 

0.83

 

0.67

 

0.77

 

0.00

 

RMSE

 

0.31

 

0.25

 

1.64

 

2.37

 

0.67

 

0.94

 

0.75

 

1.85

 

MAPE

 (%) 

4.36

 

4.94

 

28.15

  49.73 

11.61

 

15.39

 

12.39

 

33.96

 

Montes

 et al. (2020b) 

R

 

2

 

0.94

 

0.86

 

0.76

 

0.00

 

0.00

 

0.34

 

0.00

 

0.00

 

RMSE

 

0.70

 

1.03

 

1.37

 

2.88

 

4.88

 

2.26

 

3.01

 

3.10

 

MAPE

 (%) 

7.33

 

11.31

 

14.35

  29.14 

48.97

 

23.44

 

30.56

 

39.30

 

a

 Model not valid for non-circular channels. 

Outliers

 removed. 

Table

 4 

Accuracy

 of self-cleansing models for deposited bed criterion using performance in- 

dices

 for training and testing dataset. Bolded values show best performance model. 

Dataset

 

Performance

 

Index

 

Model

 

RF

 

PSO

 

LASSO

 

MGP

 

Training

 

R

 

2

 

0.98

 

0.75

 

0.82

 

0.51

 

RMSE

 

0.32

 

1.30

 

1.13

 

1.69

 

MAPE

 (%) 

4.70

 

14.36

 

13.07

 

28.78

 

Testing

 

R

 

2

 

0.91

 

0.70

 

0.83

 

0.29

 

RMSE

 

0.80

 

1.47

 

1.10

 

2.19

 

MAPE

 (%) 

12.10

 

15.94

 

12.59

 

31.36

 

El-Zaemey

 (1991) 

R

 

2

 

0.94

 

0.78

 

0.83

 

0.54

 

RMSE

 

0.38

 

0.76

 

0.66

 

1.08

 

MAPE

 (%) 

6.49

 

14.28

 

11.97

 

30.19

 

Perrusquía

 (1991) 

R

 

2

 

0.84

 

0.65

 

0.62

 

0.00

 

RMSE

 

0.33

 

0.49

 

0.50

 

1.29

 

MAPE

 (%) 

7.07

 

10.15

 

12.05

 

30.58

 

Ab

 Ghani (1993) 

R

 

2

 

0.91

 

0.56

 

0.74

 

0.51

 

RMSE

 

0.60

 

1.32

 

1.01

 

1.40

 

MAPE

 (%) 

6.13

 

16.26

 

11.19

 

13.07

 

May

 (1993) 

R

 

2

 

0.90

 

0.63

 

0.64

 

0.54

 

RMSE

 

0.62

 

1.18

 

1.16

 

1.31

 

MAPE

 (%) 

6.50

 

13.47

 

14.26

 

14.21

 

Montes

 et al. (2020a) 

R

 

2

 

0.93

 

0.00

 

0.73

 

0.00

 

RMSE

 

0.81

 

3.06

 

1.56

 

5.54

 

MAPE

 (%) 

6.84

 

21.05

 

10.36

 

58.79

 

Outliers

 removed. 

Safari

 

and

 

Aksoy

 

(2020)

 

model

 

is

 

a

 

competitive

 

equation

 

for

 

predicting

 

the

 

self-cleansing

 

velocity

 

in

 

both

 

circular

 

and

 

non-

 

circular

 

channels.

 

This

 

model

 

shows

 

similar

 

but

 

inferior

 

per-

 

formance

 

to

 

EPR-MOGA

 

model

 

in

 

small

 

sewer

 

pipes

 

(0.67

 

>

 

R

 

2

 

>

 

0.97,

 

0.25

 

>

 

RMSE

>

 

1.12

 

and

 

7.90%

 

>

 

MAP

 

E

>

 

15.60%),

 

but

 

in

 

large

 

sewers

 

the

 

accuracy

 

is

 

quickly

 

lost

 

(

 

R

 

2

 

=

 

0.34,

 

RMSE

=

 

2.26

 

and

 

MAP

 

E

=

 

23.46%).

 

In

 

contrast,

 

this

 

model

 

outperforms

 

the

 

results,

 

compared

 

to

 

other

 

regression

 

models

 

(EPR-MOGA,

 

GEP

 

and

 

MARS)

 

and

 

ML/AI

 

models

 

(ANFIS-PSO

 

and

 

ELM),

 

in

 

non-circular

 

channels

 

(

 

R

 

2

 

=

 

0.87,

 

RMSE

=

 

0.66

 

and

 

MAP

 

E

=

 

13.41%),

 

which

 

is

 

a

 

competitive

 

performance

 

compared

 

to

 

the

 

RF

 

model

 

(

 

R

 

2

 

=

 

0.89,

 

RMSE

=

 

0.61

 

and

 

MAP

 

E

=

 

10.05%).

 

This

 

is

 

because

 

of

 

the

 

inclusion

 

of

 

the

 

P

/B

 

relation

 

as

 

explana-

 

tory

 

variable

 

for

 

predicting

 

the

 

particle

 

Froude

 

number.

 

This

 

model

 

is

 

competitive

 

and

 

shows

 

good

 

generalisation

 

of

 

the

 

problem

 

for

 

designing

 

sewers

 

under

 

the

 

non-deposition

 

with-

 

out

 

deposited

 

bed

 

criterion.

 

According

 

to

 

Ebtehaj

 

et

 

al.

 

(2019)

 

,

 

ANFIS-PSO

 

model

 

was

 

developed

 

by

 

using

 

the

 

experimental

 

data

 

collected

 

by

 

Ab

 

Ghani

 

(1993)

 

,

 

Ota

 

(1999)

 

and

 

Vongvisessomjai

 

et

 

al.

 

(2010)

 

.

 

As

 

a

 

result,

 

this

 

model

 

shows

 

good

 

performance

 

for

 

these

 

datasets

 

(0.88

 

>

 

R

 

2

 

>

 

0.97,

 

0.22

 

>

 

RMSE

>

 

0.74

 

and

 

3.62%

 

>

 

MAP

 

E

>

 

10.34%).

 

In

 

large

 

sewers

 

and

 

non-circular

 

channels,

 

the

 

model

 

losses

 

accuracy

 

(

 

R

 

2

 

=

 

0.00,

 

2.74

 

>

 

RMSE

>

 

3.01

 

and

 

30.56%

 

>

 

MAP

 

E

>

 

45.28%).

 

This

 

model

 

produces

 

some

 

extreme

 

values

 

when

 

the

 

particle

 

Froude

 

number

 

is

 

calculated,

 

espe-

 

cially

 

in

 

the

 

Montes

 

et

 

al.

 

(2020b)

 

dataset.

 

The

 

RF

 

model

 

gen-

 

erates

 

better

 

results

 

compared

 

to

 

this

 

model.

 

ELM

 

was

 

trained

 

with

 

the

 

same

 

dataset

 

used

 

for

 

the

 

ANFIS-PSO

 

model.

 

Not

 

satisfactory

 

results

 

are

 

obtained

 

when

 

this

 

model

 

is

 

applied

 

on

 

the

 

dataset

 

considered

 

in

 

this

 

study

 

(0.00

 

>

 

R

 

2

 

>

 

0.55,

 

0.90

 

>

 

RMSE

>

 

3.1

 

and

 

19.54%

 

>

 

MAP

 

E

>

 

39.30%).

 

Same

 

comments,

 

as

 

mentioned

 

above

 

for

 

the

 

ANFIS-PSO

 

model,

 

can

 

be

 

shown

 

here.

 

According

 

to

 

the

 

results

 

shown

 

in

 

Table

 

4

 

(deposited

 

bed

 

crite-

 

rion),

 

RF

 

model

 

outperforms

 

the

 

other

 

models

 

for

 

the

 

entire

 

con-

 

sidered

 

dataset.

 

This

 

model

 

shows

 

good

 

accuracy

 

levels

 

(0.84

 

>

 

R

 

2

 

>

 

0.98,

 

0.32

 

>

 

RMSE

>

 

0.81

 

and

 

4.70%

 

>

 

MAP

 

E

>

 

12.10%)

 

for

 

all

 

the

 

range

 

of

 

variation

 

of

 

the

 

hydraulics

 

and

 

sediment

 

characteris-

 

tics.

 

Comments

 

related

 

to

 

the

 

other

 

models

 

studied

 

are

 

as

 

follows:

 

7

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

Fig.

 5. Performance of the models applied in the non-deposition without deposited bed testing dataset. 

8

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

Fig.

 6. Performance of the models applied in the non-deposition with deposited bed testing dataset. . 

PSO

 

model

 

was

 

developed

 

by

 

using

 

the

 

experimental

 

data

 

col-

 

lected

 

by

 

El-Zaemey

 

(1991)

 

,

 

Perrusquía

 

(1991)

 

,

 

May

 

(1993)

 

and

 

Ab

 

Ghani

 

(1993)

 

.

 

As

 

a

 

result,

 

this

 

model

 

shows

 

good

 

perfor-

 

mance

 

for

 

these

 

datasets

 

(0.56

 

>

 

R

 

2

 

>

 

0.78,

 

0.49

 

>

 

RMSE

>

 

1.32

 

and

 

10.15%

 

>

 

MAP

 

E

>

 

16.26%).

 

However,

 

when

 

the

 

model

 

is

 

compared

 

to

 

the

 

data

 

collected

 

in

 

the

 

large

 

sewer

 

pipe,

 

the

 

accuracy

 

quickly

 

decreases

 

(

 

R

 

2

 

=

 

0.00,

 

RMSE

=

 

3.06

 

and

 

MAP

 

E

=

 

21.05%).

 

LASSO

 

model

 

reports

 

good

 

accuracy

 

levels

 

for

 

all

 

the

 

datasets

 

considered

 

(0.62

 

>

 

R

 

2

 

>

 

0.83,

 

0.50

 

>

 

RMSE

>

 

1.56

 

and

 

10.36%

 

>

 

MAP

 

E

>

 

14.26%).

 

However,

 

the

 

accuracy

 

is

 

still

 

inferior

 

com-

 

pared

 

to

 

the

 

RF

 

model.

 

This

 

model

 

shows

 

good

 

extrapolation

 

capabilities

 

and

 

generalisation

 

of

 

the

 

problem.

 

MGP

 

was

 

developed

 

by

 

using

 

the

 

same

 

experimental

 

datasets

 

of

 

the

 

PSO

 

model.

 

This

 

model

 

shows

 

less

 

accuracy

 

compared

 

to

 

the

 

PSO

 

model

 

(0.00

 

>

 

R

 

2

 

>

 

0.54,

 

1.08

 

>

 

RMSE

>

 

5.54

 

and

 

13.07%

 

>

 

MAP

 

E

>

 

58.79%).

 

In

 

large

 

sewer

 

pipes,

 

the

 

model

 

shows

 

poor

 

performance.

 

In

 

contrast

 

to

 

other

 

models,

 

the

 

MGP

 

was

 

developed

 

by

 

using

 

normalised

 

values.

 

Based

 

on

 

this,

 

the

 

range

 

of

 

variation

 

used

 

for

 

training

 

the

 

model

 

can

 

potentially

 

affect

 

the

 

final

 

form/structure

 

of

 

the

 

final

 

expression

 

shown

 

by

 

the

 

MGP.

 

RF

 

accuracy

 

shown

 

in

 

the

 

Montes

 

et

 

al.

 

(2020b)

 

data

 

is

 

es-

 

pecially

 

important

 

due

 

to

 

the

 

relative

 

sediment

 

thickness

 

(

 

y

 

s

 

/D

 

)

 

used

 

at

 

laboratory

 

scale

 

in

 

that

 

study.

 

As

 

Table

 

1

 

shows,

 

the

 

sedi-

 

ment

 

thickness

 

used

 

at

 

laboratory

 

scale

 

ranging

 

from

 

0.8

 

mm

 

(for

 

Montes

 

et

 

al.

 

(2020b)

 

data)

 

to

 

129.6

 

mm

 

(for

 

May

 

(1993)

 

data),

 

i.e.

 

the

 

variation

 

of

 

y

 

s

 

/D

 

is

 

from

 

1.1%

 

to

 

20.0%

 

of

 

the

 

pipe

 

diame-

 

ter.

 

Values

 

of

 

y

 

s

 

/D

 

=

 

20%

 

is

 

an

 

unrealistic

 

consideration

 

since

 

the

 

optimal

 

sediment

 

thickness

 

design

 

has

 

been

 

defined

 

as

 

1%

 

of

 

the

 

pipe

 

diameter

 

(

 

May

 

et

 

al.,

 

1989

 

;

 

Safari

 

and

 

Shirzad,

 

2019

 

).

 

Data

 

collected

 

by

 

Montes

 

et

 

al.

 

(2020b)

 

seem

 

to

 

be

 

the

 

closer

 

repre-

 

sentation

 

of

 

the

 

real

 

conditions

 

found

 

in

 

sewer

 

systems.

 

Based

 

on

 

Fig.

 7. Variable importance estimated by RF model: A) without deposited bed; B) 

with

 deposited bed. 

this,

 

RF

 

is

 

the

 

model

 

that

 

best

 

predicts

 

the

 

self-cleansing

 

velocity

 

for

 

data

 

close

 

to

 

real

 

conditions.

 

4.1.

 

Variable

 

importance

 

RF

 

model

 

input

 

variable

 

importance

 

is

 

presented

 

in

 

Fig.

 

7

 

.

 

As

 

shown

 

in

 

this

 

figure,

 

for

 

both

 

non-deposition

 

criteria

 

the

 

most

 

important

 

variable

 

is

 

the

 

volumetric

 

sediment

 

concentration,

 

fol-

 

lowed

 

by

 

the

 

dimensionless

 

grain

 

size

 

and

 

the

 

relative

 

grain

 

size.

 

This

 

result

 

is

 

consistent

 

with

 

previous

 

findings

 

reported

 

in

 

the

 

lit-

 

erature

 

(

 

Ackers

 

et

 

al.,

 

2001

 

;

 

Ebtehaj

 

et

 

al.,

 

2020

 

).

 

Less

 

important

 

parameters

 

for

 

predicting

 

the

 

particle

 

Froude

 

number

 

and

 

thus

 

the

 

self-cleansing

 

velocity,

 

are

 

the

 

relative

 

sediment

 

thickness

 

and

 

the

 

channel

 

friction

 

factor,

 

for

 

the

 

deposited

 

bed

 

criterion.

 

Parameter

 

importance

 

shown

 

by

 

EPR-MOGA,

 

Safari

 

and

 

Ak-

 

soy

 

(2020)

 

,

 

PSO

 

and

 

LASSO

 

is

 

quite

 

different.

 

In

 

these

 

tech-

 

niques,

 

the

 

most

 

important

 

parameter

 

is

 

the

 

relative

 

grain

 

9

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

size

 

due

 

to

 

the

 

highest

 

values

 

of

 

the

 

regression

 

coefficients

 

(

 

(

 

d

 

R

 

)

 

c

 

; 0

 

.

 

305

 

<

 

c

 

<

 

0

 

.

 

58

 

)

 

,

 

as

 

shown

 

in

 

Eqs.

 

(5)

 

,

 

(9)

 

,

 

(11)

 

and

 

(12)

 

.

 

The

 

parameter

 

importance

 

for

 

the

 

GEP,

 

MARS

 

and

 

MGP

 

model

 

is

 

less

 

intuitive

 

because

 

of

 

the

 

form

 

of

 

the

 

equations,

 

as

 

shown

 

in

 

Eqs.

 

(6)

 

,

 

(7)

 

and

 

(13)

 

,

 

which

 

include

 

logarithmic

 

and

 

inverse

 

tangent

 

functions

 

for

 

calculating

 

the

 

particle

 

Froude

 

number.

 

Less

 

comparable

 

are

 

the

 

results

 

shown

 

by

 

ANFIS-PSO

 

and

 

ELM

 

since

 

no

 

practical

 

equation

 

is

 

provided.

 

Based

 

on

 

the

 

above

 

results

 

shown

 

in

 

Fig.

 

7

 

,

 

a

 

good

 

estimate

 

of

 

the

 

volumetric

 

sediment

 

concentration

 

seems

 

to

 

be

 

essential

 

for

 

increasing

 

the

 

accuracy

 

of

 

the

 

calculation

 

of

 

the

 

particle

 

Froude

 

number

 

and

 

consequently

 

the

 

minimum

 

self-cleansing

 

velocity

 

for

 

both

 

non-deposition

 

criteria.

 

In

 

addition,

 

hydraulic

 

characteristics

 

of

 

the

 

pipe

 

(defined

 

by

 

the

 

hydraulic

 

radius)

 

and

 

the

 

sediment

 

characteristics

 

(i.e.

 

particle

 

diameter

 

and

 

specific

 

gravity)

 

are

 

pro-

 

portionally

 

important

 

for

 

model

 

performance.

 

5.

 

Discussion

 

The

 

prediction

 

of

 

self-cleansing

 

conditions

 

in

 

sewers

 

remains

 

a

 

challenge

 

despite

 

multiple

 

models

 

and

 

equations

 

developed

 

and

 

reported

 

in

 

the

 

literature.

 

Existing

 

regression-based

 

equations

 

and

 

AI/ML

 

models

 

show

 

limited

 

generalisation

 

capabilities

 

and

 

overfit-

 

ting

 

problems.

 

In

 

this

 

paper,

 

a

 

new

 

approach

 

for

 

addressing

 

these

 

issues

 

is

 

proposed

 

by

 

using

 

the

 

Random

 

Forest

 

method.

 

Due

 

to

 

the

 

nature

 

of

 

the

 

RF

 

method,

 

where

 

the

 

model

 

variance

 

is

 

reduced

 

by

 

averaging

 

the

 

results

 

from

 

an

 

ensemble

 

of

 

decision

 

trees,

 

the

 

risk

 

of

 

overfitting

 

is

 

low.

 

By

 

using

 

a

 

reduced

 

number

 

of

 

input

 

features

 

for

 

constructing

 

each

 

decision

 

tree

 

in

 

the

 

forest,

 

the

 

correlation

 

between

 

base

 

trees

 

is

 

avoided.

 

This

 

is

 

an

 

improvement

 

of

 

the

 

method

 

compared

 

to

 

a

 

single

 

decision

 

tree,

 

which

 

can

 

be

 

overtrained

 

(i.e.

 

the

 

tree

 

learns

 

the

 

noise

 

from

 

the

 

training

 

data)

 

and

 

thus

 

shows

 

poor

 

performance

 

in

 

the

 

testing

 

dataset.

 

RF

 

model

 

showed

 

good

 

generalisation

 

capabilities

 

when

 

the

 

whole

 

dataset

 

is

 

divided

 

into

 

75%

 

for

 

the

 

training

 

stage

 

and

 

25%

 

for

 

the

 

testing

 

stage.

 

For

 

this

 

percentage

 

of

 

split

 

data,

 

the

 

testing

 

error

 

presented

 

a

 

low

 

variance.

 

In

 

contrast,

 

by

 

increasing

 

the

 

num-

 

ber

 

of

 

data

 

used

 

in

 

the

 

training

 

stage

 

(e.g.

 

95%

 

of

 

the

 

whole

 

data)

 

the

 

testing

 

error

 

showed

 

high

 

variance,

 

which

 

is

 

an

 

indicator

 

of

 

an

 

over-trained

 

model

 

with

 

limited

 

extrapolation

 

capabilities

 

(as

 

shown

 

in

 

Fig.

 

3

 

b).

 

Therefore,

 

choosing

 

the

 

right

 

percentage

 

split

 

is

 

critical

 

to

 

avoid

 

model

 

overfitting.

 

Variable

 

importance

 

analysis

 

showed

 

that

 

the

 

volumetric

 

sedi-

 

ment

 

concentration

 

is

 

the

 

most

 

relevant

 

feature

 

for

 

predicting

 

the

 

self-cleansing

 

velocity

 

in

 

practice

 

for

 

both

 

non-deposition

 

criteria,

 

followed

 

by

 

the

 

dimensionless

 

grain

 

size.

 

The

 

self-cleansing

 

predic-

 

tion

 

is

 

no

 

conditioned

 

by

 

the

 

channel

 

material,

 

as

 

the

 

low

 

variable

 

importance

 

shown

 

by

 

the

 

channel

 

friction

 

factor.

 

RF

 

results

 

are

 

compared

 

to

 

existing

 

models

 

reported

 

in

 

the

 

lit-

 

erature

 

and

 

showed

 

better

 

performance

 

for

 

the

 

whole

 

dataset

 

for

 

both

 

non-deposition

 

without

 

and

 

with

 

deposited

 

bed

 

criteria.

 

This

 

is

 

explained

 

by

 

several

 

factors,

 

such

 

as:

 

RF

 

is

 

able

 

to

 

better

 

capture

 

the

 

non-linearity

 

in

 

the

 

data

 

com-

 

pared

 

to

 

linear

 

regression

 

models

 

(i.e.

 

regression-based

 

models

 

proposed

 

by

 

May

 

et

 

al.

 

(1996)

 

and

 

Safari

 

and

 

Aksory

 

(2020)

 

).

 

The

 

RF

 

model

 

also

 

better

 

captures

 

complex

 

interactions

 

be-

 

tween

 

features.

 

This

 

is

 

because

 

of

 

RF

 

model’s

 

ability

 

to

 

capture

 

effectively

 

non-linear

 

patterns

 

in

 

data.

 

RF

 

showed

 

a

 

good

 

bias-variance

 

trade-off (i.e.

 

low

 

bias

 

and

 

low

 

variance)

 

for

 

both

 

non-deposition

 

criteria.

 

In

 

contrast,

 

exist-

 

ing

 

non-regression

 

models

 

reported

 

in

 

the

 

literature

 

(i.e.

 

MARS,

 

ANFIS-PSO

 

and

 

ELM),

 

and

 

compared

 

to

 

the

 

RF

 

model

 

in

 

this

 

paper,

 

in

 

some

 

cases

 

presented

 

low

 

bias

 

and

 

high

 

variance

 

(i.e.

 

overfitting)

 

for

 

the

 

non-deposition

 

without

 

deposited

 

bed

 

crite-

 

rion,

 

as

 

shown

 

in

 

Fig.

 

5

 

.

 

For

 

the

 

non-deposition

 

with

 

deposited

 

bed

 

criterion,

 

the

 

existing

 

models

 

(i.e.

 

PSO,

 

LASSO

 

and

 

MGP)

 

showed

 

high

 

bias,

 

since

 

these

 

models

 

systematically

 

underes-

 

timate

 

the

 

particle

 

Froude

 

number

 

in

 

the

 

testing

 

dataset

 

(as

 

shown

 

in

 

Fig.

 

6

 

).

 

The

 

range

 

of

 

variation

 

used

 

for

 

training

 

and

 

testing

 

the

 

RF

 

model

 

is

 

much

 

larger

 

than

 

the

 

dataset

 

used

 

in

 

the

 

literature

 

for

 

developing

 

the

 

existing

 

predictive

 

models.

 

For

 

example,

 

the

 

ANFIS-PSO

 

and

 

ELM

 

were

 

trained

 

and

 

testing

 

with

 

the

 

Ab

 

Ghani

 

(1993)

 

,

 

Ota

 

(1999)

 

and

 

Vongvisessomjai

 

et

 

al.

 

(2010)

 

data

 

(i.e.

 

290

 

data

 

approx.).

 

Given

 

this,

 

the

 

RF

 

model

 

developed

 

here

 

is

 

able

 

to

 

predict

 

the

 

particle

 

Froude

 

number

 

for

 

a

 

larger

 

range

 

of

 

variation

 

of

 

the

 

input

 

conditions.

 

An

 

example

 

of

 

this

 

is

 

shown

 

in

 

Fig.

 

6

 

where

 

the

 

existing

 

models

 

reported

 

for

 

the

 

non-deposition

 

with

 

deposited

 

bed

 

criterion

 

underestimate

 

the

 

particle

 

Froude

 

number

 

for

 

values

 

above

 

9.0

 

(

 

F

 

r

 

>

 

9.0).

 

Despite

 

the

 

RF

 

presented

 

in

 

this

 

study

 

outperforms

 

the

 

existing

 

models

 

reported

 

in

 

the

 

literature,

 

further

 

tests

 

with

 

data

 

collected

 

in

 

real

 

sewers

 

should

 

be

 

conducted.

 

The

 

cohesive

 

effects

 

of

 

the

 

de-

 

posited

 

material

 

must

 

be

 

included

 

for

 

future

 

developments.

 

Finally,

 

further

 

evaluation

 

of

 

the

 

performance

 

of

 

the

 

model

 

in

 

trapezoidal,

 

ovoid,

 

or

 

U-shape

 

channels

 

should

 

be

 

carried

 

out

 

to

 

check

 

the

 

ap-

 

plicability

 

of

 

the

 

model

 

under

 

these

 

channel

 

characteristics.

 

6.

 

Conclusions

 

Random

 

Forest

 

based

 

model

 

was

 

developed

 

for

 

predicting

 

the

 

self-cleansing

 

velocity

 

under

 

the

 

concept

 

of

 

non-deposition.

 

This

 

model

 

was

 

implemented

 

using

 

the

 

experimental

 

benchmark

 

data

 

reported

 

in

 

the

 

literature.

 

The

 

RF

 

model

 

was

 

compared

 

to

 

the

 

fol-

 

lowing

 

ten

 

literature

 

models:

 

EPR-MOGA,

 

MARS,

 

MGP,

 

ANFIS-PSO,

 

ELM,

 

LASSO,

 

GEP

 

and

 

PSO,

 

and

 

two

 

regression-based

 

equations

 

proposed

 

by

 

May

 

et

 

al.

 

(1996)

 

and

 

Safari

 

and

 

Aksoy

 

(2020)

 

.

 

The

 

following

 

conclusions

 

are

 

made

 

based

 

on

 

the

 

results

 

ob-

 

tained:

 

(1)

 

Random

 

Forest

 

model

 

is

 

able

 

to

 

predict

 

the

 

particle

 

Froude

 

number

 

(i.e.

 

minimum

 

self-cleansing

 

velocity)

 

for

 

the

 

non-

 

deposition

 

self-cleansing

 

design

 

criteria

 

with

 

high

 

accuracy

 

on

 

validation

 

(i.e.

 

unseen)

 

data.

 

This

 

is

 

due

 

to

 

the

 

ability

 

of

 

RF

 

to

 

better

 

generalise

 

the

 

analysed

 

data,

 

i.e.

 

the

 

ability

 

to

 

avoid

 

model

 

overfitting.

 

(2)

 

RF

 

model

 

prediction

 

accuracy

 

is

 

consistently

 

superior

 

to

 

ten

 

other

 

literature

 

models

 

considered

 

here.

 

This

 

is

 

likely

 

due

 

to

 

the

 

reason

 

mentioned

 

above

 

but

 

also

 

the

 

capability

 

to

 

bet-

 

ter

 

capture

 

the

 

complex

 

interactions

 

between

 

input

 

variables

 

when

 

compared

 

to

 

other

 

models

 

considered

 

in

 

this

 

paper.

 

This

 

is

 

especially

 

relevant

 

for

 

the

 

non-deposition

 

with

 

de-

 

posited

 

bed

 

case

 

where

 

the

 

accuracy

 

of

 

RF

 

model

 

predictions

 

is

 

substantially

 

higher

 

than

 

in

 

other

 

models

 

(i.e.

 

LASSO,

 

MGP

 

and

 

PSO

 

models).

 

(3)

 

The

 

volumetric

 

sediment

 

concentration

 

is

 

the

 

most

 

impor-

 

tant

 

input

 

variable

 

for

 

predicting

 

the

 

self-cleansing

 

veloc-

 

ity

 

in

 

sewer

 

pipes.

 

A

 

good

 

characterisation

 

of

 

this

 

parame-

 

ter

 

seems

 

to

 

be

 

essential

 

for

 

improving

 

the

 

design

 

of

 

new

 

self-cleansing

 

sewers.

 

Based

 

on

 

the

 

above,

 

RF

 

can

 

be

 

used

 

for

 

predicting

 

self-cleansing

 

velocity

 

with

 

high

 

accuracy,

 

especially

 

for

 

large

 

sewer

 

pipes

 

with

 

the

 

presence

 

of

 

deposited

 

bed.

 

This

 

technique

 

can

 

be

 

used

 

for

 

de-

 

signing

 

self-cleansing

 

sewer

 

systems.

 

Further

 

testing

 

of

 

the

 

RF

 

and

 

other

 

self-cleansing

 

models

 

in

 

real

 

sewer

 

systems

 

is

 

required

 

to

 

further

 

validate

 

these

 

models

 

in

 

those

 

circumstances

 

and

 

ensure

 

their

 

applicability

 

in

 

engineering

 

prac-

 

tice.

 

10

 

/var/www/pavco.com.co/public/site/pdftohtml/0a11d602f962b8945d803abdfbf03967/index-html.html
background image

C.

 Montes, Z. Kapelan and J. Saldarriaga 

Water

 Research 189 (2021) 116639 

Declaration

 

of

 

Competing

 

Interest

 

The

 

authors

 

declare

 

that

 

they

 

have

 

no

 

known

 

competing

 

finan-

 

cial

 

interests

 

or

 

personal

 

relationships

 

that

 

could

 

have

 

appeared

 

to

 

influence

 

the

 

work

 

reported

 

in

 

this

 

paper.

 

Funding

 

This

 

research

 

did

 

not

 

receive

 

any

 

specific

 

grant

 

from

 

funding

 

agencies

 

in

 

the

 

public,

 

commercial,

 

or

 

not-for-profit

 

sectors.

 

Supplementary

 

materials

 

Supplementary

 

material

 

associated

 

with

 

this

 

article

 

can

 

be

 

found,

 

in

 

the

 

online

 

version,

 

at

 

doi:

 

10.1016/j.watres.2020.116639

 

.

 

References

 

Ab

 Ghani, A. , 1993. Sediment transport in sewers. PhD thesis. University of Newcas- 

tle

 Upon Tyne, Newcastle Upon Tyne, UK . 

Ackers,

 J., Butler, D., Leggett, D., May, R., 2001. Designing sewers to control sediment 

problems.

 In: Urban Drainage Modeling. ASCE, Orlando, FL, pp. 818–823. doi: 

10.

 

1061/40583(275)77

 

.

 

Breiman,

  L.,  2001.  Random  forests.  Mach.  Learn.  45,  5–32.  doi: 

10.1023/A:

 

1010933404324

 

.

 

Ebtehaj,

 I., Bonakdari, H., 2016a. Bed load sediment transport in sewers at limit of 

deposition.

 Sci. Iran. 23 (3), 907–917. doi: 

10.24200/sci.2016.2169

 

.

 

Ebtehaj,

 I., Bonakdari, H., 2016b. A support vector regression-firefly algorithm-based 

model

 for limiting velocity prediction in sewer pipes. Water Sci. Technol. 73 (9), 

2244–2250.

 doi: 

10.2166/wst.2016.064

 

.

 

Ebtehaj,

 I., Bonakdari, H., 2013. Evaluation of sediment transport in sewer us- 

ing

 artificial neural network. Eng. Appl. Comput. Fluid Mech. 7 (3), 382–392. 

doi:

 

10.1080/19942060.2013.11015479

 

.

 

Ebtehaj,

 I., Bonakdari, H., Es-Haghi, M., 2019. Design of a hybrid ANFIS–PSO model 

to

 estimate sediment transport in open channels. Iran. J. Sci. Technol. Trans. 44 

(4),

 851–857. doi: 

10.1007/s40996-

 018- 0218- 9 

.

 

Ebtehaj,

 I., Bonakdari, H., Safari, M., Gharabaghi, B., Zaji, A., Riahi Madavar, H., Sheikh 

Khozani,

 Z., Es-haghi, M., Shishegaran, A., Danandeh Mehr, A., 2020. Combina- 

tion

 of sensitivity and uncertainty analyses for sediment transport modeling in 

sewer

 pipes. Int. J. Sediment Res. 35 (2), 157–170. doi: 

10.1016/j.ijsrc.2019.08.005

 

.

 

Ebtehaj,

 I., Bonakdari, H., Sharifi, A., 2014. Design criteria for sediment transport in 

sewers

 based on self-cleansing concept. J. Zhejiang Univ. Sci. A 15 (11), 914–924. 

doi:

 

10.1631/jzus.a1300135

 

.

 

El-Zaemey,

 A. , 1991. Sediment transport over deposited beds in sewers. PhD thesis. 

University

 of Newcastle Upon Tyne, Newcastle Upon Tyne, UK . 

Hastie,

 T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: 

data

 mining, inference, and prediction. Springer, New York, USA doi: 

10.1007/

 

978-

 0- 387- 84858- 7 

.

 

Kargar,

 K., Safari, M., Mohammadi, M., Samadianfard, S., 2019. Sediment transport 

modeling

 in open channels using neuro-fuzzy and gene expression program- 

ming

 techniques. Water Sci. Technol. 79 (12), 2318–2327. doi: 

10.2166/wst.2019.

 

229

 

.

 

Liaw,

 A. , Wiener, M. , 2002. Classification and regression by Random forest. R News 

2

 (3), 18–22 . 

May,

 R. , 1993. Sediment transport in pipes and sewers with deposited beds. HR 

Wallingford,

 Oxfordshire, UK Report SR 320 . 

May,

 R., Ackers, J., Butler, D., John, S., 1996. Development of design methodol- 

ogy

 for self-cleansing sewers. Water Sci. Technol. 33 (9), 195–205. doi: 

10.1016/

 

0273-

 1223(96)00387- 3 

.

 

May,

 R. , Brown, P. , Hare, G. , Jones, K. , 1989. Self-cleansing conditions for sewers 

carrying

 sediment. HR Wallingford, Oxfordshire, UK Report SR 221 . 

Mayerle,

 R. , 1988. Sediment transport in rigid boundary channels. PhD thesis. Uni- 

versity

 of Newcastle upon Tyne, Newcastle Upon Tyne, UK . 

Merritt,

 L. , Enfinger, K. , 2019. Tractive force: a key to solids transport in gravity flow 

drainage

 pipes. In: Pipelines 2019. ASCE, Nashville, TN, pp. 349–358 . 

Montes,

 C., Berardi, L., Kapelan, Z., Saldarriaga, J., 2020a. Predicting bedload sed- 

iment

 transport of non-cohesive material in sewer pipes using evolutionary 

polynomial

 regression – multi-objective genetic algorithm strategy. Urban Wa- 

ter

 J. 17 (2), 154–162. doi: 

10.1080/1573062X.2020.1748210

 

.

 

Montes,

 C., Kapelan, Z., Saldarriaga, J., 2019. Impact of self-cleansing criteria choice 

on

 the optimal design of sewer networks in South America. Water (Basel) 11, 

1148.

 doi: 

10.3390/w11061148

 

.

 

Montes,

 C., Vanegas, S., Kapelan, Z., Berardi, L., Saldarriaga, J., 2020b. Non-deposition 

self-cleansing

 models for large sewer pipes. Water Sci. Technol. 81 (3), 606–621. 

doi:

 

10.2166/wst.2020.154

 

.

 

Nalluri,

 C., Ab Ghani, A., 1996. Design options for self-cleansing storm sewers. Water 

Sci.

 Technol. 33 (9), 215–220. doi: 

10.1016/0273-

 1223(96)00389- 7 

.

 

Ota,

 J. , 1999. Effect of particle size and gradation on sediment transport in storm 
sewers.

 PhD thesis. University of Newcastle upon Tyne, Newcastle Upon Tyne, 

UK

 . 

Perrusquía,

 G. , 1991. Bedload Transport in Storm Sewers: Stream Traction in Pipe 

Channels

 PhD thesis. Chalmers University of Technology, Gothenburg, Sweden . 

Roushangar,

 K., Ghasempour, R., 2017. Estimation of bedload discharge in sewer 

pipes

 with different boundary conditions using an evolutionary algorithm. Int. 

J.

 Sediment Res. 32 (4), 564–574. doi: 

10.1016/j.ijsrc.2017.05.007

 

.

 

Safari,

 M., 2019. Decision tree (DT), generalized regression neural network (GR) and 

multivariate

 adaptive regression splines (MARS) models for sediment transport 

in

 sewer pipes. Water Sci. Technol. 79 (6), 1113–1122. doi: 

10.2166/wst.2019.106

 

.

 

Safari,

 M., Danandeh Mehr, A., 2018. Multigene genetic programming for sediment 

transport

 modeling in sewers for conditions of non-deposition with a bed de- 

posit.

 Int. J. Sediment Res. 33 (3), 262–270. doi: 

10.1016/j.ijsrc.2018.04.007

 

.

 

Safari,

  M.,  Mohammadi,  M.,  Ab  Ghani,  A.,  2018.  Experimental  studies  of  self- 

cleansing

 drainage system design: a review. J. Pipeline Syst. Eng. Pract. 9 (4), 

04018017.

 doi: 

10.1061/(ASCE)PS.1949-1204.0

 0 0 0335 

.

 

Safari,

 M., Shirzad, A., 2019. Self-cleansing design of sewers: definition of the opti- 

mum

 deposited bed thickness. Water Environ. Res. 91 (5), 407–416. doi: 

10.1002/

 

wer.1037

 

.

 

Safari,

 M., Shirzad, A., Mohammadi, M., 2017. Sediment transport modeling in de- 

posited

 bed sewers: unified form of May’s equations using the particle swarm 

optimization

 algorithm. Water Sci. Technol. 76 (4), 992–10 0 0. doi: 

10.2166/wst.

 

2017.267

 

.

 

Safari,

 M., 2020. Hybridization of multivariate adaptive regression splines and ran- 

dom

 forest models with an empirical equation for sediment deposition predic- 

tion

 in open channel flow. J. Hydrol. 590 (November 2020), 125392. doi: 

10.1016/

 

j.jhydrol.2020.125392

 

.

 

Safari,

 M., Aksoy, H., 2020. Experimental analysis for self-cleansing open channel 

design.

 J. Hydraul. Res. 1–12. doi: 

10.1080/00221686.2020.1780501

 

.

 

Tyralis,

 H., Papacharalampous, G., Langousis, A., 2019. A brief review of random 

forests

 for water scientists and practitioners and their recent history in water 

resources.

 Water (Basel) 11 (5), 910. doi: 

10.3390/w11050910

 

.

 

Vongvisessomjai,

 N., Tingsanchali, T., Babel, M., 2010. Non-deposition design cri- 

teria

 for sewers with part-full flow. Urban Water J. 7 (1), 61–77. doi: 

10.1080/

 

15730620903242824

 

.

 

Zendehboudi,

 S., Rezaei, N., Lohi, A., 2018. Applications of hybrid models in chem- 

ical,

 petroleum, and energy systems: a systematic review. Appl. Energy 228 

(2018),

 2539–2566. doi: 

10.1016/j.apenergy.2018.06.051

 

.

 

11

 

¿Quiere saber más? Contáctenos

Declaro haber leído y aceptado la Política de Privacidad
contactenos