Background
Soil contamination with total petroleum hydrocarbon (TPH) compounds originated from the entrance of chemical compounds to the natural soil environments that originally come from crude oil (1). TPHs have recognized carcinogenic effects and (or) other serious disorders on human health (2-4). The penetration of these contaminants into the groundwater resources can leads to water quality impairment which will impose special and expensive treatment technologies for remediation. One commonly used technique is soil washing by surfactants, which involves physical-chemical processes for the removal of soil contaminations (5). This process can be used as a portable technology in the polluted sites and meaningfully declined the proportion of polluted soil (6). Non-ionic surfactants such as Tween 80 and Brij 35 are the most approved surfactants that are used in soil washing technique for the removal of PAHs, TPHs, PCBs and heavy metal compounds (Ahn et al. 2009; Villa et al. 2010). These surfactants have low tendency to flocculate clay particles in soil which are cost-effective and biodegradable.
Developing model for treatment processes may help to better understanding of the processes, better management of the system’s problems and also, in many cases can help to estimate and improve the system’s performance. Artificial neural network methodologies (ANN) have already been applied to the prediction of arsenic removal efficiency by modified Fenton (7), modeling of adsorption of Cu (II) from industrial leachate by pumice (Turan et al. 2011), prediction of wastewater treatment plant (8) and waste stabilization pond (9) performances, modeling and prediction of water treatment plant influent quality parameters (10). The results of these studies indicated that ANNs are easy to use and effective tool at the environmental modeling and prediction.
Aims of the study:
Therefore, the main objective of this study was to develop a feed forward artificial neural network (FFANN) model to estimate the removal of TPH based on several input parameters (i.e., shaking speed, contact time, surfactant concentration and pH) in soil washing process with Tween 80.
One of the most widely used artificial neural networks configurations in modeling environmental pollution removal is FFANN structure which was used in this work (11). On the basis of the experimental data collected from the previous work (12), in the present study, a FFANN network was developed as an artificial intelligence-based approach for modeling of soil washing process. Neurons and links are important components of a neural network structure, where neurons are known as the processing elements and links provide neurons connections. Optimization and identification of input and output variables are two important steps in developing an FFANN model (13). The schematic diagram of FFANN which was used in this work is shown in Fig. 1. Each neuron in a layer which receives weighted inputs from a previous layer and transmits its output to the neurons in the next layer. The summation of weighted input signals are calculated by Eq. (1) and this summation is transferred by a linear or nonlinear function called the activation function (14). The selection of the activation function can affect the network performance, significantly. Commonly used activation functions are the hyperbolic tangent function and sigmoid function. The type of activation function is also depends on the type of neural network to be designed. The sigmoid logistic, sigmoid tangent and hyperbolic tangent transfer functions are widely used and are given in Eqs.2, 3 and 4, respectively. One of the characteristics of sigmoid logistic function is that it only produces a value between 0 and 1. For sigmoid tangent (tansig), which is an anti-symmetric activation function the output of each neuron is permitted to assume both positive and negative values in the interval between -1 and 1, in which case it is likely for its mean to be zero. This function is mathematically equivalent to hyperbolic tangent transfer function (tanh) but it differs in that it runs faster than function. If the network connectivity is large, back-propagation learning with anti-symmetric activation functions can yield faster convergence than a similar process with non-symmetric activation functions such as sigmoid function (15). A linear function was also used the output layer. This arrangement of functions in modeling is common and can yield better results (16).
The results of network are compared with the actual observation results and the network error is calculated with Eq. (5). The training process continues until this error reaches an acceptable value (17).
(1)
Sigmoid transfer function (2)
- Sigmoid tangent transfer function
(3)
- Hyperbolic tangent transfer function
(4)
(5)
Where, Y
i is the response of neuron i, f (Y
net) is the nonlinear activation function, Y
net is the summation of weighted inputs, X
i is the neuron input, W
i is weight coefficient of each neuron input, W
0 is bias, J
r is the error function between observed value and network response, O
i is the observed value of neuron i (14).
Figure 1) The schematic diagram of FFNN used in this study; P is the number of the hidden layer neurons
In this study for prediction of soil washing efficiency in removal of TPH from contaminated soil, data of pH
s, shaking speeds, surfactant concentrations and contact times are applied to the FFANN structure as repressors and TPH removal as the target. 85% of the data set was used for ANN model training and 15% for testing the model, approximately (18). As seen in Table 3, a random selection method was used for division of all data to training and testing sets. Thus, all data including input and output variables were scaled between -1 and 1. This action preserves the interpretation of the weights and prevents the numerical overflows (18). Eq. (6) shows the normalization approach used in the study.
S= (2 × (X - X
min / X
max - X
min)) – 1 (6)
Where, X
min and X
max are the extreme values of variable X. Finally, in order to determine the errors in training and testing steps, all outputs were returned to the original scaling and were compared with experimental responses. MATLAB software was used for training and designing the feed forward neural network.
The experimental investigations data were used to design the FFANN network. Summary of these data are presented in Table 1. Further details of the network topography are summarized in Table 2.
Table 1) Summery of the experimental study results
Input parameters |
Range |
Removal range % (output parameters) |
Best selected conditions for each parameter |
Shaking speed (rpm) |
100-250 |
43.5-70 |
250 |
Contact time (min) |
10-120 |
38.5-75 |
120 |
Surfactant concentration (mg/L) |
2-30 |
35-82 |
10 |
pH |
2.5-9 |
61-72 |
7 |
Table 2) Neural network’s properties
Training function |
Performance function |
Data division |
Number of inputs |
Number of outputs |
Transfer functions |
Number of neurons in 1st layer |
Number of neurons in 2nd layer |
Number of neurons in 3rd layer |
Number of epochs |
Trainlm |
MSE |
Dividerand |
4 |
1 |
Tansig |
4 |
3 |
1 |
1000 |
The statistical parameters including mean ( ), standard deviation ( ) and skewness coefficient ( ) for the training and testing data sets are given in Table 3. Table 4 shows performance of FFNN for TPH removal estimations.
Table 5 shows the linear regression coefficients which are slopes and intercepts of the fitted lines.
Table 3) Statistical parameters for neural network input and output data
Variables |
Training data set |
|
Testing data set |
|
|
|
|
|
|
|
Shaking speed (rpm) |
233.03 |
41.41 |
-2.26 |
|
240 |
22.3607 |
-1.50 |
Contact time (min) |
64.82 |
31.92 |
-0.43 |
|
81 |
32.86 |
-0.56 |
Surfactant concentration (mg/L) |
11.14 |
13.09 |
2.66 |
|
7 |
2.73 |
0.40 |
pH |
6.82 |
1.13 |
-1.78 |
|
5.90 |
1.94 |
-1.33 |
TPH removal (%) |
57.31 |
14.33 |
-0.24 |
|
63.42 |
6.84 |
-0.31 |
Table 4) Performance of FFNN for removal percentage estimation
Output |
Training step |
Testing step |
RMSE |
RMSE |
R2 |
RMSE |
TPH removal (%) |
2.596 |
10.706 |
0.785 |
10.706 |
Table 5) Linear regression between neural network output and desired values
output |
Training data set |
Testing data set |
All data set |
a1 |
b2 |
a |
b |
a |
b |
Removal Percentage |
1 |
7.89×10-4 |
0.862 |
9.741 |
0.936 |
2.905 |
1: a is linear regression coefficient, 2: b is the Intercept
Figs. 2a, 3a and 4a show the real values and model computed values of TPH removal in testing, training and all data sets for the structure of 4-3-1, respectively. In Figs. 2b, 3b and 4b, the TPH removal estimations by FFNN versus the observed values are shown.
Figure 2) Removal percentage calculated by the neural network for testing data set versus the real values (a) and corresponding linear regression (b) for test data set testing step
Figure 3) Removal percentage calculated by the neural network versus the real values (a) and corresponding linear regression (b) for train data set
Figure 4) Removal percentage calculated by the neural network versus the real values (a) and corresponding linear regression (b) for all data set
FFNN with a backpropagation learning algorithm (BPLA) used to compute the network parameters by an iterative Levenberg-Marquardt (LM) algorithm (14). The MATLAB trainlm was used for the optimization process. This algorithm has a highest learning speed and a high performance in comparison other optimization algorithms (14). Based on the selected network structure in this study, the training process was conducted to achieve a performance target of 1×10
-3 for a maximum training epoch of 1000 with a learning rate of 0.01. This value was obtained after several trials and error runs (more than 100 iterations). It was found that this value insures stable fast learning.
Large standard deviation for shaking speed, contact time and removal percentage means that values in data set are farther away from the mean, on average. In the training phase, it is important to have a data set with considerably large standard deviation, because the normalized data are uniformly distributed and the network inputs do not have a compact data set (14). Performances of the neural network model for both training and testing data were evaluated by coefficient of determination (R
2) and Root of Mean Square Error (RMSE) (9, 18).
(7)
(8)
Where, is the estimated variable, is the desired variable and is the average of the desired variable. The Efficiency (E) evaluates model performance. The values of R
2 close to 1.0 indicate good model performance. RMSE evaluates the residual between estimated and desired variable. Whatever this criteria be close to zero then model represents the good fit.
To determine the performance of the developed network in removal estimation of TPH, the regression coefficients are precisely measured for training and testing data sets.
The results showed that a model with three layered structure including an input layer with 4 neurons, a hidden layer with 3 neurons and an output layer with 1 neurons (4-3-1) may provide the most of performance. To receive this structure, different structures of FFNN were evaluated. Some structures with one hidden layer containing 2, 3, 4, 5, 6, 8, 10, 12, and 14 neurons were also evaluated. Moreover, the performance of some structures with two hidden layers containing 2:2, 4:2, 6:4, 4:6, 8:8, 10:8, and 8:10 neurons in first and second hidden layer were also investigated.
As shown in Fig 2a, the maximum errors are related to the datasets 1 and 2. This is mainly due to the conjunction of selected backpropagation algorithm with gradient descent optimization in neural network training step (14). The principal restriction of this algorithm is converging the results into the local minima (14). Hence, derived weights from the network cannot make the minimum errors in some of train data sets. As shown in Table 4, the performance of FFNN for the estimation of removal percentage was analyzed by RMSE and R
2 statistics. The values of RMSE and R
2 for the selected structure of FFNN (4-3-1) in training and testing steps were found to be 2.596, 0.966 and 10.706, 0.785, respectively. These mean that the residual values between estimated and desired (observed) values for training and testing steps are 2.596 and 10.706%. In order to better determining the performance of the network, linear regression between neural network outputs and desired values was also drawn and the regression coefficients according to the following equation were calculated: where. It can be seen in Table 5; the linear regression coefficient for all data set was obtained 0.936 it means that only 0.064 errors in our results can be found. Thus, the present study showed that FFANN is able to model and predicts TPH removal based on the 4 input parameters data and with a small data set. However, due to inadequate data in the testing phase of FFANN model, it was not able to improve the results of this step. It seems that the training and testing of an FFANN with small data set is difficult to guarantee an excellent predictive model, where only 28 and 5 data were available.
In recent years, utilization of computational tools for the assessment of pollutants has become increasingly valuable due to their capability to interpret integrated variable measurements. ANNs are considered as inexpensive techniques for the data interpretation and prediction. In a study conducted by Olawoyin et al (19) ANN was used for recognizing the spatial patterns in contaminated zones Niger Delta by integrating chemical, physical, ecotoxicological and toxicokinetic variables in the identification of pollution sources. In this study the physico-chemical properties (pH, TPH, BTEX, PAH, COD, SO
4, PO
4, NO
3, and heavy metals) in the recipient environments were assessed in sediments, soils and water. The ANN was used as a powerful visualization tool to identify trends in the dataset. However in some studies some drawbacks of feedforward neural models (FFANN) have been highlighted for example when a large number of input/output variables are available the learning process for FFANN may be very time consuming. In these cases the use of a recurrent neural network model (RNNM) for the prediction are proposed. As a case study the prediction of the biodegradation profiles of hydrocarbons contained in an aged polluted soil was conducted with RNNM models by De la Toree-Sanchez et al. (20).
Although various types of neural network models have been developed, but the model is still considered as one of the most popular and widely-used network paradigm (11). FANN in this study provides acceptable findings regarding its valuable application in soil washing process optimization.
It was revealed that the soil washing process influencing parameters could be applied to the prediction of the removal efficiency of TPH from the polluted soils using a FFANN model. In this paper following conclusions could be highlighted:
• A feed forward artificial neural network with input repressors including shaking speed, contact time, surfactant concentration and pH and a three-hidden-layer structure of 4-3-1 was provide the best performance for the estimation of TPH’s removal efficiency (as the only output variable).
• The performance of model in the training and testing steps was found acceptable by using linear regression, Root of Mean Square Error (RMSE) and coefficient of determination (R
2) statistics. Results of the testing step as a critical step of model checking reveals that the FFANN model could provide reasonable predictions for the TPH removal efficiency (%) parameter.
• It is estimated that for about 80% of the TPH removal can be described by the assessed regressors developed model. Thus according to the findings of this study focusing on the optimization of soil washing process regarding to these four recognized control variables (shaking speed, contact time, surfactant concentration and pH) can improve the more performance of TPH removal efficiency from polluted soils.
• The results of this study could be the basis for the application of artificial neural network in the assessment of soil washing process and control of petroleum hydrocarbon emission into the environments.
Conflict of Interest:
The authors declared no conflict of interest.