Comparison of Artificial Neural Network and Multiple Regression Analysis for Prediction of Fat Tail Weight of Sheep

Document Type: Research Articles

Authors

1 Department of Animal Science, College of Abouraihan, University of Tehran, Tehran, Iran

2 Department of Mathematics, Faculty of Mathematical Science, Shahid Beheshti University, Tehran, Iran

Abstract

A comparative study of artificial neural network (ANN) and multiple regression is made to predict the fat tail weight of Balouchi sheep from birth, weaning and finishing weights. A multilayer feed forward network with back propagation of error learning mechanism was used to predict the sheep body weight. The data (69 records) were randomly divided into two subsets. The first subset is the training set comprising of 75 percent data (52 records) to build the neural network model and test data set comprising of 25 percent (17 records), which is not used during the training and is used to evaluate performance of different models. The mean relative error was significantly (P<0.01) lower for ANN than the MLR model. The coefficient of determination (R2) values computed for the body measurements were generally higher (0.93) using ANN model than the multiple linear regression (MLR) model (0.81). The ANN model improved the mean squared error (MSE) of the MLR model by 59% and R2 by 15% that the ANN represents a valuable tool for predicting of lamb fat tail weight from birth, weaning and finishing weights.

Keywords


INTRODUCTION

The sheep industry is the largest enterprise of animal agriculture in Iran. The total number of sheep in Iran is estimated to be about 52 million, which accounts for nearly 42% of the available total animal units. Fat tail breeds are an important class of sheep breeds and these breeds are commonly found in a wide range of countries in Asia, especially the Middle East and Iran (Davidson, 2006). The fat-tail is regarded as an adaptive response of animals to a hazardous environment and is a valuable reserve for the animal during migration and winter (Kashan et al. 2005). Until recently, it had additional value because it was used to preserve cooked meat for longer periods of time and also as an energy reserve during times of drought and famine. Therefore the climatic variation as well as the associated requirements of humans led to artificial selection for higher fat tail weight across generations. Nowadays, in intensive and semi-intensive systems most of the advantages of a large fat tail have reduced their importance and therefore, a decrease in fat tail size is often desirable for producers and consumers. Fat deposition requires more energy than the deposition of lean tissue (Moradi et al. 2012). Also, ruminant edible fats are particularly rich in saturated fatty acids due to the extensive microbial hydrogenation of dietary polyunsaturated fatty acids (PUFA) in the rumen and in many countries, edible fat is usually an unpopular part of meat for consumers, being considered unhealthy and is desirable to select against large fat-tails (Zamiri and Izadifar, 1997). This has been practiced by crossing fat-tailed breeds with lean-tailed breeds or selection for lower fat tail weight across generations (Kashan et al. 2005). In selection strategies, live weight, average daily gain, and fat tail weight are important components influencing the profitability of sheep. Live weight and average daily gain were measured on live sheep, but for measuring of fat tail weight, the animal should be slaughtered. To overcome this problem, fat tail measurements (length, width and circumference) were performed on live animals and used as a measure of tail weight in breeding programs (Vatankhah and Talebi, 2008). However, a few studies reported low correlation between tail fat length, width, circumference measurements and its weight (Zamiri and Izadifar, 1997; Safdarian et al. 2008). Artificial neural networks (ANN) are new analytical tools based on the models of neurological structures and processing function in the brain. The main advantage of ANNs in prediction is that a priori assumptions about the relations between independent and dependent variables are not necessary. However, those relations learned by an the ANNs are hidden in its neural architecture and cannot be expressed in traditional mathematical terms. The comparative advantage of the ANNs over more conventional econometric models, such as multiple linear regression (MLR) is that they can model complex, possibly non-linear relationships without any prior assumptions about the underlying data-generating process. They are able to learn and to generalize relations between input and output data from examples presented to the network. The strength of ANNs is pattern recognition and pattern classification, but these programs can also be used for predictive purposes (Dayhoff, 1990). Because of these features of ANNs, there has been an increasing tendency to apply the ANNs in biological science (Marengo et al. 2006; Alp and Cigizoglu, 2007; Norouzian and Asadpour, 2012). In the present study, we compared the performance of the classic approach, the multiple linear regression and ANNs for estimating of fat tail weight from empirical data that were obtained from farm experiment. This study was undertaken to obtain prediction models for estimating fat tail weight of weight of Balouchi lambs from birth, weaning and finishing weights for breed characterization and selection for genetic improvement. In this study, artificial neural network (ANN) is employed to investigate the relationship between fat tail and live weights of lambs.

 

MATERIALS AND METHODS

Animal data

The current study was conducted on a sheep farm with approximately 200 lambs per year Mashhad, Iran (latitude 36 ˚20', longitude 54 ˚11', and altitude 1830 m). The climate is semi-arid, with a mean annual precipitation of 236 mm and a mean annual temperature of 33.9 ˚C. A total of 69 Balouchi lambswere used in this study. At birth, lamb identification number, sex and birth type were recorded for each lamb. In addition to the ewe’s milk, the lambs were offered alfalfa hay ad libitum and a concentrate mixture including 40% corn, 20% soybean meal, 20% beet pulp and 20% wheat bran (300 g per lamb per day) until weaning. The lambs were weaned at 60 days of age and body weights were recorded. After weaning, lambs were maintained under uniform feeding, fattening diet (Table 1), for 12 weeks and performance were determined.

 

Table 1 Ingredients and chemical composition of fattening diet

 

1 Each kg of supplement contained: vitamin A: 50000 IU; vitamin D3: 10000 IU; vitamin E: 0.1 g; Calcium: 196 g; Phosphorus: 96 g; Sodium: 71 g; Magnesium: 19 g; Iron: 3 g; Copper: 0.3 g; Manganese: 2 g; Zinc: 3 g; Cobalt: 0.1 g; Iodine: 0.1 g and Selenium: 0.001 g.

 

At the end of finishing period, the animals were slaughtered for determination of fat tail weight. Lambs were fasted for 12 hours and weighed before slaughter. Dressed carcass weight and weight of tail fat were recorded in kilograms.

 

Development of artificial neural network models

The data were randomly divided into two subsets. The first subset is the training set (n=52), which is used for building the model. The second subset is the testing set (n=17), which is not used during the training and is used to evaluate performance of different models. A 2-layer feed-forward network formed by 1 input neuron, 1 output layer, and a number of hidden units fully connected to both input and output neurons were adopted in this study. The most used learning procedure is based on the back propagation algorithm in which the network reads inputs and corresponding outputs from a proper data set (training set) and iteratively adjusts weights and biases in order to minimize the error in prediction. In this study, training gradient descent with Levenberg Marquardt algorithm is applied and the performance function was the mean square error (MSE), the average squared error between the network outputs and the actual output.

 

Development of multiple regression models

To compare the effectiveness of the ANN for the prediction of body weight, the MLR model was developed using the three body measurements, birth, weaning and finishing weight as input variables to predict the fat tail weight.

The multiple regression procedure will estimate b0, b1,…, bq parameters of the linear equation:

y= b0 + b1x1 + … + bqxq 

Where:

b0, b1,…, bq: independent contributions of each independent variable x1,…, xq to the prediction of the dependent variable y.

 

The global statistical significance of the relationship between y with the independent variables is analyzed by means of an analysis of variance to ensure the validity of the model in a quantified manner. The same training data set was used to develop the regression equations, and the effectiveness of prediction from the MLR model a tested using test data set. The Neural Network Toolbox of MATLAB 8.3 was employed to construct ANN models. For comparison, MLR models were generated using the training and test dataset by MATLAB 8.3 Statistics Toolbox.

 

Models evaluation

The following parameters were calculated to evaluate the performance and predictive ability of the model: R2 (correlation coefficient between predicted and observed values) and MSE (mean squared error). TheR2 and MSE value between predicted and observed data is calculated by the following equations:

Where:

yt: observed value.

ŷt: estimated value.

n: number of observations.

SSReg: sum of square of regression model.

SST: sum of square of the total.

SSE: sum of square of residuals.

To compare the predicted values with the results of laboratory assays, t-student test was used.

 

RESULTS AND DISCUSSION

Conformation of artificial neural network model

Architecture, specification and statistic information of the neural network were listed in Table 2.

 

Table 2 Architecture, specification and statistical information of the neural network model

 

 

Selecting inputs and outputs, the number of layers, number of neurons in each layer and number of hidden layer nodes of the ANNs can affect the benefits and abilities of them, significantly. A previous study (Cybenko, 1989) showed that one hidden layer neural network was enough to approximate any function, if enough hidden nodes were presented. The topology of the network, along with the neuron processing function, determines the accuracy and degree of representation of the model developed to correctly represent the system behavior. Therefore, the first aim was to determine the optimal number of hidden layer nodes. There are no rigorous theoretical principles for determining this. However, there are many empirical rules (Berry and Linoff, 1997). For example, the number of neurons in the hidden layer can be confirmed by the formula:

m= Log2 (n) + α

Where:

m: number of neurons in the hidden layer.

n: number of input variables.

α: integer between 0 and 10 (Berry and Linoff, 1997).

A series of neural networks with different numbers of hidden layer nodes were trained. According to its generalization ability of the testing set, MSE was calculated on different numbers of the hidden layer nodes. The model which gave the lowest value of MSE was chosen as the final ANN model. The best number of hidden layer nodes was 25 for prediction of fat tail weight (Table2). For ANN, the training was stopped after 1000 epochs because the error increased. The linear transfer for output layer and the sigmoid transfer function for input and hidden layer are used in the ANN. This transfer function gives an appropriate response for many applications with respect to linear transfer function.

 

Comparison between ANN and MLR models

The statistical values of empirical and predicted values by ANN and MLR models, as well as residues (difference between predicted and observed values) and relative residues are listed in Table 3 and Figure 1. Also, scatter plot comparing observed and estimated fat tail weights for the MLR and ANN and scatter plot comparing estimated fat tail weight and residues (observed minus estimated values) are shown in Figures 2 and 3. Comparison between actual and predicted fat tail weights using ANN and MLR models revealed statistically significant differences (Table 3). Compared with the MLR model, the ANN model gave a better prediction. The ANN predictions gave higher R2 values with lower MSE in comparison with MLR (Figure 1). The R2 and MSE of the MLR model were 0.81 and 1.24, respectively. However, the ANN model for the same dataset produced much improved results with R2= 0.93 and MSE= 0.51, respectively. The ANN model improved the MSE of the MLR model by 59% and R2 by 15%. In ANN model, the regression between observed and estimated fat tail weights showed a slope very close to one and a low dispersion around the regression line (Figures 2 and 3). On the other hand, estimated fat tail weights versus residues (observed values minus estimated values) showed a slope very close to zero and homogeneous deviations around this value.

 

Table 3 Mean, maximum, minimum and standard deviation (SD) of empirical and predicted data, as well as residues

* Relative error= (((predicted-observed)/observed)×100)

The means within the same row with at least one common letter, do not have significant difference (P>0.01).

ANNs: artificial neural networks and MLR: multiple linear regression.

SD: standard deviation.

 

Figure 1 Performance comparison of ANN and MLR prediction models

 

 

Figure 2 Scatter plot comparing observed and estimated fat tail weight for the multiple regression and scatter plot comparing estimated fat tail weight and residues (observed minus estimated values)

 

Figure 3 Scatter plot comparing observed and estimated fat tail weight for the artificial neural networks and scatter plot comparing estimated fat tail weight and residues (observed minus estimated values)

 

 

However, the MLR provides worse results than an ANN model. The mean relative error was 4.2 and 2.9 for MLR and ANN prediction, respectively, and was significantly (P<0.01) lower for ANN than MLR (Table 3).  As far as we know, there is no literature about ANN modeling for fat tail weight prediction in sheep. However, some ANN models have been used in dairy science. Fernandez et al. (2007) used a three layer feed-forward ANN to model and predict the weekly milk prediction on dairy goats. These authors demonstrated that artificial ANN is a suitable tool for the prediction of next week’s milk yield from goat factors recorded on a farm and present milk yield.

 

CONCLUSION

Based on maximum R2 value with smallest MSE of models the results obtained in the present study revealed that the ANN model gave a more accurate prediction of fat tail weight than MLR models. This suggests that the ANN method may be a promising tool for the rapid estimation of fat tail weight from body measurements in the sheep industry. However, further efforts with larger data sets are required to better determine the feasibility of rapidly predicting fat tail weight by using ANN methods.

Alp M. and Cigizoglu H.K. (2007). Suspended sediment load simulation by two artificial neural network methods using hydromete- orological data. Environ. Model Softw. 22, 2-13.

Berry M.J.A. and Linoff G. (1997). Data Mining Techniques. John Wiley and Sons, New York.

Cybenko G.C. (1989). Approximations by super positions of a sigmoidal function. Math. Cont. Sig. Sys. 2(3), 303-314.

Davidson A. (2006). The Oxford Companion to Food. OxfordUniversityPress, USA.

Dayhoff J.E. (1990). Neural network architectures. Van Nostrand Reinhold, New York.

Fernandez C., Soria S., Sanchez-Seiquer E.P., Gomez-Chova S., Magdalen E.R., Martin-Guerrero J.D., Navarro M.D. and Serrano A.J. (2007). Weekly milk prediction on dairy goats using neural networks. Neural Comput. Appl. 16, 373-381.

Kashan N.E.J., Manafi Azar G.H., Afzalzadeh A. and Salehi A. (2005). Growth performance and carcass quality of fattening lambs from fat-tailed and tailed sheep breeds. Small Rumin. Res. 60, 267-271.

Marengo E., Bobba M., Robotti E. and Liparota M.C. (2006). Modeling of the polluting emissions from a cement production plant by partial least squares, principal component regression, and artificial neural networks. Environ. Sci. Technol. 40, 272-280.

Moradi M.H., Nejati-Javaremi A., Moradi-Shahrbabak M., Dodds K.G. and McEwan J.C. (2012). Genomic scan of selective sweeps in thin and fat tail sheep breeds for identifying of candidate regions associated with fat deposition. BMC Genet. 13, 10-12. 

Norouzian M.A. and Asadpour S. (2012). Prediction of feed abrasive value by artificial neural networks and multiple linear regression. Neural Comput. Appl. 21, 905-909.

Safdarian M.M., Zamiri M.I., Hashemi M. and Noorolahi H. (2008). Relationships of fat-tail dimensions with fat-tail weight and carcass characteristics at different slaughter weights of Torki-Ghashghaii sheep. Meat Sci. 80, 686-689.

Vatankhah M. and Talebi M.A. (2008). Heritability estimates and correlations between production and reproductive traits in Lori-Bakhtiari sheep in Iran. South African J. Anim. Sci. 38(2), 110-118.

Zamiri M.J. and Izadifard J. (1997). Relationships of fat-tail weight with fat-tail measurements and carcass characteristics of Mehraban and Ghezel rams. Small Rumin. Res. 15, 261-266.