## Introduction

Volume models are used to estimate the average contents for standing trees of various sizes and species. The principal variables ordinarily associated with standing tree volume are diameter at breast height and tree height. Tree volume models that are based on the single variable of diameter at breast height are commonly referred to as local volume models or single volume models while models that require the combination of diameter at breast height and the tree height are referred to as standard volume or double entry volume models. Assumptions regarding the inferiority of local volume models over standard volume models are not necessary, particularly when the local equation is derived from a standard volume equation(Avery & Burkhart, 2002). On the other hand, previous research has shown that volume models having two predictors performed better than volume models with only one predictor(Lumbres & Lee, 2013; Seo et al., 2015). General volume models could be in the form of a linear model, constant form factor, single variable, quadratic, logarithmic, general logarithmic, or transformed variable(Clutter et al., 1983; Husch et al., 2003).

Stem volume estimations are very important for forest managers, particularly in forest inventories. Forest inventories have often been used as starting points for the estimation of biomass and carbon storage in the forest of South Korea and forest inventories are very important for the timely monitoring and reporting the forest resources. South Korea has been routinely monitoring forest resources, especially the growing stock, since the establishment of the National Forest Inventory in 1972.

One of the most commonly used procedures in estimating growing stock is developing equations based on the relationships between volume, diameter at breast height and total height(Akindele & LeMay, 2006) through regression analysis(Avery & Burkhart, 2002) and by allometric equation development(Teshome, 2005; Akindele & LeMay, 2006; Gonzalez-Benecke et al., 2014). Volume equations are very important in estimating the forests’ aboveground biomass by transformation of volume to dry weight, using density and biomass expansion factors for the biomass determination of the whole tree(Fang & Wang, 2001; Lehtonen et al., 2004; Tobin & Nieuwenhuis, 2007).

As *Larix kaempferi* is among the main coniferous species of South Korea(Kang et al., 2014; Lee et al., 2014; Lee et al., 2015), it is one of the major contributors to the national forest stock of South Korea. Thus volume monitoring and reporting of this species is of significance. Although the species has been the subject of previous research(Son et al., 2002; Kang et al., 2014; Lee et al., 2014; Lee et al., 2015) the stem volume equation, with the use of a simple volume model has not yet been developed. This study aims to develop the stem volume equation of the *L. kaempferi* species in the Central Region of South Korea and to validate the performance of the volume model with different forms.

## Materials and Methods

### 1.Study site

The study sites were located in the *Larix kaempferi* stands in the Central Regions of South Korea, particularly in Bouen, Buyeo, Cheongju, Danyang, and Yeongju. The central zone stretches to 40°N in the east, 39°N in the west, and 38.5°N in the interior areas(Fig. 1). The region is hot and humid during the summer, and have cold and dry climate during the winter, and has a temperate climate.

### 2.Data collection

A total of 550 trees were felled for research purposes. The trees were selected to represent the range of diameters at breast height classes of South Korea: 53 trees represented diameter class 1(<6 cm); 146 trees diameter class 2(6 to 16 cm); 227 trees diameter class 3(18 to 28 cm) and 124 trees diameter class 4(>30 cm).

The sampled trees had diameters at breast height (DBH, cm) ranging from 0.60 to 47.90 cm, with an average of 23.79 cm. The tree height(H) ranged from 2.0 to 33.00 m, with a mean of 21.23 m. DBH was measured at 1.20 m above the ground, following the recommendation of the Korea Forest Research Institute(KFRI, 2010), using a standard diameter tape. H was measured directly using standard measuring tape, after cutting the tree to a stump height of 0.2 m. The diameter outside bark(d, cm) was also measured at specific heights from 0.2 m of the tree section, 0.7 m, 1.2 m and then at two meters intervals. A total of 6, 415 paired d and heights at specific diameter(h, m) were recorded.

### 3.Data analysis

Stem section volumes were calculated using the Smalian’s formula except for the top section, for which the cone formula was used(Avery & Burkhart, 2002). The total stem volume for a single tree was determined by summing each of the stem section volumes. The scatter plot of DBH against tree volume was plotted(Fig. 2) to visually examine each sample tree of the *L. kaempferi* species to detect possible anomalies in the data(Özçelik & Göçeri, 2015) and to increase efficiency(Bi, 2000).

The data was divided into two sets through random sampling: 80% of the data were used for initial model fitting, while the remaining 20% were used for model validation. The total dataset(100%) was used for the final model fitting. Descriptive statistics of the dataset are presented in Table 1.

Seven commonly used model forms for estimating individual stem volumes were selected as candidate models for the initial model fitting(Table 2). The volume models were selected from forestry literature (Clutter et al., 1983; van Laar & Akça, 1997; Avery & Burkhart, 2002; Husch et al., 2003) and have been used for several studies(Dela Cruz & Bruzon, 2004; FAO, 2005; Segura & Kanninen, 2005; Lumbres & Lee, 2013; Seo et al., 2015). The volume models are in the forms of a linear model, constant form factor, single variable, quadratic, logarithmic, general logarithmic, and transformed variable(Clutter et al., 1983; Husch, 2003).

The Statistical Analysis System(SAS) NLIN procedure (SAS Institute, 2004) was used for the determination of the different values of the coefficients of each equation.

The performance of each seven volume models was evaluated using various statistics of fit including bias(Ē), absolute mean deviation(AMD), root mean square error(RMSE), coefficient of determination(R^{2}), and the Akaike Criterion Information(AIC) of Akaike (1974). The raw AIC values were weighted through the Akaike weight(AIC_{w}). The AIC_{w} is used to evaluate how much statistical importance is attached to a difference in the AIC values between the best model and the next best model(Wagenmakers & Farrell, 2004). Just like the AMD and RMSE, AIC choose the model with lowest values as the best while for the AIC_{w}, the model which has the highest value is determined to be the best model, the same with R^{2}. The statistics of fit for each equation is presented in Table 3.

The ranking of methods(Poudel & Cao, 2013) was employed in order to determine the best model. The difference of the ranking of methods used in this study with that of the traditional standard or ordinal ranks is that the ranking of method shows not just the order of the models but also the magnitude of difference between the models. The first five volume models with the lowest value were chosen for the model validation and final model fitting. The rank analysis is in the form of:

where R_{i} is the relative rank of model *i*(*i*=1, 2,… m); S* _{i}* is the goodness-of-fit statistic produced by model

*i*; S

_{min}is the minimum value of the good-of-fit statistic; and S

_{max}is the maximum value of the goodness-of-fit statistic.

Numbers 1 and m was regarded as the best and worst rank, respectively, for each statistical criterion. In this ranking system, the order and also the magnitude of difference of the models are taken into consideration. For example, relative ranks of 1.00, 1.05, 1.07, 1.11, 1.12, 4.49, and 7.00 in the case of seven models suggest that the models fall into groups that are separated by a large gap.

Although there is no set of specific standards or tests that can be easily applied to determine the appropriateness of a model, a minimum validation procedure must be established to ensure reliability and reasonable performance of a new model(Huang et al., 2003). The remaining 20% of the dataset was used for model validation. For the final model fitting with the use of the validated volume models, the combined data or the 100% dataset were used and rank analysis was performed to determine the best volume model for *Larix kaempferi* species in the Central Region of South Korea.

Aside from the fit statistics criteria used for evaluation of the models, a simple linear regression (Zar, 1999) was used to compare the observed and predicted stem volumes of the *Larix kaempferi* species in the Central Korea. The observed and predicted stem volumes were related according to the following linear model: Predicted stem volume=*b*_{0}+*b*_{1}^{*} Observed stem volume. If the simple volume model correctly estimated the stem volume of a tree, then the intercept(*b*_{0}) would not be significantly different from zero and the slope(*b*_{1}) would not be significantly different from one. A simultaneous *F*-test was also conducted to evaluate the hypothesis: Ho: (*β*_{0}, *β*_{1}) = (0, 1), Ha: (*β*_{0}, *β*_{1}) ≠ (0, 1). Simple linear regression and the simultaneous F-test had already been used by previous researches to evaluate the performance of a model including Lee & Coble, 2002.

## Results and Discussion

### 1.Initial model fitting and model validation

The initial model fitting was carried out with the seven volume models using the 80% of the dataset. These were evaluated using fit statistics including Ē, AMD, RMSE, R^{2}, AIC, and AIC_{w}. The results showed that the volume models with two variables, in the form of a linear model, constant form, quadratic, logarithmic, and Honer’s transformed variable, had better fit statistics values than the volume models with a single variable(Table 4). The best five volume models, based on rank analysis, were chosen for model evaluation using the 20% dataset and final model fitting using the 100% dataset. Codes models 1 to 5 were assigned to the five chosen models. Four of the models use two variables, DBH and H. The model validation showed that model 1 provided the least Ē and models 3, with a single variable, and 4 over-predicted the volume. Models 2 and 5 provided an under-predicted result. The AMD ranged from 0.023(model 4) to 0.065(model 3); the RMSE ranged from 0.035(model 4) to 0.090(model 3); the R^{2} ranged from 0.978(model 3) to 0.997(model 4), while the AIC ranged from -725.25(model 4) to -522.59(model 3). The AIC_{w}, on the other hand, had values that ranged from <0.001(model 3) to 0.997 (model 4). The overall rank analysis of the model evaluation showed that model 4, which is in logarithmic form, had the best value. The volume model with a single variable(model 3) had the poorest value of the five. The result of the relative ranks of the model using the validation dataset based on fit statistics is presented in Fig. 3. The model with the smallest area inside the box represents the best model. Model 4 had the smallest area followed by model 1, model 5, model 2, and model 3, respectively.

The Ē of the five models was plotted against the DBH class of South Korea(Fig. 4) for further evaluation. Fig. 4 showed that the model 4 had a Ē value of almost zero within the <6 cm and 6 to 16 cm DBH classes and provided underestimated results within the 18 to 28 and >30 cm DBH classes. For the other models, model 1 had an overestimated result within the diameter classes <6 and 6 to 16 and underestimated result within the diameter classes 18 to 28 and >30. Model 2 on the other hand, gave underestimated results in all diameter classes while model 3 overestimated the diameter classes <6, 6 to 16 and >30, and underestimated diameter class 18 to 28. Model 5 underestimated diameter classes <6, 6 to 16, and 18 to 28 but overestimated the >30 diameter class.

### 2.Final model fitting

The five models were refitted to the combined or the 100% dataset for the final model fitting. The resultant parameter estimates are provided in Table 5. Using the same fit statistics as before, the performance of the volume models was evaluated as shown in Table 6. Model 4 remained the number 1 rank among the five models with a Ē of 0.002 m^{3}, AMD of 0.103 m^{3}, RMSE of 0.172 m^{3}, R^{2} of 0.924, and an AIC value of -1906 with an AIC_{w} of 1.0. Model 4 had the second nearest zero Ē, the second lowest AMD value, the lowest RMSE, and the highest R^{2}. The lowest E was model 1(0) while the third lowest Ē was model 3(-0.004 m^{3}), followed by models 2 (0.027 m^{3}) and 5(0.33 m^{3}). Only model 3 overpredicted the volume. Model 2 had the lowest AMD (0.102 m^{3}) and the third is model 5(0.104 m^{3}) followed by models 1(0.110 m^{3}) and 3(0.143 m^{3}). For the RMSE, the best was model 4. Model 1 had a value of 0.185 m^{3} followed by models 5, 2 and 3 with values of 0.189, 0.190 and 0.218 m^{3}, respectively. For R^{2}, where the higher value the better the performance, model 4 again had the best performance (0.924). Model 5(0.908) was the second best, followed by model 2(0.907), model 3(0.877) and model 1(0.798), respectively. For the AIC value(AIC_{w}), the models had the following values arranged from lowest to highest: model 1: -1824(<0.001), model 5: -1806 (<0.001), model 2: -1802(<0.001), model 3: -1651(<0.001).

The overall ranking showed that the best performance was given by model 4(1.053), followed by model 2(3.001), model 1(3.014), model 5(3.114), and model 3(4.265), respectively. Although model 3, which uses only one predictor(DBH), had the poorest performance, the use of this model is still very important in cases where only DBH is available. Having only the DBH in a field inventory can be unavoidable considering the physical structure of the forest of South Korea. In addition, measuring height in the field is always accompanied by errors, as tree height measurement instruments do not have 100% accuracy. High accuracy is only guaranteed if a direct, destructive measurement is taken. Measuring DBH in the field is easy to do directly, so high accuracy is possible.

In line with other *Larix* species in South Korea, the logarithmic model(V=aDBH^{2}*H) had been suggested to be the best model for predicting the stem volume of *L. leptolepis* stand in Jinan, Chonbuk, South Korea(Jeon et al., 2007).

The relationships of the observed and the predicted stem volume were plotted using a simple linear regression(Fig. 5). The stem volumes were found to be related according to the following linear model: Predicted stem volume=0.0794+0.8631*Observed stem volume. Model 4 gave an underestimated results in volumes <0.45 m^{3} and an overestimated in volumes >0.55 m^{3}. The overestimation became greater as the volume increased. The results show that most of the observations are <1.00 volume, which agrees with the average volume of 0.8190 m^{3} for a single *Larix kaempferi* tree in South Korea(Kang & Son, 2016). Additionally, for model 4, a simultaneous F-test was conducted to evaluate the hypothesis: Ho: (*β*_{0}, *β*_{1}) =(0, 1), Ha: (*β*_{0}, *β*_{1})≠(0, 1). A p-value of 0.5131 was computed indicating that there is no significant difference between the observed and predicted volumes.

All four stem volume models with two variables, DBH, and total height, showed better predictive capability than the single variable volume model when estimating the stem volume of the *L. kaempferi* in the Central Region of South Korea. Model 4 consistently performed best, from initial model fitting to model evaluation and finally the final model fitting. On the other hand, model 3, the single variable model, is still recommended in situations where total height is unavailable. Volume models with only a single variable should not be totally ignored considering that the information gathered in forest inventories includes directly measured tree DBH but tree heights that are only remotely measured. Tree height is difficult to measure remotely with accuracy. This inaccuracy can result in biased estimates when tree height is included as an independent variable in volume and biomass models. Considering the sources of error, it is necessary to consider volume models using single variables such as DBH which can be measured accurately in the field.