ABSTRACT
Objective
To compare skeletal ages determined using three different regression methods from measurements made on cervical vertebrae from lateral cephalometric radiographs (LCRs) with the skeletal age determined from hand-wrist radiographs (HWRs).
Methods
LCRs and HWRs of 794 individuals (329 boys, 465 girls) aged 7-18 years were examined. The hand-wrist skeletal age of the participants was determined using the Greulich-Pyle (GP) atlas. Forty-four linear and nine angular morphometric measurements in the C2-C5 vertebrae were made in LCRs. Vertebral skeletal age (VSA) was determined in both sexes using Ridge, the least absolute shrinkage and selection operator (LASSO), and ElasticNet regression methods. The study results were evaluated using R2 (explainability power). Bland-Altman analysis was performed to determine the consistency of chronologic age (CA), GP age, and VSAs.
Results
LASSO regression showed the highest explainability power for VSA, with boys at 0.783 and girls at 0.741. In both sexes, the vertebral depth of concavities had high beta coefficients, and the posterior height of C3 vertebrae (TVup-TVlp) had the highest beta coefficient in boys in LASSO regression. The width of the limits of agreement in both CA and VSA graphs of GP age was wider in boys than in girls. The width of the limits of agreement of CA-VSAs was wider in girls than in boys.
Conclusion
Although high R2 values were obtained, VSA showed no superiority over CA in the assessment of skeletal age, and no significant clinical advantage was observed. For the Turkish population, using GP age may be more accurate for determining skeletal age in orthodontic treatment planning.
Main Points
• The least absolute shrinkage and selection operator regression exhibited the highest predictive accuracy in estimating vertebral skeletal age.
• Vertebral depth of concavities emerged as a significant predictor of skeletal age in both sexes.
• Vertebral skeletal age estimation did not demonstrate a clinical advantage over chronological age.
• Vertebral skeletal age estimation showed greater variability in boys than in girls, indicating lower consistency with hand-wrist skeletal age assessment.
INTRODUCTION
Assessing growth potential during pre-adolescence and adolescence is crucial, and various indicators such as body height and weight, sexual maturation, chronologic age (CA), dental development, and skeletal development can be used to identify growth stages. The identification of the growth and development stage of an individual has a significant impact on the diagnosis, treatment planning, and treatment outcome of orthodontic treatment. Although CA is commonly used, it may not always be a reliable indicator for growth stages due to variations in the timing, velocity, and duration of growth among individuals.1 Skeletal age is commonly evaluated in orthodontics via hand-wrist radiographs (HWRs) or lateral cephalometric radiographs (LCRs).2
The Greulich-Pyle (GP) atlas is commonly used to determine patients’ skeletal age by evaluating the maturation of the hand and wrist bones. The main use of skeletal age in orthodontic treatment is the determination of the timing of orthopedic treatment or the confirmation of the end of growth.3 HWRs are considered the gold standard; the other most commonly used method for evaluating skeletal maturity in orthodontics is cervical vertebral maturation (CVM), which is based on assessing the maturation stage of the cervical vertebrae.4, 5 It is often suggested that HWRs in orthodontics should be limited to cases where the information obtained is considered essential for treatment planning and cannot be obtained by other means, given the importance of minimizing radiographic exposure.6
To objectify skeletal age assessment and make it more efficient, many artificial intelligence (AI) systems have been developed to increase diagnostic accuracy mostly via HWRs.7 Due to the significant correlation between hand-wrist bone and CVM, most AI studies have focused on classifying developmental phases and comparing AI-based classifications with human diagnoses. However, skeletal age estimation has not been thoroughly studied. The clinical application of these studies was limited because they focus on evaluating success metrics rather than automated systems.8, 9 To address this gap, this study aims to evaluate cervical vertebrae maturity using a quantitative method of morphologic changes.
Regression-based methods determine how independent factors affect a dependent variable by identifying a non-deterministic function representing the independent variables’ effect on the dependent variable’s mean. While regression procedures are straightforward, they require a suitable model for data fitting. Predictions can be made by applying the parameters obtained in a clinical application into the regression formula.10 Ridge, The least absolute shrinkage and selection operator (LASSO), and ElasticNet are regression models are commonly used in multiple linear regression problems to prevent overfitting. Optimizing the selection of the proper technique and fine-tuning the hyperparameters via cross-validation is essential for constructing a model that effectively manages bias and variance, thereby enhancing predicted accuracy.11
The explainability power (R2) provides valuable information regarding the degree to which the analyzed data can understand the dependent variable. The higher the R2 value, the higher the capacity of the obtained data to describe the dependent variable.12-14 The predominant methodology in the scholarly literature for estimating skeletal age through vertebral parameters involved stepwise regression analysis.15-18 To our knowledge, no previous study in the literature includes a quantitative approach with AI regression methods to determine skeletal age through LCRs.
Although correlation analysis can compare actual and regression-predicted skeletal age studies, it only evaluates the connection between variables, not their differences.15, 17 The Bland-Altman analysis offers an alternative approach by quantifying the agreement between two quantitative measures by calculating the mean difference and agreement limits. However, Bland-Altman plots only depict the range of agreement without indicating whether it is acceptable. Acceptable limits must be determined based on predefined clinical requirements, biologic considerations, or other relevant goals.19 Also, there is limited research explicitly addressesing R2 in skeletal age determination using vertebral measurements and assessing the compatibility and repeatability [vertebral skeletal age (VSA) -GP age] of this method through Bland-Altman analysis.20
The aim of this study was to develop a predictive model of VSA by using Ridge, LASSO, and ElasticNet regression models.
The null hypothesis of the study was that there would be no significant difference between the vertebral age prediction models developed using Ridge, LASSO, and ElasticNet regression.
METHODS
Study Design
The study received ethical approval from the Research Ethics Committee of Recep Tayyip Erdoğan University (date: 02.02.2023 and protocol number: 33) and involved a retrospective analysis of LCRs and HWRs from patients referred for orthodontic treatment at the Department of Orthodontics, Faculty of Dentistry at Recep Tayyip Erdoğan University. The study was conducted in accordance with the applicable ethical principles of the World Medical Association Declaration of Helsinki of 1964 and later versions.21 Informed written consent forms, which included the use of patient records in scientific studies, were obtained from all patients at the beginning of treatment. Patients who met specific criteria were included in the study, including individuals of Turkish ethnicity, between the ages of 7-18 years, with good quality LCRs and HWRs, normal growth and development, no systemic disease, no congenital deformities, no bone syndromes, no previous hand-wrist injury, and good nutrition without serious illness. LCRs and HWRs were taken on the same day and all LCRs included in the study were of sufficient quality, with a clear view of the cervical spine (C2-C5).
The LCRs and HWRs were acquired using a Planmeca Promax 2D S2 imaging unit (Planmeca Oy; Helsinki, Finland) with specific exposure parameters (66 kVp, 10 mA, 10.5 s in LCRs, 60 kVp, 4 mA, and 10.5 s in HWRs). During LCR acquisition, ear rods, and nasal support were used to stabilize the head, and the Frankfort horizontal plane was set parallel to the floor. HWRs were obtained using a specific focus-to-film distance of 170 cm and 30° angulation of the thumb to allow for the depiction of the sesamoid bone.
In the sample size calculation performed considering the number of independent variables as 53, the adjusted R2=0.686 result in the study of Varshosaz et al.15, 95% confidence (1-α), 95% test strength (1-β), and f2=2.185 effect size, the minimum required number of samples was determined as 69.
A total of 1257 individuals’ LCRs and HWRs were reviewed, and radiographs from 463 individuals who did not meet the inclusion criteria were excluded from the study. We analyzed 794 sets of radiographs (LCR and HWR) of untreated subjects (329 boys, 465 girls) and identified 27 cervical vertebral reference points (Figure 1) for the analysis and obtained 44 linear and nine angular morphometric measurements (Figures 2 and 3), which were located in the C2-C5 vertebrae. The GP age was determined using the HWR images.
All LCRs were calibrated using a 45-mm-long bar, and linear and angular measurements were performed by an orthodontist with 4 years of orthodontic clinical experience using AudaxCeph version 4.2.0.3101 software. To assess the intra-rater and inter-rater agreement, a random sample of 393 LCRs and HWRs was chosen. The measurements were repeated after 1 month by the same orthodontist with 4 years of clinical experience to determine intra-rater reproducibility. Another orthodontist with 10 years of clinical experience performed the measurements to evaluate inter-rater reliability. The intraclass correlation coefficient (ICC) was used to assess the measurement error.
Regression Methods
Ridge, LASSO, and ElasticNet are all regression models used in multiple linear regression problems to prevent overfitting. Choosing the appropriate method and tuning the hyperparameters through cross-validation are crucial for building a model that balances bias and variance, thus improving predictive performance.12, 14
Multicollinearity occurs when independent variables in a regression model are highly correlated, making it difficult to determine each variable’s effect. This issue can be detected using the variance inflation factor (VIF) and tolerance values. A VIF above 10 or a tolerance below 0.2 indicates multicollinearity. Regularization techniques such as Ridge, LASSO, and ElasticNet address multicollinearity by adding a penalty term to the regression model, which helps shrink the coefficients of correlated variables.
Ridge regression incorporates an L2 penalty, the sum of squared coefficients, into the loss function. This technique is particularly effective when dealing with many small and approximately equal coefficients because it distributes the values evenly among correlated variables. By using the lambda (λ) parameter, Ridge regression controls the strength of the L2 regularization. This regularization term penalizes large coefficients, thereby reducing their variance without eliminating any variables, and mitigating multicollinearity in the model.12 LASSO regression applies an L1 penalty, which is the sum of the absolute values of coefficients. This approach can shrink some coefficients to zero, effectively performing variable selection by eliminating less important predictors. This makes it particularly useful when only a few predictors are expected to be significant. LASSO regression uses the lambda (λ) parameter to control the strength of the L1 regularization, which penalizes the absolute values of the coefficients and enables variable selection by shrinking some coefficients to zero.13 ElasticNet combines both L1 and L2 penalties, offering a balance between Ridge and LASSO regressions. This approach is advantageous when multiple correlated predictors are present, and some need to be eliminated. ElasticNet regression uses lambda (λ) and alpha (α) parameters. Lambda (λ) controls the overall strength of the regularization, and alpha (α) determines the mix between L1 and L2 regularization. When alpha is 0, ElasticNet behaves like Ridge regression; when alpha is 1, it behaves like LASSO regression. Values between 0 and 1 provide a balance between the two methods. The optimal values of the regularization parameters in Ridge, LASSO, and ElasticNet regression are determined by minimizing the mean squared error.14
The performance of these models is typically evaluated using metrics such as R2 and the Akaike information criterion (AIC) (to measure the model’s fit and complexity).12, 14 Cross-validation is a method used to evaluate the performance of machine-learning models. Among various methods, k-fold cross-validation is the most widely used. The dataset is divided into k parts, and each of the k parts is used separately as the test dataset, and the remaining dataset is used as the training dataset. This process is repeated k times, and the mean of the test errors obtained each time is used to predict the model’s performance. K-fold cross-validation method ensures that all the samples in the dataset are used to train the model. After k cross-validation, the mean error is calculated for the training and test data and expresses how much the predicted values deviate from the actual values. A lower mean error value means a better fit and more accurate predictions. Cross-validation, especially k-fold cross-validation, is often used to tune the hyperparameters (lambda and alpha), ensuring that the model generalizes well to new data.12, 14
Statistical Analysis
Statistical analysis was performed using the Eviews v12 program (IHS Markit Ltd, London, UK). Descriptive statistics were calculated as mean, standard deviation, median, minimum/maximum (min./max.), Kurtosis, and Skewness. Vertebral morphometric measurements were included to generate a calculated VSA. The ENET-ElasticNet regularization method was used for estimating skeletal age. Estimation was made using Ridge, LASSO, and ElasticNet regression models included in the method. Lambda hyperparameter was used in Ridge and LASSO methods and the optimal lambda value was determined according to the min./max. ratio (0.0001) according to the minimum mean square error within 50 periods. In ElasticNet regression, both lambda and alpha editing parameters were used and the alpha value was automatically taken as 0.5. Bland-Altman analysis was used to assess the agreement among different methods of age estimation, including the GP age, VSA (Ridge, LASSO, ElasticNet), and CA. Limits of agreement were identified.
RESULTS
Measurement Error
The intra-rater and inter-rater agreements were estimated using the intra-class correlation coefficient (ICC) and were found to be excellent for all vertebral measurements (ICC ≥0.977, and ICC ≥0.960, respectively). Both intra-observer and inter-observer agreements of GP skeletal age were 0.997 (95% confidence interval: 0.996 to 0.997) with excellent reliability.
The First Phase of the Regression Methods
The descriptive statistics in the study are demonstrated for each sex (Table 1). Independent variables with VIF values greater than 10 are shown in bold (Table 2). Our study was conducted separately for both girls and boys.
Statistical analysis consisted of two parts. In the first part, all independent variables (vertebral measurements) were evaluated. The target variable was GP age. To obtain the VSA, three regression methods were used. In the second part, the analyses were repeated with the variables with the highest beta coefficients obtained from each regression model.
In the initial phase of the statistical analysis, the lambda values were chosen based on minimum mean square error values. The beta coefficients and lambda values for each regression model were determined separately for boys and girls, and the results are presented in Table 3. The R2 values obtained in the first stage of the statistical analysis were between 0.799 and 0.804.
In boys, all variables except one in the Ridge and ElasticNet regressions and 15 variables in the LASSO regression had non-zero beta coefficients. In girls, all variables in the Ridge and ElasticNet regressions and all variables except 11 in the LASSO regression had non-zero beta coefficients.
The Second Phase of the Regression Methods
Due to the high number of independent variables (n=53) statistically evaluated in our study, in the second part of the analysis, the beta coefficients were examined to determine which variables had the greatest impact on the regression models and to select the most important variables for clinical applicability. Separate analyses were conducted for boys and girls, and the eight variables with the highest coefficients in each regression model were chosen.
For both girls and boys, eight measurements with the highest coefficients were selected in each regression model, and a total of 24 measurements were determined. In boys, for the elimination of 24 measurements selected for the second part of the statistical analysis, the first three measurements (SVD, FiVD, FVD) were common to all three regression models and had the highest beta coefficients, and PH3, TVD, TVup-TVlp, and Y3 measurements, which were common to all three models, were selected. In addition, UW3, which was common to ElasticNet and Ridge regressions was selected. For boys, the selected measurements were SVD, FiVD, FVD, PH3, TVD, TVup-TVlp, Y3, and UW3 (Figure 4a, Table 4). In girls, for the elimination of 24 measurements selected for the second part of the statistical analysis, SVD, FVD, TVum-TVd, TVpm-TVam, UW4, and Y3, which are common to all three regression models and have high beta coefficients were selected.
In addition, FiVD, which is common to Ridge and ElasticNet regression, and UW5, which is common to LASSO and Ridge regression, were selected. For girls, the selected measurements were SVD, FVD, TVum-TVd, TVpm-TVam, UW4, Y3, FiVD, and UW5 (Figure 4b, Table 4). The lambda values and beta coefficients were recalculated based on new datasets created separately for each sex.
In the second phase of the statistical analysis, the lambda values were chosen based on minimum mean square error values. The minimum mean square error for boys was obtained at lambda values 0.0762, 0.000148, and 0.002344 for the Ridge, LASSO, and ElasticNet, respectively. For girls, the minimum mean square error was obtained at lambda values of 0.04913915, 0.00003718, and 0.00000113 for the Ridge, LASSO, and ElasticNet regression, respectively.
The R2 values obtained in the second stage of the statistical analysis were between 0.740 and 0.783. The highest R2 in both boys and girls was obtained using LASSO regression (respectively, R2=0.783, 0.741) (Table 5), and the performance of each regression model was assessed using 10-fold cross-validation.
The means and errors for both the training and test datasets from the initial and second parts of the analyses are presented in Table 6.
Vertebral skeletal age formulas obtained in each regression model in boys:
Ridge regression: VSA=0.318*FVD + 0.561*FiVD + 0.307*PH3 + 0.487*SVD - 0.059*TVD + 0.33*TVup-TVlp + 0.025*UW3 + 0.252*Y3 + 0.889
LASSO regression: VSA=0.185*FVD + 0.534*FiVD + 0.019*PH3 + 0.448*SVD + 0*TVD + 0.647*TVup-TVlp + 0*UW3 + 0.259*Y3 + 0.868
ElasticNet regression: VSA=0.323*FVD + 0.564*FiVD + 0.306*PH3 + 0.483*SVD – 0.048*TVD + 0.326*TVup-TVlp + 0.031*UW3 + 0.249*Y3 + 0.906
Vertebral skeletal age formulas obtained in each regression model in girls:
Ridge regression: VSA=0.528*FiVD + 0.909*FVD + 0.638*SVD + 0.023*TVpm-Tvam + 0.508*TVum-TVd -0.138*UW4 - 0.064*UW5 + 0.456*Y3 + 1.988
LASSO regression: VSA=0.481*FiVD + 0.935*FVD + 0.614*SVD + 0*TVpm-Tvam + 0.513*TVum-TVd -0.149*UW4 - 0.065*UW5 + 0.494*Y3 + 1.892
ElasticNet regression: VSA=0.543*FiVD + 0.891*FVD + 0.642*SVD + 0.033*TVpm-Tvam + 0.498*TVum-TVd - 0.126*UW4 - 0.055*UW5 + 0.435*Y3 + 1.985
The highest power of explainability was obtained using LASSO regression for both girls and boys (Table 5).
Bland-Altman Analysis
Figures 5 and 6 display the Bland-Altman plots illustrating the consistency of inter-age measurements for boys and girls, respectively, including CA, GP age, Ridge regression age, LASSO regression age, and ElasticNet age. The plots depict a solid line indicating zero bias, the middle-dashed line represents the bias, and the outer dashed lines define the limits of agreement.
DISCUSSION
This study identified key findings in skeletal age prediction using Ridge, LASSO, and ElasticNet regression models. Among these, LASSO regression demonstrated the highest R² values (0.783 in boys and 0.741 in girls). Additionally, in both sexes, the vertebral depth of concavities exhibited high beta coefficients, highlighting their significance in skeletal age estimation. The Bland-Altman analysis indicated that the limits of agreement for GP age with CA and VSA were wider in boys than in girls, whereas the limits of agreement between CA and VSA were wider in girls than in boys.
Furthermore, although LASSO exhibited the highest R², the observed differences in predictive accuracy among Ridge, LASSO, and ElasticNet regression models suggest that the assumption of equal model performance does not hold. The performance variations among models differed, leading to the rejection of the null hypothesis (H₀), which stated that there would be no difference between VSA prediction models developed using Ridge, LASSO, and ElasticNet regression.
Morphologic changes in the cervical vertebrae are considered useful indicators of skeletal development, although the CVM method has some limitations, such as subjectivity and inadequate validity and reproducibility.22 We attempted to overcome these restrictions by assessing VSA using morphometric measurements. CVM and hand-wrist methods may be consistent,9, 23 making them reliable skeletal maturity indicators, especially when HWR images are unavailable.24
The sample sizes in the literature for skeletal age estimation from vertebral measurements varied from 66 to 958 individuals. Our study sample size was larger than in many studies in the literature, except for Roman et al.’s24 study.15-17,25
There are noticeable differences between boys and girls in the timing of the growth spurt (pre-peak, peak, and post-peak). Hägg and Taranger26 reported that pubertal growth spurts begin on average at the age of 10 years in girls and 12 years in boys. Fishman27 also reported that the pubertal growth spurt ended at the age of 14.77 years in girls and 16.4 years in boys. In the present study, VSA was determined separately in boys and girls because the difference in growth and development between the sexes is often considered important.24, 26, 27
Previous studies examined C2-C5,9, 28, 29 C2-C4,4, 8, 30 and C3-C415-17,25 vertebrae for estimating skeletal age and maturation from cervical vertebrae. In our study, we focused on evaluating C2-C5 vertebrae.
The age range of the sample of our study (7-18 years) was wider than in Caldas et al.’s25 study (7-15.9 years), Mito et al.’s17 study (7-14.9 years), and Alhadlaq and Al-Maflehi’s16study (10-15 years).
Caldas et al.25 reported that the anterior (TVua-TVla), median (TVum-TVd), and posterior (TVup-TVlp) heights of the C3 vertebrae increased between 10 and 13 years, and the anterior (FVua-FVla), median (FVum-FVd), and posterior (FVup-FVlp) heights of the C4 vertebrae increased between the ages of 11-13 years in girls. In addition, they reported that the anterior (Tvua-Tvla), median (Tvum-TVd), and posterior (Tvup-TVlp) heights and median width (TVpm-Tvam) of the C3 vertebrae increased between 12 and 15 years, but no significant changes were observed in the C4 vertebral measurements in boys.25 Mito et al.17 reported that the anterior, median, and posterior heights of the C3 and C4 vertebrae increased rapidly from age 10 to 13 years in girls.
Alhadlaq and Al-Maflehi16 reported an increase in the heights of the C3 and C4 vertebrae between 10-15 years, but the median width did not change in this period in boys. In the present study, the median height of the C3 vertebrae (TVum-TVd) in girls and the posterior height of the C3 vertebrae (TVup-TVlp) in boys had high beta coefficients, and the coefficients of C3 height measurements were high. However, the concavity depth of all vertebrae may have been more pronounced than C4 height measurements due to the wider age range compared to other studies,16, 17, 25 and the higher number of independent variables. Roman et al.24 found that the most influential variable in determining the vertebral maturation period was the vertebral depth of concavity.
Likewise, concavity depth at the lower border of C4 (FVd) and C3 (TVd) vertebrae in girls and concavity depth at the lower border of C5 (FiVD) and C2 (SVD) vertebrae in boys were found to be the most influential variables in skeletal age estimation.
Generally, stepwise regression has been used in studies to obtain VSA.15-18 Varshosaz et al.15 reported that the anterior length of the fourth vertebrae was the most important variable for determining skeletal age by performing a stepwise multivariable regression analysis. The focus of the present study was to introduce different regression models for detecting VSA. The power of explainability in their study was R2=0.686, whereas, in our study, it was R2=0.741 in girls and R2=0.783 in boys.15 Although both studies were conducted in similar age groups, our study provided separate evaluations for boys and girls. Difference in variables, sample size, ethnic differences, and the use of different regression models may have influenced the results.
Although many studies have reported that evaluating cervical vertebrae with morphologic and morphometric methods yields successful results in skeletal age estimation,16,17,23-25,29,31 Beit et al.20 reported that methods based on vertebral morphology were insufficient for estimating skeletal age. In addition to the ratio measurements in their study, the SI angle, which was also included in our study, was included. When the first part of the statistical analysis was examined in our study, the beta coefficient of the SI angle was found to be low, likewise in the study of Beit et al.20. Thus, the SI angle was excluded from the second part of the statistical analysis. The explanatory power of this study model (R2=0.783 for boys, R2=0.741 for girls) was found to be higher than for Beit et al.20 (R2=0.693 for boys and R2=0.671 for girls). Although our R2 values are higher than those in the studies by Varshoaz et al.15 and Beit et al.20, the clinical advantage was insufficient to predict the skeletal age.
It is important to evaluate the differences between the two methods to assess their compatibility and reproducibility. Bland-Altman analysis was used to examine the agreement between GP age, CA, and VSA.
Varshosaz et al.15 evaluated the relationship using the correlation method and stated that LCRs are useful for skeletal age estimation and might be an alternative to HWRs, with the advantage of radiation reduction. In the study of Beit et al.20, the limit of agreement between CA and GP skeletal age (in boys ULA: 2.1, LLA: -1.7; in girls ULA: 2.2, LLA: -1.2) was found to be better than in our study (in boys ULA: 2.17, LLA: -2.36, in girls ULA: 1.41, LLA: -2.64). They reported that the agreement between CA and GP age was higher than the agreement between GP age and VSA in both sexes.20 In our study, in both CA and VSA (Ridge, LASSO, Elastic Net) graphs of GP age, the width of the limits of agreement was wider in boys than in girls (Figures 5a, e, f, g, 6a, e, f, g). The width of the limits of agreement of CA-VSA (Ridge, LASSO, ElasticNet) was wider in girls than in boys (Figures 5b, c, d, 6b, c, d). Similar to our findings, by comparing GP age with VSA and CA, Beit et al.20 reported that VSA was not superior to CA. Therefore, differences in interpretation based on statistical analysis methods are important.
In studies performed to obtain VSA, ratio or angular measurements were generally used.15-17,20,25 In the present study, only linear and angular measurements were included. Although image magnification was mentioned as a disadvantage in the use of linear measurements,16 the power of explainability in our study was higher than the ratio measurements used in other studies.15, 20
Circumpubertal growth differences are more closely related to skeletal age than CA. Variations in the maturation stage are closely associated with changes in when and how much growth happens. Comprehending the development of the oro-facial region is crucial for orthodontic therapy. Determining skeletal age is important in creating effective orthodontic treatment plans because patients grow at different times, durations, and velocities. Orthodontic treatment for growth modification requires proper patient selection, appliance prescription, and compliance. Clinical decisions involving extra-oral traction forces, functional appliances, extraction vs. non-extraction therapy, or orthognathic surgery are primarily based on growth considerations.32, 33
The methods mentioned in our study have provided useful but limited information on determining the timing of orthopedic treatment or confirming the end of growth. Clinicians should know the average differences between chronologic and skeletal ages for each sex and identify ages when there is good concordance or within clinically acceptable limits of treatment or purpose. Suri et al.32 reported that a 0.5-year difference between skeletal and CA was acceptable in clinical practice. Despite observing high R2 values, no significant clinical advantage was observed when comparing it with CA in the present study.
Study Limitations
Skeletal age is influenced by ethnic factors.34 To avoid ethnic influences on skeletal growth and development, only individuals of Turkish ethnicity were included in this study. Although GP atlas assessment has been reported to exhibit minimal inter-observer and intra-observer discrepancies, it should be noted that this evaluation is inherently subjective.35
Future studies should be conducted using a group-based approach, employing larger sample sizes and encompassing diverse age ranges within the groups. Variations in vertebral maturations may exhibit dissimilarities across distinct age cohorts. Evaluations can be made about which vertebral variables play a more important role in different age groups.16, 17, 25
This study had several strengths. First, with a sample size of 794 individuals (329 boys, 465 girls), it included a larger dataset than many previous studies evaluating skeletal age through cervical vertebrae measurements, except for Roman et al.’s24 study.15-17,25 Second, by incorporating multiple regression models (Ridge, LASSO, and ElasticNet), this study enabled a comparative assessment of different predictive methodologies, providing insights into their strengths and weaknesses. Additionally, the Bland-Altman analysis enhanced reliability by quantifying the agreement between VSA and GP skeletal ages, thereby improving the interpretability of the findings. However, some limitations should be acknowledged. The retrospective design and the the inclusion of only a single ethnic group may limit the generalizability of the results. Future research should incorporate longitudinal data, investigate the influence of ethnic variability on skeletal age prediction, and validate findings using external datasets to improve model robustness and clinical applicability.
CONCLUSION
In our study, the difference in skeletal age estimation was greater than 0.5 years, which does not provide enough information in clinical practice. Relying on VSA alone to determine the skeletal age of individuals within the Turkish population is insufficient for determining the timing of orthopedic treatment or confirming the end of growth.