Original Article

Interobserver and Intraobserver Reliability of Cephalometric Measurements Performed on Smartphone-Based Application and Computer-Based Imaging Software: A Comparative Study


  • Vinay Kumar Chugh
  • Navleen Kaur Bhatia
  • Dipti Shastri
  • Sam Prasanth Shankar
  • Surjit Singh
  • Rinkle Sardana

Received Date: 12.04.2022 Accepted Date: 29.09.2022 Turk J Orthod 2023;36(2):94-100 PMID: 37346006


The aim was to compare the reliability of cephalometric analysis using a smartphone-based application with conventional computer-based imaging software.


Pre-treatment cephalometric radiographs of 50 subjects (26 males, 24 females; mean age, 19.2 years; ±4.2) were traced using the OneCeph® application and Dolphin imaging software®. Two independent observers identified seventeen landmarks and measured fourteen cephalometric measurements at an interval of. Interobserver and intraobserver reliability were evaluated using the intraclass correlation coefficient. Student’s t-test was used to compare the means of two measurement methods for observer 1 and observer 2. Additionally, the time taken to complete the cephalometric measurements was also compared between the two methods.


Good (ICC 0.75-0.90) to excellent (ICC 0.90-1.00) interobserver and intraobserver reliability was observed for all hard and soft tissue measurements with both methods. No significant differences were found between the two measurement methods for both observers (p<0.05). OneCeph application took significantly more time to complete the analysis than Dolphin imaging software (p<0.001).


Cephalometric measurements made through a smartphone-based application showed good to excellent interobserver and intraobserver reliability and are comparable with the computer-based software. Therefore, it can be recommended for clinical use. The time taken to complete the cephalometric measurements was more with a smartphone-based application (OneCeph application) compared to computer-based software (Dolphin imaging software).

Keywords: Cephalometrics, Reliabiity, OneCeph, Dolphin

Main Points

• The study compared interobserver and intraobserver reliability for cephalometric evaluation between smartphone-based applications (OneCeph®) and computer-based software (Dolphin imaging software®).

• Good to excellent reproducibility and repeatability of cephalometric evaluation seen with OneCeph, which is comparable to Dolphin software.

• OneCeph took double the time compared with Dolphin imaging software.


Evaluating and assessing dental malocclusion and underlying skeletal abnormalities require cephalometric radiography, which is widely used in orthodontics. However, the procedure is time-consuming and prone to various errors. Technical measures, radiography acquisition, and identification landmarks are some of the most common sources of inaccuracies in cephalometric radiography. In recent years, many computer-based cephalometric software programs have been introduced into the market. Clinicians and researchers have successfully adopted them, and they have been in use for the past two decades. Numerous studies have tested the consistency and reliability of Dolphin imaging software® (Dolphin Imaging and Management Solutions, Chatsworth, California, USA), which is considered almost a gold standard in the field.1,2,3,4

The advancement of technology has brought about the advent of the smartphones and their applications, leading many professionals to spend more time on them. Recently, mobile technology has evolved at par with computers with applications designed to mimic computer operations. Some smartphone-based applications have been developed for cephalometric analysis, allowing for easy access to various analyses at anytime. Promising results from recent research on smartphone-based applications have encouraged further investigation into their effectiveness.5,6,7,8,9 Nevertheless, digital or smartphone applications should be designed to reduce the workload of orthodontists.

Livas et al.7 assessed the diagnostic accuracy of two smartphone-based cephalometric analysis apps and found good to excellent reliability compared with the Viewbox software (Viewbox 4, dHAL Software, Kifissia, Greece). Another study compared the reliability of the OneCeph application® (version beta 1.1, NXS, Hyderabad, Telangana, India), a smartphone-based application, with the conventional hand tracing method and concluded that both methods can be used with good reliability.8 Similarly, a study conducted on a smartphone-based app showed that most cephalometric parameters are comparable with the Dolphin Imaging software.9 However, although smartphone-based apps have been used for quite some time, a robust study evaluating interobserver and intraobserver reliability is still lacking. None of the above studies have estimated the efficiency of cephalometric analysis in terms of time taken to complete the analysis.

This study aimed to compare the interobserver and intraobserver reliability of a smartphone-based application (OneCeph application) with the standard computer-based software (Dolphin imaging software). Our secondary objective was to compare the time required to complete the cephalometric analysis between the two methods. The null hypothesis was that there would be no significant difference in the interobserver and intraobserver reliability of the Dolphin imaging software and the OneCeph application.


In this cross-sectional study, pre-treatment lateral head cephalograms were drawn from the archives of the Orthodontics Division, Department of Dentistry, All India Institute of Medical Sciences Jodhpur, between January 2016 and December 2018. The study received approval from Institutional Ethics Committee AIIMS Jodhpur (AIIMS/IEC/2018/689) Rajasthan, India..

A total of 50 pre-treatment standardized digital lateral head cephalograms were selected from healthy patients without any history of systemic diseases (26 males and 24 females), with a mean age of 19.2±4.12 years. The cephalograms were obtained using a standardized machine (NewTom GiANO CEFLA-SC, Cella Dental Group, Italy) in a natural head position. Only good-quality cephalograms without any artifacts were included for the study. Cephalograms on which landmarks could not be identified due to motion, resolution, or lack of contrast were excluded. Radiographs that did not show good superimposition of bilateral anatomical structures about the mid-sagittal plane were not included. Additionally, subjects with gross asymmetry, and craniofacial deformity were excluded.

Cephalometric Measurements

The lateral head cephalograms were imported into the semi-automated analysis software. For Method 1, Dolphin Imaging software® (Version 11.7, Chatsworth, California, USA) was installed on a Hewlett-Packard laptop (HP EliteBook Folio 9470m) with Windows 7 Professional (Service Pack 1) and an integrated Intel HD graphics 4000 chip. A14-inch HD Anti-glare SVA LED panel (Hewlett-Packard Company, Core i5, 8GB RAM, Graphics 2GB) was used as the output. Landmarks were identified manually within the software using a cursor (input). For Method 2, OneCeph application® (version beta 1.1, NXS, Hyderabad, Telangana, India) was downloaded from Google Play Store (Google Inc, Mountain View, Calif) on a OnePlus android smartphone with a 6.41-inch touch-screen (OnePlus 6T, Android 8, 6 GB RAM). Landmark identification was made manually using the index finger on the touch-screen and refined by repositioning it within the application. Each cephalogram was calibrated, and 17 digital landmarks were identified (Figure 1). A total of 14 parameters (nine angular and five linear) were chosen for measurements, including the commonly used skeletal, dental, and soft tissue parameters (Table 1). Figure 2 illustrates the linear and angular measurements used in the study.

Additionally, the time taken to complete all the measurements was recorded in minutes with the using a stopwatch. After importing the cephalogram in each software, time taken was recorded from the start of the analysis to the completion of all the measurements. An assistant not involved in the study operated the stopwatch, and he was blinded to the measurement being made.

Interobserver Reliability

Two orthodontists (observer 1; SP & observer 2; NKB) with more than three-year experience performed all measurements on fifty lateral cephalograms. To calculate interobserver reproducibility, the first measurements of observer 1 were compared with the first measurements of observer 2.

Before performing the cephalometric measurements, each orthodontist underwent a one-hour training session to become familiar with the use of the software and the method for make cephalometric measurements. Measurement periods for every session were set to one-hour to prevent operator fatigue. The study was initiated only after both observers demonstrated their ability to perform the cephalometric measurements independently using both software independently. All cephalometric radiographs were assigned a unique number in a list that did not follow any specific sequence. The images were randomized, and their order was blinded. Observer 1 was blinded to the measurements made by Observer 2 and vice-versa to ensure reproducibility.

Intraobserver Reliability

For the intraobserver reliability calculation, both observers’ measurements were used. Thirty cephalograms were randomly selected and measured by two observers using both methods. An interval of at least four weeks between the repeated measurements (repeatability) was used.

For calculating time required to perform the cephalometric measurements, the first measurements by observer 1 and observer 2 were timed.

Statistical Analysis

The sample size was calculated using a web-based sample size calculator for reliability studies developed by Arifin.10 The minimum acceptable reliability was set as 0.75 and the expected reliability was set as 0.90, which was observed for most variables according to the study by Livas et al.7 With 90% power and a significance level of 95%, the minimum sample size needed per group was calculated to be 44. Fifty cephalograms per group were included to increase the power of the study.

The statistical analysis was conducted using the Dahlberg11 formula to calculate the method error of each method for all cephalometric measurements.The data was analyzed using the SPSS for Windows (Version 23.0, Armonk, NY: IBM Corp). Interobserver and intraobserver reliability were assessed using the intraclass correlation coefficient (ICC; two-way mixed-effects model, single measures, absolute agreement) and the 95% confidence intervals (CI). ICC values less than 0.5 were considered to indicate poor reliability, values between 0.5 and 0.75 indicated moderate reliability, values between 0.75 and 0.9 indicated good reliability, and values greater than 0.90 indicated excellent reliability.12 The Student’s t-test was used to compare mean differences and the time it has taken to complete all the measurements between the two methods. A p-value of <0.05 was considered significant.


The measurement error measured with the Dahlberg11 formula for Method 1 ranged between 0.35 to 0.88 degrees for angular measurements and 0.31 to 0.60 mm for linear measurements. The measurement error for Method 2 ranged between 0.42 to 1.08 degrees for angular and 0.42 to 0.66 mm for linear measurements.

Table 2 shows the result of ICC values for interobserver reliability (reproducibility) of the two measurement methods. For method 1, interobserver reliability was classified as “excellent” (ICC value >0.90) for all measurements. For method 2, interobserver reliability was classified as “excellent” (ICC value >0.90) for all measurements except upper and lower lip to E-line, which was classified as “good” (ICC value 0.75-0.90) reliability.

Tables 3 and 4 demonstrate the results of the intraobserver reliability (repeatability) for observers 1 and 2 using both measurement methods. For both the observers, the repeatability was classified as “excellent” (ICC value >0.90) for all measurements with methods made with 1 and 2 except for E-line to the upper lip and mandibular plane to SN, for which observer 1 showed “good” repeatability (ICC value 0.75-0.90).

The mean values of all cephalometric measurements using methods 1 and 2 are shown in Table 5. No significant difference was recorded with either observer 1 or observer 2 in performing the measurements using each method (p>0.05). A significant difference was observed in the time required to complete the cephalometric measurements between methods 1 and 2, with method 1 taking significantly lesser time (p<0.001) (Table 6).


This study showed excellent repeatability and reproducibility for hard and soft tissue measurements using Dolphin imaging software which is consistent with the findings of Kasinathan et al.13, who reported higher reliability for hard tissue measurements using the same software. Compared to manual tracings, a high level of agreement (ICC >0.9) for cephalometric measurements has been reported with Dolphin imaging software.14 It is known to have good intra-rater reliability for most cephalometric parameters and good inter-rater reliability for almost all parameters similar to this study.15 The OneCeph application's measurements showed good to excellent reproducibility and repeatability for the cephalometric measurements. Previous studies have reported the OneCeph application to be reliable. However, most studies have used either the Pearson correlation coefficient or Student’s t-test to the measure the reliability, which is an inaccurate method.16,8 Livas et al.7 reported the high validity of the OneCeph application with computer-based software using ICC. However, they did not investigate the Dolphin imaging software in their study.

Good reproducibility in upper and lower lip to E-line measurement was observed with the OneCeph application. Aksakallı et al.5 found significantly lower values concerning lower lip to E-line in smartphone applications compared to the Dolphin imaging software however, the application was not investigated in their study. Shettigar et al.9 did not find any difference in the measurement of the lower lip to E-line between the two software although, they did not report on the interobserver and intraobserver reliability of the OneCeph application and Dolphin imaging software. It should be noted that the OneCeph application works on a smaller smartphone screen and the absence of a contrast adjustment tool within the application may potentially affect the accurate identification of soft tissue landmarks. While some landmarks can be refined in the application, there are limitations to adjusting the size and color of the landmark guide, which may reduce the precision of the identification of soft tissue landmarks.

Computer-based software allows not only the adjustment of the contrast of the cephalograms, but also provides users with a modifiable point cursor to locate the various landmarks with higher accuracy.

Dolphin imaging software offers various cephalometric analyses, as well as the ability to refine tracings of different structures. In contrast, the OneCeph application lacks advanced features of cephalometric superimposition, surgical treatment planning, and the ability to create STL files, or perform three-dimensional volume rendering. These limitations, along with the software’s inability to conduct multiple analyses simultaneously, are significant drawbacks compared to the Dolphin software.

The overall reliability of the OneCeph application has been evaluated in multiple studies. However, most of them have only compared it with the manual tracing method.7,8,17 This study is probably the first to report the method error, as well as the interobserver and intraobserver reliability of the OneCeph application. Only one study used Dolphin imaging software for comparison with the OneCeph application; however, they did not use a robust statistical method such as the ICC for reliability assessment.9

A significant difference in the time taken to complete the cephalometric measurements was found between the OneCeph application and Dolphin imaging software. It took nearly twice the time to complete the analysis compared to the Dolphin imaging software. This may be attributed to the small screen size of the smartphone, which makes landmark identification and marking more time-consuming with the OneCeph application as finger touch may not be as accurate as a cursor on a larger screen. Meriç and Naoumova18 compared fully-automated, computerized, app-aided, and manual tracing in terms of time taken for tracing the landmarks and found that the shortest analysis time was obtained using CephX, followed by CephNinja and Dolphin, whereas manual tracing took the longest time. Since there is a significant difference in the speed of computers and smartphones, the performance of a computer is not only better but the landmark identification is also faster.

Both the software showed accuracy and reliability, although the Dolphin imaging software was faster. Dolphin imaging software provides more cephalometric evaluation features that may be added to smartphone-based applications in the future. However, it should be noted that measurements may vary depending on the screen size and specifications of the smartphone used. Nevertheless, smartphone-based applications are cost-effective, efficient, and readily accessible, making them an attractive option. The use of smartphone-based applications could play a vital role in cephalometric analysis in day-to-day practice, and the findings of the current study indicate that their use may be advocated.

Study Limitations

Smartphones have different patterns of use, viewing positions, and distances from the eye compared to computers.19 A recent study has shown that smartphones can aggravate subjective ocular symptoms, asthenopia and compromise tear film stability.20 However, these aspects were not analyzed in this study. Another limitation of the study is that the time to complete the analysis was calculated after importing the cephalogram into the software from the start of analysis. The results may have been impacted if the time taken had been calculated from the import of cephalogram into the software to completion of the analysis.


The following conclusions can be drawn from the study:

- Both Dolphin imaging software and OneCeph application displayed good to excellent interobserver and intraobserver reliability for most cephalometric measurements.

- OneCeph application took nearly twice the time to complete the cephalometric measurements compared to Dolphin imaging software.


Ethics Committee Approval: Ethical approval was obtained from the Institutional Ethics Committee, All India Institute of Medical Sciences Jodhpur, (AIIMS/IEC/2018/689) Rajasthan, India.

Informed Consent: Informed consent was obtained from all patients for orthodontic treatment.

Peer-review: Externally peer-reviewed.

Author Contributions

Concept - V.K.C., S.P.S.; Design - V.K.C., N.K.B.; Supervision - V.K.C., D.S.; Materials - S.P.S., R.S.; Data Collection and/or Processing - S.P.S.; Analysis and/or Interpretation - N.K.B., S.S.; Literature Review - N.K.B., D.S.; Writing - N.K.B., S.S.; Critical Review - V.K.C., D.S.

Declaration of Interests: The authors have no conflicts of interest to declare.

Funding: The authors declared that this study has received no financial support.

  1. Power G, Breckon J, Sherriff M, McDonald F. Dolphin Imaging Software: an analysis of the accuracy of cephalometric digitization and orthognathic prediction. Int J Oral Maxillofac Surg. 2005;34(6):619-626.
  2. Nouri M, Hamidiaval S, Akbarzadeh Baghban A, Basafa M, Fahim M. Efficacy of a Newly Designed Cephalometric Analysis Software for McNamara Analysis in Comparison with Dolphin Software. J Dent (Tehran). 2015;12(1):60-69.
  3. Erkan M, Gurel GH, Nur M, Baris D. Reliability of four different computerized cephalometric analysis programs. Eur J Orthod. 2012;34(3):318-321.
  4. Paixao MB, Sobral MC, Vogel CJ, Araujo TM. Comparative study between manual and digital cephalometric tracing using Dolphin Imaging software with lateral radiographs. Dental Press J Orthod. 2010;15(6):123-130.
  5. Aksakallı S, Yılancı H, Görükmez E, Ramoğlu Sİ. Reliability Assessment of Orthodontic Apps for Cephalometrics. Turk J Orthod. 2016;29(4):98-102.
  6. Kumar M, Kumari S, Chandna A, et al. Comparative evaluation of CephNinja for android and NemoCeph for computer for cephalometric analysis: A study to evaluate the diagnostic performance of CephNinja for cephalometric analysis. J Int Soc Prev Community Dent. 2020;10(3):286-291.
  7. Livas C, Delli K, Spijkervet FKL, Vissink A, Dijkstra PU. Concurrent validity and reliability of cephalometric analysis using smartphone apps and computer software. Angle Orthod. 2019;89(6):889-896.
  8. Zamrik OM, Işeri H. The reliability and reproducibility of an android cephalometric smartphone application in comparison with the conventional method. Angle Orthod. 2021;91(2):236-242.
  9. Shettigar P, Shetty S, Naik RD, Basavaraddi SM, Patil AK. A comparative evaluation of reliability of an android-based app and computerized cephalometric tracing program for orthodontic cephalometric analysis. Biomed Pharmacol J. 2019;12:341-346.
  10. Arifin WN. A web-based sample size calculator for reliability studies. Education in Medicine Journal. 2018;10(3):67-76.
  11. Dahlberg G. Statistical Methods for Medical and Biological Students. Br Med J. 1940;14(2):358-359.
  12. Koo TK, Li MY. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med. 2016;15(2):155-163.
  13. Kasinathan G, Kumar S, Kommi PB, Sankar H, Sabapathy S, Arani N. Reliability in Landmark Plotting between Manual and Computerized Method - A Cephalometric Study. Int J Sci Stud. 2017;4(12):73-78.
  14. Mahto RK, Kharbanda OP, Duggal R, Sardana HK. A comparison of cephalometric measurements obtained from two computerized cephalometric softwares with manual tracings. J Indian Orthod Soc. 2016;50:162-170.
  15. Khosravani S, Esmaeilia S, Mohammadia NM, Eslamianb L, Dalaiec K, Motamedian SR. Inter and Intra-rater Reliability of Lateral Cephalometric Analysis Using 2D Dolphin Imaging Software. Journal Dental School. 2020;38(4):148-152.
  16. Mohan A, Sivakumar A, Nalabothu P. Evaluation of accuracy and reliability of OneCeph digital cephalometric analysis in comparison with manual cephalometric analysis-a cross-sectional study. BDJ Open. 2021;7(1):22.
  17. Khader DA, Peedikayil FC, Chandru TP, Kottayi S, Namboothiri D. Reliability of One Ceph software in cephalometric tracing: A comparative study. SRM J Res Dent Sci. 2020;11(1):35-39.
  18. Meriç P, Naoumova J. Web-based fully automated cephalometric analysis: Comparisons between app-aided, computerized, and manual tracings. Turk J Orthod. 2020;33(3):142-149.
  19. Jaiswal S, Asper L, Long J, Lee A, Harrison K, Golebiowski B. Ocular and visual discomfort associated with smartphones, tablets and computers: what we do and do not know. Clin Exp Optom. 2019;102(5):463-477.
  20. Choi JH, Li Y, Kim SH, et al. The influences of smartphone use on the status of the tear film and ocular surface. PLoS One. 2018;13(10):e0206541.