This action might not be possible to undo. Are you sure you want to continue?
A Systematic Review and Meta-Analysis of the Diagnostic Accuracy of Fine-Needle Aspiration Cytology for Parotid Gland Lesions
Robert L. Schmidt, MD, PhD, MMed, MBA,1 Brian J. Hall, MD,1 Andrew R. Wilson, MStat,2 and Lester J. Layfield, MD1,2
Key Words: Parotid gland; Fine-needle aspiration; Sensitivity and specificity; Meta-analysis
Upon completion of this activity you will be able to: • describe the role of quality assessment in a meta-analysis. • define the following terms: verification bias, review bias, misclassification bias, timing bias, and bias due to handling of indeterminate results. • discuss the current state of knowledge regarding the factors affecting the diagnostic performance of fine-needle aspiration cytology for the assessment of salivary gland lesions. • list the general categories of factors that lead to variation in diagnostic performance.
The ASCP is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians. The ASCP designates this journal-based CME activity for a maximum of 1 AMA PRA Category 1 Credit ™ per article. Physicians should claim only the credit commensurate with the extent of their participation in the activity. This activity qualifies as an American Board of Pathology Maintenance of Certification Part II Self-Assessment Module. The authors of this article and the planning committee members and staff have no relevant financial relationships with commercial interests to disclose. Questions appear on p 152. Exam is located at www.ascp.org/ajcpcme.
The clinical usefulness of fine-needle aspiration cytology (FNAC) for the diagnosis of parotid gland lesions is controversial. Many accuracy studies have been published, but the literature has not been adequately summarized. We identified 64 studies on the diagnosis of malignancy (6,169 cases) and 7 studies on the diagnosis of neoplasia (795 cases).The diagnosis of neoplasia (area under the summary receiver operating characteristic [AUSROC] curve, 0.99; 95% confidence interval [CI], 0.97-1.00) had higher accuracy than the diagnosis of malignancy (AUSROC, 0.96; 95% CI, 0.94-0.97). Several sources of bias were identified that could affect study estimates. Studies on the diagnosis of malignancy showed significant heterogeneity (P < .001). The subgroups of American, French, and Turkish studies showed greater homogeneity, but the accuracy of these subgroups was not significantly different from that of the remaining subgroup. It is not possible to provide a general guideline on the clinical usefulness of FNAC for parotid gland lesions owing to the variability in study results. There is a need to improve the quality of reporting and to improve study designs to remove or assess the impact of bias.
The value of fine-needle aspiration cytology (FNAC) for diagnosis of parotid gland lesions is controversial. FNAC obviates the need for surgery in up to 33% of patients1-4 and can provide useful information for surgical planning5,6; however, the clinical usefulness of FNAC is questioned because of low sensitivity, variation in reported results, and the belief that most parotid masses require surgery. While FNAC is now a commonplace procedure, some authors, such as Batsakis et al,7 have suggested that FNAC is only cost-effective in limited circumstances. FNAC provides information that informs 2 key decisions in patient management. First, FNAC differentiates between neoplastic and nonneoplastic lesions. Neoplastic lesions usually are managed surgically, whereas nonneoplastic lesions are managed conservatively. Second, given a neoplastic lesion, FNAC determines whether the lesion is malignant or benign, which determines the extent of surgery and, in particular, whether the facial nerve can be spared during surgery. The 2 central research questions regarding the clinical usefulness of FNAC for parotid lesions are: (1) Is FNAC sufficiently sensitive to exclude neoplasia and avoid surgery? (2) Is FNAC sufficiently sensitive for malignancy to allow for facial nerve–sparing surgery? The resolution of these questions requires an accurate assessment of the diagnostic performance of FNAC and an understanding of the causes of variation in performance. Systematic reviews are the cornerstone of evidence-based medicine and provide the basis for the development of guidelines for patient management. Numerous studies on the accuracy of FNAC for the diagnosis of parotid tumors have been published; however, this body of literature has never been
Am J Clin Pathol 2011;136:45-59 45
© American Society for Clinical Pathology
or by a translator working in conjunction with 1 author (R. Studies were eligible if they seemed to contain diagnostic accuracy data on FNAC of salivary gland tumors or head and neck tumors.136:45-59 DOI: 10. Language was not restricted.).H. Studies were included if they contained extractable 46 46 Am J Clin Pathol 2011. We also conducted a quality assessment of included articles to explore potential sources of bias and to provide recommendations to improve the reporting of future studies. Finally. we included only the most complete data.R. Full reports were obtained for all eligible articles. In our opinion. recheck the data used to make a judgment.17 To that end. the statistical techniques for meta-analysis of diagnostic studies have been developed relatively recently.L.J. All studies except 1 reported only cases with histologic verification. We excluded studies using needle core biopsy and included only studies in which the needle size was 20 gauge or smaller (0. previous summaries have not used modern meta-analytic methods to obtain estimates of diagnostic performance or they have used older methods that have been shown to have deficiencies. Our objective was to summarize the evidence on the diagnostic accuracy of FNAC for parotid gland tumors using current guidelines for systematic review and meta-analysis of diagnostic studies. we conducted a comprehensive systematic review of the literature and used metaanalytic methods to develop a summary receiver operating characteristic (SROC) curve of diagnostic performance.S.8-12 the literature has not been subject to a comprehensive systematic review. data on histologically verified cases involving parotid tumors and provided data that enabled lesions to be classified into broad categories (malignant vs benign and nonneoplastic vs neoplastic). © American Society for Clinical Pathology Materials and Methods Literature Search MEDLINE. and the bibliographies of retrieved articles were searched for studies evaluating diagnostic accuracy of FNAC for parotid lesions published between January 1. Quality Assessment Quality assessment of articles written in English was conducted using the QUADAS tool. The consensus approach was required for relatively few items because the initial level of agreement was high (see the “Results” section).S.).15. We excluded case reports and studies with fewer than 10 cases. and discrepancies were resolved by consensus. the consensus process worked quite well because it required each evaluator to revisit the criteria. and true-negatives).S.L. as a result.S. and December 31. correspondence with study authors. false-negatives. and no reviews have examined the accuracy of the diagnosis of neoplasia. A pilot scoring form was developing and tested on a subset of 10 studies. Because our study was concerned with broad categories of disease. Eligibility Titles and abstracts were evaluated independently by 2 authors (R. The degree of agreement was assessed by using the Cohen κ. and B. These items were discussed until a consensus was reached. Scopus was used to perform a “forward search” to obtain articles citing the set of retrieved articles. 1985. Discrepancies sometimes arose owing to differences in judgment. We debated whether to use a consensus approach or use a third evaluator as a tie-breaker.Schmidt et al / FNAC for Parotid Gland Lesions adequately reviewed. Our search strategy was broad and included articles on FNAC of head and neck lesions in addition to salivary glands. Data Extraction Data extraction was completed independently by 2 authors (R. FNAC diagnoses of “suspicious for malignancy” or “atypical” were counted as malignant.). The revised form was then used to evaluate the full set of studies. When results of a study were published more than once.L.L. The literature on group decision making provides no clear guidance as to which procedure provides better decisions. false-positives.16 No previous reviews have included a quality assessment of the literature and examined the potential for bias in study estimates.13. While some studies have summarized the results of selected articles. using a sensitive search strategy developed in consultation with an experienced medical reference librarian. We excluded the 4 cases that had only clinical follow-up. In that study. previous reviews have focused on the diagnosis of malignancy. and B.).60 mm inner diameter). and B. and A. 2010. Data from foreign language articles (non-English) were extracted by pathologists with knowledge of the language. The scores were compared and used to clarify definitions and identify deficiencies in the form. Eligible studies were included if accuracy data could be extracted in the form required for analysis (true-positives.J.W. Prospective and retrospective studies were eligible.14 and. Inadequate or indeterminate biopsy results were not counted in the calculation of accuracy.19 Assessment was completed independently by 2 authors (R.L.18 there were 67 cases with histologic verification and 4 benign cases with clinical follow-up.H. The items with discrepant scores were reviewed. EMBASE.15.S.1309/AJCPOIE0CZNAT6SQ . Inclusion Eligible studies were independently evaluated by 2 authors (R.) for eligibility. Discrepancies due to errors and misinterpretations were corrected.H. In addition. we excluded studies that included only data on the diagnosis of a particular disease entity.J. and discrepancies were resolved by consensus or by correspondence with study authors.
Study Characteristics We collected data on the setting (academic vs community). age distribution. characteristics.6% of the total. Suspicious and atypical results were reclassified as malignant and accounted for 4. There were 7 data sets for the assessment of neoplastic vs nonneoplastic lesions ❚Table 2❚ with a total of 795 cases. We hypothesized that results might vary by location and found that the studies conducted in the United States formed a homogeneous subgroup (P < . College Station. and potential sources of bias (blinding.90. The publication rate increased during the period with half of the studies published in the last 8 years ❚Figure 1❚.28). Japan. location (country).18. and 10 studies indicated that specimens were obtained by nonpathologists (clinicians. a tie-breaker would be determined by the third vote with no guarantee of quality (the tie-breaker could assign scores randomly).169 cases. 95% CI.17-0.99.21 (95% CI. however.83) and 0.94. period of the study. 0.80 (95% CI. the data were insufficient (eg. These cases. In addition. inadequacy rates.6 (95% CI. 0. were excluded from the analysis. Only 2 studies45 were blinded. Computations were done using Stata 10 (StataCorp.97-0. Thus. Statistical Analysis SROC curves were developed by using the hierarchical method13. representing 8. No studies were performed in a community setting. The area under the ROC curve (AUROC) was 0. unusual referral patterns and comorbidities). We hypothesized that study results might vary by time and compared studies completed before 2000 with those completed after 2000. population characteristics (age. the positive predictive value was 0.98. The reports of the eligible studies were screened and resulted in 64 studies that met our inclusion criteria. we compared the diagnostic Am J Clin Pathol 2011.848 titles and abstracts to obtain a set of 551 eligible articles. the consensus process provided stringent quality control on discrepant votes. To that end. These 64 studies contained 64 data sets on the diagnostic accuracy of FNAC for the assessment of malignancy vs benignancy ❚Table 1❚5. 20.8). Diagnostic Accuracy Malignant vs Benign Lesions The SROC curve for malignant vs benign lesions is shown in ❚Figure 2❚. © American Society for Clinical Pathology 47 47 .6.1309/AJCPOIE0CZNAT6SQ Results Literature Search We screened 3. We tested whether we might be able to form a larger homogeneous subgroup by combining the results from highincome countries. 0. sex distribution. 0. Finally. Most studies (46 of 64) did not specify who obtained the sample. pre-2000 vs post-2000) showed significant heterogeneity (P < .97 (95% CI. and 1 study specified that samples were taken by both pathologists and nonpathologists.34 No studies reported the experience of the pathologist. percent verification. We found that studies from France and Turkey also formed homogeneous groups. and the negative predictive value was 0.9-12. We also tested whether results varied by inadequacy rates but found no significant correlation. 0. and Italy were heterogeneous. and indeterminate diagnoses). The diagnostic accuracy of the American studies was not significantly different from the remaining group of studies ❚Table 3❚.001). The summary estimates for the sensitivity and specificity were 0. study design (prospective vs retrospective).96-0. study design.25). post-2000 AUROC. there was no significant difference between these 2 groups (pre-2000 AUROC. and all those reporting were similar. 7 studies specified that samples were obtained by a pathologist. method (experience of pathologist and whether samples were immediately assessed by a pathologist). Approximately two thirds of the studies reported summary statistics on age and sex distributions of patients. In general. experience of the pathologist or immediate assessment) or had too little variability (eg.136:45-59 47 DOI: 10. 0. surgeons.97). 0. 0. To that end. we examined whether the heterogeneity could be caused by a small set of outliers. The results for both groups (ie.001). 95% CI.21-77 with a total of 6. and the negative likelihood ratio was 0. we sequentially dropped studies in order of their contribution to the Cochrane Q statistic until we obtained a homogeneous set.95-0.20 there were 37 FNAC results reported as suspicious and 14 results reported as atypical. while groups composed of studies from Australia. The SROC curve for this group is given in ❚Figure 3❚. There was significant heterogeneity among studies (P < . sex.96 (95% confidence interval [CI]. The largest number of studies (9 of 64) were conducted in the United States. We used the data on study characteristics (described in a preceding section) to investigate sources of heterogeneity.97.2% of the malignant lesions.14 to construct the curves.Anatomic Pathology / Review Article and reevaluate the decision in the light of the other evaluator’s comments. A total of 451 inadequate and 79 indeterminate FNAC results were reported (Table 1). TX) and the metandi procedure for SROC curve analysis. and blinding) to allow for statistical tests.98]. All studies were retrospective with the exception of 2 prospective studies. respectively. The positive likelihood ratio was 28.76-0. The locations are summarized in Table 1. It was necessary to drop more than 20 studies before a homogeneous subset could be obtained and no underlying theme for the homogeneous subset could be identified. or radiologists). In contrast.5-39. The prevalence of malignant disease was 25%.22.94-0.98).
200428 10/Berrone et al.98 1. 199776 63/Zbaren et al. UK. US.00 1.94 0.94 1.82 0.00 0.55 0.00 1.98 1. 200947 30/Kamal and Othman.79 0. 200362 47/Que Hee and Perry. 199748 31/Kaur et al. 198965 50/Schelkun and Grundy. 199823 4/Atula et al.86 0.00 0. 199529 11/Brennan et al. 200567 53/Shashinder et al. 82 33 150 204 310 50 43 85 67 352 103 58 72 60 135 139 80 23 78 174 124 164 128 176 87 121 46 81 101 18 24 166 17 81 22 102 41 16 34 37 53 189 60 36 133 118 155 97 46 8 284 208 70 78 24 129 40 29 53 67 47 28 110 223 TP 16 5 61 23 34 7 17 16 11 145 16 0 6 9 60 12 17 2 7 17 37 30 26 13 16 31 7 19 12 8 4 23 2 8 17 11 4 14 4 3 17 21 27 4 25 16 27 31 11 4 27 54 16 9 12 18 3 7 20 8 11 8 50 31 FP 1 0 11 9 0 1 3 0 1 20 5 3 4 0 1 0 3 1 4 3 1 3 3 13 8 0 3 2 3 3 0 8 0 0 0 2 0 0 1 0 3 6 2 0 5 6 0 3 1 0 2 12 2 1 0 0 1 1 2 1 3 1 5 0 FN 1 0 13 21 7 3 1 5 1 7 7 3 2 0 5 9 3 1 0 6 2 4 6 3 2 9 6 3 6 1 0 5 4 2 4 3 5 2 1 1 5 14 6 6 6 13 20 2 2 0 2 9 5 4 2 2 2 5 2 3 3 2 18 14 TN 64 28 65 151 269 39 22 64 54 180 75 52 60 51 69 118 57 19 67 148 84 127 93 147 61 81 30 57 80 6 20 130 11 71 1 86 32 0 28 33 28 148 25 26 97 83 108 61 32 4 253 133 47 64 10 109 34 16 29 55 30 17 37 178 ND 0 0/4 10/4 3 NR NR 5 NR 4 10 17/16 NR NR NR 3 6 0 2 7 18 3 U 11 NR 57 U NR U 9 NR 4 U U 10 5 12 11 4 U U 9 22 37 U 0/15 U 17 NR 18 2 U 12 6 NR 2 29/37 NR 0 10 45 2 0 6 23 Suspicious* Sensitivity Specificity Location 0 0 8 20 0 U U U 0 0 0 5 0 0 0 0 0 0 0 U 2 U 0 0 0 U 0 U 0 0 0 U U 0 0 0 0 0 0 0 0 0 0 U 0 U 0 0 0 U U 0/14 0 0 0 0 0 2 0 0 0 0 0 0 0.00 0.00 0.00 0.00 1. 201030 12/Buhler et al.67 0. 200344 27/Herrera Hernandez et al.88 1.83 0.91 0.44 0. 200271 58/Uğuz et al. inadequate. NR.80 0.00 — 0. 200027 8/Behbehani et al.95 0.92 0.94 1.73 0.00 0.90 0.92 0. true-negative.69 0.Schmidt et al / FNAC for Parotid Gland Lesions ❚Table 1❚ Included Studies for Diagnosis of Malignant vs Benign Lesions Study 1/Akbas et al. 200564 49/Rodriguez et al. 200011 44/Osanai et al. 198550 33/Kondo et al.70 0.85 1.86 0.85 0.86 0.00 0. United States. 199840 22/Friese. 200012 52/Seethala et al.95 0.78 0.80 0.95 0.00 0. 200968 54/Sonmez.75 1. 199166 51/Schroder et al.88 0. 200163 48/Riley et al. FP.70 0. unknown.78 0.58 0. 200658 42/Mohammed et al.81 0. 200752 35/Lin and Bhattacharyya.98 0. 199018 9/Behzatoglu et al.86 0.00 0.89 1. 199770 57/Tsai and Hsu. 20075 36/Longuet et al. 20059 55/Takashima et al.1309/AJCPOIE0CZNAT6SQ © American Society for Clinical Pathology .93 0.98 0.97 0. 200510 23/Gete Garcia et al.57 0. 200153 37/Lurie et al.82 0.98 1.00 0. 200934 16/Contucci et al. inadequate or No.96 0. “suspicious” or No. inadequate/No.00 0. false-positive.97 1. 199377 No.88 1.94 0. 200625 6/Awan and Ahmad. 200773 60/ Van Lierop et al.136:45-59 DOI: 10. 200243 26/He et al.60 0.94 0.94 0.58 0. 48 48 Am J Clin Pathol 2011.81 0.97 0.81 0.98 0.75 0.76 0.67 1. 199349 32/Knudsen et al.94 1.33 0. 200774 61/Weinberger et al. 200846 29/Jafari et al. 200254 38/Malata et al. U. 201042 25/Gooden et al. 199537 19/Deneuve et al.00 0.00 0.96 0. 200561 46/Pons Rocher et al.80 0.00 0.93 1.00 0.91 0. 20086 64/Zurrida et al.94 0. suspicious/No.60 0. 201038 20/Filho et al. ND.95 0.54 0. 200832 14/Califano et al.52 0. 199755 39/Altuna Mariezkurrena et al.40 0. 200772 59/Upton et al.95 0.00 0. 200335 17/Costas et al. 199624 5/Aversa et al. 200522 3/Al-Khafaji et al.73 0. NZ. truepositive.69 0.79 0.93 1. TP.88 0.91 0. 200421 2/Al Salamah et al.00 0.99 0.99 1.00 0. United Kingdom. 200139 21/Filopoulos et al. indeterminate). not reported. 200859 43/Ortega et al.98 0.97 1. 200036 18/Deans et al. 199357 41/Mianroodi et al. * Given as No. 200751 34/Lim et al. 200731 13/Burgess and Serpell.90 0. 199233 15/Carrillo et al.00 Turkey Saudi Arabia US Turkey Italy Pakistan US Kuwait Turkey Italy England Brazil Australia Italy Mexico Italy Spain Northern Ireland France Brazil Greece Germany Spain Croatia Canada China Columbia Japan France Saudi Arabia Singapore Norway Japan Singapore US France Israel England Spain Italy Australia Canada Mexico Japan France Spain Australia NZ US US Germany US Malaysia Turkey Japan Australia Taiwan Turkey US South Africa US Pakistan US Italy FN.92 0.00 0. 200656 40/Marrazzo et al.94 0. 199969 56/Tew et al.89 0. TN.82 0.91 0. New Zealand. 200360 45/Paris et al. 199275 62/Zafar et al.93 0. nondiagnostic (No.94 0.76 0. 200845 28/Inohara et al.99 0.00 0. 200641 24/Gobić et al.67 1.90 0.98 0.81 0. 200426 7/Bartels et al. false-negative.95 0.97 1.74 0.88 1.94 1.
0.86 1. TN. 0. not reported.91 0. Using the criteria of Fleiss.71 1. 95% CI.82 0. which gives a positive predictive value of 1.8 1. 0.4 Study estimate HSROC curve 95% prediction region Summary point 95% confidence region 0.98) with the accuracy of those completed in non–high-income OECD countries (AUROC.81.67-1.00 1.6 Summary point 95% confidence region 6 4 2 0.0 1 – Specificity ❚Figure 1❚ Number of studies by year of publication. 0.4 0.2 0.00 1. NR. The summary estimate for sensitivity was 0.8 0.96 (95% CI.04 (95% CI. 200752 6/Lurie et al. 200428 3/Buhler et al.097.0 10 0.0 0.00 1. 0. The size of the circle is proportional to the weight given to the study in the final accuracy estimate.8 8 Sensitivity No. true-positive.00 0. The summary estimate for specificity was 0. Neoplastic vs Nonneoplastic Lesions The SROC curve for neoplastic vs nonneoplastic lesions is shown in ❚Figure 4❚. 200254 7/Zurrida et al. Quality Assessment The results of the interrater agreement study are given in ❚Table 4❚. The positive likelihood ratio was 58.83-0. FP.2 0.95-0. 1. Am J Clin Pathol 2011. There was no disagreement on © American Society for Clinical Pathology ❚Figure 2❚ Hierarchical summary receiver operating characteristic (HSROC) curve for the diagnosis of malignancy. and the negative likelihood ratio was 0.0 1.2 Study estimate HSROC curve 95% prediction region 0. Each circle represents a study. nondiagnostic/inadequate.98 (95% CI.0-1.99). false-positive. true-negative.1309/AJCPOIE0CZNAT6SQ 49 49 .0 Specificity ❚Figure 3❚ Hierarchical summary receiver operating characteristic (ROC) curve for diagnosis of malignancy for American studies.00) and showed no significant heterogeneity (P < .9). 200731 4/He et al.00 Location Turkey Turkey Brazil China Singapore Israel Italy FN.136:45-59 49 DOI: 10.09).18).97.2 0. 95% CI.0 (95% CI.6 0. 0.651.00).97 0.79 1. of Studies 0.0 0 1985 1990 1995 2000 2005 2010 0.00 0.98). OECD30 vs non-OECD30) showed significant heterogeneity (P < . 200344 5/Lim et al.6 0. accuracy of studies performed in high-income OECD (Organisation for Economic Co-operation and Development) countries78 (AUROC.4 0.0 0.00 Specificity 0.79 the degree of agreement ranged from good to excellent.00 and a negative predictive value of 0.0 0.001). 1.Anatomic Pathology / Review Article ❚Table 2❚ Included Studies for Diagnosis of Neoplastic vs Nonneoplastic Lesions Study 1/Atula et al.4 0. ND. 204 67 58 121 81 41 223 TP 135 63 42 73 72 31 204 FP 8 0 0 0 2 0 0 FN 35 0 9 7 2 5 0 TN 26 4 7 41 5 5 19 ND 10 4 NR NR 10 11 23 Sensitivity 0. The AUROC curve was 0.76 1.8 Sensitivity 0. false-negative. The results for both groups (ie.98-1. 199377 No. 0.99 (95% CI.6 0. 2. The prevalence of neoplastic disease was 85%.95-0. 199624 2/Behzatoglu et al. 0.01-0. TP.
if indicated.4) 0. 18 as unclear.95-0.05.8 1.10 0. This item was scored positive for all studies.77-0. This item was scored unclear for all studies. Item 3: Is the reference standard likely to correctly classify the target condition? The reference standard (H&E histologic findings) was standard across all studies.9. 2. and if a summary of patient demographics was provided showing that the population was broadly similar to the overall population (age and sex).13-0. .2 P (κ > Z) .43 0. The reference standard is imperfect and gives rise to misclassification errors.61 a particular period were included. and 4 as negative.3) 0.12 Z 7.88) 0.0 Qualitative Agreement* Excellent Excellent Good Good * According to the Fleiss criteria. The results of the quality assessment are given in ❚Table 5❚. .93 (0.4 54. the standard practice is to perform surgery within a relatively short period (eg. 9 through 12.96 0.4 5. The size of the circle is proportional to the weight given to the study in the final accuracy estimate.1309/AJCPOIE0CZNAT6SQ © American Society for Clinical Pathology .83) 0.44 0. We searched the literature but were unable to find any studies on interobserver variation or accuracy studies for the diagnosis of salivary gland tumors.93 0.19 (0.7 (23.8 (6.94 0.4-48. In 3 studies. df. and 14 (which did not vary by study). Item 4: Is the time period between the reference standard and the index test short enough to be reasonably sure that the target condition did not change between tests? The period between FNAC and a definitive histologic diagnosis was not specified in any study.83 (0.18-0.6 0.0 . Item 6: Did patients receive the same reference standard regardless of the index test result? By design. I2 (%) Log rank Q.0 0.Schmidt et al / FNAC for Parotid Gland Lesions ❚Table 3❚ Comparison of American Studies With Non-American Studies for Malignant vs Benign Lesions* American Studies (n = 9) Area under the curve Inconsistency.89 Values in parentheses are 95% confidence intervals.79 50 50 Am J Clin Pathol 2011. This item was scored negative for all studies. Item 5: Did the whole sample or a random selection of the sample receive verification using the reference standard of diagnosis? Complete or random verification was not used in any study.3-12.97 (0.90 0.98) 33.26) 0.00 0.96) <1 (not determined) 0.8 Sensitivity 0. 2. and these were not counted in the rater agreement study. We scored 34 of 59 studies as positive. patients receiving magnetic resonance imaging [MRI] and FNA). we included SE (κ) 0.0 0. The QUADAS survey questions and information about our findings are as follows: Item 1: Were the selection criteria clearly described? Most studies (36/56) clearly specified that all cases within ❚Table 4❚ Rater Agreement for QUADAS Scoring19 QUADAS Item 1 2 8 13 Agreement (%) 98. We scored the item as positive if it seemed likely that consecutive patients were included.2 0.22 (0.136:45-59 DOI: 10.12 0.91-0. Item 2: Was the spectrum of patients representative of the patients who will receive the test in practice? We scored this item as negative if the patients were drawn from an unusual referral pattern (eg.0 99.0 κ 0.97 (0. QUADAS items 3 through 7.87-0. Each circle represents a study.2 0.0 0. 1 month) following FNAC.27) 0.79 (0.97-0. however.7 8. and the error rate would likely vary across studies according to the skill of pathologists.49 0.43 71.23 0.0 . We found no difference in the accuracy of studies in which the spectrum of patients was representative compared with those in which the accuracy was unclear or possibly not representative.93) 8.0 .4 0. 1. In some cases (17/56). the selection criteria were not clear and seemed not to include consecutive cases.4 Expected Agreement (%) 73.74-0.6 0.98) 97 (95-99) 65.12 0. it seemed likely that consecutive cases were included. it was not clearly stated. It is unlikely that most tumors would undergo significant change during that period.2 4. however.4 0. no unusual referral pattern was mentioned.94 (0.8 55. P Sensitivity Specificity Positive likelihood ratio Negative likelihood ratio Disease prevalence Positive predictive value Negative predictive value * Non-American Studies (n = 55) 0.89 0.0 74.0 1 – Specificity Study estimate HSROC curve 95% prediction region Summary point 95% confidence region ❚Figure 4❚ Hierarchical summary receiver operating characteristic (ROC) curve for diagnosis of neoplasia.5 82.
This item was scored as negative for all studies.1309/AJCPOIE0CZNAT6SQ 51 51 . so all cases were verified by the same reference standard. This item was scored positive for all studies. although FNAC findings influence the final diagnosis. Other items such as experience level of the pathologist could have a bearing but were not reported. needle size). Item 12: Were the same clinical data available when test results were interpreted as would be available when the test was used in practice? All the included studies were retrospective with cases drawn from standard practice in which clinical data are available. we view the influence to be relatively minor. Thus. We compared the diagnostic accuracy of studies in which indeterminate results were reported with those in which such results were not reported and found no difference (P > . Some studies reported this aspect in detail. Item 10: Were the index results interpreted without knowledge of the results of the reference standard? This was scored positive for all studies. patients who were scheduled but did not undergo surgery were not included. if so.Anatomic Pathology / Review Article ❚Table 5❚ Summary of QUADAS Quality Survey19 Results* QUADAS Item Description 1 2 3 4 5 6 7 8 9 10 11 12 13 14 * Yes 35 34 0 56 0 56 56 12 56 56 0 56 30 0 Unclear No 17 18 56 0 0 0 0 18 0 0 0 0 15 0 3 4 0 0 56 0 0 26 0 0 56 0 11 56 Were the selection criteria clearly described? Was the spectrum of patients representative of the patients who will receive the test in practice? Is the reference standard likely to correctly classify the target condition? Is the time period between the reference standard and the index test short enough to be reasonably sure that the target condition did not change between tests? Did the whole sample or a random selection of the sample receive verification using the reference standard of diagnosis? Did patients receive the same reference standard regardless of the index test result? Was the interpretation of the reference standard independent of the index test (ie. This item was scored negative for all studies. only histologically verified cases. the preparation of histologic slides is standard. and there is little reason to believe there is significant variation across studies. and. in such cases. the index test did not form part of the reference standard)? The reference standard is independent of the index test. Thus. in all studies. This result is Am J Clin Pathol 2011. Item 11: Were the reference standard results interpreted without knowledge of the results of the index test? It is standard © American Society for Clinical Pathology practice for pathologists to be aware of FNAC results when evaluating histologic slides. This item was scored positive for all studies. Item 7: Was the interpretation of the reference standard independent of the index test (ie. it was impossible to determine whether indeterminate/uninterpretable results were found and. Item 13: Were uninterpretable or intermediate results reported? We found the reporting of uninterpretable results to be quite variable.05). Item 14: Were withdrawals from the study explained? All of the included studies were retrospective with patients selected from surgery rosters. Discussion Our results show that FNAC for parotid gland lesions has high specificity. the index test did not form part of the reference standard)? Was the execution of the index test described in sufficient detail to permit replication of the test? Was the execution of the reference standard described in sufficient detail to permit replication? Were the index results interpreted without knowledge of the results of the reference standard? Were the reference standard results interpreted without knowledge of the results of the index test? Were the same clinical data available when test results were interpreted as would be available when the test was used in practice? Were uninterpretable or intermediate results reported? Were withdrawals from the study explained? Quality assessment was completed on 56 English-language articles. however. Item 9: Was the execution of the reference standard described in sufficient detail to permit replication? No studies described the reference standard in detail. different proportions of cases were verified depending on the result of the index test (see item 5). whereas other studies made no mention of such results. We scored this item as positive for all studies. although the sensitivity is good.136:45-59 51 DOI: 10. and. how they were handled. and relatively few studies indicated whether a pathologist was available to assess adequacy. the technique shows greater specificity than sensitivity. This item was scored positive for all studies. Item 8: Was the execution of the index test described in sufficient detail to permit replication of the test? We scored this as positive if the size of needle and the number of passes were described and if it was indicated whether a pathologist or cytopathologist was immediately available to assess the adequacy of the specimen. In practice. however. Many studies omitted even the most basic description of the procedure (eg. histologic findings are weighted more heavily than FNAC.
as a consequence. Referral patterns are a common source of population differences because the spectrum of disease is dependent on the referral pattern. differences in population and methods). the knowledge gained from the study of heterogeneity can be one of the most important benefits of a meta-analysis.8.Schmidt et al / FNAC for Parotid Gland Lesions consistent with the findings of other reviews. In general. whether the accuracy statistics represent the average performance of a group of pathologists or 1 pathologist). clearly state whether consecutive cases were selected within a specified period. Given the wide variation in results. only a small percentage was estimated to be due to this factor. the non-American group of studies on malignant lesions showed significant heterogeneity. It is important to understand the factors that led to performance variability in order to increase consistency and improve overall performance. Given a set of criteria.80. it is difficult to develop general guidelines for the use of FNAC in the non-American group because the predictive value of FNAC will vary by setting. For example. The American group of studies was homogeneous. however. We found that the accuracy of diagnosis for neoplasia was significantly higher than the accuracy of diagnosis for malignancy. Stata 10) to estimate the degree of heterogeneity due to threshold effects. it has been shown that having a pathologist on site for immediate evaluation of specimen adequacy improves the diagnostic yield84-86. Our results (ie. diagnostic performance could differ between studies owing to differences in the referral pattern and the resulting differences in the spectrum of disease. it was not possible to complete many of the analyses we had planned. This is an important finding because the decision to recommend surgery generally depends on a diagnosis of neoplasia. the spectrum of parotid disease in patients referred for MRI and FNAC might differ from the spectrum of disease in patients who received FNAC only because the subpopulation referred for MRI may involve more complex cases.06). It is important that researchers report the referral pattern and the selection criteria. We attempted to collect data on many study-wide factors. Threshold Differences Study differences can result from differences in the criteria used to assign cases to categories (benign vs malignant). In general. in the context of diagnostic testing.18) than specificity (SD = 0. random variation. In contrast. The knowledge gained from the study of heterogeneity in the non-American subgroup of studies can be used to improve performance in both subgroups.87 It may be possible to reduce performance variation by more uniform application of diagnostic criteria. Differences in accuracy reflect differences in the ability to correctly identify and interpret features. and provide a statistical summary of the population (age and sex distribution). Thus. we found that such data were often poorly reported. Such differences are due to threshold effects and should be distinguished from differences in accuracy. To that end. whether the sample was guided by imaging and the type of imaging. Population differences present another possible factor that could lead to real differences in diagnostic performance. Threshold effects reflect a tradeoff between sensitivity and specificity with constant overall accuracy. understanding causes of real performance variation requires one to study correlations between study-wide factors (PICO) and performance. whereas differences in threshold (constant accuracy) result in variation along the SROC curve. number of passes. few studies in our survey documented whether a pathologist was present. and the staining technique used. 2 pathologists may detect the same features in a case. index test. number of samples obtained. and. and bias related to study design. and the experience level of the pathologist could contribute to variability of results but are often not reported. the number of pathologists involved in the study (ie. For example. Differences in any of these factors can lead to differences in diagnostic performance. We used a statistical procedure (midas. we investigated a range of study-wide factors (described in the “Results” section). Differences in FNAC methods are another factor that could lead to differences in study results. as shown in ❚Figure 5❚. Threshold effects have been shown to account for a significant fraction of interrater disagreement. © American Society for Clinical Pathology . however. Study Heterogeneity Sources of Differences Between Studies There are a large number of study factors that can give rise to differences in diagnostic performance. Similarly. whether a pathologist was available for immediate evaluation of specimen adequacy. We suggest that researchers report the following: needle size. Variation in diagnostic performance can be due to 4 sources82: real differences in test conditions (eg. however. and outcome measure. and it may be possible to use the summary statistics developed herein to develop guidelines for the American setting. one may use more stringent criteria than another in assigning malignancy. threshold effects. significant heterogeneity) suggest that the variability cannot be explained by random variation. stands for population. This result was consistent for the American and non-American subgroups of studies. comparator.136:45-59 DOI: 10. differences in accuracy result in variation in the direction perpendicular to the summary average of the SROC curve. In addition.83 These factors are encapsulated in the acronym PICO that.81 We also found that studies showed much more variability in sensitivity (SD = 0. other factors 52 52 Am J Clin Pathol 2011. Indeed. however. the summary statistics should be interpreted with caution owing to the heterogeneity.1309/AJCPOIE0CZNAT6SQ such as needle size.
false-negatives.8 0.0 1. is the relative proportion of malignant to benign cases that receive histologic verification. Thus.2 0.6 0.71 and 1. In addition.Anatomic Pathology / Review Article Sources of Bias Bias is a second source of potential variation that could have contributed to the heterogeneity in studies. the subset of nonneoplastic cases that proceeds to surgery is unlikely to be representative of the nonneoplastic cases.8 0.3 0. however. when diagnosing malignancy.1 Accuracy Threshold 0. β = the sampling fraction of negative cases.2 0. Almost all cases with a finding of neoplasia are verified by histologic examination. Variation along the SROC curve represents studies with equal accuracy with differing thresholds for malignancy (ie. and false-positives.4 0. Sn' = 0. true-negatives.0 0. © American Society for Clinical Pathology 53 Am J Clin Pathol 2011.0 0. r r = 0.136:45-59 DOI: 10. α = the sampling fraction of positive cases. Both of these factors are likely to bias the sensitivity and specificity in the diagnosis of neoplasia. to be quite high. and TP'. there is more potential for verification bias to affect the diagnosis of neoplasia more than the diagnosis of malignancy because the degree of differential verification is higher in the diagnosis of neoplasia compared with the diagnosis of malignancy. we would expect the relative sampling fraction. FN'.—Because of their retrospective design.4 0.7 0.6 0.1 0. r. Sampling fraction.0 0. more positive samples than negative samples receive histologic verification). r. The sampling fraction.6 Actual Sensitivity 0. For example.9 0.4 0. TN'. whereas only a small subset of the nonneoplastic cases receives histologic verification. and Sp = the actual sensitivity and specificity.98 in equations 3 and 4. our quality survey highlighted several potential sources of bias in this collection of studies.6 0.5 0. respectively. and Sp' = 0. the apparent sensitivity would be biased upward and the apparent specificity would be biased downward. a tradeoff between sensitivity and specificity).0 0.2 0.7 0. say on the order of 5 to 10 (ie. almost all cases proceed to surgery and receive histologic verification.8 1.3 0. Then: ❚Equation 1❚ Sn' = TP' αSn rSn = = TP' + FN' αSn + β(1 – Sn) 1 – (1 – r)Sn ❚Equation 2❚ TN' βSp Sp = = TN' + FP' βSp + α(1 – Sp) r + (1 – r)Sp From which we obtain: Sp' = ❚Equation 3❚ Sn = ❚Equation 4❚ Sn' r + (1 – r)Sn' rSp' 1 – (1 – r)Sp' Sp = Relations 1 and 2. Indeed.5 0.4 0. we obtain estimates of 0. r = α/β = the relative sampling fraction of positive to negative cases. all studies have this source of bias.2 0. verification bias is less of an issue for the diagnosis of malignancy than for the diagnosis of neoplasia. using values of r = 10.1 r=1 r=5 r = 10 ❚Figure 6❚ The effect of verification bias on the observed vs actual sensitivity for diagnosis of malignancy. The effect of verification bias can be estimated as follows: Let Sn. Sn’ and Sp’ = the apparent sensitivity and specificity.8 0. Variation perpendicular to the SROC curve represents differences in accuracy (differences in the ability to detect or correctly interpret case features). and FP' = the observed true-positives.0 1 – Specificity ❚Figure 5❚ Comparison of accuracy vs threshold effects on the summary receiver operating characteristic (SROC) curve.0 0.0 0. Under these conditions.3 0. For the diagnosis of neoplasia.00 for the 1. showing the effect of verification bias on sensitivity and specificity are presented in ❚Figure 6❚ and ❚Figure 7❚.9 0.96. Verification Bias.7 Sensitivity Observed Sensitivity 0. In contrast.1 0.1309/AJCPOIE0CZNAT6SQ 53 53 .9 1.5 0.
While it is not practical to follow up all nonneoplastic cases with histologic studies.90 Specifically.2 0. We varied the sampling fraction. Thus.0 Diagnosis of Malignancy (r = 1. is the relative proportion of neoplastic to nonneoplastic or malignant to benign fineneedle aspiration cytology diagnoses that receive histologic verification. it is important to know how many patients undergo evaluation of a suspected salivary gland lesion. if done at all. In general.00 1.7 0.71 1.97 Estimated Actual 0.1309/AJCPOIE0CZNAT6SQ © American Society for Clinical Pathology .98 Calculations based on equations 3 and 4 (see the text). such follow-up was rarely done and.84 Specificity 0.5 0.5 2 4 5 10 * Sampling fraction.89.79. which shows that our estimate for the sensitivity of diagnosis of neoplasia may be biased upward owing to verification bias but that bias is unlikely to be a major factor in our other estimates.6 Actual Specificity 0.4 0.1 r=1 r=5 r = 10 Nonneoplastic vs Neoplastic Sensitivity 0. we obtain estimates of 0. We expect the sampling fraction of malignant lesions to be similar to that of nonmalignant lesions because most neoplasms are managed with surgery. The likelihood ratios and predictive values are also affected by verification bias because they are functions of sensitivity and specificity.—Only 2 of the studies in this collection were blinded. r.1 0.96 0.00 1.00 1. The sampling fraction. from a clinical perspective. Sp' = 0.93 0. The sampling fraction.95 0. is the relative proportion of benign to malignant or nonneoplastic to fine-needle aspiration cytology diagnoses that receive histologic verification. we believe that verification bias is more of an issue for the diagnosis of neoplasia than for malignancy.9 0.2 and using equations 3 and 4. patients can be followed up with an alternative method of verification such as clinical follow-up.4 0.99 — 1. The sampling fraction.96 for diagnosis of malignant lesions and assuming the relative sampling fraction of malignant to nonmalignant lesions is r = 1.00 0. r.83 0. In our survey.00 ❚Figure 7❚ The effect of verification bias on the observed vs actual specificity for diagnosis of neoplasia. The results ❚Table 7❚ show that the summary estimate of sensitivity is most likely to be affected by verification bias.96 0. ❚Table 7❚ The Effect of Verification Bias on the Summary Receiver Operating Characteristic Curve Estimates Neoplastic vs Malignant (American Studies) Sampling fraction.00 1.0 0.79 0.0 0.0 0.72 0.76 0. r r = 0. the follow-up of negative cases is poorly documented.29 Specificity 0.2) Apparent 0.8 Observed Specificity ❚Table 6❚ The Estimated Effect of Verification Bias in Summary Estimates of Sensitivity and Specificity* Diagnosis of Neoplasia (r = 10) Apparent Sensitivity Specificity * 0.97 for the actual sensitivity and specificity.6 0. is the relative proportion of neoplastic to nonneoplastic fine-needle aspiration cytology diagnoses that receive histologic verification. We conducted a simulation to determine the effect of verification bias on our summary estimates obtained from our SROC curve.97 0. and at each level of r recalculated the entries to Tables 1 and 2 that would have been obtained without verification bias. how many underwent surgery. r.89 0.00 Sensitivity 0.136:45-59 DOI: 10. r* 1 1. r. For example. respectively. At each level of r.96 0.90 0.2 0.3 0.45 — 0. the results of FNAC were known when 1.96 Estimated Actual 0. how many underwent FNAC and. It is worth noting that sensitivity and specificity are not the only summary statistics that are affected by verification bias.8 1.94 0. we reran our analysis and obtained the summary estimates for the 2 homogeneous groups: malignant vs benign (American group) and neoplastic vs nonneoplastic. 54 54 Am J Clin Pathol 2011.Schmidt et al / FNAC for Parotid Gland Lesions actual sensitivity and specificity.98 1. studies need to document the flow of patients through the system as recommended by the STARD initiative guidelines. As discussed previously. is the key performance measure.76 and 0. The study by Rapkiewicz et al88 is an exception and provides a good example of complete documentation of negative cases. These data are required to provide the predictive value of FNAC that. Resolution of verification bias would require studies with follow-up of patients who receive a diagnosis of a nonneoplastic lesion by FNAC.64 0. These results are summarized in ❚Table 6❚. respectively. Review Bias. using our results of Sn' = 0.
It would be helpful to obtain an estimate of the misclassification rate for salivary gland lesions through future studies. In the case of neoplasia. fine-needle aspiration cytology.—We found the reporting of data on inadequate and indeterminate results was often unclear.1 0.3 0. In the case of malignancy. for that reason. and 4.0 0. fine-needle aspiration cytology.9 0. the proportion of true-positive cases is high.227 true-positives. The misclassification rate refers to the error rate of the “gold standard. The potential impact of nondifferential misclassification can be seen by investigating the effect of the misclassification rate on the summary sensitivity and specificity using the totals obtained from our survey.454 true-negatives for the diagnosis of malignancy ❚Table 8❚. we do not believe that review bias is likely to have a large impact. We believe that misclassification errors are most likely nondifferential. We are uncertain as to how much this factor contributed to the variability in study results. in turn.0 0. This source of variation could be eliminated by more standardized reporting. Several articles did not mention such results.4 0.” histologic examination. If such a bias exists. For example. ❚Table 9❚ Summary Totals From Included Studies for Diagnosis of Neoplasia Histologic Diagnosis FNAC Diagnosis Neoplastic Nonneoplastic Neoplastic 620 58 Nonneoplastic 10 107 FNAC. however.0 Misclassification Rate ❚Figure 8❚ The effect of misclassification errors on observed sensitivity and specificity for diagnosis of malignancy.7 Sensitivity 0. Few data are available on the levels of interrater agreement in the diagnosis of salivary gland tumors. 311 false-positives. and. so the level of error and the types of error (differential vs nondifferential misclassification) are unknown. it would tend to increase the correlation between FNAC and histologic findings and inflate the estimates of sensitivity and specificity.2 0.8 1.Anatomic Pathology / Review Article the reference test was conducted and the knowledge of the FNAC results could influence the interpretation of histologic slides. the impact is different for malignancy than for neoplasia.4 0.227 311 Benign 177 4.8 0. future studies should use blinding to remove this source of bias. Am J Clin Pathol 2011.5 0. and misclassification causes a downward bias on sensitivity because the net effect of misclassification is to move cases from the true-negative category to the false-negative category. our survey found a total of 1. and it was unclear whether there were no results of this type. and misclassification causes true-positives to be misclassified as false-positives that.6 0. and our calculations show that relatively small misclassification rates (say 1%) can cause significant bias. Bias Due to Handling of Indeterminate and Inadequate Results. 1. cause a downward bias in specificity. Similar totals are shown for diagnosis of neoplasia in ❚Table 9❚. The ❚Table 8❚ Summary Totals From Included Studies for Diagnosis of Malignancy Histologic Diagnosis FNAC Diagnosis Malignant Benign Malignant 1.454 FNAC.2 0. Because histologic findings are weighted more heavily than FNAC findings. 177 false-negatives. Misclassification Bias. whether they were excluded. or whether © American Society for Clinical Pathology they were incorporated in some other way. Obviously. the way these result categories are handled can have a large influence on diagnostic performance. Renshaw et al87 found a misclassification rate of approximately 3% in a range of surgical specimens. The effect of nondifferential classification on the sensitivity and specificity is shown in ❚Figure 8❚ and ❚Figure 9❚.1309/AJCPOIE0CZNAT6SQ 55 55 . the proportion of true-negative cases is high. We recommend that results be reported in a standardized format as shown in ❚Table 10❚.—The accuracy of the reference standard (definitive histologic diagnosis) is another potential source of bias. The impact of misclassification depends on the distribution of cases in the 2 × 2 table.136:45-59 55 DOI: 10.6 0.0 Specificity Accuracy Estimate: Sensitivity and Specificity 0. Nondifferential misclassification occurs when the error rate of the reference test (definite histologic diagnosis) is independent of the result of the index test.
4 0. however. Thus. the usefulness of FNAC can be evaluated only on a case-by-case basis depending on local diagnostic performance.0 specificity for the diagnosis of neoplasia. Misclassification bias probably leads to an underestimate of the sensitivity in the diagnosis of malignancy and an underestimate of the ❚Table 10❚ Suggested Format for Reporting of Accuracy Results Histologic Diagnosis FNAC Diagnosis Positive Negative Indeterminate Inadequate FNAC.8 0. average and SD) on the time between FNAC and surgery. At present. Centers need to develop systems to assess local diagnostic performance for the diagnosis of neoplasia and malignancy.1 0. Thus. as a result. The summary estimate for the sensitivity of the diagnosis of neoplasia is probably inflated owing to verification bias.—Timing bias occurs when there is significant disease progression during the time between performance of the index test and a reference test. it would be helpful if researchers reported summary statistics (eg.1309/AJCPOIE0CZNAT6SQ . Review bias probably inflates the estimates of sensitivity and specificity.” histologic examination. This is the most important decision because it determines whether a patient goes to surgery.9 0.3 0. whereas a false-negative for neoplasia will usually be detected following surgery by definitive histologic findings. there is room for further research including the following: • Improved estimates of the diagnostic accuracy of the diagnosis of neoplasia. which determines the extent of surgery and.136:45-59 DOI: 10. fine-needle aspiration cytology. We found that studies grouped according to country often formed homogeneous groups with respect to diagnostic performance. given a neoplastic lesion. Timing Bias. Neoplastic lesions generally are managed by surgery. It seems that studies have been driven by the convenience of sampling (surgery lists) and.4 0.7 Specificity 0.2 0.5 0. the FNAC differentiates between neoplastic and nonneoplastic lesions. results for diagnosis of neoplasia and malignancy should be presented in 2 separate tables. The impact of verification bias needs to be better understood before the estimates for neoplasia can be considered reliable. but this effect is probably small. have focused on the diagnosis of malignancy rather than the diagnosis of neoplasia.6 0. the diagnosis of neoplasia could be regarded as the most critical decision.8 1.2 0.6 0. Second. Obtaining such data will require studies with better follow-up of patients who do not undergo surgery. the statistically significant difference seen between the accuracy of the diagnosis of neoplasia and malignancy may be due to bias rather than a real difference in performance. Clinical Implications FNAC provides information that informs 2 key decisions in patient management. whereas nonneoplastic lesions are managed conservatively. A false-negative diagnosis for neoplasia can result in a delay of treatment and progression of disease. 0. it would not be possible to provide general guidelines based on average performance. © American Society for Clinical Pathology Positive Negative Indeterminate 56 56 Am J Clin Pathol 2011. FNAC determines whether the lesion is malignant or benign. the performance variability is too great to provide general guidelines regarding the usefulness of FNAC.Schmidt et al / FNAC for Parotid Gland Lesions 1. Given the variability. While we do not believe this is likely to be a significant source of bias. The misclassification rate refers to the error rate of the “gold standard.0 Misclassification Rate ❚Figure 9❚ The effect of misclassification errors on observed sensitivity and specificity for diagnosis of neoplasia. the data on the diagnosis of neoplasia are sparse. First. Thus. whether the facial nerve can be spared during surgery. Research Needs In our opinion.0 0. Overall. in particular.0 Sensitivity Accuracy Estimate: Sensitivity and Specificity 0. it may be possible to develop guidelines that would apply to countries or regions.
With these improvements. Conclusions The specificity of FNAC is quite high for the diagnosis of neoplasia (0.gov. Fine needle aspiration biopsy cytology of major salivary glands. Sneige N. 15 N Medical Dr East. Chie Minoda. Salt Lake City. University of Utah School of Medicine. 4. et al. Second. Acknowledgments: We thank the following people for assistance with translation: Evrim Ergodan. Nettle WJ. in turn. Egger M. istanbulsaglik. Larissa Furtado. factors that improve performance need to be incorporated into practice to reduce variability that. Biostatistics. Young JE. From the 1Department of Pathology.pdf.7:267-272. Thus. Given the wide variability in accuracy. PhD. Based on our survey.tr/w/tez/pdf/patoloji/dr_birol_sonmez. 2007. Head Neck. it is not possible to give a general guideline regarding the clinical usefulness of FNAC. Address reprint requests to Dr Schmidt: Dept of Pathology. the clinical usefulness of FNAC can be assessed only by evaluating its incremental impact on the final diagnosis. Zbaren P. Universität zu Würzburg. so the diagnostic process often involves other tests such as a clinical examination and imaging studies. MD. Yefim Lavrentyev. 2007. MD. This will require 2 things. Lin AC.29:503-512. Ortega PG. 13. 5. There is a need for more complete reporting and improved study designs to remove sources of bias. We also thank Mary Youngkin for assistance with the literature search. MD. 11. in the face of such variability. Laryngoscope. Batsakis JG. UT 84112. 2. The sensitivity is lower and more variable for the diagnosis of neoplasia (0. MD. however. Glasgow BJ. PhD.Anatomic Pathology / Review Article • Improved study designs and patient tracking systems to eliminate verification bias • Studies to understand the impact of misclassification bias • Studies to further understanding of the contribution of FNAC to the diagnostic process. Reha Toydemir. 2008.59:47-51. Value of fine needle puncture cytology in neoplasms of the parotid gland [in German]. 9.98) and malignancy (0. each practice location must monitor its diagnostic performance to assess the usefulness of FNAC at each individual practice location.96) and for malignancy (0. Thus. having a cytopathologist available to assess slide adequacy at sample collection) would reduce variance and improve overall performance.139:811-815. • Studies to understand and eliminate the causes of performance variability. A unification of models for meta-analysis of diagnostic accuracy studies [published correction appears in Biostatistics. HNO. Sianos J. makes it possible to identify more subtle causes of performance variation. el-Naggar AK. MD.101:245-249. Sonmez B. Fine-needle aspiration biopsy of salivary glands. 1992. Guelat D.46:81-84. et al. Otolaryngol Head Neck Surg.79). Schroder U. Diagnosis of salivary gland tumors by fine-needle aspiration cytology: a review of clinical utility and pitfalls. Frable WJ. Fine-needle aspiration cytology in a regional head and neck cancer center: comparison with a systematic review and meta-analysis. 2010. Parotid tumors: fineneedle aspiration and/or frozen section.30:1246-1252. however. Carolin Teman.90 would do much to rectify this situation. Sotelo-Regil RH. and Holly Zhou. 12. Parotis Bezi Kitlelerinde Ince Igne Aspirasyon Sitolojisi Ve Postoperatif Histopatolojik Degerlendirmenin Karsilastirilmasi Sağlık Bakanlığı. Deeks JJ. provide general guidelines regarding the clinical usefulness of FNAC for diagnosis of parotid gland lesions. there are other sites where this is not the case. 7. The utility of fine needle aspiration in parotid malignancy. Accessed October 15. Ann Otol Rhinol Laryngol. Karen Moser. 2008.48:421-429. there is a need for improved reporting of study factors that can be used to identify the factors that cause performance variability. University of Utah. Harbord RM. 2005. et al. for example. Friese A. Diagn Cytopathol. et al. 10.96). Thus. the guidelines of the STARD initiative89. Acta Cytol. FNAC is not done in isolation. Eckel HE. http://www. et al. Frable MA. Perez VM. a positive diagnosis is quite reliable. Shahab R.136:45-59 57 DOI: 10.9:779]. 2008. Bhattacharyya N. Laura Parnas. Qizilbash AH. Fine-needle aspiration of salivary glands: its utility and tissue effects. Ergebnisse von Feinnadelpunktionszytologie.1309/AJCPOIE0CZNAT6SQ 57 57 . Loosli H.101(2 pt 1):185-188. adoption of the best practices (eg. future studies may provide useful data that can be used to understand causes of performance variation and to © American Society for Clinical Pathology Am J Clin Pathol 2011. Rasche V. More complete reporting using. 3. Biopsia por aspiracion can aguja delgada en el diagnostico de lesiones en region parotidea. 1985. Anya Matynia. Benton JI. 2000. and 2ARUP Laboratories. Otolaryngol Head Neck Surg. Aust N Z J Surg. 1991. The overall accuracy of diagnosis of neoplasia is greater than the accuracy for diagnosis of malignancy. 6. 8. 1989. Salt Lake City. some of the difference may be due to verification bias. It is disappointing that such a large collection of cases does not contribute more to our understanding of performance variation.136:793-798. klinischer Beurteilung und endgültigem histopathologischem Befund im Vergleich bei Parotistumoren. The usefulness of FNAC will vary by location depending on the accuracy obtained at each site. 1991. 2005. 2000.8:239-251. Orell SR. First. References 1. Layfield LJ. there are practice locations where FNAC is sufficiently accurate that a negative result can be used to avoid surgery or to allow for nerve-sparing surgery. MD. Also. Tandon S. Rev Inst Nac Cancerol. Fine needle aspiration in the diagnosis of salivary gland lesions.
et al. 1997.267:601-605. J Otolaryngol.44:515-519. Othman EO. 30. Fine needle aspiration of 123 head and neck masses: an initial experience. 2009. 42. 38. Outcome of surgery for parotid tumours: 5-year experience of a general surgical unit in a teaching hospital. Zupi A. 1995. http://srdta. Acta Otolaryngol. Bollito E. Giardino C. Filho V.Schmidt et al / FNAC for Parotid Gland Lesions 14. Laippala P.0. 15. Preoperative cytology in the management of parotid neoplasms. Acta Otorhinolaryngol Ital. 32.22:781-786.1309/AJCPOIE0CZNAT6SQ © American Society for Clinical Pathology . Harbord RM. 25. An audit of surgery of the parotid gland. Depondt J. Eur J Surg Oncol. Kulak Burun Bogaz Ihtis Derg. Coll Antropol.22:303-306. He Y. Zhang ZY. 49. 35. Buhler RB. 45. Head Neck.38:539-542.23:314-318. Systematic reviews of diagnostic test accuracy. Behzatoglu K. Cubetta M. Leeflang MMG. 44. 1992. Ann R Coll Surg Engl. 2003.20:354-359. Atula T. doi:10. 2002. 40. 2010. 2003. Correlation between fine needle aspiration biopsy and histologic findings in parotid masses: personal experience. 50. Serpell JW. et al. Al Salamah SM.100:133-138.31:351-354. Lim-Tan SK.3:25. 27. Pedisić D. Nestok BR. 16. et al. et al. Bartels S. Bekafigo IS. 23. Watanabe K. 26. Royer B. Cancer. Katz RL. Otolaryngol Head Neck Surg.136:45-59 DOI: 10. et al. Burgess AN. de Carlucci D. 2009.34:345-348. Rev Col Bras Cir. Sergi B.1186/1471-2288-3-25. Filopoulos E. Ahmad Z. Davies B. Evaluation of fine needle aspiration cytology in the diagnosis of cancer of the parotid gland [in Spanish]. 18. Poller D. et al. Mattiola LR. ANZ J Surg.11:294-299. 41. et al. 2006. 19. 46.75:948-952. Value of the cytological diagnosis in the treatment of parotid tumors. 2000. Aversa S. Garcia Alvarez G. Whiting P.61:1095-1103. 1995. Fine needle aspirative biopsy in parotid gland diseases. et al. 2004. et al. Stata J. 24. Parotid gland tumours in 255 consecutive patients: Mount Sinai Hospital’s quality assurance review. Brennan PA. Corina L.147:2824-2827. 51. et al. 17. Gobić MB. Flores L. Al-Shahawi M. Accuracy in the diagnosis of parotid tumours. 2000. Amasio ME. 2006. 2010. Demireller A. Hacker D. Fine-needle aspiration biopsy in the diagnosis of parotid gland lesions: evaluation of 438 biopsies.149:889-897. 29. Int Arch Otorhinolaryngol. Shanghai Kou Qiang Yi Xue. Briggs K. 34. DiTomasso J. et al. Fine needle aspiration puncture in parotid gland lesions. Parotid tumours: correlation between fine needle aspiration biopsy and histological findings [in Spanish].84:153-159. J Craniomaxillofac Surg. D. Acta Otorrinolaringol Esp. Gete Garcia MP. Gooden E. Br J Oral Maxillofac Surg. Behbehani A. Pinheiro JLG. Almodovar Alvarez C. Stat Med. Ann Acad Med Singapore. Deneuve S. 2004.20:2865-2884. Eur Arch Otorhinolaryngol. 2009. et al. Martin-Granizo R. Diagn Cytopathol. Gatsonis C. et al.128:1152-1158.13:15-18. Diagnostic value of fine needle aspiration cytology in parotid tumors. Gatsonis CA. Bahadir B.0. Diagnostic accuracy of fine needle aspiration biopsy in preoperative diagnosis of patients with parotid gland masses. Harbord RM. Dashti H.78:791-793. 2008. Med Principles Pract.cochrane.57:279-283. Inohara H. Kamal SA. 2001. 2009. Deeks JJ BP. eds. Kaplan HH. Fine needle aspiration cytology in the evaluation of parotid gland tumors.48:149-154. 2007. 2010. Diaz Perez JA.12:410-413. Management of parotid gland surgery in a university teaching hospital. The diagnostic value of fine-needle aspiration cytology (FNAC) for lesions in the parotid gland [in Chinese]. Spence RA. 47. Rutter CM. 2008. Chew CT.59:212-216.9:211-229. Minerva Stomatol. Ultrasonography guided fine needle aspiration biopsy of parotid gland masses. 39. 33. Fine needle aspiration cytological study of parotid tumors [in Danish]. Ramirez R. et al. Kozawa T. et al. et al. 2001. J Pak Med Assoc.100:369-373. Khalid K. J Laryngol Otol. Berrone S.54:617-619. et al. 1996. Gatsonis C. Herrera Hernández AA. et al. 2008. 28. J Surg Oncol. The Cochrane Collaboration.77:188-192. Greenman R. Khan IA. Rutjes AWS. Ondolo C. 43. 1993. 20. 2007. Deeks JJ. Contucci AM. 31. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Version 1. Acta Cytol. The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. Pre-operative evaluation of parotid tumours by fine needle biopsy. et al. Sterne JAC. Angeli S. Carrillo JF. Witterick IJ. 2008. et al.2:27-34. 2010. Lefevre M. et al. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. 2005. et al.28:189-192.24:180-183. Pract Otorhinolaryngol. 1998. Fine-needle aspiration of 154 parotid masses with histologic correlation: ten-year experience at the University of Texas M. Fine needle aspiration biopsy of the parotid gland: diagnostic problems and 2 uncommon cases. 21. Daskalopoulou D. Kondo A. Kaur A. et al. Garcia CA. Tuna EU. 37. Clinical study of parotid tumors. BMC Med Res Methodol. Ugeskr Laeger. Metandi: meta-analysis of diagnostic accuracy using hierarchical logistic regression. et al. Diagnosis and treatment of parotid tumours. Reitsma JB. ANZ J Surg. et al. Knudsen PJ. Talbot JM. Akahani S. An empirical comparison of methods for meta-analysis of diagnostic accuracy showed hierarchical models are necessary. Akbas Y. J Clin Epidemiol. 1985. Sondermann A. Anderson Cancer Center. Acta Otorrinolaringol Esp. 1990. Parotidectomy: preoperative investigations and outcomes in a single surgeon practice. 48. Jafari A. Am J Otolaryngol. The relative value of fine-needle aspiration and imaging in the preoperative evaluation of parotid masses. 2008. The role of fineneedle aspiration cytology and magnetic resonance imaging in the management of parotid mass lesions.15:185-190. 58 58 Am J Clin Pathol 2011. The value of pre-operative fine-needle aspiration biopsy in planning the management of parotid gland tumours. Califano L. et al. Eriksen HE. Accessed September 15.140:381-385. Br J Oral Maxillofac Surg. 36. Castro P. Fine needle biopsy in the preoperative diagnosis of parotid tumors [in Italian]. Costas A. Deans GT. Al-Khafaji BM. 2003.111:316-321. Tian Z. Quesnel S.27:96-100. 1998. 22. Fine needle aspiration cytology (FNAC) of salivary gland tumours: repeat aspiration provides further information in cases with an unclear initial cytological diagnosis. Yamamoto Y. Whiting P. 2004. Greisen O. Ann Intern Med.48:26-29. Awan MS. org/handbook-dta-reviews. Whiting P. Fine needle aspiration biopsy (FNAB) for lesions of the salivary glands.
96:799-804.136:45-59 DOI: 10. Evaluation of the fine needle aspiration biopsy in the presurgical diagnosis of tumors of the parotid gland [in Spanish]. 65. Singh N. Poole AG. 1991. A comparative study of 200 fine needle aspiration biopsies performed by clinicians and cytopathologists. et al.org/about/country-classifications/countryand-lending-groups. ANZ J Surg.71:345-348. Pitfalls in salivary gland fine-needle aspiration cytology: lessons from the College of American Pathologists Interlaboratory Comparison Program in Nongynecologic Cytology. 71. Sigston EA. Fagan JJ. Baloch ZW. 1997. Fine-needle aspiration of salivary gland lesions: comparison with frozen sections and histologic findings.75:144-146. 2003. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative.116:1212-1215. New York. 84. 1987. Reitsma JB.47:188-190. Glasgow BJ. 90. Osaki T. Marrazzo A. 75. Ryan D. Role of fine-needle aspiration cytology in the evaluation of parotid tumours. 63. et al. Payne RJ. S Afr J Surg. 62. Tradati N. Glasziou P. Fine-needle aspiration of parotid masses. The World Bank. Clin Chem. Parotid tumors: clinical study of 36 cases. 1997.14:483-487. Connor NP. Tan P. Layfield LJ.1309/AJCPOIE0CZNAT6SQ 59 59 . Upton DC. 79. Agreement and error rates using blinded review to evaluate surgical pathology of biopsy material. 1992. 2002. Eur Arch Otorhinolaryngol. for the STARD Group. 2003. et al. 59. Minerva Chir. Lim CM. 2008. 54. Retrospective review of 242 consecutive patients treated surgically for parotid gland tumours. Bossuyt P. 2009. et al. Seethala RR. Berney D. 2001. Inadequate rates are lower when FNAC samples are taken by cytopathologists.77:742-744. Osanai H. Granter SR. 77. Lurie M. 2003. Isr Med Assoc J. 1981. 66. Fleiss JL. 61. 2003. Bossuyt PM. Am J Clin Pathol. 1992. Gorostiaga Aznar F. 102-103.48:1193-1196.worldbank. et al. 2001. Moisa II. 87. Guedon C. Parotidectomy: ten-year review of 237 cases at a single institution. They J. Stevenson S. 2nd ed. 2007. et al. 83. 2007.129:26-31. BMJ. Pract Otorhinolaryngol. and intraparotid facial nerve anatomy. Rev Laryngol Otol Rhinol (Bord). Diagnostic value of fineneedle aspiration from parotid gland lesions. J Oral Maxillofac Surg. Mohammed F. A review of parotid tumours and their management: a ten-yearexperience. Yuan S. Riley N. Bruns DE. 85. 53. Van Lierop AC. Silver CE.17:96-99. Facon F. Tsai SC. et al. Nallet E. et al. Rodriguez HP.Anatomic Pathology / Review Article 52. Takayama F.158:342-344. 2006. et al. 2010. Paris J. Head Neck. Malignant tumours of the parotid gland: a 12-year review. Preoperative diagnostic values of fine-needle cytology and MRI in parotid gland tumors. Spectrum of head and neck lesions diagnosed by fine-needle aspiration cytology in the pediatric population. 89. Parotidectomy in Cape Town: a review of pathology and management. 100. 2010. Cancer. Dinnes J. et al.27:217-223. ANZ J Surg. Diagnostic value of needle biopsy and frozen section histological examination in the surgery of primary parotid tumors [in French]. Bruns DE. Que Hee CG.50:600-608. Kulak Burun Bogaz Ihtis Derg. et al. Deeks J. Rapkiewicz A.119:797-800. 2006. Med J Malaysia. 69. Shafi M. Rosenberg WW. Fine-needle aspiration biopsy of head and neck lesions. 64. 57. Uğuz MZ. et al. Fine needle aspiration cytology in parotid lumps.33:495-503. 81. Estelles Ferriol E.136:788-792. 2003. Arch Pathol Lab Med. Zurrida S. Bossuyt PM. New Zealand.9:1-113. 2007. 2006. 1993. Fine-needle aspiration biopsy of parotid lesions: comparison with frozen section. Fradis M. 82. Simsir A. Tang IP. Pascal T. Relative accuracy of fine-needle aspiration and frozen section in the diagnosis of lesions of the parotid gland.14:327-331. Reitsma JB. et al. 2005. et al.49:7-18. Kirby J. Loh KS. Burstein DE. Grundy WG. Aust N Z J Surg. J Otolaryngol Head Neck Surg. Irwig L. et al. 2007. Allison R. A methodological review of how heterogeneity has been examined in systematic reviews of diagnostic test accuracy. et al. Parotid gland lesions: diagnosis of malignancy with MRI and flow cytometric DNA analysis and cytology in fine-needle aspiration biopsy. J Pak Med Assoc. et al. 2005. Br J Plast Surg. Fine-needle aspiration of parotid tumors. Weinberger MS. Zafar A. Accessed November 1. 1997. Alasio L. 60. 2005. NY: Wiley. Country and Lending Groups. Designing studies to ensure that estimates of test accuracy are transferable. Taormina P.49:1-6. Am J Surg. An Otorrinolaringol Ibero Am. Nonaka S. Longuet M.116:359-362. 2007. et al. LiVolsi VA. 72. Parotid neoplasms: diagnosis. treatment.21:43-51.324:669-671. Cytopathology. et al. Tew S. http:// data. Wang Q.67:438-441. Wilbur DC.111:242-251.122:51-55. et al. Schelkun PM. Altuna Mariezkurrena X. Le BT. 55.30:571-585. Malignant tumors of the parotid gland [in Spanish]. Koch WM.262:27-31. Philips J. Hsu HT. Richtsmeier WJ. et al. Hughes JH.76:736-739. 74. Utility of immediate on-site cytopathological procurement and evaluation in fine needle aspiration biopsy of head and neck masses. 80. Arch Pathol Lab Med. Otolaryngol Head Neck Surg.49:262-267. © American Society for Clinical Pathology 59 Am J Clin Pathol 2011. An Otorrinolaringol Ibero Am. Fineneedle aspiration of parotid gland lesions. Frozen section for parotid surgery: should it become routine? ANZ J Surg. Onal HK. Meurer WT. Mianroodi AAA. 1989. La Bara G. Vallance NA. Head Neck. 88. 2002. The role of needle aspiration biopsy in the diagnosis of parotid masses [in Italian].72:2306-2311. 70. J Laryngol Otol. Misselevitch I. et al. Asaria J. Carrasco Llatas M. Takashima S. Malata CM. 56. Camilleri IG. Cancer. Wu M. Volk EE.102:1328-1330. Sensitivity and specificity of fine needle aspiration biopsy in parotid masses [in Turkish]. et al. Head Neck. McLean NR. Clin Chem. McNamar JP. Velayutham P. 1993. 73. Hassan SH. 1999. Zulueta Lizaur A. 78. Perry CF. Fine-needle aspiration cytology of parotid tumours: is it useful? ANZ J Surg. Laryngoscope.111:346-353. Health Technol Assess. 2003. Fine-needle aspiration cytology in parotid masses: our experience in Canterbury. Shashinder S. Renshaw AA. Laryngoscope.4:681-683. 67. et al. 58. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Pons Rocher F. 76. 2002. Cartagena N.64:31-33. Eroğlu OO. 86. et al.37:340-346. 2005. Eisele DW. 2005.45:96-98. 68. et al. Statistical Methods for Rates and Proportions.
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.