Article Access Statistics | | Viewed | 940 | | Printed | 68 | | Emailed | 0 | | PDF Downloaded | 19 | | Comments | [Add] | |
|

 Click on image for details.
|
|
|
ORIGINAL ARTICLE |
|
|
|
Year : 2022 | Volume
: 68
| Issue : 4 | Page : 221-230 |
The development of QERM scoring system for comprehensive assessment of the Quality of Empirical Research in Medicine - Part 1
Research Quality Improvement Group*
.
Date of Submission | 04-Jun-2022 |
Date of Decision | 12-Sep-2022 |
Date of Acceptance | 13-Sep-2022 |
Date of Web Publication | 04-Nov-2022 |
Correspondence Address:
 Source of Support: None, Conflict of Interest: None  | Check |
DOI: 10.4103/jpgm.jpgm_460_22
Purpose: Whereas a large number of features are mentioned to connote the quality of medical research, no tool is available to comprehensively measure it objectively across different types of studies. Also, all the available tools are for reporting, and none includes quality of the inputs and the process of research. The present paper is aimed to initiate a discussion on the need to develop such a comprehensive scoring system (in the first place), to show that it is feasible, and to describe the process of developing a credible system. Method: An expert group comprising researchers, reviewers, and editors of medical journals extensively reviewed the literature on the quality of medical research and held detailed discussions to parse quality at all stages of medical research into specific domains and items that can be assigned scores on the pattern of quality-of-life score. Results: Besides identifying the domains of the quality of medical research, a comprehensive tool for scoring emerged that can be possibly used to objectively measure the quality of empirical research comprising surveys, trials, and observational studies. Thus, this can be used as a tool to assess Quality of Empirical Research in Medicine (QERM). The expert group confirmed its face and content validity. The tool can be used by the researchers for self-assessment and improvement before submission of a paper for publication, and the reviewers and editors can use this for assessing the submissions. Published papers can also be rated such as those included in a meta-analysis. Conclusion: It is feasible to devise a comprehensive scoring system comprising domains and items for assessing the quality of medical research end-to-end from choosing a problem to publication. The proposed scoring system needs to be reviewed by the researchers and needs to be validated.
Keywords: Empirical research, medical research quality, QERM score, quality assessment, scoring system, tool to assess quality
How to cite this article: Research Quality Improvement Group*. The development of QERM scoring system for comprehensive assessment of the Quality of Empirical Research in Medicine - Part 1. J Postgrad Med 2022;68:221-30 |
How to cite this URL: Research Quality Improvement Group*. The development of QERM scoring system for comprehensive assessment of the Quality of Empirical Research in Medicine - Part 1. J Postgrad Med [serial online] 2022 [cited 2023 Jun 1];68:221-30. Available from: https://www.jpgmonline.com/text.asp?2022/68/4/221/360454 |
*This group comprises eight authors
A Indrayan1, G Vishwakarma2, RK Malhotra3, P Gupta4, H P S Sachdev5, S Karande6, S Asthana7, S Labani7
1 Clinical Research, Max Healthcare, New Delhi, India
2 Biostatistics, Indian Spinal Injuries Centre, New Delhi, India
3 Surgical Oncology, Dr. B. R. Ambedkar Institute Rotary Cancer Hospital, All India Institute of Medical Sciences, New Delhi, India
4 Pediatrics, University College of Medical Sciences, Delhi, India
5 Pediatrics and Clinical Epidemiology, Sitaram Bhartia Institute of Sciences and Research, New Delhi, India
6 Pediatrics, Seth G. S. Medical College and KEM Hospital, Mumbai, Maharashtra, India
7 Epidemiology and Biostatistics, National Institute of Cancer Prevention and Research (Indian Council of Medical Research), Noida, Uttar Pradesh, India
:: Introduction | |  |
The concept of quality generally refers to the inherent properties of a process or product that meet the stated objectives. In the case of medical research, the objective generally is to find new ways to improve health. This is mostly achieved by empirical research. This kind of research is an inherently imperfect endeavor in any field, but medical research is much more inflicted with uncertainties because of its interface with volatile humans. Thus, it requires special tools for assessing quality.
The quality of medical research has been in discussion for quite some time[1],[2],[3],[4] but the concern has steeply increased in the past decade as the number of papers and journals has exponentially increased. Many of these publications are believed to be of dubious quality.[3] While discussing “scandal” in medical research, Altman[5] stated that poor quality is widely acknowledged, but it raises minimal concern among the medical professionals. Ioannidis[6] commented on false findings in much published research and drew attention to the flawed and misleading findings in many publications resulting in huge wastage of resources.[7] Chalmers and Glasziou[8] also highlighted this enormous wastage. Thus, the quality of medical research needs urgent attention—not just for publication, but also the process, so that quality research is conducted and reported.
The quality of medical research has been interpreted differently by different workers. Ioannidis[1] emphasized truthfulness of results and Mendoza and Garcia[2] expressed concern with reproducibility. ESHRE Capri Workshop Group[3] showed concern with credibility and utility, and Mische et al.[4] discussed scientific rigor and transparency. With such diversities, it is important that a consensus is evolved. The core domains of quality need to be identified and the components of each domain are specified for assessing specific ingredients of quality.
Most comments in the literature on the quality of medical research are based on the publications rather than the actual process. However, the quality assurance should be done all through the process, and not just at the end, to ensure the excellence in the inputs, the process, and the final product.
An expert Research Quality Improvement Group, comprising researchers, reviewers, and editors of medical journals, held intensive consultation among themselves and considered a wide spectrum of features that constitute “quality.” The endeavor was to include not just the quality of the written draft but also the inputs and the process. The Group explored the possibility of developing a scoring system comprising domains and items that can measure the quality of medical research in a comprehensive and objective manner, much like the quality-of-life index. The deliberations were limited to the data-based empirical research because of the widely divergent issues in other kinds of research. They resulted in identifying the domains of the quality of medical research. Based on these domains, a proposal emerged for a tool to assess Quality of an Empirical Research in Medicine (QERM) that can measure the quality of the research process and the output. Evidence from CONSORT, STROBE, and STARD indicates that such guidelines do help in improving the quality of reporting.[9],[10],[11],[12] Adherence to the items of our proposed scoring system may help in improving not just the reporting but the entire process of medical research.
The Group realized at the outset that it is not feasible to observe the research process actually followed by different workers in their institutions and organizations, and the only effective way to assess the quality of medical research by a third party is by evaluating the written document that describes the research. We included the quality of steps at the planning and execution that can be used for self-assessment and did not restrict to the quality of manuscripts before publication when peer review is routinely done. The proposed scoring system contains items that could be useful to frame a worthwhile protocol, and to adopt an appropriate process of conducting quality research, including the steps beginning with the choice of the topic and ending with the publication of a paper. Reporting also is a part of this unique scoring system. The scoring may be useful for the reviewers and editors also to assess the quality of a submission that generally includes the statements on the conception and the process. Funding agencies can also use this scoring tool to assess the quality of the proposals and the reports of the concluded project. Published papers can also be assessed, including those in a meta-analysis.
This communication reports the process adopted and the progress made to quantify the quality of empirical medical research through scoring. This includes a checklist. We follow the dictum that excellence should not be the enemy of good.[13] The objective is to initiate a discussion on the need to develop such a system (in the first place), to show that it is feasible, and to propose a credible scoring system to assess the quality of medical research for review by the researchers.
:: Materials and Methods | |  |
To comprehensively capture the traits that reflect the quality of medical research, the following activities were undertaken by the Group.
- Extensive review of the literature concerned with the quality of medical research to identify potential traits that could go into developing a scoring system. For this, PubMed database was searched for articles containing the terms (“research quality” OR “quality of research”) in the title published during last ten years and has full text available. This yielded 84 articles (as of January 22, 2022). This could be a restricted sample but may have provided a reasonable snapshot of the recent thinking on this issue.
- Since most journals around the world depend on the inputs of the reviewers to decide to publish or decline, reviewers' guidelines of several journals were consulted including PloS Reviewers Guidelines[14] and Wiley Guidelines.[15]
- The origin of CONSORT,[16] STROBE,[17] and STARD[18] guidelines was studied to learn lessons regarding how to begin and complete such an exercise, and to pick features for assessing the quality of a research.
- The SQUIRE guidelines[19] and ICMJE recommendations[20] for improved reporting were examined.
- Other quality assessment tools such as OQAQ[21] for review articles, AMSTAR[22] for systematic reviews, QUDAS[23] for diagnostic tools studies, NOS[24] for observational studies, and COREQ[25] and the one proposed by Mays and Pope[26] on qualitative research were studied.
- Other relevant articles we could locate on this topic through internet search and cross-references were also studied. The attempt was to comprehensively capture all possible aspects of quality applicable to all levels of empirical research from observational studies to clinical trials.
The articles consulted by us are in the reference list and the [Supplementary Material - 1]. This also includes a separate list of articles on describing the process of developing tools such as checklists and scoring systems without validation at that stage.
Because of enormous and wide variety of information, the Group decided to list all the terms that connote the quality and bin them into suitable domains, such that the terms within each domain are related, but are largely unrelated with the terms in the other domains. The domains are like abstract constructs, whereas the items are the observable entities.
Domains of quality of medical research
The above-mentioned review identified more than 50 terms that could connote different aspects of the quality of medical research. These were classified using established norms.[27] In the context of empirical medical research, these may be understood as adequacy of the process, and truthfulness and applicability of the results and the conclusions, which comes from clarity regarding the research question and the methodology of the investigation.[28],[29] This was extended to choosing the problem, the thought process, and the execution of research, besides the draft of the manuscript. The five domains thus identified are in column 1 of [Table 1]. | Table 1: Domains of the quality of medical research and their constituents
Click here to view |
Brainstorming was done to bin these 50-odd terms into the identified domains. One person (AI) was asked to come up with a classification for review by the Group. After some discussion and modification, the agreed classification is as shown in column 2 of [Table 1]. This classification helped to define each domain so that it has the same meaning for everybody. It also helped to understand what each domain contains and what does it represent.
The first four domains comprise the traits to be considered mostly at the time of planning and execution, whereas the fifth is on drafting the manuscript. The draft of the paper should contain the statements on the first four domains plus more as mentioned later. The following are the briefs of each domain. The details are provided in the [Supplementary Material - 2].
The items numbered under each domain in the following paragraphs are based on the aforementioned review of the literature and the Group discussion regarding various components of research. The components begin with the choice of the research question and end with the drafting of the report.
Clarity
Scott-Findlay and Pollock[30] called for conceptual clarity and Shaw[31] discussed clarity in research integrity. An important ingredient of clarity is transparency.[4],[32],[33],[34],[35] A brief of clarity, including transparency, regarding various components of medical research, is as follows: The details are in the [Supplementary Material - 2]. (i) Unambiguous problem, (ii) complete specification of the objectives, (iii) full clarity regarding the target population, (iv) clarity regarding the design of the study, (v) complete specification of the sample size, considering the exclusions and dropouts, (vi) full specification of the intervention, if any (vii) clearly formulated tools for data collection, (viii) clear process of elicitation of information, (ix) road map for analysis of data, (x) visualization of the expected results, and (xi) fitting negative and positive results in the research jigsaw puzzle.
Adequacy
Adequacy is sufficiency without overdoing. Bordage[36] and Pierson[37] considered inadequacy of contents as a major cause of rejection of manuscripts.
The “adequacy” domain contains many traits—most talked about is reproducibility.[2],[38],[39],[40],[41] This is generally understood as the ability of a competent researcher to get nearly the same result using similar material from another setup when similar methods are used as by the original investigator.[42] This is different from replicability[43] and repeatability but can also be broadly considered parts of reproducibility. According to Goodman et al.,[44] reproducibility incorporates features of design, reporting, analysis, interpretation, and corroborating studies.
The adequacy of different components of research can be enumerated as follows: (i) original, novel, and justified research question, (ii) sufficient resources for completing the research, (iii) measurable objectives, (iv) adequate intervention to achieve the stated objectives, (v) appropriate target population, (vi) adequate tools for collection of the data, (vii) no relevant variable missed, (viii) study design that takes care of confounders and interactions, (ix) sufficient sample size not discounting the advantages of small samples,[45] (x) representative sample, (xi) ethical and complete data collection, (xiii) appropriate analysis for the type of data and focused on the research question, (xiii) the results with sufficient reliability, and (xiv) valid reasons available if the research question is not fully answered. For details, see the [Supplementary Material - 2].
Truthfulness
Reproducible research can have bias when the same bias recurs every time the research is reproduced. A comprehensive list of biases is given by Indrayan and Holt.[46] Validity is the ability to reach the truth with no contrarian example. Since empirical research can never include future subjects, the veracity of the hypothesis cannot be determined, and truthfulness is assumed when its falsehood cannot be demonstrated.[47] Ioannidis[1],[6] emphasized “truthfulness” of results as the core component of the quality of medical research.
The following brief incorporates the “truthfulness” in the context of various components of research. The details are in the [Supplementary Material - 2]. (i) Accurate research question anchored with prior evidence, (ii) objectives directly related to the research question, (iii) target population is specified, (iv) variables chosen provide a correct answer to the research question, (v) valid intervention, (vi) chosen end-points provide answer to the research question, (vii) valid tools for eliciting the data, (viii) appropriate subjects of the study, (ix) accurate measurements, (x) design of the study provides unbiased results, (xi) reliability and power of the study are medically relevant, (xii) correct method of analysis with no P-hacking,[48] (xiii) credible and evidence based results, (xiv) internally and externally validated results, (xv) chance of false results minimized, (xvi) factors other than data that can affect the results are considered,[49] (xvii) conclusion considered corroborative evidence, and (xviii) imperfections in the tools and alternative explanations considered for the conclusion.
Applicability
There may be isolated examples of conclusions that are applicable but not useful, or vice versa, but these two traits generally go together. Limitations, both known and unknown, can hamper the applicability. Any medical research is considered good if its results and conclusions are useful for improving the health of people directly or indirectly. A clear, truthful, and reproducible research, described in the preceding sections, does not necessarily imply that it is useful too. Utility relates to the extent to which the results are going to impact the practice[50] and stands on its own as a domain of quality. The importance of the applicability has been emphasized by Goodman et al.[44] and Ioannides.[51]
Research component-wise applicability and utility comprise (i) useful and relevant questions, (ii) the research is timely and objectives amenable to translate into practice, (iii) study setting specified, (iv) inclusion and exclusions criteria not too restrictive, (v) how target population would benefit from the research, (vi) effect size is medically significant, (vii) control group on existing regimen rather than placebo, (viii) the intervention easy to adapt to local conditions, (ix) chosen variables work under varying conditions, (x) implementation of results feasible under varying conditions, (xi) the methodology can be adopted by other workers to check the replicability, (xii) robust results, and (xiii) the conclusions have demonstrable applicability. The details of each of these are in the [Supplementary Material - 2].
Reporting
Research generally culminates into a report for dissemination of the findings to the target audience. A huge project may require a full volume, but many reports are published in a journal in a summarized but self-contained paper reflecting the entire process of research and the methodology to reach a conclusion. This section is restricted to the quality of the draft of a paper for publication in a journal.
Many articles have appeared that advise on how to write a paper for publication.[52],[53],[54] Most of these have implications for the planning and execution also. The guidelines such as CONSORT,[10] STROBE,[11] and STARD[12] are primarily for the content and style of the drafting of specific types of studies, and SAMPL[55] is for statistical reporting.
The basic tenet of quality reporting is that the text describes the full process of research, including the results, and stated in a focused, concise, and precise manner without losing clarity. The draft should be well organized and follow a logical format such as IMRaD.[20]
Many of the following suggestions for quality of a draft may look like a repetition of what we advised earlier in this communication, but the earlier advice was for conception, thought process, and the execution, whereas the advice now is on stating all of that in the draft of the paper. The possibility of an excellent thought process but poor reporting cannot be ruled out. In a way, the following reinforces the advice provided earlier for various domains: (i) accurate title that describes the research, (ii) unambiguous research question, (iii) complete specification of all the variables, (iv) clear identification of the target population, (v) factual design and the guidelines concerned with the design followed, (vi) target and actual sample size with justification, (vii) methods used to elicit the data, (viii) what data collected from whom, and adequacy and correctness of the available data, (ix) complete method of the analysis, (x) SAMPL guidelines[55] followed for statistical reporting, (xi) complete results including the unfavorable ones, (xii) limitations kept in view while interpreting the results, (xiii) the conclusion fully answers the research question, (xiv) all relevant references cited, (xv) data sharing considered for others to replicate, (xvi) supplementary material for fuller explanation, and (xvii) keywords adequately describe the thrust of the research. The details of each of these are in the [Supplementary Material - 2].
:: Results | |  |
For a scoring system to be applicable, it must not be too long, and practically feasible and manageable. Thus, only the essential items are included in our proposed scoring without leaving out any aspect that substantially affects the quality [Table 2]. This scoring gives more weight to the process (methodology) although the outcome (result) also gets its due share. Each item is given a score 2 for nearly full satisfaction, score 1 for partial satisfaction, and score 0 for almost no satisfaction. The scores for each domain can be assigned standalone to quantify the quality of different aspects of empirical research on the pattern of QoL, or can be used for self-assessment by the researcher. Higher weight to more important items was considered but was disfavored as it could introduce complexity and subjectivity. Instead, the number of items in various domains was different and this number matched fairly well with our opinion regarding the importance of that domain. For example, the highest number of items (13) is for “Adequacy” and “Truthfulness” followed by “Clarity” (10 items). The total score is the simple sum, considered valid for reflective scoring models[7] opposed to formative measurement models with a combination of heterogeneous items.[56] In case categories are preferred, we suggest considering a total score of less than 50 as poor, 50–74 as tolerable, 75–89 as good, and 90+ as excellent. The utility of this scoring system is also in assessing which quality domain of the research is strong and which domain is weak. The items in column 3 of [Table 2] can also be used as a checklist without scoring. | Table 2: Scoring system for assessing the quality of empirical medical research: from the conceptualisation to the publication - Detailed QERM scoring system (for the researcher)
Click here to view |
The extract in [Table 3] is the QERM-Brief for the reviewers who have access to only the manuscript and are generally hard pressed for time. In this case, the items are to be scored as 1 for “satisfied” and 0 for “not satisfied.” Presentation in terms of “concise, coherent, unambiguous, clear, and understandable, and supported by tables and graphs, and follow the reporting guidelines” is given a maximum score of 10 and the statements regarding all other domains together have a maximum score of 40. The editors can devise their own grouping but, in our opinion, the manuscripts with a total score of less than 30 can be rejected straightway, 30–39 as requiring revision, and 40+ as acceptable, provided there is no specific concern of the reviewer. The scoring may be able to provide a holistic assessment, largely free of the specialized field of the reviewer. This can minimize the scope for unbalanced review. Unethical research can be discarded straightway. | Table 3: Brief QERM score sheet for the reviewers (from the material as much as available from the manuscript) – to be used only for the studies that meet ethical standards
Click here to view |
The experts in the Group verified face and content validity of the scoring systems after intense discussion. The other kinds of validity (including external validity and construct validity) are under process and will be reported separately. This communication is limited to the process we followed and the progress we made toward quantifying the quality of empirical medical research.
:: Discussion | |  |
Scoring in clinical practice has brought in relief and ease of evaluation in much the same way as the measurement of laboratory parameters has brought in medical care. Although the perception of the evaluator cannot be completely ruled out, a scoring system can achieve objectivity to a large extent because all the items are fully specified. In the past, scores such as for quality of life, APACHE for critical care patients, and APGAR for newborns have found wide applications. We expect that the domains and items we list for QERM scoring will help researchers not just in assessing their research but also in making them aware of the features that need to be considered while planning and executing a research project.
Besides the primary purpose of helping to plan, execute, and report quality research, the QERM scoring sheet can serve many other purposes. As discussed by Sandercock and Whitley[57] for quality research, the QERM can help the emerging generation of researchers to get started on a sound footing, create teaching resource, and help clinicians working in resource-poor settings to optimally allocate resources for quality research. Publications and funding proposals can also be evaluated, including the articles in a systematic review/meta-analysis.
Jadad et al.[58] listed 49 items such as random allocation and blinding that connote quality and developed a scale for quality of reporting of RCT with a focus on control of bias. Catillon[59] analyzed more than 20,000 RCTs and used criteria such as “adequate methods” and “poor reporting” for assessing quality. Higgins[60] discussed the Cochrane Collaboration tool for assessing bias in RCTs. Olivo et al.[61] reviewed nine scales used in physical therapy trials and 16 scales used for other areas of healthcare research. All these are focused on specific areas such as health services, dental injuries, or acupuncture. Many of these scales follow the pattern of items or domains and sub-items.
Gabriel and Maxwell[62] emphasized on the “level of evidence,” which is an indicator of the potential for bias.[63] Montagna et al.[64] were of the view that “adoption of standardized protocols” can foster the strengthening of scientific publications. For Ueda et al.,[65] the impact factor of the journal is the main consideration for assessing quality! Glasziou et al.[66] mentioned reliability of answers to important clinical questions as an important component of the quality of medical research. We believe our QERM score is comprehensive and contains all these traits of quality.
Most guidelines require that the checklist be submitted along with the paper for publication without scoring system. But some use scoring system. R-AMSTAR[22] quantifies quality for systematic reviews, QUADAS[23] for diagnostic tools studies, NOS[24] for observational studies, and Jadad scale[58] for RCTs. Our effort is to devise a common scoring system covering several kinds of studies. Second, the existing tools mostly focus on reporting and not much on the conceptual framework and the process. The proposed QERM is more comprehensive with wider coverage, including aspects of conceptualization and the process of research.
We realize that the proposed QERM scoring is complex but that is so because it is comprehensive and incorporates several facets of medical research. The researcher may have to first understand each term signifying the quality [Table 1] of different aspects of the research before using the QERM score. Such rigorousness is the price we should be prepared to pay for quality research. However, we have also proposed a QERM-Brief [Table 3], especially for the reviewers and editors, for relatively quick assessment although this also would require more time and effort of the reviewers than the present system. The journals may like to think of incentives to the reviewers such as priority consideration of their submission for publication. The reviewers may also consider the word limitation of the journal while assigning scores because this limit sometimes prevents giving full details. Now many journals accept supplementary material for their online version. This Brief version can be used by those researchers also who do not wish to be as rigorous with details.
This communication describes our efforts and the process for developing QERM scoring system for initiating a discussion on the need to develop such a system in the first place, to show that it is feasible, and to propose a credible scoring system to assess the quality of medical research for review, while the validation is in the process. Modifications as needed can be done.
Limitations
The proposed scoring system is only for empirical studies, namely descriptive studies, clinical trials, observational studies, and diagnostic studies. Other kinds of studies such as laboratory experiments, systematic reviews, and methodological research are excluded. The aspects such as plagiarism and data manipulation are not considered, presuming that all researchers recognize these as malpractices. The scoring systems are verified for the face and content validity and the other kinds of validations are still in process.
:: Conclusion | |  |
Research resources are scarce, and they must be spent optimally for beneficial outcomes. Assessing the quality of medical research starting from the planning stage can substantially help in improving the quality—thus in achieving the outcomes that really contribute to improved health. A high QERM score may require meticulous research process from conceiving the research question to drafting the paper for publication. We believe that awareness of such a scoring system and its usage can substantially improve the quality of medical research. At the same time, this may make the research process much more difficult, but quality comes at a cost.
Our scoring system is verified for face validity and content validity as assessed by the experts in the Group. It certainly needs additional evidence which would come from validation and wider applications in diverse settings.
Authors' contribution
AI proposed the idea and piloted the project. All significantly contributed to the deliberations and the draft. All approved the manuscript.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
:: Supplementary Material - 1 | |  |
A. Other references consulted for identifying the domains and developing the scoring system
- Antman P. From Aristotle to Descartes – A brief history of quality. Blog. May 08, 2013. [cited 2022 Jun 16]. Available from: https://smartbear.com/de/blog/2013/from-aristotle-to-descartes-a-brief-history-of-qua/
- APPENDIX 1: R-AMSTAR checklist - Quality assessment for systematic reviews. [cited 2022 Jun 16]. Available from: https://perosh.eu/wp-content/uploads/2015/12/R-AMSTAR-Checklist-OSH-Evidence.pdf
- Bellin E, Dubler NN. The quality improvement-research divide and the need for external oversight. Am J Public Health 2001;91:1512-7.
- Burda B, Holmer H, Norris S. Limitations of a measurement tool to assess systematic reviews (AMSTAR) and suggestions for improvement. Systematic Reviews. 2016;5:58.
- CASP. Critical Appraisal Skills Programme. CASP Checklist: 10 questions to help you make sense of a qualitative research. [cited 2022 Jun 16]. Available from:
- CASP. Critical Appraisal Skills Programme. CASP Checklist: 10 questions to help you make sense of a systematic review. [cited 2022 Jun 16]. Available from: https://casp-uk.net/wp-content/uploads/2018/01/CASP-Systematic-Review-Checklist_2018.pdf
- CASP. Critical Appraisal Skills Programme. CASP Checklist: 11 questions to help you make sense of a randomised controlled trial. [cited 2022 Jun 16]. Available from: https://casp-uk.net/wp-content/uploads/2018/01/CASP-Randomised-Controlled-Trial-Checklist-2018.pdf
- CASP. Critical Appraisal Skills Programme. CASP Checklist: 11 questions to help you make sense of a case control study. [cited 2022 Jun 16]. Available from: https://casp-uk.net/wp-content/uploads/2018/01/CASP-Case-Control-Study-Checklist-2018.pdf
- CASP. Critical Appraisal Skills Programme. CASP Checklist: 12 questions to help you make sense of a cohort study. [cited 2022 Jun 16]. Available from: https://casp-uk.net/wp-content/uploads/2018/01/CASP-Cohort-Study-Checklist_2018.pdf
- CASP. Critical Appraisal Skills Programme.CASP Checklist: 12 questions to help you make sense of a diagnostic test study. [cited 2022 Jun 16]. Available from: https://casp-uk.net/wp-content/uploads/2018/01/CASP-Diagnostic-Checklist-2018.pdf
- Singh Chawla D. Hundreds of 'predatory' journals indexed on leading scholarly database. Nature 2021 Feb 8. Epub ahead of print.
- Cherpak LA, Korevaar DA, McGrath TA, Dang W, Walker D, Salameh JP, et al. Publication bias: Association of diagnostic accuracy in radiology conference abstracts with full-text publication. Radiology 2019;292:120-6.
- Choudhuri AR. Identification of criteria for assessing the quality of research. Am J Educ Res 2018:6:592-5.
- Cohen JF, Korevaar DA, Altman DG, Bruns DE, Gatsonis CA, Hooft L, et al. STARD 2015 guidelines for reporting diagnostic accuracy studies: Explanation and elaboration. BMJ Open 2016;6:e012799.
- Emmert-Streib F, Dehmer M, Yli-Harja O. Ensuring quality standards and reproducible research for data analysis services in oncology: A cooperative service model. Front Cell Dev Biol 2019;7:349.
- Faggion CM Jr. The (in)adequacy of translational research in dentistry. Eur J Oral Sci 2020;128:103-9.
- Farrugia P, Petrisor BA, Farrokhyar F, Bhandari M. Practical tips for surgical research: Research questions, hypotheses and objectives. Can J Surg 2010;53:278-81.
- Scott-Findlay S, Pollock C. Evidence, research, knowledge: A call for conceptual clarity. Worldviews Evid Based Nurs 2004;1:92-7; discussion 98-101.
- Gierisch JM, Beadles C, Shapiro A, McDuffie JR, Cunningham N, Bradford D, et al. Health Disparities in Quality Indicators of Healthcare Among Adults with Mental Illness [Internet]. Washington (DC): Department of Veterans Affairs (US); 2014 Oct. [cited 2022 Jun 16]. Available from: https://www.ncbi.nlm.nih.gov/books/NBK299080/
- 10/90 gap. [cited 2022 Jun 16]. Available from: https://en.wikipedia.org/wiki/10/90_gap
- Held L, Schwab S. Improving the reproducibility of the science. Significance 2020;17:10-1.
- Hong PJ, Korevaar DA, McGrath TA, et al. Reporting of imaging diagnostic accuracy studies with focus on MRI subgroup: Adherence to STARD 2015. J Magn-Reson Imaging 2018;47:523–44.
- Ioannidis JP. The mass production of redundant, misleading, and conflicted systematic reviews and meta-analyses. Milbank Q 2016;94:485–514.
- Ioannidis JPA. Meta-research: Why research on research matters. PLoS Biol 2018;16:e2005468.
- Kasi P. Research: What, Why and How? The Treatise from Researchers to Researchers. Authorhouse 2009.
- Shamseer L, Moher D, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: Elaboration and explanation. BMJ 2015;350:g7647.
- Macleod MR, Michie S, Roberts I, Dirnagl U, Chalmers I, Ioannidis JP, et al. Biomedical research: Increasing value, reducing waste. Lancet 2014;383:101-4.
- Mårtensson P, Fors U, Wallin SB, Zander U, Nilsson GH. Evaluating research: A multidisciplinary approach to assessing research practice and quality. Research Policy 2016;45: 593–603.
- Meader N, King K, Llewellyn A, Norman G, Brown J, Rodgers M, et al. A checklist designed to aid consistency and reproducibility of GRADE assessments: Development and pilot validation. Syst Rev 2014;3:82.
- Moher D, Cook D, Eastwood S,Olkin I, Rennie D, Stroup DF. Improving the quality of reports of meta-analyses of randomised controlled trials: The QUOROM statement. Quality of Reporting of Meta-analyses. Lancet 1999;354:1896–1900.
- Moher D, Jadad AR, Tugwell P. Assessing the quality of randomized controlled trials. Current issues and future directions. Int J Technol Assess Health Care 1996;12:195-208.
- Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. BMJ 2009;339:b2535.
- Newcastle-Ottawa quality assessment scale case control studies. [cited 2022 Jun 16]. Available from: http://www.ohri.ca/programs/clinical_epidemiology/nosgen.pdf
- Newcastle Ottawa Scale. [cited 2022 Jun 16]. Available from: http://www.ohri.ca/programs/clinical_epidemiology/nosgen.pdf
- Newcastle-Ottawa quality assessment form for cohort studies. [cited 2022 Jun 16]. Available from: ncbi.nlm.nih.gov/books/NBK115843/bin/appe-fm3.pdf
- Nieminen P, Virtanen JI, Vähänikkilä H. An instrument to assess the statistical intensity of medical research papers. PLoS One 2017;12:e0186882.
- Philipp D. Raising the bar for systematic reviews with assessment of multiple systematic reviews (AMSTAR). BJU Int 2017;119:193.
- Rezaeian M. Disadvantages of publishing biomedical research articles in English for non-native speakers of English. Epidemiol Health 2015;37:e2015021.
- Robinson JJ. Openness and clarity are essential in research reporting. Int Nurs Rev 2011;58:147.
- Shea BJ, Grimshaw JM, Wells GA, Boers M, Andersson N, Hamel C, et al. Development of AMSTAR: A measurement tool to assess the methodological quality of systematic reviews. BMC Med Res Methodol 2007;7:10.
- Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, et al. AMSTAR 2: A critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ 2017;358:j4008.
- Singh Chawla D. Hundreds of 'predatory' journals indexed on leading scholarly database. Nature 2021 Feb 8. Epub ahead of print.
- Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: A proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA 2000;283:2008-12.
- Timmer A, Sutherland LR, Hilsden RJ. Development and evaluation of a quality score for abstracts. BMC Med Res Methodol 2003;3:2.
- Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, et al. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): Explanation and elaboration. PLoS Med 2007;4(10):e297.
- Verhagen AP, de Vet HC, de Bie RA, Kessels AG, Boers M, Bouter LM, et al. The Delphi list: A criteria list for quality assessment of randomized clinical trials for conducting systematic reviews developed by Delphi consensus. J Clin Epidemiol 1998;51:1235-41.
- von Elm E, Egger M. The scandal of poor epidemiological research. BMJ 2004;329:868-9.
- Wells GA, Shea B, Connell D. The Newcastle Ottawa Scale (NOS) for assessing the quality of nonrandomized studies in meta-analysis. The Ottawa Hospital Research Institute 2013.
- Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011;155:529-36.
- Whiting PF, Weswood ME, Rutjes AW, Reitsma JB, Bossuyt PN, Kleijnen J. Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies. BMC Med Res Methodol 2006;6:9.
B. References that describe the process of developing an assessment tool without validation (validation was not done or done later in another article)
- Shwartz M, Restuccia JD, Rosen AK. Composite measures of health care provider performance: A description of approaches. Milbank Q 2015;93:788-825.
- Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Standards for Reporting of Diagnostic Accuracy. Clin Chem 2003;49:1-6.
- Ogrinc G, Mooney SE, Estrada C, Foster T, Goldmann D, Hall LW, et al. The SQUIRE (Standards for QUality Improvement Reporting Excellence) guidelines for quality improvement reporting: Explanation and elaboration. Qual Saf Health Care 2008;17 Suppl 1(Suppl_1):i13-32.
- Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J. The development of QUADAS: A tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol 2003;3:25.
- Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): A 32-item checklist for interviews and focus groups. Int J Qual Health Care 2007;19:349-57.
- Mays N, Pope C. Qualitative research in health care. Assessing quality in qualitative research. BMJ 2000;320:50-2.
- CASP. Critical Appraisal Skills Programme. CASP Checklist: 10 questions to help you make sense of a qualitative research. [cited 2022 Jun 16]. Available from: https://casp-uk.net/wp-content/uploads/2018/01/CASP-Qualitative-Checklist-2018.pdf
- Nieminen P, Virtanen JI, Vähänikkilä H. An instrument to assess the statistical intensity of medical research papers. PLoS One 2017;12:e0186882.
- Shea BJ, Grimshaw JM, Wells GA, Boers M, Andersson N, Hamel C, et al. Development of AMSTAR: A measurement tool to assess the methodological quality of systematic reviews. BMC Med Res Methodol 2007;7:10.
- Lazaris AM, Mastoraki S, Kontopantelis E, Seretis K, Karouki M, Moulakakis K, et al. Development of a scoring system for the prediction of early graft failure after peripheral arterial bypass surgery. Ann Vasc Surg 2017;40:206-15.
- Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: A proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA 2000;283:2008-12.
- Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, et al. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): Explanation and elaboration. PLoS Med 2007;4:e297.
- Verhagen AP, de Vet HC, de Bie RA, Kessels AG, Boers M, Bouter LM, et al. The Delphi list: A criteria list for quality assessment of randomized clinical trials for conducting systematic reviews developed by Delphi consensus. J Clin Epidemiol 1998;51:1235-41.
- Wells GA, Shea B, Connell D. The Newcastle Ottawa Scale (NOS) for assessing the quality of nonrandomized studies in meta-analysis. The Ottawa Hospital Research Institute 2013.
- Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011;155:529-36.
- Barbagallo GM, Olindo G, Corbino L, Albanese V. Analysis of complications in patients treated with the X-stop interspinous process decompression system: Proposal for a novel anatomic scoring system for patient selection and review of the literature. Neurosurgery 2009;65:111-19; discussion 119-20.
- Najjar-Pellet J, Jambou P, Jonquet O, Fabry J. Quality assessment in surgical care departments: Proposal for a scoring system in terms of structure and process. Qual Saf Health Care 2010;19:107-12.
- Lee YK, Shin ES, Shim JY, Min KJ, Kim JM, Lee SH, et al. Developing a scoring guide for the Appraisal of Guidelines for Research and Evaluation II instrument in Korea: A modified Delphi consensus process. J Korean Med Sci 2013;28:190-4.
- Zhang J, Han T, Cai Z, Wang Y, Shang X, Yang B, et al. The use of Delphi method and analytical hierarchy process in the establishment of assessment tools in premature ejaculation: The scoring system for premature ejaculation treatment outcomes. Am J Mens Health 2020;14:1557988320975529.
- Allin B, Ross A, Marven S, J Hall N, Knight M. Development of a core outcome set for use in determining the overall success of gastroschisis treatment. Trials 2016;17:360.
- Pluye P, Gagnon MP, Griffiths F, Johnson-Lafleur J. A scoring system for appraising mixed methods research, and concomitantly appraising qualitative, quantitative and mixed methods primary studies in mixed studies reviews. Int J Nurs Stud 2009;46:529-46.
:: Supplementary Material - 2 | |  |
Details of the contents of each domain of quality of empirical research
Clarity
The following gives the details of clarity for various steps of medical research, including transparency.
- The problem under investigation is unambiguously conceptualized. For example, relation between diet and cancer is an unclear question unless the site of cancer and the type of diet to be investigated are specified. Clear question can be answered without many hurdles.
- The objectives of the research should specify the antecedents, intervention, confounder, and outcome variables to be investigated and each variable should be precisely defined. For example, if obesity is one of the factors under investigation, specify that it would be measured by body mass index, waist hip ratio, weight for height, or any other, how will it be measured and why this measurement is more suitable than the other competitors.
- The target population to which the conclusion will apply should be well-defined with no ambiguity. Geographic location, ethnicity, age-group, sex, severity of disease, and such other factors should be considered for clearly identifying the target population.
- The researcher must be clear about the design to be used for the study. For example, in the case of a double-blind RCTs, there must be complete clarity about the kind of subjects that will receive the test and the control regimen (such as the inclusion-exclusion criteria), from where they will come, how they will be recruited, method of randomization, and how blinding will be ensured including the process of concealment. In the case of observational studies, will it be a retrospective (from known outcome to the antecedents), prospective (from the known antecedents to the outcomes), or cross-sectional (both antecedents and outcomes assessed together). A flow diagram of the subjects of research with appropriate filters can help to achieve further clarity.
- The sample size must be obtained with due consideration of exclusions and dropouts, if any.
- In the case of intervention, consider the clarity regarding the specifics of intervention and its duration. Prospective studies should specify the length of follow-up with justification. Also, think of what action can be taken if the subjects change their preference, withdraw their consent, adverse side effect appears, or become untraceable. Also, think of actions to be taken for possible non-response.
- The tools for collection of data from different modes (observations, interview, examination, investigation) should have unambiguous columns and items so that the collection can go on smoothly.
- Clear visualization of the process of elicitation of information – what, how often and from whom, and how this will be recorded. The items of information on the case record form should be clearly worded so that there is no scope for subjective interpretations.
- Think of the method of analysis of data that could be deployed with clear road map.
- There must be sufficient clarity of what kind of results are expected, and check that they would clearly provide answer to the research question under investigation. Ifs and buts, if any, should be clearly visualized and accounted for.
- Whereas the results cannot be predicted but the possibilities can be anticipated so that the planning can be done accordingly. Place these possibilities into the jigsaw of research puzzle in the light of the current knowledge and develop a perspective of what may come out as a conclusion, particularly in view of the anticipated limitations of design, data, and analysis.
- Visualize a possible conclusion in case the results are positive as anticipated or negative against anticipation. This can help in achieving further clarity regarding the proposed research.
Adequacy
The adequacy of different components of research can be explained as follows.
- Adequacy for the research question requires that it is original, novel, and justified, whose answer will contribute to a better understanding of the health processes and outcomes. It should be focused and not aim at too many answers. The rationale of the research question should be clear and chosen after a thorough review of the appropriate literature with sincere efforts to identify the gaps.
- Sufficient resources in terms of time, expertise, facilities, and funding should be available to take the study to its logical end.
- The objectives should be framed in measurable format and the variables studied should be sufficient to provide answer to the question. No aspect that can have impact on the outcomes should be left out or ignored, or acknowledged as a limitation.
- Consider whether the intervention is adequate to provide the answer to the research question.
- The target population should be appropriate for the research question.
- The tools used for collection of data should be able to elicit all the data required for the study.
- All the relevant variables (antecedents, intervention, confounders, and outcome) should be included to get full answer to the research question.
- A study design is considered adequate when it is able to provide full answer to the research question with sufficient precision and takes care of possible confounders and interaction between factors.
- The sample size should be based on prior consideration of the aimed reliability of the estimate of each parameter under study or for the adequate power to detect a pre-defined and medically justified effect after adjustments for multiple comparisons. In the case of multiple endpoints, the sample size should be adequate for reliable conclusion on all the endpoints.
- The sample should be representative of the intended target population and its segments in case the results are for the segments. A convenience sample is unlikely to be representative and a large sample of this type will accentuate the bias. A small sample is rarely able to be representative of the cross section of the target population. An unrealized advantage of small sample is that full care and sophisticated technology can be used to elicit the right data.
- The procedure for collection of data should be ethical and the collected data should be complete on all the antecedents, interventions, confounders, and outcomes, without being affected by the researcher's attitude.
- The analysis of data should be based on the type of data and should be geared to answer the research question. The statistical significance should be tested for the presence of medically relevant effect and not for no effect as the null of no effect is almost surely be rejected if the sample size is large. Different types of analysis should be done and all should lead to the same results. If not, valid reasons for this variation should be available. All the assumptions of the methods used for statistical analysis must be verified. The analysis should be rigorous.
- The results obtained or expected should have sufficient reliability after accounting for uncertainties such as in measurements, responses, and assessments. The evidence of the result must be convincing and biologically plausible.
- In case the question is not fully answered by the study, convincing reasons should be available. Such gaps affect the translational potential of the findings to real-life situations.
Truthfulness
The following points explain the “Truthfulness” in the context of various components of research.
- A research question should be conceived with sufficient accuracy and should be appropriately anchored with the prior evidence. This should be justified by showing gaps in the knowledge.
- The objectives must be directly related to the research question. If they are indirect, valid reasons should be available.
- The target population, as defined by the inclusion-exclusion criteria or by the selected sample, should be valid and not biased such as volunteers or clinic subjects in some cases.
- Variables chosen for the study should be able to provide correct answer to the research question. For example, they should not be those that are considered to give a favorable result. If surrogates are used, consider how truly they represent the underlying variables.
- The intervention, if any, should be a valid and ethical strategy for the research in hand.
- The chosen endpoints should be capable of providing the right answer to the question and should not be changed mid-way without valid reason. As far as possible, outcome assessment should be blind to the exposure or initial status of the subjects, perhaps also be unaware of the intervention.
- The tools used to elicit the data should have established validity or their validity should be established before using them.
- The subjects of the study should be those that can provide the correct answer to the research questions.
- The measurements should be accurate as a slight bias can blow in some cases due to butterfly effect.
- The design should be able to provide unbiased results. For example, in the case of clinical trials, randomization or matching, blinding, and equipoise across groups are required to reach to a valid answer. In the case of observational studies, unbiased sample and baseline equivalence between cases and controls, wherever appropriate, is required. Descriptive studies must be based on a representative sample. The design should be executed as planned.
- The aimed reliability of the estimates and proposed power of the statistical comparisons should be based on medically relevant considerations.
- (a) The method of statistical analysis should be right for the variables, sample size, and the design, and should be right to answer the research question with due considerations of distribution pattern, inter-dependence of various characteristics, confounding, and need for standardization. (b) Categorization of continuous variables, if done, should be justified and the categories should be medically relevant. (c) Confounders and baseline inequalities, if present, should be adjusted at the time of analysis. (d) Ensure that there is no P-hacking, data dredging, cherry-picking, snooping, significance chasing, or data torturing. P-values should be adjusted for multiple uses of the same data so that the chance of reaching the truth increases. Remember that P-values are valid only when the design, data, and analysis are appropriate.
- The results are credible when they are based on evidence available within the study. Effect size and its confidence interval (CI) should be such as to provide confidence to the results. In case a sub-group analysis is done, the sub-group results should conform to the overall result or should have valid explanation of the variation. The results should be interpreted with humility considering the inherent uncertainties in the P-values, the effect size, and the CIs with no over-interpretations.
- Internal and external validity of the results should be demonstrated or acknowledged as a limitation.
- Consider the possibility of reaching to a false positive or false negative result and think of how this chance can be minimized.
- Realize that the results are based on the data and any set of data is a 'partial representation' of the phenomenon under study. Other factors that can possibly affect the results should also be considered.
- The conclusion should also consider corroborative evidence from the literature or otherwise and limitations besides the biological plausibility.
- Realize the imperfections in the tools and measurements, and factor this into the conclusion. Possible alternative explanations of the results should be considered and ruled out.
Applicability
Research component wise details of applicability (and utility) are as follows.
- Applicability starts with setting up a useful and relevant question, answer to which can help in improving the health of an identified segment of the population.
- Objectives must be amenable to translate into practice. Timeliness and the current importance add to the applicability.
- Setting of the study (community, hospital, or clinic) decides the extent of applicability of the results.
- Inclusion-exclusion criteria should not be too restrictive so that the applicability does not suffer.
- Consider whether the target population is indeed the one that will benefit from the answer of the research question. For example, targeting efficacy is a laudable objective in most trials, but the effectiveness in practice is the one that determines applicability.
- The effect size should be medically significant in the sense of being capable of changing the current practice. Negative result, if any, should be clearly stated.
- The control group – in trials as well as in case-control studies – must be the one in the current practice and not a null group because this determines the utility in retaining or changing the current practice. For example, in diagnostic studies, the control group too should be suspected cases who undergo the concerned medical test.
- The intervention should preferably be such that can be adapted for use in different settings.
- The chosen variables should preferably be those that can provide the answer under varying conditions. For example, mortality end-point for critical conditions can be observed in different settings but the duration of hospital stay in one setup may not necessarily apply to the other setups.
- The data required for implementation of results should be such that is feasible to obtain in varying conditions.
- The entire methodology should be such that another researcher with adequate resources is able to replicate. Appropriate statistical analysis is crucial for improving the reproducibility of results.
- Consider whether the results obtained or expected to be obtained will be reproducible, generalizable, sustainable, and useful to advance the science of medicine – will they be really applicable. Besides robustness of the results, cost and convenience are sometimes over-riding considerations for applicability of the results.
- The conclusion, after other relevant considerations, should have demonstrable applications. A theoretical construct should be proposed in justification of the conclusion.
Reporting
The following are the details of the reporting of various components of medical research.
- Frame a title that accurately reflects the contents of the paper.
- Unambiguously and accurately state the research question, how it originated, why is this novel and justified, and why it is relevant and important. This must be supported by the gaps in the review of literature. The review must be comprehensive to include all the relevant literature without leaving out the contrarian view. The aim of the study should be mentioned, and the specific objectives should be in measurable format.
- Mention all the antecedent, intervention, confounder, and outcome variables, including the relevance of each for the research question under study, precisely define them and state how they were measured.
- Clearly identify the target population and state how is this population could benefit from the answer to the research question.
- State factual aspects of the design of the study, including the selection of cases and controls, stratification, allocation, randomization, and blinding. Justify the design for providing a valid answer to the research question and state the limitations, if any. Follow CONSORT, STARD, or STROBE guidelines as per the design.
- State the target sample size for each group along with the aimed reliability or power. Justify the medically important effect used for calculation of sample size for specified power in case relevant. Also, state how many could be actually studied, what are the reasons of dropouts, and how will it or will it not affect the results.
- State the methods (observation, interview, examination, investigation) used to elicit various elements of the data, how were they recorded, including the tools used such as structured or unstructured case record form, pre-coded or not, scoring system used with its merits for the study, and any other such tool. Also state whether any follow-up was done, why it was needed, how many visits, and what was elicited at each follow-up. Also, when and why revisits were made, if any. Comment on the feasibility of these methods in other setups.
- State what data were collected from whom and how the correctness of the data was ensured. Comment on the adequacy of the data to answer the research question, and whether this kind of data can be obtained by other researchers who would want to replicate the study.
- State the complete method of analysis – planned as well as the one actually used. Also, how this method is consistent with the design of the study. State how the underlying assumptions were checked. Give relevant percentages, correlations, regression equations, P-values, confidence intervals, etc. State the limitations also of the method of analysis.
- State complete results without suppressing the unfavorable ones. Check that the results really emanate from the evidence presented in the paper and they match with the objectives mentioned earlier. Comment on their reproducibility, reliability, and statistical significance in view of multiple testing if used. State how the internal and external consistency of the results was checked and how the results are robust for applications in other setups. Organize the results with the help of clearly made tables and graphs.
- Use improved SAMPL guidelines for statistical reporting for all the steps where applicable.
- Properly interpret the results in view of the limitations of the design and data. Discuss the strengths and weaknesses of the results and comment on their robustness after resolving any internal conflicts within the study and with the findings of the others. Present the biological plausibility and corroborative evidence in support of the results. Give convincing argument against the alternative viewpoint, if any. The Discussion section should follow a logical sequence and should be coherent. Nothing should be hidden. Do not mix opinions with the evidence-based results.
- Draw a clear conclusion that answers the research question and is consistent with the results. This must consider the biological plausibility, corroborative evidence, and limitations stated earlier. Comment on its applicability in the light of cost-effectiveness but show humility in view of omni-present uncertainty in empirical research.
- All the cited references must be relevant and recent as much as possible from high-quality journals.
- Consider whether the data can be shared. This helps others to check the repeatability.
- Keywords serve an important purpose of quick retrieval of the work in internet search. The keywords specified in research should adequately describe the thrust of this research.
:: References | |  |
1. | Ioannidis JP. How to make more published research true. PloS Med 2014;11:e1001747. |
2. | Mendoza D, Garcia CA. Defining research reproducibility: What do you mean? Clin Chem 2017;63:1777. |
3. | ESHRE Capri Workshop Group. Protect us from poor-quality medical research. Hum Reprod 2018;33:770–6. |
4. | Mische SM, Fisher NC, Meyn SM, Sol-Church K, Hegstad-Davies RL, Weis-Garcia F, et al. A review of the scientific rigor, reproducibility, and transparency studies conducted by the ABRF Research Groups. J Biomol Tech 2020;31:11–26. |
5. | Altman DG. The scandal of poor medical research. BMJ 1994;308:283–4. |
6. | Ioannidis JP. Why most published research findings are false. PloS Med 2005;2:e124. |
7. | Shwartz M, Restuccia JD, Rosen AK. Composite measures of health care provider performance: A description of approaches. Milbank Q 2015;93:788–825. |
8. | Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. Lancet 2009;374:86–9. |
9. | Moher D, Jones A, Lepage L, CONSORT Group (Consolidated Standards for Reporting of Trials). Use of the CONSORT statement and quality of reports of randomized trials: A comparative before-and-after evaluation. JAMA 2001;285:1992–5. |
10. | Kane R, Wang J, Garrard J. Reporting in randomized clinical trials improved after adoption of the CONSORT statement. J Clin Epidemiol 2007;60:241–9. |
11. | Hendriksma M, Joosten MH, Peters JP, Grolman W, Stegeman I. Evaluation of the quality of reporting of observational studies in otorhinolaryngology – based on the STROBE statement. PLoS ONE 2017;12:e0169316. |
12. | Korevaar DA, Wang J, van Enst WA, Leeflang MM, Hooft L, Smidt N, et al. Reporting diagnostic accuracy studies: Some improvements after 10 years of STARD. Radiology 2015;274:781–9. |
13. | Chambers C. The registered reports revolution: Lessons in cultural reform. Significance 2019;16:23–7. |
14. | |
15. | |
16. | Begg C, Cho M, Eastwood S, Horton R, Moher D, Olkin I, et al. Improving the quality of reporting of randomized controlled trials. The CONSORT statement. JAMA 1996;276:637–9. |
17. | Fernández E, STROBE group. Estudios epidemiológicos (STROBE) [Observational studies in epidemiology (STROBE)]. Med Clin (Barc) 2005;125(Suppl 1):43–8. |
18. | Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD initiative. Standards for Reporting of Diagnostic Accuracy. Clin Chem 2003;49:1–6. |
19. | Ogrinc G, Mooney SE, Estrada C, Foster T, Goldmann D, Hall LW, et al. The SQUIRE (Standards for Quality Improvement Reporting Excellence) guidelines for quality improvement reporting: Explanation and elaboration. Qual Saf Health Care 2008;17(Suppl 1):i13–32. |
20. | |
21. | Oxman A. Guyatt G. Validation of an index of the quality of review articles. J Clin Epidemiol 1991;44:1271–8. |
22. | Kung J, Chiappelli F, Cajulis OO, Avezova R, Kossan G, Chew L, et al. From systematic reviews to clinical recommendations for evidence-based health care: Validation of revised assessment of multiple systematic reviews (R-AMSTAR) for grading of clinical relevance. Open Dent J 2010;4:84–91. |
23. | Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J. The development of QUADAS: A tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol 2003;3:25. |
24. | |
25. | Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): A 32-item checklist for interviews and focus groups, Int J Qual Health Care 2007;19:349–7. |
26. | Mays N, Pope C. Qualitative research in health care: Assessing quality in qualitative research. BMJ 2000;320:50–2. |
27. | Ten Have H, Gordijn B. Medical epistemology. Med Health Care Philos 2017;20:451–2. |
28. | Kemper JM, Wang HTY, Ong AGJ, Mol BW, Rolnik DL. The quality and utility of research in ectopic pregnancy in the last three decades: An analysis of the published literature. Eur J Obstet Gynecol Reprod Biol 2020;245:134–42. |
29. | Rajabally YA, Fatehi F. Outcome measures for chronic inflammatory demyelinating polyneuropathy in research: Relevance and applicability to clinical practice. Neurodegener Dis Manag 2019;9:259–66. |
30. | Scott-Findlay S, Pollock C. Evidence, research, knowledge: A call for conceptual clarity. Worldviews Evid Based Nurs 2004;1:92–7. |
31. | Shaw D. The quest for clarity in research integrity: A conceptual schema. Sci Eng Ethics 2019;25:1085–93. |
32. | Montenegro-Montero A, García-Basteiro AL. Transparency and reproducibility: A step forward. Health Sci Rep 2019;2:e117. |
33. | Altman DG, Moher D. Declaration of transparency for each research article. BMJ 2013;347:f4796. |
34. | Davidoff F. News from the international committee of medical journal editors. Ann Intern Med 2000;133:229–31. |
35. | Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD, Breckler SJ, et al. Scientific standards. Promoting an open research culture. Science 2015;348:1422–5. |
36. | Bordage G. Reasons reviewers reject and accept manuscripts: The strengths and weaknesses in medical education reports. Acad Med 2001;76:889–96. |
37. | Pierson DJ. The top 10 reasons why manuscripts are not accepted for publication. Respir Care 2004;49:1246–52. |
38. | Begley CG, Ioannidis JP. Reproducibility in science: Improving the standard for basic and preclinical research. Circ Res 2015;116:116–26. |
39. | Munafo MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, du Sert NP, et al. A manifesto for reproducible science. Nature Hum Behav 2017. |
40. | Wallach JD, Boyack KW, Ioannidis JPA. Reproducible research practices, transparency, and open access data in the biomedical literature. PLoS Biol 2018;16:e2006930. |
41. | Collins FS, Tabak LA. Policy: NIH plans to enhance reproducibility. Nature 2014;505:612–3. |
42. | Plesser HE. Reproducibility vs. replicability: A brief history of a confused terminology. Front Neuroinform 2018;11:76. |
43. | |
44. | Goodman SN, Fanelli D, Ioannidis JP. What does research reproducibility mean? Sci Transl Med 2016;8:341ps12. |
45. | Indrayan A, Mishra A. The importance of small samples in medical research. J Postgrad Med 2021;67:219–23.  [ PUBMED] [Full text] |
46. | Indrayan A, Holt M. Concise Encyclopedia of Biostatistics for Medical Professionals. Boca Raton, CRC Press; 2017. |
47. | Holme C. Cultivate absolutely accuracy in observation and truthfulness in report. J Adv Nurs 2020;76:1093–4. |
48. | Andrade C. HARKing, Cherry-picking, P-hacking, fishing expeditions, and data dredging and mining as questionable research practices. J Clin Psychiatry 2021;82:20f13804. |
49. | Tarran B. New year, familiar problems. Significance 2020;17:1. |
50. | Booth A, Brice A. Evidence-based Practice for Information Professionals: A Handbook. Vol. 21. London Facet Publishing; 2004. |
51. | Ioannidis JP. Why most clinical research is not useful. PloS Med 2016;13:e1002049. |
52. | Knottnerus JA, Tugwell P. How to write a research paper. J Clin Epidemiol 2013;66:353–4. |
53. | Cooper ID. How to write an original research paper (and get it published). J Med Libr Assoc 2015;103:67–8. |
54. | Zieger M. Essentials of Writing Biomedical Research Papers. 2 nd ed. New York, McGraw Hill; 1999. |
55. | Indrayan A. Reporting of basic statistical methods in biomedical journals: Improved SAMPL guidelines. Indian Pediatr 2020;57:43–8. |
56. | Jacobs R, Goddard M, Smith PC. How robust are hospital ranks based on composite performance measures? Med Care 2005;43:1177–84. |
57. | Sandercock P, Whiteley W. How to do high-quality clinical research 1: First steps. Int J Stroke 2018;13:121–8. |
58. | Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds DJ, Gavaghan DJ, et al. Assessing the quality of reports of randomized clinical trials: Is blinding necessary? Control Clin Trials 1996;17:1–12. |
59. | Catillon M. Trends and predictors of biomedical research quality, 1990–2015: A meta-research study. BMJ Open 2019;9:e030342. |
60. | Higgins J, Altman D, Gøtzsche P, Jüni P, Moher D, Oxman AD, et al. The Cochrane Collaboration's tool for assessing risk of bias in 24 randomised trials. BMJ 2011;343:d5928. |
61. | Olivo SA, Macedo LG, Gadotti IC, Fuentes J, Stanton T, Magee DJ. Scales to assess the quality of randomized controlled trials: A systematic review. Phys Ther 2008;88:156–75. |
62. | Gabriel A, Maxwell GP. Reading between the lines: A plastic surgeon's guide to evaluating the quality of evidence in research publications. Plast Reconstr Surg-Glob Open 2019;7:e2311. |
63. | Indrayan A, Malhotra RK. Medical Biostatistics. 4 th ed. Boca Raton Chapman &Hall/CRC Press; 2018. |
64. | Montagna E, Zaia V, Laporta GZ. Adoption of protocols to improve quality of medical research. Einstein (São Paulo) 2020;18:1–4. |
65. | Ueda R, Nishizaki Y, Homma Y, Sanada S, Otsuka T, Yasuno S, et al. Importance of quality assessment in clinical research in Japan. Front Pharmacol 2019;10:1228. |
66. | Glasziou P, Vandenbroucke J, Chalmer I. Assessing the quality of research. BMJ 2004;328:39–41. |
[Table 1], [Table 2], [Table 3]
|