The development of validated outcome measures is important to improve the practice of evidence-based medicine. Patient-reported variables reflect a patient’s function more accurately than clinician-assessed variables.21
The most relevant single outcome measure for athletes is a return to the preinjury level of functioning and performance in their usual sport.22
Contact athletes may only experience symptoms during training and competitive sport, not during activities of daily living. There is no outcome measure designed and validated specifically for shoulder injuries in rugby players. Established shoulder scoring systems that are currently used to assess shoulder injuries, such as the Constant Score,6
Oxford Shoulder Score8
and Oxford Shoulder Instability Score,9
were developed for use in the general population and have not been validated for use in any group of contact athletes. These instruments either include clinician-assessed variables or focus on activities of daily living. Consequently, existing shoulder scoring systems are not suitable for evaluating shoulder function in high-demand contact athletes such as rugby players.
Currently, three scoring systems exist that have been designed for shoulder injuries in athletes. Tibone and Bradley developed a scoring system for shoulder function in athletes. This was modified by Kuhn and Hawkins to create the Athlete Shoulder Assessment Tool.18
The third existing scoring system is the Kerlan-Jobe Orthopaedics Clinic Score.19
These scoring systems have neither been designed nor validated for use in any group of contact athletes. The purpose of this research was to develop and validate an athlete-reported scoring system for the specific evaluation of shoulder function in rugby players.
The first stage in developing an athlete-reported shoulder scoring system was to review all existing scoring systems that are relevant to the shoulder to generate a list of potential items for the new scoring system. The literature search identified 61 distinct scoring systems, 46 of which were retrievable. A review of the use of outcome scores in shoulder surgery in 2005 identified a total of 44 different shoulder scores.23
This suggests that the literature search in the item generation phase was comprehensive. The majority (60.9%) of reviewed scoring systems included patient-assessed items only (table 1
). This most likely reflects the trend over the last decade to develop patient-reported outcome measures.23
A review of the distribution of items in scoring systems across different domains confirmed that most scoring systems evaluated power (56.5%), range of movement (71.7%), function in necessary activities of daily living (87%) and pain (76.1%). Only a minority of scoring systems (28.3%) directly assessed patient satisfaction, which is now recognised as a reliable indicator of outcome.
Following the literature review and completion of questionnaires by professionals, 20 items were selected for the provisional scoring system by calculating the FIP.15–17
One limitation of the methodology in developing this scoring system is that only eight players were interviewed to determine the FIPs of 105 potential items. A balance must be obtained between the number of items reviewed and the number of players to be interviewed. As scoring systems for shoulder injuries in contact athletes or rugby players have not been previously developed, a comprehensive evaluation of items of relevance to rugby players was favoured over sampling of a large number of players. The included players represented a range of positions, player ages and severity of injury (table 2
). A limit of 20 items was chosen as this represented a reasonable maximum question load for a respondent. The identification and exclusion of redundant items within the scoring system can be performed with confidence when more players have completed the scoring system. It is anticipated that this analysis can be performed when more players (n=100) have completed the scoring system by eliminating items that correlate highly with each other when assessed using the Pearson correlation coefficient.
On comparison of the generated scoring system with previous outcome measures, this scoring system contains eight items (items 9, 12–18; table 7
) that have not been identified in any shoulder scoring system previously. This indicates that several of the conditioning exercises and skills that are affected by shoulder injuries in rugby players, and are important for their return to full training and competitive sport, are not evaluated in existing shoulder scoring systems. The number of items in the scoring system covering specific domains shows a similar pattern to the domains covered by previous scoring systems (table 1
The Flesch-Kincaid Grade Level and Flesch Reading Score were computed to assess the readability of the scoring system. The Grade Level for the new scoring system was 9 and the Flesch Reading Score was 46, indicating that the items were easy to comprehend. No rugby players demonstrated or reported any difficulty in understanding the items, indicating that it was sufficiently formatted for its intended purpose. The Likert scale response format, which is used in the majority of existing shoulder outcome measures (82.6%), was selected as it has been shown to be more reliable. Weighting of items was not performed as this is not necessary if items with very low FIPs are eliminated from the final scoring system.24
The ability of the items of the scoring system to measure the same general latent variable (shoulder function for rugby players in this research) can be assessed by estimating the tool’s internal consistency, which is statistically reported as Cronbach’s α. An ideal scoring system would contain items that are related and internally consistent, but which each also provide unique information. A Cronbach’s α result of less than 0.6 indicates a lack of cohesion, whereas a score of 0.8–0.95 indicates good consistency.8
The scoring system had an overall internal consistency result of 0.96 (table 5
), suggesting that while some items contribute to a cohesive scoring system, some redundancy may exist between items. The exclusion of any single item did not improve the internal consistency result to within the desirable range (0.8–0.95). Two or more redundant items may therefore exist. Redundant items may include those assessing overhead function (items 10, 11 and 13), which had similar mean scores and SD. These results provide supportive evidence for the development of a cohesive scoring system, but definitive identification and elimination of redundant items require further testing of the scoring system, as was previously described.
A reliable outcome measure should produce a similar result on repeated measures of a patient if their condition is unchanged. The most appropriate test statistic for evaluating reliability is the ICC.26
The interval period for reliability testing was selected as 2 weeks as this was deemed to be long enough for the players to have forgotten previous responses but a sufficiently short period to minimise the probability of their level of injury changing. The mean scores of all items were similar at the two time points (table 7
). The ICC value was greater than 0.75 for 15 items, indicating acceptable reliability. Further, four items demonstrated ‘fair’ reliability. The only item that showed poor reliability was ‘Carrying a ball with strength in the crook of your arm’ (ICC=0.392). This item will be eliminated from the scoring system if it shows similarly poor reliability after more subjects have been tested. The total score for the scoring system demonstrated excellent reliability (ICC=0.941, table 7
If this level of reliability is maintained after further validation of the scoring system, it indicates that it could be used not only for large-scale research purposes but also for decision-making regarding treatments for individual athletes.16
Eleven players were used to test the reliability of the scoring system over time. Although this may seem a low sample number, this produced a statistically valid level of reliability using the ICC. The paired student t test results (table 7
) demonstrated no significant differences in either individual items or total score over the two time points, indicating that the distribution of the repeated results was not different, and providing further evidence of the reliability of the scoring system.
This research has developed a reliable provisional scoring system for the evaluation of shoulder function in rugby players. We are continuing this work by validating the scoring system for its responsiveness to change in injury status by recruiting rugby players with new shoulder injuries to complete the scoring system before treatment and during rehabilitation, and by determining the minimal important difference in the overall score that represents a significant change in the clinical condition.
What are the new findings?
This is the first study to define the most important aspects of shoulder function for professional rugby players.
This study identified eight new aspects of shoulder function that are important for rugby players and are not assessed by existing shoulder scores.
This study developed the first reliable athlete-reported scoring system for the assessment of shoulder injuries in rugby players.
How might it impact on clinical practice in the near future?
This scoring system may
Provide an accurate assessment of injury severity and shoulder function in rugby players after shoulder injuries.
Determine the most appropriate treatment interventions for rugby players sustaining shoulder injuries.
Gauge rehabilitation of shoulder function in rugby players after shoulder injuries.
Contributors SBR designed the study. He also collected, analysed and interpreted the data, besides drafting and finally approving the paper. LF was involved in the initial study conception, study design, interpretation of data and drafting of the paper and finally approved the paper. SBR and LF can act as guarantors for the work described in this paper.
Competing interests None.
Ethics approval Research Governance and Ethics Committee of the University of Salford.
Provenance and peer review Not commissioned; externally peer reviewed.
▸ References to this paper are available online at http://bjsm.bmj.com