More on proprioception
A couple posts back I blogged about a new tool that has been in development called the Comprehensive Observations of Proprioception. I was a little surprised about the editorial decision to publish an article about performance on the tool without publishing about the tool itself. Now we have a paper on the tool itself - so the ordering of publication is a question for AJOT editors - not the authors of the paper.
The authors describe the tool as an observational measure that is criterion referenced. The test includes 18 items that purportedly represent some aspects of proprioceptive function and they use literature review as one tool to substantiate the content validity of the items. As I mentioned in the original post on this matter I am concerned that some of these items might represent some aspect or measure of proprioception but then again they also might not. Fully 25%+ of the items are behavioral measures like 'overactive' and 'enjoyment when being pulled' that are very unclear to me if these are functional and discriminatory measures of a proprioception construct.
Use of content experts to establish validity is a well-established mechanism in item development. However, the authors report disregarding the content experts on some items because even though the ratings of experts did not achieve a degree of criticality the authors believed that the items should be included because they are represented in the literature.
Some items that the experts excluded were curious. They excluded 'Muscle tone is hypotonic' but they included 'Decreased muscle tone.' This is very difficult to understand without having the benefit of operational definitions of all these items. A Likert scale was reportedly used to rate item performance from (1) typical performance to (5) most severe form of proprioceptive processing difficulties. If this is a criterion referenced scale, I was left wondering what the cutoffs are on these measures and how they operationally defined mild or moderate forms of proprioceptive processing difficulties, particularly on the 18 items listed. The authors report that the test can be taught to a therapist with a minimum of two years of experience after receiving a brief training. I am struggling to understand if there are operational definitions for this Likert scale establishing discrimination between ratings for these very vague behavioral observations - and if there are - how would it only take a brief training to reliably rate a child on this tool? It seems that if there were true criterion points that this could be very complicated.
This leads into a significant limitation in this tool that has to do with relying on a constricted set of therapists who are working in a common clinic or set of clinics. This is nothing new - I talked about this in 2009: " It is a pertinent issue - you can't talk about broad validity of constructs when you only include people who are all drinking the same Kool Aid, so to speak." This is the criticism that you will receive when you use convenience sampling, which the paper rightly identifies, but it still underscores a really big problem.
Additionally, the tool would be strengthened considerably if an evaluator was blinded to diagnostic condition before running comparison analysis between children with known problems and typically developing children.
It would also be strengthened if it didn't use parent report measures as a measure of criterion validity. For example, is there any surprise when this test correlates with the Body Awareness section of the Sensory Processing Measure? If one test asks a parent to comment on "Bumps or pushes other children" and the other test includes observations of "Crashing/Falling/Running" then what else would we expect?
Finally, the test also relies on the Kinesthesia subtest of the SIPT - and this is a major flaw because the value of any factor analysis is only as good as the data going into the analysis. Everyone knows that the KIN subtest is weak and unreliable - so why even use it as a point of criterion validity? There are items on the SIPT which would seem to be arguably stronger measure points of proprioceptive processing, such as the eyes closed measures on the SWB subtest, or maybe even including Oral Praxis. Why include the weakest proprioceptive measure on the SIPT? I am generically aware of some work done by graduate students at Brenau University last year about pilot revision of the Kinesthesia test but I don't think they published yet (??). What I am saying here is that I know I am not the only person on the planet to think that the KIN test is pretty flawed. Garbage in, garbage out - so they say. Makes it hard to know what to do with this factor analysis that relies on the weakest of all the SIPT subtests.
So there is my blunt analysis - and I am attempting to be as constructive as possible because when researchers go and do things they probably would benefit from end-users who read their work and are trying to see how to use it. Is proprioceptive processing an important construct? It sure is. Do kids who have difficulties sometimes have difficulties with proprioceptive processing? I believe they do.
But to me I think this tool is too weak. Maybe I would feel differently if I saw operational definitions for these items... and operational definitions for the Likert Scale and some understanding how how cutoffs were determined. I would also like to see evaluators blinded when they are looking at discriminatory item analysis - and I would like to know that content experts were broadly represented so there was no danger of everyone thinking with a regional mindset.
I look forward to more, so we can continue to move these kinds of ideas forward.
Blanche, E.I., Bodison, S., Chang, M. & Reinoso, G., (2012). Development of the Comprehensive Observations of Proprioception (COP): Validity, reliability, and factor analysis, American Journal of Occupational Therapy, 66, 691-698.