Problems in the Evaluation of College Teaching in Music

October 1, 1992

Evaluation of college music instruction must be performed to assess the work of several different missions pursued by college music faculty. In the typical college music department, the main missions are those of the academic teachers of music classes, the conductors of performing groups, and the studio teachers of solo performance. Frequently, a single faculty member will serve in more than one of these roles.

The evidence examined to evaluate college music teaching, however, is used to make decisions of widely differing importance. These decisions include the annual merit salary review, the annual review of untenured faculty, review for the granting of permanent tenure, and finally review for awarding promotion in rank.

Administrators and evaluation committees are left with the problem of varied evidence being used to make different decisions about different people pursuing different missions. This shows the diversity of functions performed by the college music department, and it establishes a clear need for a methodical approach to gathering and analyzing evidence of quality in music teaching at the college level.

Some music departments have taken a highly informal approach to evaluating the quality of instruction. This type of approach is characterized by a lack of any structure which would determine how faculty are identified as candidates for evaluation, who is responsible to carry out assessment activities, what type of information may legitimately be taken, and on what schedule the evaluation will progress.

Informality when conducting evaluations usually remains a popular approach until the first strong-willed people hear the first bad news to come out of their own evaluation. Sooner or later the unstructured approach will be challenged, and it will usually fail to withstand the test. Many informal evaluation methods would not even be upheld if tested in the institution's own grievance procedure.

When the approach to the evaluation of teaching is unstructured, the lack of systematic decision making may lead to judging the quality of teaching by the flawed absence-of-complaints model. In this model, for example, an administrator might tell a faculty colleague that he must be doing well because she hasn't gotten any complaints about him lately. This approach to evaluation could tempt faculty to perform some of their teaching and evaluating duties in a way that might be interpreted as bribery of their students: requiring little work, and giving high grades to everyone.

Systematic evaluation carried out through a constant review of teaching will promote early attention to problems and constructive work to resolve these problems. It will also reward work that is found to already be good.

There are three main reasons to evaluate instruction: for the benefit of the institution's students, for the instructor's self-improvement throughout his or her career, and for eventual decisions of tenure and promotion. This paper will concentrate on the personnel decisions of tenure and promotion because the importance of these decisions is so great in a college teacher's career.


Characteristics of a Model for the
Evaluation of College Teaching in Music

What are the essential characteristics of a model for the evaluation of college teaching in music? It must have criteria for determining what is allowable input information. The most basic criteria are that there must be no blockage of information flowing in from evaluators, all incoming information must be assessed for its own validity (meaning accuracy or truthfulness), and all incoming information should be subject to challenge by the person whose teaching is being evaluated.

It is expected that the nature of the input in a responsible evaluation model will range along continua of objective to subjective, student centered to faculty centered, and inside the subject matter area, department, or institution to outside the subject matter area, department, or institution. This very diversity of input will do much to promote fairness in the evaluation.


A Model for the Evaluation
of College Teaching in Music

At this point I will begin to describe what I call a responsible model for the evaluation of college instruction in music. As I describe the elements of this model, I will also discuss some of the problems that typically come to those who work with this kind of evaluation. The problems I will examine come from those I have experienced as a faculty member, as a member of the evaluation committee, and as a department level administrator. Crucial problems were also suggested by a number of music faculty and administrators I interviewed on this topic. The model itself can be described as the kinds of information that are taken in for evaluation, and what is done to process and interpret that information. Although this article deals only with the evaluation of teaching, it should be noted that most universities base their tenure and promotion decisions on a consideration of the candidate's research or creative activity and public service as well as on teaching.


Sources of Information in the Model

1. Student Instructional Rating Forms

Student instructional rating forms, widely known by their acronym SIRS (Student Instructional Rating Scale) stand as one of the most commonly used means of getting information about teaching. They carry elements of both subjective and objective evaluation, they can be made specific to the music department and the subject matter area within music, and they are student centered.

Students are well qualified to report what actually goes on in the classroom, studio, or rehearsal hall, but they are not so well qualified to judge a teacher's scholarship or reputation within the national or international profession. It is important that the questions on this form focus mainly upon questions that students are well qualified to answer.

Music departments must decide whether or not to use the same form for each faculty specialty. When different versions are used, the department will then have to guard against the temptation to interpret results as if they all came from the same form.

It is possible, however, to create different forms which have comparable content and use the same numerical reporting scale. This was recently done in the School of Music at Michigan State University. A faculty committee spent two years designing, pilot testing, and refining a set of student instructional rating forms for music classes, applied teaching, and performance ensembles.1

Student evaluations offer a huge amount of evaluative input in proportion to the small amount of time that is needed to secure them. There will always be room for different interpretations when student evaluations are considered, so this form of input information should always be taken through formal procedures for distribution, collection, and interpretation of the forms.

The goal of distribution rules is to assure that each and every student receives a form, to assure that students choose to fill them out honestly and return them without fear, and to prevent the inclusion of spurious information, for example one person completing two forms. The best protection against occasional student dishonesty is high overall participation, in which the few who might lie have their influence diluted by the many who would not. Tight procedures in handling the forms should make it impossible for one student to complete more than one form.

The most typical areas examined in the student instructional rating scale include the instructor's ability to plan, organize, and communicate; the quality of interaction between instructor and student; the perceived appropriateness of the student work load; the instructor's sense of professionalism and ethics; the fairness of grading procedures; and the amount of material learned in the course. Some of the newer scales are adding items about the instructor's sensitivity to issues of race, gender, and ethnicity. It is important to include an open-ended question after the last multiple choice item to pull in information not covered by specific items on the form.

There are a number of desirable practices which optimize use of the student instructional rating scale. Probably the most important of these is that students should respond anonymously. Some authorities feel that the teacher should be out of the room while the ratings are being filled out. A student should collect the ratings in an envelope, and the envelope should be sealed as soon as all students have turned in a rating. As an added safeguard against possible tampering, the instructor could place a strip of tape over the sealed envelope flap and sign his or her name on the tape strip. The student should then deliver the envelope of rating forms to the designated person, usually a department secretary.

Authorities such as Braskamp, Brandenburg, and Ory2 recommend that rating forms be filled out during class, and that regular class time be allotted for this purpose. They also recommend that ratings be collected during the last two weeks of classes rather than during or after the final examination. Braskamp, Brandenburg, and Ory provide a table on pages 44 and 45 listing many factors that have been shown by previous research to influence the student rating of class instruction.

It is important not to force each student to participate in filling out the instructional rating form. In terms of research design and principles of sampling, it would be better for every student to participate, but I have found a number of foreign students over the years who feel it is highly inappropriate for a college student to rate his or her teacher. This viewpoint from other cultures must be recognized and respected.

2. Verbal or Written Evaluations by Students

Another important source of information about the quality of instruction is the verbal or written evaluations volunteered by current or former students. These evaluations would be in addition to the basic responding on the standard SIRS form which is completed by every student. It is always much better if these additional evaluations are written. Verbal reports will consume too much administrator time, and they will introduce an unnecessary risk of misinterpretation if they must be summarized and reported verbally.

Verbal complaints from students can easily place a teacher at an unfair disadvantage because they are usually directed toward administrators and the typical time demands placed on administrators tend to discourage them from conducting a wide-ranging follow up. The administrator may simply be too busy to check this complaint by contacting the teacher involved, and the administrator may well be reluctant to do this if the teacher is thought likely to respond with hostility or indignation. However, the administrator will probably remember the negative information provided in this conference with a student, whether the information was entirely valid or not.

An effective denunciation delivered in private can unfortunately have an effect equal to that of scores of favorable written evaluations. It is possible that a teacher who suffers from this criticism will never learn about it in time to mount an effective defense; indeed, he or she may never learn about it at all.

The solution to this problem is for administrators to make sure they are getting a sufficiently large sample of student opinion when harsh criticism of a teacher is delivered verbally and in private. All complaints should be backed up in writing and signed by the complaining individuals, but the names of the complaining individuals should be held confidential by the department administration.

It is possible for a complaining individual to have been motivated by invalid considerations, for example by a personal dislike of the teacher. To guard against this, the responsible administrator should interview a randomly drawn sample of other students from the same class that was the source of the complaint.

Student evaluations of this type should be in the form of letters in order to balance the great specificity of the typical student instructional rating scale. Because they are a free-form source of information, these letters constitute powerful evidence of a teacher's achievement or deficiency if a number of different writers, who wrote in different years, all point to the same aspects of this person's teaching.

One form of this input, the signed testimonial letter, can be very powerful when authored by former students who have gone on to distinguish themselves in the profession. These reports will be especially authoritative if they have accumulated over a period of years.

If the music department already has an established tradition of evaluating teaching, it will tend to already receive letters of evaluation from the two extremes of opinion: students who feel they have been very well or very poorly served by a teacher. The task of the department administrator is to obtain and keep on file a large enough sample of these letters so that they present an accurate sample of student views.

3. "Archeological" Evidence of Performance as a Teacher

The information included under this rather strange heading is the kind of thing that can be "unearthed" in studies of old department records, newsletters, memos to the faculty, and programs for award ceremonies. It would include the record of student enrollment under the teacher being evaluated, with a tacet assumption that this should be compared to enrollment figures of other teachers who have taught the same or comparable courses. Such a comparison will not be a fair one unless factors such as the course's meeting time, other courses or required activities offered during the same hour, and a wealth of other considerations can all be held equal between the instructors being compared.

The number of invitations from students to serve on oral examination committees or guidance committees for graduate degree programs has been suggested as an indicator of good teaching at the graduate level, with the assumption that more invitations will be extended to those who do the best teaching. However, it is possible that a high number of invitations could also reflect a teacher's personal popularity or expected leniency as a committee member rather than his or her actual teaching achievement.

Another factor which must be considered when counting the number of committee invitations is the question of which instructors the students are exposed to in the process of taking courses required for their degrees. If all other things are kept equal, students are more likely to invite professors with whom they are already familiar to serve on their committees.

Nominations and/or awards made for excellence in teaching are another possible indicator of excellence. It will be useful in this case to note whether the award was primarily determined by faculty or by students, or if the award was a cooperative project administered by both faculty and students.

The specific achievements of a teacher's former students can serve as appropriate input under this heading. This is a difficult thing to judge unless the student's association with this teacher was long and highly specific in terms of professional work. It seems reasonable, for example, to allocate credit to a student's performance teacher of many years when that student attains a first chair position with a major symphony. But for how long should a mentor receive this kind of credit? It becomes a much harder evaluation problem to try to determine what stake the same student's music theory or music history teachers may have had in that student's eventual success.

4. Teacher Products

Only one step away from archeological evidence of a teacher's effectiveness is the evidence which can be assembled and submitted by the teacher. No evaluation of teaching performance is complete without an examination of teacher products. These would typically include course syllabi, reading lists, course calendars, bibliographies and discographies, homework assignments, study guides, examinations, overhead projector transparencies, slides, audio and video tapes, and so forth. Under this heading teachers should also get credit for their work in curriculum development and for any work in research and publication which assists them in their teaching mission.

Teacher products are very attractive components of the overall evaluation model because of their objectivity and permanence. They present little danger of misinterpretation, and they may be sent to outside experts in the appropriate field for authoritative evaluation.

The teacher products category will tend to favor highly organized people who create a large number of clear-cut materials. If a teacher's main appeal is based upon personality or the unique quality of his or her interaction with students, these strengths will not necessarily be reflected in teacher products.

5. Classroom Observation

Classroom observation of actual instruction can function as a source of information about performance as a teacher. Information may be obtained as unobtrusively as by placing an unattended video camera on a tripod aimed more or less in the direction of the teacher, or it can be taken through the comparatively dramatic appearance of the music department's entire promotion and tenure committee accompanied by the department administrator.

Videotape offers the advantages of objectivity and unobtrusiveness. Videotaping will probably not excite students in the way that a visit from a committee or a high ranking administrator would. Videotape may be viewed and reviewed by the teacher in private for the purposes of self-improvement, and it can be sent to outside experts for their own evaluation. Different sections of the tape may be rewound and replayed as needed.

"Live" observation in the classroom offers a more flexible and a more complete sampling of teaching performance than is usually possible with videotape. If a video camera is unattended, the camera will not be able to move to follow the teacher around the room. Important aspects of teaching may not get sampled at all. A live observer, on the other hand, may use a standard checklist to see that certain basic aspects of teaching will always be observed and rated.

A crucial element of either videotape or live observation is the need to obtain a sufficient sample of the teacher's performance. If the teacher is observed on an unusually good or bad day, it will distort the representation of that person's teaching. Braskamp, Brandenburg, and Ory 3 recommend three or four observations per class per semester. Observations should be made by different colleagues or administrators; not always by the same individual. The same authors also caution against undue reliance upon observation in cases where there is sharp philosophical difference or even personal animosity between the teacher and the observer. This can be a particularly serious problem if the same individual serves as chair of the teaching area and de facto observer for all faculty in that area for a long period of years.

6. Subjective Evaluation by Colleagues, Administrators, and Outside Consultants

The final form of information conceptualized in this model is subjective evaluation of the teacher by colleagues, administrators, and outside referees. This evaluation usually takes the form of a letter to the music department administrator or promotion and tenure committee. It is not the same thing as a report on a classroom observation, a critique of a research article, or a review of a performance recital, though it could well be based upon the same considerations as any of these reports. The distinguishing attribute of this form of evaluation will be the fact that it is a global summary bringing together a fair assessment of many different forms of information.

When a teacher's colleagues or administrators provide this type of evaluation, they are usually asked to use every bit of accurate information available to them in preparing their evaluative summary. When an outside referee is asked to write a report, it is often preferable that this referee already be familiar with the teacher's work. Sometimes, however, it may be desirable to have a report from a referee who does not know the teacher being evaluated. This might be the case when the review is expected to be controversial, or when an earlier evaluation is being challenged in court or through a grievance procedure.

Many review procedures allow the teacher being evaluated to nominate a certain proportion of the referees who will be called upon for input, and the usual purpose of this is to assure that at least some of the referees will be inclined to favor the teacher. Typical referees would include both local colleagues and people from outside the institution. A vexing procedural problem can occur when a referee proposed by the teacher declines to serve but proposes someone else. There is no assurance that this third party would have been nominated by the teacher, or that he or she will have any inclination to favor the teacher. The best solution for this kind of problem is for the administrator to ask for extra referees from the teacher so that others can be called upon if the first ones decline.


Applying the Model

Application of this evaluation model is relatively straightforward. There are three consecutive phases which must be completed before a valid decision can be made about the work of the person being evaluated. Those phases are the administrative preparation for evaluation, the sampling of information about teaching performance, and the interpretation of information brought in by the evaluation procedure.

1. Administrative preparation

The most difficult thing about the administrative preparation for a good evaluation of teaching is the need to manage the timetable of events. It will be difficult to delegate some things because of the need for confidentiality. Letters will be needed, and phone calls will probably also be needed to seek replies to letters. Higher administration can change the timetable during the evaluation process. For a number of practical reasons, it will usually be better for a department level administrator rather than the chair of a faculty committee to perform the duties of administrative preparation for an evaluation.

2. Sampling of Information

The process of evaluating college music instruction can be seen as nothing more nor less than drawing a scientifically balanced sample. Unfortunately, sampling is one of the areas in which things may most easily go wrong in the actual evaluation. When sampling is well done and information is taken for every element in the model, the good quality of the sample will tend to protect the evaluation. An evaluation based upon good sampling will resist problems even as severe as some sources of input having an unfair bias for or against the teacher.

3. Interpretation of Information

The final component of the evaluation is the well-considered interpretation of the information that was collected. This requires the prior existence of a competent structure explaining what the music department expects of its teachers, it requires a fair-minded committee, and it requires committee members who will attend as many meetings as are necessary to interpret the information obtained.

Evaluative information should be considered in the framework of a faculty member's official duties, and these duties should have been spelled out in adequate detail at the time he or she was first hired. Sometimes these duties will change before tenure is awarded or promotion is achieved, and it then becomes the duty of the department administrator to issue a written statement of newly defined duties.

Most departments will find it utterly impossible to assemble a tenure and promotion committee which is completely free of preferences for some faculty and a lesser preference for others. Mild preferences are not the problem, however. It is deep seated animosity between individuals that will cause the most trouble, and the most effective way to combat this is to simply make sure that membership on the tenure and promotion committee will significantly change from year to year, and to permit faculty to have some control over the year in which they come up for tenure and promotion. The only good way to address the question of meeting attendance is to conduct no business whatever when one or more members of the committee are absent.

The final decision in tenure and promotion reviews is usually made at a level much higher in the university than the music department. The department will be well served, however, if its own reviews are carried out in an exemplary manner. A high quality evaluation carried out by the department will give the department's recommendation a great deal of weight at every level of consideration. The time needed to perform a good evaluation of teaching will be time well spent by the music department.

1Sample copies of these forms are available on request from Albert LeBlanc, School of Music, Michigan State University, East Lansing, MI 48824-1043.

2Larry A. Braskamp, Dale C. Brandenburg, and John C. Ory, Evaluating Teaching Effectiveness: A Practical Guide (Newbury Park, CA: Corwin Press, Sage Publications, 1984), 51.

3Ibid., 66.

3158 Last modified on October 22, 2018