School Work

28 views

Debating International Learner Assessments as a Proxy Measure of Quality of Education in the Context of EFA-A Review Essay

This review essay looks at three publications that discuss the contentious issue of evaluating education quality by learner outcomes as a proxy indicator. The essay explores the debates, gaps and proposes recommendations in the context of Education
of 13
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Tags
Transcript
  www.sciedu.ca/wje World Journal of Education Vol. 4, No. 1; 2014 Published by Sciedu Press  35  ISSN    1925-0746 E-ISSN 1925-0754   Debating International Learner Assessments as a Proxy Measure of Quality of Education in the Context of EFA-A Review Essay Godfrey Mulongo 1,*   1 Regional Monitoring and Evaluation Specialist at The International Potato Center (CIP) and Ph.D candidate, Witwatersrand University, South Africa *Correspondence: Box 34424, Manara Road Plot 10, Dar-es-salaam, Tanzania. E-mail: mulongoe@gmail.com Received: September 28, 2013 Accepted: December 12, 2013 Online Published: February 11, 2014 doi:10.5430/wje.v4n1p35 URL: http://dx.doi.org/10.5430/wje.v4n1p35 Abstract This review essay looks at three publications that discuss the contentious issue of evaluating education quality (Note 1) by learner outcomes as a proxy indicator (Note 2). The essay explores the debates, gaps and proposes recommendations in the context of Education For All (EFA) (Note 3). The three articles reviewed are Harvey Goldstein’s (2004) “Education For All: the globalization of learning targets”,  Angeline Barret’s (2009), “The education Millennium Development Goal beyond 2015: Prospects for quality and learners”  and    Daniel Wagner’s. et.al. (2012) article on “the debate on learning assessments in developing countries”. Goldstein and Barret’s articles argue against the adherence to numerical learner achievement targets and explore possible consequences of doing so, while Wagner et.al. (2012) articles argue in support of same. Keywords: assessment, learner outcomes, quality, indicator    1. Background The joint UNESCO/UNICEF project on Monitoring Education-For-All Goals with focus on Learning Achievement (Note 4) begun in September 1992. This was an immediate outcome of The World Declaration on Education-For-All adopted at Jomtien in March 1990 which pointed to the need “to define acceptable levels of learning acquisition for educational programmed and to improve and apply systems of assessing learning achievement”. The understanding was   that merely improving the supply of education -- quantity -- was not enough, and an improvement in quality was considered vital, so was the means to assess progress made on this front (UNESCO/UNICEF, 1994). However, international learner assessments did not start with the UNESCO/UNICEF project, courtesy of EFA (Note 5). According to Larry Suter (Note 6), education researchers and policymakers from twelve countries first established a  plan for making large-scale cross-national comparisons between countries on student performance way back in 1958 at the UNESCO Institute for Education in Hamburg, Germany. This led to the first successful large-scale quantitative international study in mathematics conducted in 1965 by the International Association for the Evaluation of Educational Achievement (IEA) and included Australia, Belgium, England, Finland, France, Germany, Israel, Japan,  Netherlands, Scotland, Sweden, and the United States. Between 1965 and 2001 the IEA sponsored studies of mathematics in 1965, 1982, 1995, and 1999; science in 1970, 1986, 1995, and 1999; reading in 1970, 1991, and 2001; civics in 1970 and 1998; and technology in 1990 and 1999. The Educational Testing Service conducted an International Assessment for Education Progress in science and mathematics in 1990. The first international learner assessment under the EFA framework begun in 1992. Today, the main international assessment frameworks include the International Association for the Evaluation of Education Achievement, or IEA (e.g. TIMSS and PIRLS), the OECD (e.g. PISA), the Laboratoria Latinoamericano de Evaluacion de la Calidad de la Education or LLECE (e.g. TERCE), and the Southern and Eastern Africa Consortium for Monitoring Education Quality (SACMEQ) among others. It may therefore seem late in the day, especially now that the EFA vision of 2015 is drawing nigh, to re-open the contentious debate on the utilisation of the international numerical learner assessments as a proxy indicator of quality of education. However, it is my contention that this debate was not exhaustively concluded. The paper will analyse the arguments in the three papers and suggest some recommendations from the perspective of a Monitoring and  www.sciedu.ca/wje World Journal of Education Vol. 4, No. 1; 2014 Published by Sciedu Press  36  ISSN    1925-0746 E-ISSN 1925-0754   Evaluation (M&E) practitioner. 2. Harvey Goldstein’s (2004) Article Goldstein’s article is a critique of the EFA literacy target; his major bone of contention is the inherent potential of the distorting effects that ‘high stakes’ target setting can lead to. While citing evidence from England and USA where national and state-wide testing were introduced in the 1990s, Goldstein adopts a theoretical descriptive and normative approache to argue that in England, whilst test score levels in those aspects of the curriculum that are tested in public examination results have risen, there has been a backwash effect on learners and teachers that is detrimental to other aspects of quality. This includes a tendency to de-motivate pupils, increased test anxiety especially amongst low achievers and teachers. In Texas, Goldstein posits that although the high stake testing rewarded schools or teachers on the basis of pupils’ test scores with subsequent large gains in student test scores, the gain in the scores over time for the students on the national test was much less than that implied by the Texas test scores. Technically, the writer contends that by adopting these quantitative targets, individuals are encouraged to adapt their behaviour in order to maximize  perceived rewards (“even where this is dysfunctional in educational terms”). He also points out the near impossible challenges of creating achievement tests that are culturally or educationally specific and thus suitable for specific socio-economic contexts. Goldstein observes that “if a measuring instrument is restricted only to those items for which we might assume there are no locally specific differences, there is then a real question about whether such an instrument is measuring anything useful” (Goldstein, 2004:9). Also, under the section of measuring targets , Goldstein identifies the problematic issues of designing a monitoring framework for any given project. He observes that learning outcomes under the EFA framework do not have a clearly set-out description of what form the relevant assessments might take. In essence, the writer is raising very pertinent issues related to technical questions surrounding the design of international achievement tests. Finally, and perhaps the most forthright section in the article is the one that explores some of the possible consequences of UNESCO’s continued adherence to such targets: The writer strongly argues that the imposition of targets for institutions or school authorities can be viewed as an effective means of centralized control, an increasing control of individual systems by institutions such as the World Bank, which may not only lead to demoralization of poor performing countries but may also allow the imposition from outside of systemic reforms under the heading of ‘remedies’. He strongly recommends that each educational system should consider “developing different criteria for assessing quality, enrolment, etc. instead of monitoring progress towards an essentially artificial set of targets. 3. Angeline Barret’s (2009) Article This article is an unequivocal supporter of Goldstein’s (2004) paper which argues against quantitative measurement of learner outcomes as an indicator for quality of education. Specifically, the key issues on which these two writers are in agreement include; the difficulty of contextualizing the assessments/tests to suit the different international contexts; potential unintended outcomes of the assessments on learners and teachers and the inadequacy of the test instruments to comprehensively “measure learning”. On the latter, Barret argues that using quantitative indicators to measure learning outcomes is tantamount to technically neglecting complex, often culture specific and political nature of education as a social practice. The writer, while arguing against the adoption of Filmer et. al’s (2006)  proposal for replacement of Millenium Development Goal (MDG) with a Millennium Learning Goal (MLG) in  place of universal completion of primary education, asserts that when learning outcomes are used as an indicator of quality, there is a tendency to privilege cognitive learning outcomes that are amenable to measurement by standardized testing. The writer is not a lone voice as far as this issue is concerned (Note 7). Education experts agree that assessment needs to be ‘fit-for-purpose’; that is, it should enable evaluation of the extent to which learners have learned and the extent to which they can demonstrate that learning (Brown & Smith, 1997, in Brown, 2004). Other writers further propose that educationists need to consider not just what they are assessing and how they are doing it (particularly which methods and approaches), but also why  — the rationale for assessing and who (who to participate in the assessment) (Brown, 2004; Wagner, 2012). Finally, like Goldstein (2004), Barret concludes that international quantitative assessments can have unintended effect of impoverishing curricula and educational processes as teachers and learners come under pressure to maximize scores in pen and paper tests, irrespective of whether this enhances useful learning outcomes or not. In essence, the writer finds the quantitative measurement of learner outcomes an  www.sciedu.ca/wje World Journal of Education Vol. 4, No. 1; 2014 Published by Sciedu Press  37  ISSN    1925-0746 E-ISSN 1925-0754   inadequate indicator of quality of education. 4. Daniel Wagner’s et. al (2012) Articles This is a group of papers that provides multiple perspectives in support of learner achievement assessments. The writers adopt various approaches to present their points. For instance, Daniel Wagner’s article utilizes a hypothetical theoretical approach to argue in favour of quantitative international assessments. However, he is quick to add that the current learning assessments are a work in progress, and that no one has yet the perfect tool for the many goals and contexts that need to be addressed. Meanwhile, Marlaine Lockheed in a paper titled Policies, performance and  panaceas: what international large-scale assessments in developing countries , cites the conclusions of four evaluations of The IEA assessments as well as three evaluations of SAQMEC to theorize in support of the large-scale international assessments. Lockheed postulates that these assessments motivate regulatory and  behavioural policy reforms around the content of teaching and learning, that they create a learning environment in which assessment specialists can improve their technical skills and related performance and that they increase transparency regarding education system outcomes and human capital development in a cross national context and support analytic work enabled by data sets. In essence, giving reasons, the writer clearly discusses the significance of the achievements assessments at the national level and is unequivocal in support of the same. However, Wagner’s and Lockheed’s papers are the most relevant in this group of papers (for this review essay). . Other papers include Ina Mullis and MichaelMartin’s who by citing prePIRLSas an empirical example, explore practical means of contextualizing international test instruments for local educational demands. Amber Gove’s article, Think global, act local: how early reading assessments can improve learning for all analyses ongoing efforts to establish global-level learning indicators that would require countries to measure the percentage of children meeting locally set targets. The writer does this by drawing comparative examples from the work of Research Triangle Institute’s (RTI) 1  Early Grade Reading Assessment (EGRA) conducted in various countries. Finally, Amy Jo Dowd’s paper,  An NGO  perspective on assessment choice: from practice to research to practice  relies on anecdotal evidence to discuss how assessment evidence changed ‘business as usual’ for Save the Children basic education programmes. The writer therefore uses Save the Children as a case study to demonstrate how assessments can instigate shifts in approach to technical guidance, national implementation, advocacy and equitable impact for this discussion; this essay will therefore give a lot of attention on these two (Wagner and Lockheed’s). 5. Points of Convergence The writers seem to have divergent views on most points. However, the points on which they hold similar views include the notion that “quality” of education is a multifaceted concept that must be understood holistically. The consensus amongst the writers is that quality education is one that improves learning outcomes, meet the social and affective as well as cognitive needs of learners and create the conditions in the classroom, school and systemic levels that are conducive to learning. The three articles also underscore the need for “quality” education as opposed to mere  preoccupation with “quantity”. The writers also agree on the general definition of “learning outcomes” to broadly imply literacy, numeracy and life skills, creative and emotional skills, values and the social benefits of education. It is for this reason that the writers postulate that the measurement of the quality of education can never be a linear  process and that quantitative learning outcomes can only be a partial proxy measure. Finally, the writers concur that any attempt, especially by international institutions such as The World Bank to change these international assessments from low-stakes to high-stakes enterprises while dangling the “carrot and the stick” on the basis of student test performance has the potential to introduce distortions in the assessments and therefore compromise the ability of the assessments to provide both valid and reliable cross-national measures of human capital investments or valuable data for decision making. 6. Points of Divergence As already mentioned, Goldstein (2004) and Barret (2009) are “fraternal” voices that strongly oppose reliance on numerical learner achievement targets as a proxy indicator in monitoring the quality of education especially in the context of EFA. In contrast, Wagner et.al (2009) argues in favour of the same. In their arguments, they posit the following reasons:    Goldstein (2004) and Barret (2009) advance the argument that the eventual outcome of pursuing EFA targets may well be an increasing control of individual national systems by institutions such as the World  www.sciedu.ca/wje World Journal of Education Vol. 4, No. 1; 2014 Published by Sciedu Press  38  ISSN    1925-0746 E-ISSN 1925-0754   Bank or aid agencies, supported by global testing corporations. On the other hand, Wagner, gives this argument “a contempt card” by insisting that   learning assessments have grown increasingly important as  policy-makers and other educational consumers (agencies, schools, communities, parents, individuals, etc.) seek to understand what is (and isn’t) learned as a function of information inputs and that these quantitative measures are important for transparency of education system outcomes that may be compared across national contexts, support analytic work based on solid data sets, support teacher professional development, improve instructional design and reduce learning inequities..    Where learning outcomes are used as an indicator for quality, there is a tendency to privilege cognitive learning outcomes that are amenable to measurement by standardised testing. This “testing can have the unintended effect of impoverishing curricula and educational processes as teachers and learners come under  pressure to maximize scores in pen and paper tests, irrespective of whether this enhances useful learning outcomes” (Barret, 2009:3) . In contrast , Wagner (2012) does not find anything wrong with competitive testing because in any case, “learning assessments have been around as long as parents have been trying to teach their children, and institutions have been trying to determine who is intellectually fit for a particular  job” (ibid pg. 510). Wagner also argues that international large-scale assessments are neither targeted for improving the individual performance of students nor the individual effectiveness of teachers or schools  because they are typically sample based – for content domains as well as by students, teachers and schools. Therefore, pressure on individual students and teachers should not arise since these tests “rarely provide information about all students, teachers or schools in a country, they are not good for holding schools or teachers accountable or for creating incentives to reward school or teacher performance” (Wagner et. al , 2012:515).    Goldstein and Barret further theorize that these assessments may lead to demoralization of poor  performing countries and also allow the imposition from outside of systemic reforms under the heading of ‘remedies’ to put those countries ‘on track’. In contrast , Lockheed in Wagner et.al (2012) contends that international assessments can document the poor performance of a country relative to other countries at similar levels of economic development, which in turn could motivate a country to alter its investments in human capital development through education. According to him, a detailed analysis of the strengths and weaknesses of a nation’s curriculum as compared to that of other countries could motivate regulatory reform regarding the content and methods of instruction and results related to differences in student achievement associated with cross-country differences in teaching strategies which in-turn could invigorate efforts to change the behaviour of teachers through programmes of pre-service and in-service professional development. In effect, what Lockheed is saying is that competition between countries in not necessarily  bad, for the “positive jealousy” can in fact be impetus for institutional changes necessary for better learning outcomes.    Goldstein hypothesizes that the international numerical achievement targets/assessments neglect complex, often culture specific and political nature of education as a social practice. He particularly raises concerns about the comparability and reliability of the data, and the methodological and operational differences  between the various countries arising from these international assessments. Wagner on the other hand feels that technical parameters of a ‘good’ assessment such as sample sizes, alpha coefficients, test-retest reliability, predictive and content validity are non-issues and that these generally have substantial agreement among test-makers the world over.    Goldstein strongly recommends that each educational system should consider “developing different criteria for assessing quality, enrolment, etc. instead of monitoring progress towards essentially artificial targets set  by EFA (…. ) The emphasis should be on the local context and culture, within which those with local knowledge can construct own aims rather than rely upon common yardsticks implemented from a global  perspective”. On the other hand, Lockheed thinks that this micro approach will cause countries to miss-out on benefits that accrue from the participation, ostensibly because these assessments create a learning environment in which national assessment specialists can improve their technical skills and related  performance. “A country’s participation in international large-scale assessments reinforces national technical and managerial capacity for assessment, however, through both training and hands-on experience. It exposes participants to international quality standards in testing and measurement; it provides participants’ experience with the technical fields of sampling, test development, questionnaire development, data management and quality control; it builds participants’ management capacity for undertaking large research endeavours; it helps education officials prepare reports for policy-makers” (ibid Pg. 515)  www.sciedu.ca/wje World Journal of Education Vol. 4, No. 1; 2014 Published by Sciedu Press  39  ISSN    1925-0746 E-ISSN 1925-0754   7. Why the Divergence? As already demonstrated, from a conceptual point of view , Goldstein and Barret argue that learner achievement target is illogical. Meanwhile, Wagner argues in favour, citing the public good of the tests. The question therefore is; why are the writers so divergent in their views? From the onset, the writers hold two opposing epistemological (Note 8) views: positivism and interpretivism (Note 9). Goldstein adopts more of interpretivism approach and Wagner is positivist. For instance, Goldstein (2004) hypothesizes that the international numerical achievement targets/assessments neglect complex, often culture specific and political nature of education as a social practice. He particularly raises concerns about the comparability and reliability of the data, and the methodological and operational differences between the various countries arising from these international assessments. Wagner on the other hand feels that technical parameters of a ‘good’ assessment such as sample sizes, alpha coefficients, test-retest reliability, predictive and content validity are non-issues and generally have substantial agreement among test-makers the world over. Goldstein further recommends that each educational system should consider “developing different criteria for assessing quality instead of monitoring progress towards an essentially artificial target set by EFA. In refuting this, Lockheed in Wagner et. al (2012) contends that this micro approach will cause countries to miss-out on the many benefits that accrue from  participation in international learner assessments. In other words, the writers differ in their viewpoint because they are adopting different philosophical positions; Goldstein posits that achievement can-not be universalized while Wagner thinks learner outcomes are comparable since the tests are standardized. Divergence between the writers also emerges in the way they perceive dependency i.e. they take opposing sides as far as the dependency theory (Note 10) is concerned (read Noah and. Eckstein, 1988). Reading Goldstein’s article, one feels that the writer is suspicious about ‘who evaluates’ and ‘for what purposes’. He assumes that there are hegemonic penchants by western multinational/international institutions to abuse the intention and consequences of these assessments. As already highlighted, the writers differ on the possible unintended side-effects of adhering to these assessments. Goldstein fears that they may lead to impoverished curriculum and put learners and teachers under pressure to maximize scores. On the contrary, Wagner feels that these assessments are neither targeted for improving the individual performance of students nor the individual effectiveness of teachers or schools because they are typically sample based – for content domains as well as by students, teachers and schools and pressure to individual students and teachers should not arise. Looking at the content of these discussions, it is apparent that the writers differ on two  points: why the tests are conducted and the link between these assessments and the individual learner, teacher and school. We find this divergence purely emanating from a conceptual understanding, which is also exacerbated by the opposing philosophical positions adopted by the writers. The final reason for the divergence could be due to scanty empirical evidence to support the notion that indeed international assessments actually impact on education policy development amongst the participating countries, thus giving Goldstein ammunition. It is likely that some of the contentious issues would have been resolved if adequate scientific evidence were available. 8. Discussion Goldstein (2004) and Barret (2009) are justified in suspecting hegemonic tendencies by multinational/international institutions and how these bodies ‘abuse’ privilege by imposing control over developing countries in the guise of development aid (read Rodney and Pogge (Note 11) in Milner, 2005 (Note 12). It is important that these issues are flagged every so often so as to deter these tendencies. However, we find their argument about the possibility by institutions such as the World Bank or aid agencies with the support of global testing corporations to control individual national systems presumptuous. The writers would perhaps have cited similar instances to support their argument. Moreover, donor institutions are justified to demand for results whenever states accept aid because ‘the most effective evaluation practice balances accountability and learning’ (Jackson, 2013) (Note 13). And this is not in any way related to the adoption of quantitative indicators such as learner achievements. Demanding for results from aid is a basic tenet of accountability (Note 14). Having said that, we concur with Goldstein that such a target should not be set artificially by UNESCO or any other external body but by individual countries, unless the said body is funding the anticipated change. However, I wish Goldstein would have strengthened his argument by asking the extent to which these international learning assessments have achieved their objective in helping countries design their own assessments models
Advertisement
Related Documents
View more
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks