Εμφανιζόμενη ανάρτηση

Schools of thought

Ancient   Western   Medieval   Renaissance   Early modern   Modern   Contemporary Ancient Chinese Agriculturalism Con...

Αναζήτηση αυτού του ιστολογίου

Παρασκευή 11 Σεπτεμβρίου 2015

Assessing Writing

  • Features of difficult-to-score essays

    2015-09-11 18:51:52 PM

    Publication date: January 2016
    Source:Assessing Writing, Volume 27

    Author(s): Edward W. Wolfe, Tian Song, Hong Jiao

    Previous research that has explored potential antecedents of rater effects in essay scoring has focused on a range of contextual variables, such as rater background, rating context, and prompt demand. This study predicts the difficulty of accurately scoring an essay based on that essay's content by utilizing linear regression modeling to measure the association between essay features (e.g., length, lexical diversity, sentence complexity) and raters’ ability to assign scores to essays that match those assigned by expert raters. We found that two essay features – essay length and lexical diversity – account for 25% of the variance in ease of scoring measures, and these variables are selected in the predictive modeling whether the essay's true score is included in the equation or not. We suggest potential applications for these results to rater training and monitoring in direct writing assessment scoring projects. 




  • Comparing the accuracy of different scoring methods for identifying sixth graders at risk of failing a state writing assessment

    2015-09-11 18:51:52 PM

    Publication date: January 2016
    Source:Assessing Writing, Volume 27

    Author(s): Joshua Wilson, Natalie G. Olinghouse, D. Betsy McCoach, Tanya Santangelo, Gilbert N. Andrada

    Students who fail state writing tests may be subject to a number of negative consequences. Identifying students who are at risk of failure affords educators time to intervene and prevent such outcomes. Yet, little research has examined the classification accuracy of predictors used to identify at-risk students in the upper-elementary and middle-school grades. Hence, the current study compared multiple scoring methods with regards to their accuracy for identifying students at risk of failing a state writing test. In the fall of 2012, students composed a persuasive prompt in response to a computer-based benchmark writing test, and in the spring of 2013 they participated in the state writing assessment. Predictor measures included prior writing achievement, human holistic scoring, automated essay scoring via Project Essay Grade (PEG), total words written, compositional spelling, and sentence accuracy. Classification accuracy was measured using the area under the ROC curve. Results indicated that prior writing achievement and PEG Overall Score had the highest classification accuracy. A multivariate model combining these two measures resulted in only slight improvements over univariate prediction models. Study findings indicated that choice of scoring method affects classification accuracy, and automated essay scoring can be used to accurately identify at-risk students. 




  • Understanding variations between student and teacher application of rubrics

    2015-09-11 18:51:52 PM

    Publication date: Available online 18 August 2015
    Source:Assessing Writing

    Author(s): Jinrong Li, Peggy Lindsey

    While rubrics have their limitations, many studies show that they can clarify teacher expectations, and in comparison to a simple score or a letter grade, provide more information about the strengths and weaknesses of students’ writing. Few studies, however, have explored the variations between students’ and teachers’ readings of rubrics and how such differences affect student writing. This article describes the findings of a mixed-methods research study designed to identify discrepancies between students’ and teachers’ interpretation of rubrics and investigate how such mismatches influence the use of rubrics. For the study, students and instructors in a first-year writing program at a medium-sized state university were provided with a rubric created for end-of-course assessment and asked to share their understanding of the rubric and apply the rubric to a sample student paper previously normed by faculty. The researchers then explored discrepancies between the students’ and the instructors’ interpretation and application of the rubric in essay evaluation. Data analysis revealed significant differences between faculty and students. The article concludes with suggestions for how to address these differences in the writing classroom. 




  • Assessing the Teaching of Writing: Twenty-First Century Trends and Technologies, A.E. Dayton (Ed.). Utah State University Press, Logan, UT (2015), ISBN: 978-0-87421-954-8

    2015-09-11 18:51:52 PM

    Publication date: Available online 15 August 2015
    Source:Assessing Writing

    Author(s): Katrina Love Miller






  • Developing rubrics to assess the reading-into-writing skills: A case study

    2015-09-11 18:51:52 PM

    Publication date: Available online 8 August 2015
    Source:Assessing Writing

    Author(s): Sathena Chan, Chihiro Inoue, Lynda Taylor

    The integrated assessment of language skills, particularly reading-into-writing, is experiencing a renaissance. The use of rating rubrics, with verbal descriptors that describe quality of L2 writing performance, in large scale assessment is well-established. However, less attention has been directed towards the development of reading-into-writing rubrics. The task of identifying and evaluating the contribution of reading ability to the writing process and product so that it can be reflected in a set of rating criteria is not straightforward. This paper reports on a recent project to define the construct of reading-into-writing ability for designing a suite of integrated tasks at four proficiency levels, ranging from CEFR A2 to C1. The authors discuss how the processes of theoretical construct definition, together with empirical analyses of test taker performance, were used to underpin the development of rating rubrics for the reading-into-writing tests. Methodologies utilised in the project included questionnaire, expert panel judgement, group interview, automated textual analysis and analysis of rater reliability. Based on the results of three pilot studies, the effectiveness of the rating rubrics is discussed. The findings can inform decisions about how best to account for both the reading and writing dimensions of test taker performance in the rubrics descriptors. 




  • Building a better rubric: Mixed methods rubric revision

    2015-09-11 18:51:52 PM

    Publication date: Available online 6 August 2015
    Source:Assessing Writing

    Author(s): Gerriet Janssen, Valerie Meier, Jonathan Trace

    Because rubrics are the foundation of a rater's scoring process, principled rubric use requires systematic review as rubrics are adopted and adapted (Crusan, 2010, p. 72) into different local contexts. However, detailed accounts of rubric adaptations are somewhat rare. This article presents a mixed-methods (Brown, 2015) study assessing the functioning of a well-known rubric (Jacobs, Zinkgraf, Wormuth, Hartfiel, & Hugley 1981, p. 30) according to both Rasch measurement and profile analysis (n =524), which were respectively used to analyze the scale structure and then to describe how well the rubric was classifying examinees. Upon finding that there were concerns about a lack of distinction within the rubric's scale structure, the authors decided to adapt this rubric according to theoretical and empirical criteria. The resulting scale structure was then piloted by two program outsiders and analyzed again according to Rasch measurement, placement being measured by profile analysis (n=80). While the revised rubric can continue to be fine-tuned, this study describes how one research team developed an ongoing rubric analysis, something that these authors recommend be developed more regularly in other contexts that use high-stakes performance assessment. 




  • Keeping up with the times: Revising and refreshing a rating scale

    2015-09-11 18:51:52 PM

    Publication date: Available online 6 August 2015
    Source:Assessing Writing

    Author(s): Jayanti Banerjee, Xun Yan, Mark Chapman, Heather Elliott

    In performance-based writing assessment, regular monitoring and modification of the rating scale is essential to ensure reliable test scores and valid score inferences. However, the development and modification of rating scales (particularly writing scales) is rarely discussed in language assessment literature. The few studies documenting the scale development process have derived the rating scale from analyzing one or two data sources: expert intuition, rater discussion, and/or real performance. This study reports on the review and revision of a rating scale for the writing section of a large-scale, advanced-level English language proficiency examination. Specifically, this study first identified from literature, the features of written text that tend to reliably distinguish between essays across levels of proficiency. Next, using corpus-based tools, 796 essays were analyzed for text features that predict writing proficiency levels. Lastly, rater discussions were analyzed to identify components of the existing scale that raters found helpful for assigning scores. Based on these findings, a new rating scale has been prepared. The results of this work demonstrate the benefits of triangulating information from writing research, rater discussions, and real performances in rating scale design. 




  • Examining instructors’ conceptualizations and challenges in designing a data-driven rating scale for a reading-to-write task

    2015-09-11 18:51:52 PM

    Publication date: Available online 5 August 2015
    Source:Assessing Writing

    Author(s): Doreen Ewert, Sun-Young Shin

    Integrated reading-to-write (RTW) tasks have increasingly taken the place of independent writing-only tasks in assessing academic literacy; however, previous research has rarely investigated the development and use of rating scales to interpret and score test takers’ performance on such tasks. This study investigated how four highly experienced ESL instructors developed an empirically derived, binary choice, boundary definition (EBB) rating scale. EBB scales are known to be reliable and effective for assessing specific writing tasks administered for a single population. Nonetheless, evidence suggests that factors outside the curriculum also influence the criteria which shape an EBB scale and thus final placement scores. Analysis of the recorded deliberations provides evidence of instructors’ conceptualizations of reading, writing, and language in the RTW task although each is not equally transparent in the EBB rating scale developed. Understanding the task and the curriculum as well as considering the future training of raters were additional challenges in designing this EBB scale. Despite such challenges, an EBB rating scale has potential to help us better understand the relative contribution of hybrid constructs to the overall quality of RTW task performance and to enhance the linkages among teaching, rating, and future rater-training. 




  • Ed.Board/Aims and scope

    2015-09-11 18:51:52 PM

    Publication date: July 2015
    Source:Assessing Writing, Volume 25








  • In this issue…

    2015-09-11 18:51:52 PM

    Publication date: July 2015
    Source:Assessing Writing, Volume 25

    Author(s): Liz Hamp-Lyons






  • Teacher modeling on EFL reviewers’ audience-aware feedback and affectivity in L2 peer review

    2015-09-11 18:51:52 PM

    Publication date: July 2015
    Source:Assessing Writing, Volume 25

    Author(s): Carrie Yea-huey Chang

    This exploratory classroom research investigated how prolonged one-to-one teacher modeling (the teacher demonstrating desirable behaviors as a reviewer) in feedback to student reviewers’ essays may enhance their audience-aware feedback and affectivity in peer review. Twenty-seven EFL Taiwanese college students from a writing class participated in asynchronous web-based peer reviews. Training was conducted prior to peer reviews, and the teacher modeled the desirable reviewer behaviors in her feedback to student reviewers’ essays to prolong the training effects. Pre-modeling (narration) and post-modeling (process) reviews were analyzed for audience-aware feedback and affectivity. Reviewers’ audience awareness was operationalized as their understanding of reviewer–reviewee/peer–peer relationship and reviewees’ needs of revision-oriented feedback on global writing issues to improve the draft quality. Paired t-tests revealed significantly higher percentages of global feedback and collaborative stance (revision-oriented suggestions), more socio-affective functions, and a higher percentage of personal, non-evaluative reader feedback and a lower percentage of non-personal evaluator feedback in the post-modeling reviews. Such a difference, however, was not found in review tone. Overall, our findings confirm that EFL student reviewers can learn peer review skills through observation of their teachers and use of complementary tools such as checklists. 




  • “I must impress the raters!” An investigation of Chinese test-takers’ strategies to manage rater impressions

    2015-09-11 18:51:52 PM

    Publication date: July 2015
    Source:Assessing Writing, Volume 25

    Author(s): Qin Xie

    Most studies on holistic scoring procedures adopt a rater perspective, focusing on raters and textual features; few studies adopt a test-taker perspective. This study investigated test-taker perceptions of impression management strategies for holistically scored essay tests and estimated the extent to which such perceptions can predict essay scores. A total of 886 Chinese test-takers took two essay tests and completed a perception questionnaire, where they rated the importance of impression management strategies versus the target language and writing skills for helping them achieve good essay scores. Four raters marked the essay papers holistically; each essay paper received two independent ratings. Questionnaire items were analysed to verify scale reliability and to explore the underlying structure of each perception scale. The analysis identified two opposing factors underlying test-takers’ perceptions of the impression management strategies, which represented a risk-taking approach and a defensive approach to managing rater impressions. Regression analyses found the defensive approach consistently predicted essay scores, i.e. test-takers assigning higher values to the defensive approach achieved significantly higher scores than test-takers who assigned it lower values. Corpus-based textual analyses found defensive writers writing longer essays and committing fewer linguistic errors, while risk-takers used slightly more sophisticated words and sentences but also committed more errors. The findings were discussed in terms of their implications for validity and washback. 




  • ESL essay raters’ cognitive processes in applying the Jacobs et al. rubric: An eye-movement study

    2015-09-11 18:51:52 PM

    Publication date: July 2015
    Source:Assessing Writing, Volume 25

    Author(s): Paula Winke, Hyojung Lim

    We investigated how nine trained raters used a popular five-component analytic rubric by Jacobs et al. (1981; reproduced in Weigle, 2002). We recorded the raters’ eye movements while they rated 40 English essays because cognition drives eye movement (Reichle, Warren, & McConnell, 2009): By inspecting to what raters attend (on a rubric), we gain insights into their thoughts. We estimated inter-rater-reliability for each subcomponent. Attention (measured as total eye-fixation duration and eye-visit count, with the number of words per subcomponent controlled) was associated with inter-rater reliability: Organization (the second category) received the most attention (slightly more than the first, content). Organization also had the highest inter-rater reliability (ICC coefficient=.92). Raters attended least to and agreed least on mechanics (the last category; ICC coefficient=.85). Raters who agreed the most had common attentional foci across the subcomponents. Disagreements were directly viewable through eye-movement-data heatmaps. We discuss the rubric in terms of primacy: raters paid the most attention to organization and content because they were on the left (and read first). We hypothesize what would happen if test developers were to remove the least-reliable (and right-most) subcomponent (mechanics). We discuss rubric design as an important factor in test-construct articulation. 




  • Paul Diederich and the Progressive American High School, R.L. Hampel, in: A Volume in Readings in Educational Thought. Information Age Publishing, Charlotte, NC (2014)

    2015-09-11 18:51:52 PM

    Publication date: July 2015
    Source:Assessing Writing, Volume 25

    Author(s): Edward M. White






  • Ed.Board/Aims and scope

    2015-09-11 18:51:52 PM

    Publication date: April 2015
    Source:Assessing Writing, Volume 24








  • Editorial

    2015-09-11 18:51:52 PM

    Publication date: April 2015
    Source:Assessing Writing, Volume 24

    Author(s): Liz Hamp-Lyons






  • A new approach towards marking large-scale complex assessments: Developing a distributed marking system that uses an automatically scaffolding and rubric-targeted interface for guided peer-review

    2015-09-11 18:51:52 PM

    Publication date: April 2015
    Source:Assessing Writing, Volume 24

    Author(s): Alvin Vista, Esther Care, Patrick Griffin

    Currently, complex tasks incur significant costs to mark, becoming exorbitant for courses with large number of students (e.g., in MOOCs). Large scale assessments are currently dependent on automated scoring systems. However, these systems tend to work best in assessments where correct responses can be explicitly defined. There is considerable scoring challenge when it comes to assessing tasks that require deeper analysis and richer responses. Structured peer-grading can be reliable, but the diversity inherent in very large classes can be a weakness for peer-grading systems because it raises objections that peer-reviewers may not have qualifications matching the level of the task being assessed. Distributed marking can offer a solution to handle both the volume and complexity of these assessments. We propose a solution wherein peer scoring is assisted by a guidance system to improve peer-review and increase the efficiency of large scale marking of complex tasks. The system involves developing an engine that automatically scaffolds the target paper based on predefined rubrics so that relevant content and indicators of higher level thinking skills are framed and drawn to the attention of the marker. Eventually, we aim to establish that the scores produced are comparable to scores produced by expert raters. 




  • Effectiveness of written corrective feedback: Does type of error and type of correction matter?

    2015-09-11 18:51:52 PM

    Publication date: April 2015
    Source:Assessing Writing, Volume 24

    Author(s): Nuwar Mawlawi Diab

    The study examines the effect of form-focused corrective feedback (FFCF) on students’ ability to reduce pronoun agreement errors and lexical errors in new essays. Two experimental groups received on three assignments: direct error correction along with metalinguistic feedback, and only metalinguistic feedback, respectively; while the control group self edited their errors. All groups revised their errors before the next assignment. Students took pretest, immediate and delayed post-tests, and the two experimental groups were interviewed about the FFCF received. Results of the immediate post-test revealed a significant difference in pronoun agreement errors of the direct metalinguistic group; no significant difference appeared in lexical errors. At the delayed post-test, there was no significant difference among the groups in pronoun agreement errors, but a significant difference appeared in lexical errors of the direct metalinguistic group. Theoretical explanation and pedagogical implications are discussed. 




  • Predicting EFL writing ability from levels of mental representation measured by Coh-Metrix: A structural equation modeling study

    2015-09-11 18:51:52 PM

    Publication date: April 2015
    Source:Assessing Writing, Volume 24

    Author(s): Vahid Aryadoust, Sha Liu

    This study aims to invoke a theoretical model to link the linguistic features of text complexity, as measured by Coh-Metrix, and text quality, as measured by human raters. One hundred and sixty three Chinese EFL learners wrote sample expository and persuasive essays that were marked by four trained raters using a writing scale comprising Word Choice, Ideas, Organization, Voice, Conventions, and Sentence Fluency traits. The psychometric reliability of the writing scores was investigated using many-facet Rasch measurement. Based on the construction–integration (CI) model of comprehension, three levels of mental representation were delineated for the essays: the surface level (lexicon and syntax), the textbase, and the situation model. Multiple proxies for each level were created using Coh-Metrix, a computational tool measuring various textual features. Using structural equation modeling (SEM), the interactions between the three levels of representation, text quality, and tasks were investigated. The SEM with the optimal fit comprised 23 observed Coh-Metrix variables measuring various latent variables. The results show that tasks affected the situation model and several surface level latent variables. Multiple interactions were identified between writing quality and levels of representation, such as the Syntactic Complexity latent variable predicting the situation model and the situation model latent variable predicting Conventions and Organization. Implications for writing assessment research are discussed. 




  • Connecting writing and language in assessment: Examining style, tone, and argument in the U.S. Common Core standards and in exemplary student writing

    2015-09-11 18:51:52 PM

    Publication date: April 2015
    Source:Assessing Writing, Volume 24

    Author(s): Laura Aull

    Writing assessment criteria often separate language and writing standards, reflecting an implicit dichotomy between “writing” and “language” in which conventions and style can appear tangential to writing categories like argument and development of ideas. This article examines U.S. Common Core standards and student writing selected as exemplifying those standards in light of discourse-level features noted in applied linguistic and composition research. In so doing, it aims to help expose connections between organizationargument/claim developmentstyleconventions, and tone via patterns in academic writing. In this way, the article considers assessment standards and their use as opportunities to examine and clarify connections between the arguments students are encouraged to construct and the discourse options students have. 




  • Handbook of Automated Essay Evaluation, M.D. Shermis, J. Burstein (Eds.). Routledge, New York (2013)

    2015-09-11 18:51:52 PM

    Publication date: April 2015
    Source:Assessing Writing, Volume 24

    Author(s): Liz Hamp-Lyons






  • Ed.Board/Aims and scope

    2015-09-11 18:51:52 PM

    Publication date: January 2015
    Source:Assessing Writing, Volume 23








  • In this issue

    2015-09-11 18:51:52 PM

    Publication date: January 2015
    Source:Assessing Writing, Volume 23

    Author(s): Liz Hamp-Lyons






  • An evaluation of the Writing Assessment Measure (WAM) for children's narrative writing

    2015-09-11 18:51:52 PM

    Publication date: January 2015
    Source:Assessing Writing, Volume 23

    Author(s): Sandra Dunsmuir, Maria Kyriacou, Su Batuwitage, Emily Hinson, Victoria Ingram, Siobhan O'Sullivan

    The study evaluated the reliability and validity of the Writing Assessment Measure (WAM), developed to reflect the skills which children of different abilities are expected to achieve in written expression, as part of the National Curriculum guidelines in England and Wales. The focus was on its potential use in investigations of children's written narrative in order to inform and target related interventions. The study involved 97 children aged 7–11 from one urban primary school in England. Prompt 1 was administered to all the children in their classrooms together with a standardised written expression test. After three weeks, the same procedure was followed and Prompt 2 was administered. Statistical analyses of the reliability and validity of the instrument showed that it is consistent over time and can be scored reliably by different raters. Content validity of the instrument was demonstrated through inspection of item total correlations which were all significant. Analyses for concurrent validity showed that the instrument correlates significantly with the Wechsler Written Expressive Language sub-test. Significant differences between children of different age and writing skill were also found. The findings indicate that the instrument has potential utility to professionals assessing children's writing. 




  • Ideological and linguistic values in EFL examination scripts: The selection and execution of story genres

    2015-09-11 18:51:52 PM

    Publication date: January 2015
    Source:Assessing Writing, Volume 23

    Author(s): Corinne Maxwell-Reid, David Coniam

    This article investigates secondary school students’ use of narratives and other story genres in an English language public examination in Hong Kong. Understandings of genre from systemic functional linguistics (SFL) were used to analyse one prompt and the texts it elicited. High graded texts were found to differ from low graded texts in terms of the relative use of story versus argument genres, and also of narrative versus other story genres. Sample texts are used to highlight the role of expanded nominal groups in creating these genres. The texts suggest a need to raise students’, teachers’ and test developers’ awareness of the various purposes of story-telling, and of the generic structures and language resources that can be used to carry out those purposes. 




  • A hierarchical classification approach to automated essay scoring

    2015-09-11 18:51:52 PM

    Publication date: January 2015
    Source:Assessing Writing, Volume 23

    Author(s): Danielle S. McNamara, Scott A. Crossley, Rod D. Roscoe, Laura K. Allen, Jianmin Dai

    This study evaluates the use of a hierarchical classification approach to automated assessment of essays. Automated essay scoring (AES) generally relies on machine learning techniques that compute essay scores using a set of text variables. Unlike previous studies that rely on regression models, this study computes essay scores using a hierarchical approach, analogous to an incremental algorithm for hierarchical classification. The corpus in this study consists of 1243 argumentative (persuasive) essays written on 14 different prompts, across 3 different grade levels (9th grade, 11th grade, college freshman), and four different time limits for writing or temporalconditions (untimed essays and essays written in 10, 15, and 25minute increments). The features included in the analysis are computed using the automated tools, Coh-Metrix, the Writing Assessment Tool (WAT), and Linguistic Inquiry and Word Count (LIWC). Overall, the models developed to score all the essays in the data set report 55% exact accuracy and 92% adjacent accuracy between the predicted essay scores and the human scores. The results indicate that this is a promising approach to AES that could provide more specific feedback to writers and may be relevant to other natural language computations, such as the scoring of short answers in comprehension or knowledge assessments. 




  • Toward a validational framework using student course papers from common undergraduate curricular requirements as viable outcomes evidence

    2015-09-11 18:51:52 PM

    Publication date: January 2015
    Source:Assessing Writing, Volume 23

    Author(s): Diane Kelly-Riley

    Examining gains in undergraduate writing abilities, Haswell (2000) applied a multi-dimension construct of writing to impromptu writing exams composed at the first- and third years. This project replicates Haswell's original study to impromptu writing exams composed at the same points, and extends that methodology to course papers written for common undergraduate curricular contexts—first-year composition, general education requirements, and advanced undergraduate writing in the disciplines requirements—to consider the use of such assessment scores as plausible and appropriate evidence for outcomes assessment purposes within a validational framework (articulated by Kane, 2006, 2013). This study considers the feasibility of reporting such localized assessment information as an alternative to represent progress for undergraduate writing ability, and reports preliminary evidence suggesting positive effects of distributed writing requirements across undergraduate curriculums on student writing performance. 




  • Diagnostic Writing Assessment: The Development and Validation of a Rating Scale, U. Knoch. Peter Lang, Frankfurt (2009), ISBN: 978-3-631-58981-6

    2015-09-11 18:51:52 PM

    Publication date: January 2015
    Source:Assessing Writing, Volume 23

    Author(s): Ashley Velazquez






  • Ed.Board/Aims and scope

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22








  • Three current, interconnected concerns for writing assessment

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): Liz Hamp-Lyons






  • Automated Essay Scoring feedback for second language writers: How does it compare to instructor feedback?

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): Semire Dikli, Susan Bleyle

    Writing is an essential component of students’ academic English development, yet it requires a considerable amount of time and effort on the part of both students and teachers. In an effort to reduce their workload, many instructors are looking into the use of Automated Essay Scoring (AES) systems to complement more traditional ways of providing feedback. This paper investigates the use of an AES system in a college ESL writing classroom. Participants included 14 advanced students from various linguistic backgrounds who wrote on three prompts and received feedback from the instructor and the AES system (Criterion). Instructor feedback on the drafts (n =37) was compared to AES feedback and analyzed both quantitatively and qualitatively across the feedback categories of grammar (e.g., subject-verb agreement, ill-formed verbs), usage (e.g., incorrect articles, prepositions), mechanics (e.g., spelling, capitalization), and perceived quality by an additional ESL instructor. Data were triangulated with opinion surveys regarding student perceptions of the feedback received. The results show large discrepancies between the two feedback types (the instructor provided more and better quality feedback) and suggest important pedagogical implications by providing ESL writing instructors with insights regarding the use of AES systems in their classrooms. 




  • Contexts of engagement: Towards developing a model for implementing and evaluating a writing across the curriculum programme in the sciences

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): Ingrid A.M. McLaren

    Reflective in nature, this paper describes the process of implementing a WAC programme in the sciences at a university in the Anglophone Caribbean. It also outlines attempts to justify its continuity by employing ‘utilization-focused evaluation’ which is designed and organized around what information would be most useful to the administration and the way in which this information would be applied. A multi-pronged/naturalist-based approach to evaluating outcomes is applied with a view to offering a spectrum of outcomes based on varying levels of inquiry from a variety of contexts which enable a more informed exploration of the myriad of emergent issues brought to light. The suite of intervention strategies and the quality of evidence produced were well received by the administration who facilitated further funding. Suggestions for programme implementation, evaluation and sustainability are offered. 




  • Just Ask Teachers: Building expertise, trusting subjectivity, and valuing difference in writing assessment

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): Jeff Osborne, Paul Walker

    The authors theorize a method for writing assessment that deemphasizes the traditional privileging of validity and reliability generated from multiple-reader, calibrated scoring of samples of student work. While acknowledging the holistic model's benefits to the field of writing studies, the authors assert that its claims of accuracy and objectivity minimize the numerous tangible and intangible variables that writing teacher/experts understand and value as they evaluate writing. The removal of the “object” – writing artifact – from its context in order to assess it quantitatively diminishes the opportunities for achieving meaningful and pedagogically effective results for a writing program. Rather than calibrating teachers to a rubric, the proposed method here generates a rough calibration of teacher “values” via facilitated conversations, accepting the differences of opinions and “messiness” of teachers’ subjective views of writing. Teachers then periodically assess their students’ performance on these values as well as the course objectives. In this way, the process develops teacher contextual expertise while producing focused assessment data that is both useful for outside agencies and meaningful to the program's goals of improving the teaching of writing. 




  • On the vulnerability of automated scoring to construct-irrelevant response strategies (CIRS): An illustration

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): Isaac I. Bejar, Michael Flor, Yoko Futagi, Chaintanya Ramineni

    This research is motivated by the expectation that automated scoring will play an increasingly important role in high stakes educational testing. Therefore, approaches to safeguard the validity of score interpretation under automated scoring should be investigated. This investigation illustrates one approach to study the vulnerability of a scoring engine to construct-irrelevant response strategies (CIRS) based on the substitution of more sophisticated words. That approach is illustrated and evaluated by simulating the effect of a specific strategy with real essays. The results suggest that the strategy had modest effects, although it was effective in improving the scores of a fraction of the lower-scoring essays. The broader implications of the results for quality assurance and control of automated scoring engines are discussed. 




  • Reflexive writers: Re-thinking writing development and assessment in schools

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): Mary Ryan

    Writing is a complex and highly individual activity, which is approached in different ways by different writers. Writers reflexively mediate subjective and objective conditions in specific and nuanced ways to produce a product in time and place. This paper uses a critical realist theory of reflexivity to argue that the teaching and assessment of writing must account for the different ways that students manage and make decisions in their writing. Data from linguistically and culturally diverse primary students in Australia are used to illustrate how four distinct reflexive modalities constitute the ways in which students approach writing. The paper offers a new approach to assessing writing for and of learning that considers writers as reflexive and agentic in different ways. It posits the importance of making visible and explicit the context and reflexive decision-making as writers shape a product for a purpose and audience. 




  • Examining genre effects on test takers’ summary writing performance

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): Jiuliang Li

    The task demands of summarization are closely related to the characteristics of source texts, and genre is an essential characteristic. This paper reports an empirical study that examines how text type (genre) affects test takers’ performance on summarization tasks. A sample of 86 students was drawn from an undergraduate program in a Chinese university. The students first wrote summaries of a narrative text and subsequently wrote summaries of an expository text. Genre effects were examined from three perspectives: students’ summary scores, summary scripts, and perception of these effects as reflected in questionnaire surveys and post-test interviews. MFRM analysis showed that participants performed better on expository writing than on narrative text summarization overall. The difficulties of the rubric components differed, with several differing significantly across the tasks. However, the participants generally considered the narrative text summarization to be easier than the expository task according to the results of the questionnaire surveys and interviews. Factors that led to this contradiction between performance and perception were explored by examining the participants’ summaries and their accounts of the task difficulty and test-taking processes. Implications are discussed with reference to summarization task design, summarization teaching, and the relevance of genre effects in the creation of equivalent versions of tests. 




  • The challenges of emulating human behavior in writing assessment

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): Mark D. Shermis

    This is a response to Dr. Les Perelman's critique of Phase I of the Hewlett Trials. His argument is that the construct validity of the study was undermined because there was a high correlation between word count and vendor predicted scores. The response addresses the argument by showing that correlations do not mean causation. Further the reply illustrates how predications are actually formulated in automated essay scoring. The response concludes with an appeal for more research on the underlying constructs associated with writing. 




  • The WPA Outcomes Statement: A Decade Later, N.N. Behm, G.R. Glau, D.H. Holdstein, D. Roen, E.M. White (Eds.). The Parlor Press, Anderson, SC (2013)

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22

    Author(s): William Condon






  • Thank you to reviewers, 2013

    2015-09-11 18:51:52 PM

    Publication date: October 2014
    Source:Assessing Writing, Volume 22








  • Ed.Board/Aims and scope

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21








  • In this issue

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21

    Author(s): Liz Hamp-Lyons






  • Does the writing of undergraduate ESL students develop after one year of study in an English-medium university?

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21

    Author(s): Ute Knoch, Amir Rouhshad, Neomy Storch

    English language skills are often listed by employers among key attributes needed for employment and there is a general dissatisfaction with English standards, especially writing skills, following graduation (e.g., Healy & Trounson, 2010; Rowbotham, 2011 in the Australian context). In the case of ESL students, research on whether English proficiency improves after studying at an English-medium university has to date been scarce, and has generally examined students’ gains after a relatively short duration. The current study examined students’ ESL writing proficiency following a year's study in an Australian university. The study used a test-retest design. A range of measures was used to assess writing, including global and discourse measures. All participants were also surveyed and a subset was interviewed. The study found that students’ writing improved after a year of study but only in terms of fluency. There were no observed gains in accuracy, syntactic and lexical complexity. Global scores of writing also showed no change over time. Students stated in their questionnaires and interviews that they did not receive any feedback on their writing from their content lecturers. We discuss our findings in relation to the students’ second language (L2) proficiency and the nature of their immersion experience. 




  • Development and validation of a scale to measure perceived authenticity in writing

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21

    Author(s): Nadia Behizadeh, George Engelhard

    The purpose of this study is to examine the reliability and validity of scores obtained from a scale designed to measure authenticity of writing from the writer's perspective: the Perceived Authenticity of Writing (PAW) Scale. Using the concept of funds of knowledge as a framework (Hogg, 2011), 17 items were created to represent three areas of relevance for students: community and global (6 items), personal (5 items), and academic (5 items). One item was intended as a general item to capture an overall rating of authenticity. The PAW Scale was administered to 8th grade students (N =103), and Rasch measurement theory was used to examine the reliability and validity of the scores. The PAW Scale exhibited good reliability (Rel=.92), and good model-data fit was found for the scale. Validity evidence was also obtained from short written responses, comparison of the conceptual framework and authentic writing theory, and correlations between scores on the PAW Scale and (1) writing self-efficacy (r =.097, ns), (2) writing interest (r =.542, p <.001), (3) mastery goal orientation (r =.446, p <.05), and self-reported prior achievement in writing (r =.116, ns). The PAW Scale offers a promising measure for future research exploring perceived authenticity, including research informing writing assessment policy. 




  • The three-fold benefit of reflective writing: Improving program assessment, student learning, and faculty professional development

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21

    Author(s): Elizabeth G. Allan, Dana Lynn Driscoll

    This article presents a model of reflective writing used to assess a U.S. general education first-year writing course. We argue that integrating reflection into existing assignments has three potential benefits: enhancing assessment of learning outcomes, fostering student learning, and engaging faculty in professional development. We describe how our research-based assessment process and findings yielded insights into students’ writing processes, promoted metacognition and transfer of learning, and revealed a variety of professional development needs. We conclude with a description of our three-fold model of reflection and suggest how others can adapt our approach. 




  • Assembling validity evidence for assessing academic writing: Rater reactions to integrated tasks

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21

    Author(s): Atta Gebril, Lia Plakans

    Integrated writing tasks that depend on input from other language abilities are gaining ground in teaching and assessment of L2 writing. Understanding how raters assign scores to integrated tasks is a necessary step for interpreting performance from this assessment method. The current study investigates how raters approach reading-to-write tasks, how they react to source use, the challenges they face, and the features influencing their scoring decisions. To address these issues, the study employed an inductive analysis of interviews and think-aloud data obtained from two raters. The results of the study showed raters attending to judgment strategies more than interpretation behaviors. In addition, the results found raters attending to a number of issues specifically related to source use: (a) locating source information, (b) citation mechanics, and (c) quality of source use. Furthermore, the analysis revealed a number of challenges faced by raters when working on integrated tasks. While raters focused on surface source use features at lower levels, they shifted their attention to more sophisticated issues at advanced levels. These results demonstrate the complex nature of integrated tasks and stress the need for writing professionals to consider the scoring and rating of these tasks carefully. 




  • Instructional rubrics: Effects of presentation options on writing quality

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21

    Author(s): Todd H. Sundeen

    Using rubrics for writing instruction has become a common practice for evaluating the expressive writing of secondary students. However, students do not always receive explicit instruction on rubric elements. When students are explicitly taught elements of writing rubrics, they have a clearer perspective of the expectations for their compositions. This study examined high school student writing under three conditions using instructional rubrics in which students were taught rubric elements, provided with a copy of the rubric, and simply scored using the rubric. Results indicated that when students have access to an instructional rubric either through explicit teaching or by receiving a copy, their writing quality improved. 




  • The WPA Outcomes Statement, validation, and the pursuit of localism

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21

    Author(s): Diane Kelly-Riley, Norbert Elliot

    This validation study examines the WPA Outcomes Statement for First-Year Composition, a United States consensus statement for first-year post-secondary writing, as implemented in a unified instructional and assessment environment for first-year college students across three different institution types. Adapting categories of contemporary validation from Kane (2013), we focus on four forms of evidence gathered from early and late-semester student performance (n =153): scoring, generalization, extrapolation, and implication. With an emphasis on education policies in action, the study generates important questions that, in turn, provide a basic framework for further research into the challenges of aligning broad consensus statements with locally developed educational initiatives. 




  • When “the state of the art” is counting words

    2015-09-11 18:51:52 PM

    Publication date: July 2014
    Source:Assessing Writing, Volume 21

    Author(s): Les Perelman

    The recent article in this journal “State-of-the-art automated essay scoring: Competition results and future directions from a United States demonstration” by Shermis ends with the claims: “Automated essay scoring appears to have developed to the point where it can consistently replicate the resolved scores of human raters in high-stakes assessment. While the average performance of vendors does not always match the performance of human raters, the results of the top two to three vendors was consistently good and occasionally exceeded human rating performance.” These claims are not supported by the data in the study, while the study's raw data provide clear and irrefutable evidence that Automated Essay Scoring engines grossly and consistently over-privilege essay length in computing student writing scores. The state-of-the-art referred to in the title of the article is, largely, simply counting words. 




  • Ed.Board/Aims and scope

    2015-09-11 18:51:52 PM

    Publication date: April 2014
    Source:Assessing Writing, Volume 20








  • In this issue

    2015-09-11 18:51:52 PM

    Publication date: April 2014
    Source:Assessing Writing, Volume 20

    Author(s): Liz Hamp-Lyons






Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου


Bookmark and Share
THIRD PILLAR - Portal για την Φιλοσοφία

ΦΥΛΑΚΕΣ ΓΡΗΓΟΡΕΙΤΕ !

ΦΥΛΑΚΕΣ ΓΡΗΓΟΡΕΙΤΕ !

Σοφία

Απαντάται για πρώτη φορά στην Ιλιάδα (0-412) :
''...που με την ορμηνία της Αθηνάς κατέχει καλά την τέχνη του όλη...''
..
Η αρχική λοιπόν σημασία της λέξης δηλώνει την ΓΝΩΣΗ και την τέλεια ΚΑΤΟΧΗ οποιασδήποτε τέχνης.
..
Κατά τον Ησύχιο σήμαινε την τέχνη των μουσικών
και των ποιητών.
Αργότερα,διευρύνθηκε η σημασία της και δήλωνε :
την βαθύτερη κατανόηση των πραγμάτων και
την υψηλού επιπέδου ικανότητα αντιμετώπισης και διευθέτησης των προβλημάτων της ζωής.
..
Δεν είναι προ'ι'όν μάθησης αλλά γνώση πηγαία που αναβρύζει από την πνευματικότητα του κατόχου της.
"ΣΟΦΟΣ Ο ΠΟΛΛΑ ΕΙΔΩΣ" λέει ο Πίνδαρος
..