ETS 说:TOEFL/ GRE 的题目是怎么出的

TOEFL® GRE® are registered trademarks of Educational Testing Service (ETS). This website is not endorsed or approved by ETS.

在介绍ETS 怎么出题前,先简单说一下这个机构吧。


最简单的介绍就是,大多数人考的TOEFL\GRE 等考试就是他们家出的。


稍微丰富点的介绍如下


Educational Testing Service, founded in 1947, is the world’s largest private nonprofit educational testing and assessment organization. It is headquartered in Lawrence Township, New Jersey, but has a Princeton address. 


介绍我们来介绍下,究竟TPO\OG 或者真题是怎么出的


 本文内容均来自于ETS官方资料


Step 1: Defining Objectives


Educators, licensing boards or professional associations identify a need to measure certain skills or knowledge. Once a decision is made to develop a test to accommodate this need, test developers ask some fundamental questions:


Who will take the test and for what purpose?

What skills and/or areas of knowledge should be tested?

How should test takers be able to use their knowledge?

What kinds of questions should be included? How many of each kind?

How long should the test be?

How difficult should the test be?


Step 2: Item Development Committees


defining test objectives and specifications

helping ensure test questions are unbiased

determining test format (e.g., multiple-choice, essay, constructed-response, etc.)

considering supplemental test materials

reviewing test questions, or test items, written by ETS staff

writing test questions


Step 3: Writing and Reviewing Questions


Each test question — written by ETS staff or item development committees — undergoes numerous reviews and revisions to ensure it is as clear as possible, that it has only one correct answer among the options provided on the test and that it conforms to the style rules used throughout the test. Scoring guides for open-ended responses, such as short written answers, essays and oral responses, go through similar reviews.


Step 4: The Pretest


After the questions have been written and reviewed, many are pretested with a sample group similar to the population to be tested. The results enable test developers to determine:


    the difficulty of each question

    if questions are ambiguous or misleading

    if questions should be revised or eliminated

    if incorrect alternative answers should be revised or replaced


Step 5: Detecting and Removing Unfair Questions


To meet the stringent ETS Standards for Quality and Fairness guidelines, trained reviewers must carefully inspect each individual test question, the test as a whole and any descriptive or preparatory materials to ensure that language, symbols, words, phrases and content generally regarded as sexist, racist or otherwise inappropriate or offensive to any subgroup of the test-taking population are eliminated.


ETS statisticians also can identify questions on which two groups of test takers who have demonstrated similar knowledge or skills perform differently on the test through a process called Differential Item Functioning (DIF). If one group performs consistently better than another on a particular question, that question receives additional scrutiny and may be deemed biased or unsatisfactory. Note: If people in different groups actually differ in their average levels of relevant knowledge or skills, a fair test question will reflect those differences.

Step 6: Assembling the Test


After the test is assembled, it is reviewed by other specialists, committee members and sometimes other outside experts. Each reviewer answers all questions independently and submits a list of correct answers to the test developers. The lists are compared with the ETS answer keys to verify that the intended answer is, indeed, the correct answer. Any discrepancies are resolved before the test is published.



Step 7: Making Sure — Even After the Test is Administered — that the Test Questions are Functioning Properly


Even after the test has been administered, statisticians and test developers review to make sure that test questions are working as intended. Before final scoring takes place, each question undergoes preliminary statistical analysis and results are reviewed question by question. If a problem is detected, such as the identification of a misleading answer to a question, corrective action, such as not scoring the question, is taken before final scoring and score reporting takes place.


Tests are also reviewed for reliability. Performance on one version of the test should reasonably predict performance on any other version of the test. If reliability is high, results will be similar no matter which version a test taker completes.




LAB | 考培




本文源自微信公众号:LABcircle