• 1.99 MB
  • 2022-06-17 15:54:42 发布

广东省高考英语语法填空题构念效度验证

  • 72页
  • 当前文档由用户上传发布,收益归属用户
  1. 1、本文档共5页,可阅读全部内容。
  2. 2、本文档内容版权归属内容提供方,所产生的收益全部归内容提供方所有。如果您对本文有版权争议,可选择认领,认领后既往收益都归您。
  3. 3、本文档由用户上传,本站不保证质量和数量令人满意,可能有诸多瑕疵,付费之前,请仔细先通过免费阅读内容等途径辨别内容交易风险。如存在严重挂羊头卖狗肉之情形,可联系本站下载客服投诉处理。
  4. 文档侵权举报电话:19940600175。
分类号__________密级____公开_____UDC编号20130110075硕士学位论文广东省高考英语语法填空题构念效度验证申请人姓名刘慧奕导师姓名及职称曾用强教授申请学位类别文学学科专业名称外语教学技术与评估培养单位英语语言文化学院学位授予单位广东外语外贸大学2016年6月 分类号_________密级___公开____UDC__编号20130110075广东外语外贸大学硕士学位论文ACONSTRUCTVALIDATIONSTUDYOFGRAMMARGAP-FILLINGTASKINNMETGUANGDONG广东省高考英语语法填空题构念效度验证申请人姓名刘慧奕导师姓名及职称曾用强教授申请学位类别文学学科专业名称外语教学技术与评估论文提交日期2016年4月20日论文答辩日期2016年5月28日答辩委员会蔡宏文副教授(主席)孙肇春教授李金辉博士李子容博士学位授予单位广东外语外贸大学 ACONSTRUCTVALIDATIONSTUDYOFGRAMMARGAP-FILLINGTASKINNMETGUANGDONGLiuHuiyiSupervisedbyZengYongqiangSubmittedinPartialFulfillmentoftheRequirementsforTHEDEGREEOFMASTEROFARTSinLanguageTeachingTechnologyandAssessmentGuangdongUniversityofForeignStudiesJune2016i ACKNOWLEDGEMENTSMydeepestgratitudegoesfirstandforemosttomysupervisor,ProfessorZENGYongqiangforhiscontinuoushelp,invaluableguidance,illuminatinginstructionsandwarm-heartedkindnessduringmythreeyears’postgraduatestudies.Iamverygratefultobeabletomeetsuchaniceandgreatsupervisorwhoisfullofcreativeideasandenthusiasmaboutlanguagetestingstudies.Duringtheprocessofwritingmythesis,heoffersmeplentyofinsightfulsuggestionstome,helpingmewiththedata’scollection.Withouthishelpanddirection,Iwouldneverbeabletocompletethisthesis..Atthesametime,I’dliketothanksalltheteacherswhogivelecturestomeduringmypostgraduatestudyatGuangdongUniversityofForeignStudies.EspeciallythankstoProfessorLIUJiandawhoteachesusthecourseofStatistics,ProfessorCAIHongwenwhoalsooffersmealotofsuggestion,andProfessorRENWeiwhogivesusimpressivelectures.Apartfromthat,IalsoowemysinceregratitudetomyseniorschoolmateCHENYan.Shehasofferedmewithgreathelpinmyexperiment,helpingmechoosingtheappropriatesubjectsfromherclass.Thankstoherhelpaswellasherlovelystudentsfortheirsupportandcooperation.What’smore,mygratitudealsogoestomyclassmatesaswellasroommatesZHANGHuiling,CHENFangandJiangchaowhohelpmeaspossibleastheycouldthroughoutthestageofthethesis’writing.Lastmythanksgotomybelovedfamilyfortheirloveandsupportalltheseyears.iii ABSTRACTThisthesisisaboutthevalidationstudyofgrammargap-filling(G/F)taskinNMET(NationalMatriculationEnglishTest)Guangdong.ThegrammarG/FtesthasbeenputintouseinNMETsince2014,butasamatteroffact,itisinNMETGuangdongthatgrammarG/Ftaskhasmadeitsfirstappearance.In2007,grammarG/FtaskappearedinNMETGuangdongforthefirsttimetosubstitutethetraditionalmultiple-choice(MC)grammartest.SincetheemergenceoftraditionalMCgrammartestinNMET,itsshortcomingsbecomeincreasinglyevident,i.e.manystudentsanswertheitemsresortingtotest-wiseness;studentslearngrammaticalknowledgejustbyrecitingtheexemplifiedsentenceorrules,studentsfailtoapplythegrammaticalknowledgeintothespecificcontext.Whereas,thegrammarG/Ftaskofferstesttakersapassagewithanumberofwordsdeleted.Testtakersarerequiredtofillthegapsbythemselveswithoutanyoptiontochoose,thewordtheyfillmustbegrammaticallycorrectandmakethetextcompleteandmeaningful.SotheintentionoftestdesignersdesigningthegrammaticalG/Ftaskistomeasurestudents’grammaticalcompetenceincludingformandmeaningatthesententialordiscourselevel.ItisexpectedthatthegrammaticalG/Ftaskcanactivatestudents’grammaticalknowledgeandencouragethemtolearntousethegrammaticalrulesincontext.Butintheactualimplementation,doesthetaskcanreallymeasurewhatitissupposedtomeasure?Thisissueneedsfurtherresearch.Atthesametime,sincethetaskhasbeenputintousenotonlyinGuangdongprovince,butinthewholenation.Sothevalidationstudyofthistaskisalsoquitenecessary.ThepresentstudycombinesquantitativeandqualitativeapproachtovalidatethevalidityofgrammarG/FtaskinNMETGuangdong.Ononehand,theauthorconductedaninterviewwithonetestdesignerofthetask,findingouttheconstructofgrammarG/Ftask.Onthebasisoftheinterview,questionnairesforhighschoolEnglishteachersareconstructedtoinvestigatethehighschoolEnglishteachers’opinionaboutwhatthegrammaticalG/Fhasmeasured.Ontheotherhand,thisstudyv investigatesthetesttakers’answeringprocessbyadoptingretrospectiveinterview,aftercodingandmatchingparticipants’answeringperformancewiththeexpectedperformance,itaimstofindoutwhethertesttakers’answeringprocesscanreflecttheabilitiesthetestdesignerexpectstomeasure.Theresultsofthequestionnairedemonstratethat,overallspeaking,highschoolEnglishteachersagreewiththestatementsinthequestionnaireconcerningtheconstructofthegrammarG/Ftask.Asforparticipants’answeringprocess,itisfoundoutthat,inmostcases,participantsanswertheitemcorrectlyduetothegraspofthegrammaticalknowledge,whilefailtoanswertheitembecauseoflackingoftheunderlyingabilities.Soparticipants’answeringprocessisconsistentwiththeexpectedansweringprocess.Generallyspeaking,thehighschoolEnglishteachers’opinionabouttheconstructofgrammarG/Ftaskagreewiththetestdesigners’intention.Atthesametime,participants’answeringprocessalsorevealthattheydousetheabilitiessupposedtobemeasuredbythegrammarG/Ftask.Therefore,wecanconcludethatgrammarG/Ftaskhasagoodlevelofconstructvalidity.Keywords:grammargap-fillingtask,validationstudy,test-takingprocessvi 摘要本论文主要是研究广东省高考英语语法填空题的效度。国家考试中心公布的《考试说明》宣布从2014年开始,全国卷高考英语开始使用语法填空来替代单项选择。但其实语法填空最早出现在高考英语是2007年广东省高考英语改革,高考英语广东卷率先决定使用语法填空来替代单项选择。因为传统的单项选择题在高考英语中使用了多年,其缺点也越来越明显。学生做单项选择,很多情况都是靠猜测,同时单项选择这种测试形式导致学生学语法知识只是孤立的记语法规则或者例句,而不能将语法知识应用在特定的语境中。而语法填空则不一样,语法填空给学生一篇篇章,要求学生完成篇章中所缺的单词,没有任何选项供选择,学生必须根据自己的语法知识填入合适的单词,这个单词不仅仅要形式正确同时还必须使上下文通顺连贯。所以,语法填空题重在篇章层面测试学生的语法能力,包括单词的语法形式和意义。题型设计者希望通过语法填空这道新题型来激活学生的语法知识,促使学生将语法知识应用到特定的语境中。但是在实际的使用过程中,此题型能否真正测到学生的语法能力呢?这个问题还亟需进一步研究。此外,语法填空题如今已不单单在广东省高考英语中出现,而是在全国范围广泛使用。所以关于此题型的效度验证研究也是十分有必要的。此研究将定量和定性的方法相结合,从两方面来验证高考英语语法填空题的效度。一方面,作者首先采访了广东省高考英语语法填空题型的设计者,了解到题型设计者希望通过这个题型要考查的能力,在此基础上,编撰了关于语法填空题的构念的调查问卷,并将这些问卷发放给广东省的高中英语老师,来检验题型设计者想通过语法填空所考查的能力是否也受到了高中英语老师的认可。另外一方面,此研究采用答题过程分析法,深入分析学生的答题过程,判断学生在做题过程中是否用到了语法填空题旨在考查的能力。通过对问卷调查的分析,结果表明语法填空旨在考查的能力都受到了大部分英语老师的认可,包括构词法,对单词的正确语法形式,句子结构的分析,篇章的意义理解等。同时,通过对受试进行的回述采访,分析每个受试做这10道的答题过程,发现总体而言,受试们的答题过程也与所期待的答题过程一致,即受试们做题过程中确实有运用到语法填空题旨在考查的知识点和能力。vii 因而整体而言,无论是教师对语法填空构念的理解还是学生实际的答题过程,这些数据都说明语法填空确实能成功考查到它想考查的能力,因而具有良好的效度。关键词:语法填空;效度研究;答题过程法viii CONTENTSACKNOWLEDGEMENTS........................................................................................................iiiABSTRACT..............................................................................................................................v摘要..........................................................................................................................viiCONTENTS............................................................................................................................ixLISTOFABBREVIATIONS.....................................................................................................xiLISTOFTABLES.................................................................................................................xiiiLISTOFFIGURES...............................................................................................................xiiiCHAPTERONEINTRODUCTION..........................................................................................11.1ResearchOrientation..............................................................................................11.2ResearchQuestions................................................................................................31.3OutlineoftheThesis..............................................................................................3CHAPTERTWOLITERATUREREVIEW...............................................................................52.1Overview..............................................................................................................52.2Issuesconcerningtestvalidity...........................................................................52.2.1Definitionofvalidity................................................................................52.2.2Developmentofvaliditytheory...............................................................62.2.3Typesofvalidity...........................................................................................82.3IntroductionofGrammarG/Ftest...................................................................112.4Grammaticalability..........................................................................................132.5Empiricalstudiesbothathomeandabroad....................................................172.5.1Empiricalstudiesabroad........................................................................172.5.2EmpiricalStudiesathome.....................................................................19CHAPTERTHREEMETHODOLOGY..................................................................................223.1Overview............................................................................................................223.2Researchquestions............................................................................................223.3ResearchDesign................................................................................................233.3.1Instruments..............................................................................................233.3.2Participants..............................................................................................253.3.3Procedure.................................................................................................263.4AnalyticalMethods...........................................................................................293.5Summary............................................................................................................34CHAPTERFOURRESULTSANDDISCUSSION..................................................................354.1Overview............................................................................................................354.2Resultsofthestudy...........................................................................................354.2.1ResultsforResearchQuestion1...........................................................354.2.2ResultsforResearchQuestion2:Thecomparisonbetweentheparticipants’actualansweringprocessandthetesttakers’expectedperformance38ix 4.3Discussion..........................................................................................................454.3.1TheFirstQuestionofThisStudy..........................................................454.3.2TheSecondQuestionofThisStudy......................................................464.4Summary............................................................................................................47CHAPTERFIVECONCLUSION..........................................................................................485.1MajorFindingsofThisStudy..........................................................................485.2ImplicationsofThisStudy...............................................................................495.3LimitationsofThisStudy.................................................................................50REFERENCES.......................................................................................................................51APPENDICES........................................................................................................................57AppendixA.................................................................................................................57AppendixB.................................................................................................................59AppendixC.................................................................................................................60x LISTOFABBREVIATIONSAPAAmericanPsychologicalAssociationCGFTConstructsofGap-fillingTestECExpectedConstructETTPExpectedTest-takingPerformanceG/FGap-fillingMCMultipleChoiceNMETNationalMatriculationEnglishTestxi LISTOFTABLESTable3-1Basicinformationofquestionnairerespondents..............................27Table3-2Testdesigners’presumptionoftheunderlyingabilitiesmeasuredbygrammarG/Ftask.............................................................................28Table3-3Constructsmeasuredbyeachiteminthepresentstudy’sgrammarG/Ftask............................................................................................32Table4-1Teachers’responseofsomestatementsinthequestionnaire.........39Table4-2Thefrequencyofparticipants’responsecategoryineachitem......40Table4-3Thetotalfrequencyofparticipants’responsecategory...................46LISTOFFIGURESFigure2-1Definitionandcomponentsofgrammaticalknowledge.................16Figure3-1Thedistributionofthestatementsinthequestionnaire...................25Figure4-1Themeanscoreofeachstatementinthequestionnaire..................38xiii CHAPTERONEINTRODUCTION1.1ResearchOrientationThispaperisgoingtoelaborateonthevalidationstudyofgrammaticalgap-filling(G/F)taskinNMET(NationalMatriculationEnglishTest)Guangdongthroughacombinationofquantitativeandqualitativeapproach.Since2014,accordingtothestatementoftheNationalNewCourseTestSpecification,theG/FtesthasbeguntobeadoptedintheNMET.Asamatteroffact,theoriginofG/FtestinMatriculationTestshouldbetracedbacktoNMETGuangdong.In2007,inordertobettermeasurethegrammaticalcompetenceofstudents,languageexpertsofNMETGuangdongdesignthegrammarG/Ftasktoreplacethetraditionalmultiple-choice(MC)grammaritems.DespitetheefficiencyofMCgrammaritemsinscoring,theMCgrammaritemsfailtoevadethetest-wisenessthattesttakersusewhentheyareansweringtheitems.Atthesametime,theformatalsocausessomeharmfulwashbackeffectsincestudentslearngrammarjustbymemorizingtheexemplifiedsentence,thusthegrammaticalknowledgeislearnedinisolationratherthanincontext.Therefore,grammaticalG/Fisdesignedtoevadethetest-wisenensseffectbyrequiringtesttakerstofilltheblankwithanappropriatewordinthecontextwithnooptionstochoose.Thewordtheyfillintheblankshouldbenotonlygrammaticallycorrectbutalsosemanticallymeaningful.Thusoneormoreareasofgrammaticalknowledgecanbetestedatthesametime.DuetothesuccessfulreforminNMETGuangdong,otherprovincessuchasShanghai,Fujian,Hunanetc.alsobegintofollowthetrend.Nowadays,grammarG/FtaskhasbeenadoptedinNMET,soitiscruciallyimportanttovalidatethevalidityofgrammarG/Ftaskandthefindingsofthevalidationstudycanalsooffersomeimplicationsforthelanguagelearningandteaching.DespiteG/Ftestisonevariant1 fromclozetest,therearenumerousstudiesinvestigatingclozetestinlargescaletestssuchasNMET,CETorTEM.SomeinvestigatethecontentvaliditybycorrelatingthepreviousauthenticclozetestinNMETwiththeexaminationsyllabus,someinspectthetestfacevaliditybyquestionnairestocollectstudents’attitudetowardsthetest.However,thestudythatspecializesinexaminingthevalidityofgrammarG/Ftaskisrare.ThescarcenumberofvalidationstudiesconcerninggrammarG/Ftaskalsomakesthepresentstudynecessary.Therefore,thepresentstudyattemptstoexaminethevalidityofgrammarG/Ftaskintwoways.Ononehand,basedonaninterviewwithonetestdesignerofthegrammarG/FtaskinNMETGuangdong,questionnairesforhighschoolEnglishteachersareconstructedanddistributedtomorethan200highschoolEnglishteachersfromvariousschoolsinGuangdongprovincetoinvestigatehighschoolEnglishteachers’perceptionofthetest.IftheconstructsthattestdesignersintendtomeasureingrammarG/FtaskwintheacceptabilityfromthehighschoolEnglishteachers,itcanbeshownthatthegrammarG/Ftaskhasagoodlevelofconstructvalidity.Whereas,ifhighschoolteachersdonotrecognizethatthetesthasmeasuredtheabilitieswhichthetestispresupposedtomeasure,thenthevalidityofthegrammarG/Ftaskisquestionable.That’swhyhighschoolEnglishteachers’perceptionofthetestisworthyofinvestigation.Ontheotherhand,itisinevitablethateveniftestdesignersholdasoundintentionindesigningthetestwiththepurposeofmeasuringaseriesoflanguageabilities,thetesttakersmayapplyradicallydistinctprocesstoanswertheitems,sothemannerinwhichthegrammarG/Ftaskisperformedisalsoextremelyimportanttobeinvestigated.Unlikethepreviousstudy(Zhang2015)examinesthevalidityofgrammarG/FtaskinNMETonthebasisofstudents’scores,thepresentstudyfocusesonexploringtesttakers’internalprocessbyretrospectiveintrospectionmethodtofindsomecluesaboutwhetherthetesttakers’actualperformanceisinlinewiththetestdesigner’sexpectations.Throughlistingtheabilitiesthatthetestdesignersintendtomeasureandspecifyingtheexpectedperformanceoftesttakers,students’actualansweringprocessandobservedperformanceiscomparedwiththeidealperformance,iftesttakers’answeringprocess2 showsthattheyhaveusedtheexpectedconstructsorabilities,thenthetesthasasoundvalidity.Bycontrast,iftesttakers’actualansweringprocessisnotconsistentwiththetestdesigners’expectedperformanceprocess,thenvalidityofthetestiscalledintoquestion.1.2ResearchQuestionsAsmentionedabove,thepresentstudyaimstofindoutwhethertheconstructsofthegrammarG/FtaskthatthetestdesignersintendtomeasurearealsorecognizedbythehighschoolEnglishteachers.Atthesametime,whetherthetesttakerscompletethetestdousetheabilitiesthattestdesignersassumetobetestedingrammarG/Ftaskisalsoinvestigated.Sotheresearchquestionsareillustratedasfollowing:(1)WhatisthehighschoolEnglishteachers’perceptionoftheconstructofgrammarG/FtaskinNMETGuangdong?(2)Doestesttakers’answeringperformancereflecttheabilitiesthattestdesignersintendtomeasure?1.3OutlineoftheThesisThispapercontainsfivepartsincluding:(1)Introduction,(2)LiteratureReview,(3)Methodology(4)ResultsandDiscussion,and(5)ConclusionChaptertworeviewstherelatedliterature,includingthedefinitionofvalidity,thedevelopmentofvaliditytheory,introductionofmajortypesofvalidityandtheintroductionofgrammarG/Ftask.Atthesametime,thedefinitionofgrammaticalcompetenceisalsospecifiedinthischapter.Lastly,empiricalstudiesconcerningthemethodoftest-takingprocessaswellasthevalidityofgrammarG/Ftaskarereviewed.Chapter3focusesontheresearchdesignofthisstudy,includinginstruments,participants,procedureandthewaydealingwiththecollecteddataandverbalreport3 data.Chapter4aimstopresenttheresultsofthecollecteddataincludingdatafromquestionnairesandtheretrospectiveinterview.Atableofmeanscoreofeachstatementinthequestionnairesisillustratedandatableaccountingthefrequencyofparticipants’responsecategoryisalsopresentedtooffersomehintstoanswertheresearchquestions.Chapter5drawsaconclusionforthepresentstudy;inthemeantime,implicationsandlimitationsofthisstudyarealsomentioned.4 CHAPTERTWOLITERATUREREVIEW2.1OverviewAsmentionedinsection1.1above,theprimaryobjectiveofthepresentstudyistovalidatetheconstructvalidityofgrammarG/FtaskinNMETGuangdong.Atleastthreeareashavetobereviewedtolayatheoreticalfoundationforthepresentstudy.Firstly,sincethisstudyisavalidationstudyandtheconstructvalidityadoptinthisstudyisaunitaryconceptwhichentailsotherkindsofvalidity,therefore,thedefinitionofvalidity,thedevelopmentofvaliditytheory,majortypesofvaliditywillbebrieflyreviewed;secondly,astheformatofgrammarG/FtaskinNMETbelongstoonevariantofclozetest,anditsaimistomeasuregrammaticalknowledge,soabriefintroductionofgrammarG/Fisincludedinthischapter,mostimportantly,theissuethatwhatismeantbygrammaticalabilityisalsoelucidatedinthischapterandLado’smodelofgrammaticalabilityservesastheoreticalbasisforthepresentstudy;inaddition,sincethepresentstudyadoptsaprocess-orientedmethod,theempiricalstudiesregardinginvestigatingthetest-takingprocessandgrammarG/Ftaskisreviewedtoofferempiricalbasisforthepresentstudy’sresearchdesignandthewaydealingwiththeverbalreport.Lastly,asummaryofthischapterisconcluded.2.2Issuesconcerningtestvalidity2.2.1DefinitionofvalidityDuringtheprocessoftestconstruction,validityhasalwaysbeenasignificantfactorthattestdesignershavetoconsider,andnumerousresearchershavedefineditwithdifferentemphasis.Inabriefway,Cronbach(1949)statesthatatestisvalidtothedegreethatweknowwhatitmeasuresorpredicts.Henning(2001:89)offersanimpressivedefinition5 thatvalidityingeneralrefersto“theappropriatenessofagiventestoranyofitscomponentpartsasameasureofwhatitispurportedtomeasure.”Alderson,Caroline,andDianne(1995)conceptualizeitinasimilarway,buttheyhighlightthatvalidityistheextenttowhichatestmeasureswhatitclaimstomeasure.HunterandSchmidt(1990)alsoarguethatthevalidityofatestshouldnotbesimplifiedasayes-or-noquestion;insteaditshouldbeacontinuumbetweeninvalidorvalid.Inotherword,theypointoutclearlythatthevalidityofatestshouldbeamatterofdegree.BachmanandPalmer(1996)focusetheircomprehensionoftestvalidityontherelationshipbetweeninterpretationandthescores;theythinkthatiftheinterpretationmadeonthebasisoftestscoresisappropriateandmeaningful,thenthetestisvalid.Similartotheirdefinition,Weir(2005)furtherexplainsthatvalidityreferstotheextenttowhichtestscorescantrulyreflectthetesttakers’languageskillsorknowledge.Fromtheinitialreviewofthediversedefinitionsofvalidityingeneral,itcanbeconcludedthatvalidityisamatterofdegree;inthemeanwhile,validityisconcernedwiththerelationshipbetweentestperformancesandtheunderlyingabilitytobemeasured,thusitissenselesstosaywhetherthetestcontentortestscoresarevalidornot.Brieflyspeaking,validityreferstothedegreetowhichatestmeasureswhatitisintendedtomeasure.2.2.2DevelopmentofvaliditytheoryAsamatteroffact,thevaliditytheoryinlanguagetestingoriginatesfromthefieldofeducationalandpsychologicalmeasurement.Withthedevelopmentoftheresearchers’studyandunderstandingofvalidity,thetheoriesregardingvaliditycanbeclassifiedintothreestages:(1)validityasasingleconcept(2)variouscategoriesofvalidity(3)validityasaunifyingconcept(Li2006).Primarily,researchersinlanguagetestingfieldgenerallyviewvalidityasaconceptwhichreferstotheextenttowhichtestscorescanbeusedtomeasureorpredictabilitiesorthecorrelationcoefficientbetweentestscoresandexternalcriterion.Thisstageiscalledsingleconceptstage.However,accompaniedbythepresenceofcontentvalidity,thesingleconceptstage6 comestoanend.Researchersbegintoinvestigatevalidityfromvariousperspectives,thusonetestcouldownseveraldifferent“validities”.Angoff(1988)sumsupthedifferent“validities”inallsortsofacademicarticlesinthefieldoflanguagetesting,heconcludesasmanyas16“validities”includingcontentvalidity,constructvalidity,facevalidity,predictivevalidity,criterion-relatedvalidity,concurrentvalidity,convergentvalidity,discriminantvalidity,factorialvalidity,ecologicalvalidity,operationalvalidity,intrinsicvalidity,taskvalidity,populationvalidity,temporalvalidity,validitygeneralization.Apartfromthese16“validities”,inthedictionaryeditedbyDavies,Brown,Elder,Hill,LumleyandMcNamara(2002),theyalsoaddsomeother“validities”suchasexternalvalidity,consequentialvalidity,divergentvalidity,responsevalidity.Thorndike(1997)proposesmore“validities”suchasempiricalvalidity,curricularvalidity,instructionalvalidity,rationalvalidity,logicalvalidity.Thedifferentkindsofvalidityarefarmorethantheselistedones.Fromthesevariouskindsofvalidity,itcanbeshownthattheconceptofvalidityisstillinthestatusofimprovinganddeveloping,therelacksconsensusonthisissueuntilthepropositionoftrinitariandoctrine.InthesecondeditionofStandardsforEducationalandPsychologicalTestsandManualspublishedin1954byAPA(AmericanPsychologicalAssociation),fourtypesofvalidityi.e.contentvalidity,predictivevalidity,concurrentvalidityandconstructvalidityareproposed.Inthethirdandfourthedition,StandardsforEducationalandPsychologicalTestsandManualsintegratepredicativevaliditywithconcurrentvalidityintothecriterion-relatedvalidity.Guion(1980)callsthiskindoftheory“theholytrinity”.Andthetheoryoftrinitariandoctrineexertsprofoundimpactinthe70thcenturyuntiltoday.Butthetrinitrariandoctrinehasnotescapedfromcriticismfromresearchers(Cronbach1971;Guion1980;Messick1987)foritsarbitrarinesswhenthevalidityevidenceisgathered.Andintherecentstudiesofvalidity,constructvalidityhasbecomedominantwhichbringsabouttheeraofvalidityasaunifyingconcept.Inthelatersessionofthisthesis,theothermajortypesofvalidityespeciallyconstructvaliditywillbereviewedindetails.7 2.2.3Typesofvalidity2.2.3.1ConstructvalidityBeforediscussingconstructvalidity,thefirstandforemostconsiderationistoaddresstheissueaboutwhataconstructis.CronbachandMeehl(1995)contendthatconstructisapostulatedattributeofpersonandthisattributeissupposedtobemanifestedintestperformance.Later,Hughes(1989)offersabriefdefinitionthatinthefieldoflanguagetesting,constructreferstoanyunderlyingabilitywhichishypothesizedinatheoryoflanguageability.EbelandFisbie(1991)suggeststhatconstructisapsychologicalattribute,andexampleslikemotivation,attitude,intelligence,anxiety,readingcomprehensionanddominanceareallassumedtobeconstructs.Alltheseattributeshaveonecommonfeaturei.e.theycannotbeobservedormeasureddirectly.Inaword,anyunobservableorimmeasurableattributesreflectinganaspectofhumanbehaviorcanbeconstructs.Andinthefieldoflanguagetesting,constructreferstotesttakers’underlyingabilityorskillswhichcannotbemeasureddirectly.Intermsoftheconceptofconstructvalidity,constructvalidityalongwithpredicativevalidity,contentvalidityandconcurrentvaliditywerefirstintroducedintheStandardsforEducationalandPsychologicalTestsandManualspublishedin1954.Sincetheemergenceofthisconcept,numerousresearchershavemadeattemptstodefineitintheirownways.Butnowadays,theunifyingconceptofvalidityhasbeenacceptedbymeasurementprofessionalsasawhole(Messick1990).Insteadofclassifyingvalidityintodifferentcategoriescontainingfacevalidity,contentvalidityandcriterion-relatedvalidity,measurementspecialistsbegintoviewtheseasvariousaspectsofaunitaryconceptofvaliditythatsubsumesallofthem.Andthetermconstructvalidityisusedtorefertoageneral,integrated,overarchingnotionofvalidity(Bachman1990;Hughes2000).Andinthepresentstudy,theauthoralsoadoptstheunitaryconceptofvalidity,namelythevalidityofgrammarG/FtestinthisstudyreferstotheconstructvalidityofgrammarG/Ftestwhichentailsotherkindsofvaliditysuchascontentvalidity,facevalidityetc.8 2.2.3.2ContentvalidityTheemergenceofcontentvaliditycanbetracedbacktoitsfirstappearanceinthefirsteditionofEducationalMeasurement(Lindquist1942).AccordingtoMessick(1987),“contentvalidityisbasedonprofessionaljudgmentsabouttherelevanceofthetestcontenttothecontentofaparticularbehavioraldomainofinterestandabouttherepresentativenesswithwhichitemortaskcontentcoversthatdomain.”Onthebasisofthiscomprehensivedefinition,itcanbeconcludedthatfirstly,inlightofcontentvalidation,theevidencecomingfromtheexpertortesttakersiscruciallyimportantforcontentvalidity.Forthesamereason,thepresentstudyfocusesonthetesttakersratherthantestscorestovalidatethegrammarG/Ftask.Secondly,itisacknowledgedthatcontentvalidityismainlyconcernedwithcontentrelevanceandcontentcoverage.Differentfromconstructvalidityofwhichtheinterestisintheconsistencybetweentestscoresanditsinterpretation,contentvalidityattachesmoreimportancetotheissuesofwhetherthecontentofthetestissufficientlyrelevanttolanguageskillsorstructuresthetestismeanttomeasureandwhetherthesamplethetestselectcanrepresentthebehavioraldomaininquestion(Bachman1990;Hennings2001;Hughes1989).Therefore,therealizationofcontentvaliditycanensuretheconstructvalidityofthetest,sinceifthetestitemsarenotconsistentwiththeobjectiveorifthetestmeasuressomeknowledgethattesttakersareneverexposedto,itisverylikelythetestisnotanaccuratemeasureofwhatitispresumedtomeasure.Inthemeanwhile,ifthecontentofthetestisinfavorofsomeparticularknowledgeorstructuretotheexclusionofothers,theseoverlookedareasthatarenottestedwillalsobeignoredinthefiledoflanguagelearningandteachingandresultinharmfulwashbackeffect.Therefore,contentvalidityisalsoacriticalfactorindesigningtests.9 2.2.3.3Criterion-relatedvalidityCriterion-relatedvaliditydemonstratesarelationshipbetweentestscoresandsomecriterionwhichwebelieveisalsoanindicatoroftheabilitytested(Bachman1990).Criterion-relatedvaliditycomprisestwokindsofvalidity:concurrentvalidityandpredictivevalidity.AccordingtoAldersonetal.(1995),concurrentvalidityisinvolvedwiththecomparisonofthetestscoreswithsomeothermeasureforthesamecandidatesatroughlythetimeasthetest.Concurrentvalidityisalsoanempiricalvalidityinthesensethatitisoftenvalidatedthroughmathematicalformulaforthecomputationofcoefficient.Andthehighercorrelationcoefficientmeanshighvalidity.Forexample,testdesignerswillconducttheirowntestwithastandardizedtestatthesametime,andthecorrelationbetweenthescoresofthesetwotestsiscalculated,thehigherthecorrelationcoefficient,themorevalidthetestseemstobe.Whilepredictivevalidityconcernsthedegreetowhichtesttakersfutureperformancecanbepredictedfrompriortestorhowwelltestscorescanpredictcandidates’ability(Hughes1989).Aldersonalsoproposesthesimplestwayofpredicativevalidationistogivetesttakersatest,andthenatsomeappropriatepointoftimeinthefuturegivethemanothertest.However,thepredicativevalidationmayencountersomeproblemsinthepracticalimplementation,becauseitmaybedifficulttomakesureeverytesttakerwhohastakentheoriginaltestbeforecantakeatestforthesecondtime.Thisistheso-calledtruncatedsampleproblem(Aldersonetal.1995).Atthesametime,iftheperformanceonthesecondtesthasbeenimproved,thereasonpossiblybeeithertheenhancementoftesttakers’languagecompetenceortheintervalofthetwotests.What’smore,itisalsohardtodesigntheparalleltest.Soitisextremelyessentialtotaketheseissuesintoconsiderationwhenthepredicativevalidationisexamined.2.2.3.4FacevalidityCombiningtheviewpointsofRichardsandSchmidt(1998)andHughes(2000),atestcanbedescribedashavingfacevalidityifthetestappearstomeasuretheknowledge10 orabilitiesitclaimstomeasure.Furthermore,Heaton(2000)elucidatesthatatestissaidtohavefacevalidityifthetestlooksrighttootherteachers,testers,moderators,andexaminees.Insummary,facevalidityrepresentstheviewpointsofthepublicwhosejudgmentisnotnecessarilyprofessionalbutintuitiveandsubjective.Forthisreason,facevalidityhasbeenopposedbysomelanguagetestingspecialistsforitlacksthetheoreticalbasis.Despitethecriticismofthisconcept,nowadaysfacevalidityhasreceivedanincreasingacceptabilityinthefieldoflanguagetestingbecausegoodfacevaliditycanmaintainstudents’motivation.Ifthetestpaperlookssoundintheeyesofstudents,moststudentswilltryharder.Whereas,ifthetestpapershavelittlerelevanceintheeyesofstudents,possiblystudentswillnotputtheirmaximumeffortsintoperformingthetasksinthetest;hence,thereliabilityofthetestwillbedamaged(Heaton2000).Sofacevalidityisconsideredtobeimportantwithregardtotheuseofthetestbecauseithastodowiththesurfacecredibilityorpublicacceptabilityofatest(Daviesetal.2002).2.3IntroductionofGrammarG/FtestAsamatteroffact,G/Ftestisoneofclozetestwhichcomprisesthefixed-ratiocloze,theMCclozeandC-test.G/Ftestisalsonamedrationaldeletioncloze.MCclozetestisthemostwidelyusedtypeofclozetestsinvarioustests.MCclozetestcombinesthefeaturesofclozetestandMCitems,namelytesttakershavetorestorethewordintheblank,butatthesametime,theyweregivenfouroptionsascluestochoosethecorrectwordfillingintheblank.Thus,itcantaptesttakers’answers.SoMCclozetestreceivesapprovaltosomeextentforthescoringofMCclozeisobjectiveaswellasefficient.ButitisalsorecognizedbyresearchersthattheoptionsintheMCclozeinevitablygivehintstotesttakersandthusinfluenceitsvalidity.AndtherehasbeenintenseargumentamonglanguagetestingresearchersoverwhatMCclozetestactuallymeasures.Somestudiessupporttheclaimthatitcanbeusedasareadingcomprehensiontest(Bachaman1985;Chavez-Oller,Chihara,WeaverandOller1994;Chihar,Oller,WeaverandChavez-Oller1977;Fotos11 1991;Jonz1990;McKennaandLayton1990).Whileothersstatethatitonlymeasurestheabilitytouselocalsyntacticconstraints(e.g.Alderson1979;Kibby1980;Markman1985;Shanahan,KamilandTobin1982).C-testisalsoavariantofclozetest.ItisdesignedbyKlein-Braleyin1981.Theconstructionofitisselectingapassageofwhichthefirstandlastsentencekeepintact,fromthesecondwordinthesecondsentence,onewordaftertheotherisdeleted.Whilethefirstorthesecondletterofthewordismaintainedtohelptesttakerstorestorethewords.Klein-Braley(1984)claimedseveralmeritsofC-testoverotherclozetest,includingitsobjectivescoring,itssimplewayofconstructionandthelargeramountofitemswhichcanbemeasuredinthepassage.Theotherisfixed-ratiocloze,fixed-ratioclozemeansthewordsinthepassagearedeletedatregularintervalstoformtheblank.Usuallythewordsaredeletedattheintervaloffivetoelevenbecauseiftheblanksinthepassagearetoodenseorloose,thetestistoodifficultoreasy.Theweaknessoffixed-ratioclozeisthatthereislittlecontroloverthedifficultyleveloftheblanks.Sincethewordsaredeletedregularly,itoccursthatsomeblanksarequiteeasywhileothersareextremelydifficultortheblankthattestdesignersintendstomeasureisnotattheinterval,butsomeirrelevantwordssuchasfigure,number,date,nameofpersonarejustintheblank.TherationalclozewhichiscalledG/Fclozeadoptsamoreflexiblewayofdeletingwords.Fortheconstructionofrationalcloze,theintervalofdeletingwordjustservesasareferencefortestconstructors,whiletestconstructorscandeliberatelydeletesomefunctionwords,verbs,nounsetc.Thiswayofdeletionincreasestheorientationofthetest.Despitesometestexpertsarguethatthiswayofdeletionissubjective,becausethiswayofdeletionoverlydependsonthejudgmentofthetestconstructors,butitbenefitstestconstructorstoselectthetestpointpurposely.ThegrammarG/FtaskinNMETGuangdongbelongstothiskindoftaskwhichrequirestesttakerstofillthegapwithanappropriateresponseforthecontext.AndthistaskisdesignedtoreplacethefaultygrammarMCitemstobettermeasurethelearners’grammaticalcompetence.Therefore,differentfromotherclozetestwhoseprinciplepurposeistoassesstheoverallreadingcomprehensionability,thegrammar12 G/FtaskinNMETGuangdongisutilizedasagrammaticaltasktomeasurelearners’grammaticalability.Therefore,itishighlynecessarytodefineclearlywhatgrammaticalabilityis.2.4GrammaticalabilityInlanguagelearningandteaching,itisacknowledgedthatgrammarhasalwaysreceivedconsiderableattentionforalongtime.Butwhatiscommonlymeantby‘grammar’intheoriesoflanguage?(Purpura2004)Inthepast,knowledgeofgrammarisnarrowlyconsideredtobeasetoflinguistrules;studentslearngrammarusuallybyrecitingabstractrules,examplesandexceptions.However,withthepropositionofcommunicativelanguagecompetence,linguisticexperts’attentionhasshiftedfromtheisolatedlanguagerulestolanguageuseforcommunication.Thus,itisassertedbyteachersandlanguagespecialiststhatgrammarshouldnotonlybelearnedbutalsobeusedinsomelinguisticcontext.Fromthaton,grammaticalknowledgewasnolongerviewedasasetoflinguisticrulestobememorized,butratherasetofrulestobeappliedforcommunicativepurpose.Andgrammaticalabilityisconsideredtobeanindispensablepartofcommunicativelanguageability.Inthemodelofcommunicativecompetence,therearealsoplentyofresearchesdevotingthemselvestodefiningthegrammaticalcompetencewhichisalsocalledgrammaticalability.Intheearly1960s,Lado(1961)madeaninitialattempttoconceptualizegrammaticalknowledge.Heproposeda‘skills-and-elements’modeloflanguageknowledge.Skillsrefertolistening,reading,speaking,writing;elementsrefertophonology,structureandlexicon.Itispresumedthatlanguagelearnerdisclosetheirknowledgeoftheelementsinthecontextoflanguageskills.TheproblemswithLado’smodelarethat,firstly,grammaticalknowledgeinhismodelonlyreferstomorphosyntacticform.Secondly,theelementscontainingphonology,structureandlexiconinhismodelaretreatedtobeindependent.Last,thediscrete-pointapproachwhichmeansmeasureonepointofgrammaratatimealsoreceivesscornfromother13 researches.DuetorejectionofLado’selements-and-skillsmodel,Oller(1979)proposeshisintricateandradicallydifferentdefinitionofgrammaticalknowledge.Hemaintainsthatgrammaticalknowledgeentailsgrammaticalform(includinglexicon,morphosyntax,phonology)notonlyonthesententiallevel,butalsoonthesuprasententialordiscourselevelthroughcohesionandcoherence.Olleralsoputforwardtheconceptofpragmaticexpectancygrammar,integratingpragmaticknowledgeintogrammar.However,bothLadoandOllerfailtoisolategrammaticalcompetencefromothercomponentsoflanguagecompetence.CanalandSwain(1980)suggestacommunicativecompetencemodelwhichembracesgrammaticalcompetence,discoursecompetence,sociolinguisticcompetenceandstrategiccompetence.Intheirmodel,grammaticalcompetencereferstoknowledgeoftherulesofphonology,thelexicon,syntaxandsemantics.AlthoughCanalandSwainrecognizethatgrammaticalcompetenceincorporatesbothformandmeaning,theystillfailtoelucidatetherelationshipbetweenformandmeaning.In1996,BachmanandPalmerputforwardamulti-componentmodelofcommunicativelanguageabilitywhichalsospecifiessomenon-linguisticcomponentsofcommunicativelanguageabilitygeneratedinlanguageuse.Intheirmodel,theydividedgrammaticalcompetenceintocomponentsofsentence-basedphonology,graphology,vocabularyandsyntax.Andgrammaticalcompetencetogetherwithtextualcompetence(referringtoone’sknowledgeofcohesion,rhetoricalorganizationandconversationalorganization)formorganizationalknowledgewhichmeanshowindividualscontrollanguagestructuretoproducegrammaticallycorrectutterancesorsentencesortexts(Purpura2004).Andthenorganizationcompetenceandpragmaticcompetenceformlanguagecompetence.Larsen-Freeman(1991,1997)alsoproposetheirmodelbycharacterizinggrammaticalknowledgealongthreedimensions:linguisticform,semanticformandpragmaticuse.Linguisticformmainlyconcernsaccuracy,comprisingmorphologyandsyntacticpatterns;semanticisprimarilyinvolvedwithlexicalmeaningandgrammaticalmeaning;pragmaticuseisconcernedwithpresuppositionsaboutcontext14 includingsituation,linguistics,discourseandsocioculture.Andthesethreedimensionsareconsideredtobeindependentorinterconnected.AmongmodelsproposedbyCanaleandSwain(1980),BachmanandPalmer(1996)andLarsen-Freeeman(1997),itisundeniablethattherearesomecommonfeaturesamongthem,suchasallofthemtouchuponlinguisticfrom,semanticmeaning,andpragmaticonsomelevel.Buttheboundariesbetweenthesecomponentsarenotclearlyidentified.Soeventually,Purpura’smodelisillustratedinthefollowingfigure2-1(seepage16).Inhismodel,althoughitisacknowledgedthatgrammaticalknowledgeandpragmaticknowledgearehighlydependentuponeachothertoconveymeaninginspecificcontext,hetreatsgrammaticalknowledgeandpragmaticknowledgeasseparatecomponentsoflanguageabilitytobetterspecifytherelationshipbetweentwo.Figure2-1(inthenextpage)notonlyoffersacomprehensiveandcleardefinitionofgrammaticalknowledge,butalsoincludessomeexemplifiedcomponentsofgrammaticalandpragmaticknowledge.Thefocusofthisstudyisaboutthegrammaticalknowledge,whichembodiesgrammaticalformandgrammaticalmeaninginsententialandsupersententialordiscourselevel.Andgrammaticalformconsistsofsixpointsencompassingphonological/graphologicalform,lexicalform,morphosyntacticform,cohesiveform,informationmanagementformandinteractionalform.Grammaticalmeaningalsoentailsmeaningrelatestothesesixaspects.WhatmeritsourattentionisthatPurpura’s(2004)modelalsolistsaseriesofpossiblegrammaticalpointsthatcouldbeusedtomeasureeachcomponentalongwiththecomponentsofgrammaticalknowledge.Sothislistshedslightonthelatermethodologypartofthisstudy.15 GrammaticalKnowledgeGrammaticalformGrammaticalmeaning(accuracy)(meaningfulness)SENTENTIALLEVELSENTENTIALLEVELPhonological/Phonological/graphologicalformgraphologicalmeaning•segmentalforms•minimalpairs•prosodicforms(stress,rhythm,•interrogatives,tagsintonation,volume)•emphasis/contrast•sound-spellingcorrespondences•homophony(they’re,there)•writingsystems•homography(thewind,towind)LexicalformLexicalmeaning•orthographicforms•denotationandconnotation•syntacticfeaturesandrestrictions•meaningsofformulaicexpressions(nouns)•meaningsoffalsecognates•morphologicalirregularity•semanticfields(attributesofwords•wordformation(compounding,denotingphysicalattractiveness)derivationalaffixation)•prototypicality(wordsdenoting•countabilityandgenderrestrictionsphysicalattractiveness)•co-occurrencerestrictions(*depend•polysemy(headofperson/bed/table)on,inspiteof)•collocation(tableandchair)•formulaicformsMorphosyntacticformMorphosyntacticmeaning•inflectionalaffixes(-ed)•time/duration•derivationalaffixes(un-)•reversive(pack/unpack)•syntacticstructures(tense,aspect)•interrogation,passivization•simple,compoundand•cause–effect,factual/counterfactualcomplexsentences•voice,mood,wordorderDISCOURSEORDISCOURSEORSUPRASENTENTIALLEVELSUPRASENTENTIALLEVELCohesiveformCohesivemeaning•referentialforms(personal,•possession,reciprocitydemonstrative,comparative)•spatial,temporalor•substitutionandellipsispsychologicallinks•lexicalforms(repetition)•informationallinkstoavoid•logicalconnectors(therefore)redundancy•adjacencypairs•additive,contrast,causalInformationInformationmanagementformmanagementmeaning•prosody•emphaticmeaning•emphatic“do”•focalmeaning•markedwordorder(clefts)•contrastivemeaning•given/neworganization•foregrounding•parallelismInteractionalformInteractionalmeaning•discoursemarkers(oh,ah)•disagreement,alignment,hedging•communicationsmanagement•keepingtheconversationmoving,strategies(turn-taking,repairs,interruptionfillers,paraphrase,wordcoinage)•repairbyclarificationFigure2-1Definitionandcomponentsofgrammaticalknowledge(FromPurpura2004.AssessingGrammar.P.91)16 2.5Empiricalstudiesbothathomeandabroad2.5.1EmpiricalstudiesabroadSinceithasbeenmentionedbefore,grammarG/Ftaskisonevariantofclozetest,empiricalstudiesaslongasconcerninggrammarG/Ftaskwillbereviewed;Atthesametime,thepresentstudyfocusesontheinvestigationoftesttakers’answeringprocess,thereforevalidationstudiesadoptingprocess-orientedmethodwillalsobereviewed.Chapelle(1990)conductedastudyinvestigatingwhetherdifferentformatsofclozetestwouldinfluencemeasurement.Hecomparedtesttakers’performanceinthefourdifferenttests(thefixed-ratiotest,rational(G/F)test,MCtest,andC-test).Theresultsrevealthatthereisnosignificantdifferenceamongthereliabilityofthesefourtests;whileintermsoflevelsofdifficulty,thefixed-ratiotestisthemostdifficultoneandtheMCtestistheeasiest.Atthesametime,healsocomparedtheconvergentanddiscriminantcorrelationsofthefourtestswiththelistening,reading,vocabulary,writingandGEFTtestsinBachman’s(1990)framework.Itisfoundthatoutoftheauthor’sexpectation,therationalclozehasahighercorrelationwithreadingandvocabularytestthanthewritingtest,despitethefixed-ratioclozetesthasthehighestcorrelationwithwritingtest.Thisresultsuggeststhatrationalclozecouldmeasuretheoveralllanguageabilitytosomeextent.However,itisapitythatthestudydidnotdigintothereasonwhyfixed-ratioandrationalclozeproduceddifferentstrengthofcorrelationswithothertests.Apartfromthestudiesspecializedininvestigatingthetestmethodeffectsbycomparingthetesttakers’performanceinthedifferenttests,thereisalsoagroupofresearchersengaginginthevalidationstudyofexploringthecognitiveansweringprocessoftesttakers.Intheirstudies(Sasaki,2000;Storey,1997;Yamashita,2003),theyalljustifythatacloseness-of-fitbetweenhowtestconstructors’intendfortheirteststobetakenandhowrespondentsactuallytakenshouldbeexplored.17 InStorey’s(1997)study,heemployedthethink-aloudmethodtoexaminethetest-takingprocessonthediscourseclozetest.Inhisopinion,aclozeitemmightappeartorequireintersententialreferencetocompleteit,butasubjectmayinfactuseaquitedifferentstrategytoclosethegap.Sothemannerinwhichtesttasksareperformediscruciallyimportant.SotheresearcherconstructedaMCclozetestbydeletingwordssuchasdiscoursemarkersandanaphoricpronounsonthebasisoftheirhypothesizedimportanceinmeasuringintersententialunderstanding,andthenidentifiedamodelofidealperformancebasedontheworkofKintschandvanDijk(1978),throughanalyzingthetesttakers’think-aloudprotocolsandcomparingobservedperformanceoftesttakerswiththeidealperformancemodel,overallspeaking,theitemsshowvaryingdegreesofconstructvaliditybecausetheobservedperformanceisconsistentwiththeidealperformancemodel.Sasaki(2000)hadalsobeenawarethatmoststudiesofclozetestsjustfocusedonthescoresoftesttakersratherthanthetesttakerthemselves.Asheremarked,itwaspossiblethatparticipants’testperformance(i.e.testscores)couldnotdirectlyreflecttheparticipants’correctgraspofthegivenproblems(Cohen,1984;Grotjahn,1986).Motivatedbytheneedtoobtaininformationaboutthetesttakers’cognitiveansweringprocess,headoptedimmediateretrospectiveintrospectiontoexaminetheeffectsofculturalschemaontest-takingprocessesforfixed-ratioclozetest.Subjectswererequiredtocompleteeitheraculturallyfamiliarorunfamiliarversionofaclozetestandthengaveverbalreportoftheiransweringprocess.TheverbalreportwascodedonthebasisofBachman’s(1985)frameworkwiththeintentionoffindingouttheamountoftextinformationusedtocompletetheitems.Theresultofthestudyshowedthattesttakerswhoreadculturallyfamiliarclozehadabetterperformancethanthoseofthesubjectswhoreadtheunfamiliarversionofclozetest.Andsubjectsdidusetheinformationbeyondthesentencelevelforsolvingitems.Theresultssupportedtheclaimthatclozetestcanmeasurehigher-orderprocessingabilities.Tocontinuethislineofinvestigation,Yamashita(2003)conductedastudytoexplorehowskilledandlessskilledEFLansweredaG/Ftestbythink-aloudmethod.Thestudyfocusesonfindingoutthetypesofinformationthattest-takersuseto18 answeraG/FtestonthebasisoftheclassificationofclozeitemtypesdevelopedbyBachman(1985).Resultsdemonstratethatbothskilledandlessskilledtesttakersusetext-levelinformationmorefrequentlythanothertypesofinformation.GenerallytheG/Ftestgeneratestext-levelprocessinganddifferentiateswellbetweenskilledandlessskilledtesttakers.Therefore,thestudyconfirmstheclaimthatG/Ftestcanbeusedasatesttomeasurehigherorderprocessingability.Itseemsthatstudiesabroadmainlyfocusonthevalidityofclozetest,andtheprocess-orientedmethodofvalidatingtestshavereceivedalotofattention.Inthefollowingsession,studiesathomearereviewed.2.5.2EmpiricalStudiesathomeInChina,therearealsoalargenumberofstudiesstrivingtovalidatethetestofCETorTEMthroughgatheringverbalreportdatatocomplementthequantitativeevidence.(ChenandLi,2006;JinandWu,1998).Besides,withtheappearanceofgrammarG/FtaskinNMET,people’sattentiontothistaskisalsoincreasing.SomerecentstudiesspecializedinthevalidationofgrammarG/Ftaskisreviewed.Firstly,JinandWu’s(1998)studyfocusingonthetest-takingprocesstovalidateCETreadingcomprehensiontestlaysatheoreticalfoundationforfuturequalitativevalidationstudies.TheypointedoutthatthepreviousstudiesofthevalidityofCETreadingcomprehensionjustcollectedquantitativedatasuchascontentsoftestpapers,testtakers’scores,butreadingactivityisapsychologicalprocessencompassingaseriesofcognitiveactivities.Therefore,theyconductedtheirstudybyintrospectionmethodtoanalyzetheparticipants’verbalreportdata,tryingtofindoutthestepsormethodsthatparticipantswereinvolvedwhentheyweredoingtheirtask.Theyfirstlyspecifiedalistofexpectedreadingoperationsthattheparticipantsweresupposedtouseintheiransweringprocess.Then,atthesametime,thespecificationofexpectedreadingperformanceandnon-contributorytest-takingstrategieswerealsostated.Next,theymatchedtesttakers’actualansweringperformancewiththespecifications;ifthetesttakersdidusetheexpectedoperations,thenthetestwasprovedtobevalid;vice19 visa.TheresultsfurthersupporttheclaimthatreadingcomprehensioninCETcouldmeasurethetestconstructors’expectedabilities.SimilartotheJinandWu’s(1998)methodology,ChenandLi(2006)alsoutilizedthemethodofimmediateintrospectiontotestifythevalidityofTEM-4clozetest.TheresultsrevealthatthecontentvalidityofclozetestinTEM-4isvalid;additionally,throughtheanalysisoftesttakers’answeringprocess,itisfoundthatstudents’difficultyincompletingtheclozetestliesindifferentiatingthefouroptionswhichindicatethatthereisstillsomeroomforimprovementfortheclozetestinTEM-4.Hao(2009)investigatedthetestmethodseffectsontesttakerperformanceintermsoftheirresponsesandtest-takingstrategiesinMCgrammartestandgrammarG/Ftask.ItisfoundthatgrammarG/Ftaskcangeneratetesttakers’grammaticalcompetence,whileinmultiplechoicetestsstudentsmightgetthecorrectanswerbyusingotherstrategies,suchastest-wiseness.Theresultssuggestthatcomparedwithmultiplechoicetest,grammarG/Ftaskcouldplayabetterroleintestingstudents’grammaticalcompetence.XuandZou(2013)madeanattempttovalidatethegrammarcompletiontestinNMETShanghaibymeansofanexperimentalstudyandaquestionnairesurvey.ThequantitativedatasuggestthatgrammarG/Ftaskcouldwelldifferentiatethehighlevelproficiencystudentsfromthelowones,andismoresuitabletobeusedinNMETthanthetraditionalMCgrammaritems.Whereas,thequalitativedatashowsthereisnosignificanceamongdifferentgroupsofstudentsintheiruseofquestionansweringstrategies.ThelateststudyofZhang(2015)hasinvestigatedtheconstructvalidityofthegrammarG/FtaskinNMETGuangdongbyRaschmodelanalysis.ThesubjectsarefromBeijingseniorhighschoolstudents.Thematerialis2012GuangdongNMETgrammarG/Ftask.Throughanalysisofthescoresof277students,theresultsdemonstratethatdespitesomeirrelevantfactoraffectingtheconstruct,thetestcanachieveitsintentionofmeasuringtesttakers’grammaticalabilitiesintheconstructedcontext.Whilethestudyalsohassomelimitations,suchasthesubjectsarenotfrom20 Guangdongprovince,recommendationforfuturestudiesisgatheringinformationfrommultiplesourcesinsteadofonesinglesource.Therefore,thepresentstudywhichinvestigatesthetesttakers’actualtesttakingprocesscancomplementitbyprovidingadditionalinformationforfurthervalidatingthegrammarG/Ftask.2.6SummaryThischapterreviewsliteratureonthedefinitionofvalidity,thedifferentstagesofvaliditytheory,thetypesofvalidity.Itcanbeconcludedthattheunifyingconceptofvalidityisagreedamonglanguagetestingresearches,sointhepresentstudythevalidityisalsoreferredtoconstructvaliditycontainingcontentvalidity,facevalidity,criterion-relatedvalidity.ThereviewofdefinitionofgrammaticalknowledgealsosuggeststhatgrammarG/Ftaskisdesignedtomeasurethegrammaticalformandmeaningatthesententialorsupersententiallevel.Atthesametime,thischapterreviewstheempiricalstudiesconcerningtheprocess-orientedmethodandthevalidationstudyofgrammarG/Ftask;thepresentstudywillalsorefertothemethodsofJinandWu’s(1998)study,sothemethodologywillbedescribedindetailsinthefollowingchapter.21 CHAPTERTHREEMETHODOLOGY3.1OverviewThepreviouschapterhaselucidatedtherelevantreviewoftestvalidityaswellasempiricalstudiesconcerninggrammarG/Ftaskandthemethodusedinthepresentstudy.Inthischapter,section3.2introducestheresearchquestionsofthisstudy,followedbytheresearchdesign,encompassingtheinstrumentsadoptedinthisstudyaswellastheparticipants,andresearchprocedures.Insection3.4,theanalyticalmethodoftheverbalreportdataisalsodescribeindetails.Intheend,asummaryofthischapterisillustratedinsection3.6.3.2ResearchquestionsAsmentionedinchapterone,thepresentstudyattemptstoexaminethevalidityofgrammarG/FtaskinNMETGuangdong.Thetestdesigners’intentionofdesigningthistestistomeasurestudents’grammaticalknowledgeofgrammaticalformsandmeaningsatthesententialordiscourselevel.Itisexpectedthatthetestcanactivatestudents’grammaticalknowledgeandstimulatestudentstoapplygrammaticalknowledgetolanguageinlinguisticcontext.ButthegrammarG/Ftaskcanreallymeasurewhatitisassumedtotest?Consequently,thehighschoolEnglishteachers’understandingofthistestisimportant,becauseifthehighschoolEnglishteachersagreewiththetestdesigneraboutwhatthegrammaticalG/Fhasmeasured,thishelpstoprovethatthetestdesigners’intentionhasbeenrealized.Atthesametime,thehighschoolEnglishteachers’accurateunderstandingofthetestalsoguaranteestherightwayofpreparingthistest.SothisstudyconductsquestionnairesintendedforhighschoolteacherstoinvestigatetheirviewoftheabilitiesmeasuredbygrammarG/Ftask.Atthesametime,inordertomakesurethevalidityofthequestionnaire,an22 interviewwithonetestdesignerofgrammarG/FtaskhelpstheauthortoknowtheconstructsofthegrammarG/Ftask.Ontheotherhand,students’answeringprocesscanalsobethepowerfulevidencetovalidatethevalidityofthegrammarG/Ftask.Ifthestudentsdousetheabilitieswhichthetestdesignersintendtomeasuretoanswertheitems,thenitcanbeconcludedthatthetestmeasureswhatitissupposedtomeasure.Soaccordingtotheverbalreportdataoftheparticipants,wecanalsofindoutwhethertheparticipants’answeringprocesscanreflecttheabilitiesthattestdesignersintendtomeasure.Sothepresentstudywilladdressthefollowingquestions:1.WhatisthehighschoolEnglishteachers’perceptionoftheconstructofthegrammarG/FtaskinNMETGuangdong?2.Doestest-taker’sansweringperformancereflecttheabilitiesthattestdesignersintendtomeasure?3.3ResearchDesign3.3.1InstrumentsTheinstrumentsusedinthisstudycontainquestionnairesforhighschoolEnglishteachersandagrammarG/Ftaskforseniorhighschoolstudents.QuestionnairesforhighschoolEnglishteachers:Thequestionnairesmainlyconsistoftwosections.ThefirstsectionisfiveMCquestionsdesignedtoobtainthehighschoolEnglishteachers’backgroundinformation,includingtheirgender,age,educationbackground,teachingexperienceandtherankoftheschoolthattheyareteaching.Thesecondsectionincorporates12statements.ThesestatementsdescribetheabilitiesmeasuredbygrammarG/Ftask,theteacherscanchoosethepoint1-5whichstandsfortheiropinion(1meanstheystronglydisagreethatthegrammarG/Ftaskhasmeasuredtheability,2meanstheteachersdisagreewiththestatementthatthegrammarG/Fhasmeasuredtheability,3meanstheteachersarenotsureaboutwhetherthegrammarG/Ftaskhasmeasuredtheabilityornot,4meanstheteachers23 agreewiththestatementthatgrammarG/Fhasmeasuredtheability,5meanstheystronglyagreewiththestatement).Amongthese12statements,4statementsdescribetheirrelevantabilitieswhicharenotmeasuredbygrammarG/Ftask.These4statementsservesasdistrators.Theother8statementsconcerningtheabilitiesmeasuredbygrammarG/FtaskareconstructedonthebasisoftheinterviewwithonetestdesignerofthetestandreferencetothetheoreticaldefinitionandcomponentsofgrammaticalknowledgeproposedbyPurpura(2004).AsillustratedinFigure3-1,the8statementscoverabilitiesmeasuredatthewordlevel,sentencelevelanddiscourselevel.Andthegrammaticalabilitiestouchupontwodimensioni.e.formandmeaning.Statement1,2,4wordlevelStatement5,6,8sentencelevelQuestionnaireforhighschoolEnglishteachersStatement10,11discourselevelStatement3,7,9,12distractorsFigure3-1ThedistributionofthestatementsinthequestionnaireThequestionnaireshaveundergoneseveraltimesdiscussionwiththeauthor’ssupervisor,andhavebeenmodifiedforseveraltimesafterapilotcollection,andthenafinalversiontryingtobeaseffectiveandimpeccableaspossiblehascomeintobeing(seeAppendixA).GrammarG/Ftest:InordertomakesurethevalidityofthegrammarG/Ftask,thegrammarG/FtestusedinthepresentstudyisabackupversiontestofNMETGuangdong.Thankstothehelpoftheauthor’ssupervisor,theoriginalgrammarG/FtaskinNMETGuangdongisaccessibletotheauthor.Atthesametime,sincethetest24 isabackupversionofNMETGuangdong,soitisassuredthatthegrammaticaltestusedinthisstudyhasnotbeendonebyhighschoolstudentsbefore.ThegrammarG/FtaskemployedinthisstudyisattachedinAppendixB,thepassageofthegrammarG/Ftaskbelongstoexpositionwithapproximately150words,talkingabouttheoriginoftheAmericanIndependenceDay,thetopicofwhichshouldnotbecompletelystrangetothestudentsdespitesomeblanksdoseemalittlebitdifficultforthehighschoolstudentstocomplete.3.3.2ParticipantsTestdesigner:ThetestdesignerisaprofessionalandexperiencedlanguageassessmentresearcherwhohastakenacrucialroleinintroducinggrammarG/FtaskintoNMETGuangdong.Therefore,theinterviewofthetestdesignercanbethemostdirectandhelpfulwayfortheauthortofindouttheunderlyingabilitiesthattestdesignersintendtomeasure.Andtheinterviewcanprovidevaluableinformationfortheconstructionofthequestionnaires.HighschoolEnglishteachers:Thereare161highschoolEnglishteacherscomingfromvariousschoolsinGuangdongprovinceanswerthequestionnaires.Theirbackgroundinformationisillustratedintable3-1.Asthetableillustrates,themajorityoftheseteachersarefemale,andtheiragecentersaround31to40yearsold.90.38%ofthempossessbachelordegreewhileonly9.62%ofthemholdaMasterdegree.Asfortheirteachingexperience,morethan60%ofthemhavebeenateacherformorethan10years.Andtheseteacherscomefromvariousschoolsincludingprovincialkeyschool,city’skeyschool,district’skeyschool,ordinaryschoolandforeignlanguageschool.GrammarG/Ftesttakersaswellasretrospectiveinterviewees:8highschoolstudentstakepartintheexperimentincludingcompletingthegrammarG/Ftestaswellasreceivingtheretrospectiveinterview.ThesestudentsarefromGuangzhouTieyiMiddleSchoolinGrade3.AllofthemarequietfamiliarwiththegrammarG/Ftaskanddothetaskfrequentlyintheirdailypreparationforthecollegeentrance25 examination.Amongthem,thereare6femalestudentsand2malestudents.TheyarepickedbytheirEnglishteacherashighEnglishproficiencylevelstudentsintheclass.Allofthemarewillingtoparticipateintheexperiment.Table3-1BasicinformationofquestionnairerespondentsPercent(%)Male17.85%GenderFemale82.05%21-3023.08%31-4053.21%Age41-5021.15%51-602.56%BachelorDegree90.38%EducationBackgroundMasterDegree9.62%1-3years12.82%Teaching4-9years22.44%Experience9-17years35.26%>18years29.49%ProvincialKeySchool12.82%City’sKeySchool18.59%TheRankOfTheSchoolDistrictKeySchool28.85%AverageSchool39.10%ForeignLanguageSchool0.64%3.3.3ProcedureQuestionnaires’constructionandadministration:Firstly,aninterviewwiththetestdesignerofgrammarG/FtaskinNMETGuangdongwasconductedinMay,18th,2015.Intheinterview,thetestdesignerexplainedtheoriginalintentionofdesigningthegrammarG/FtaskinNMETGuangdong,thereasonwhyitisdecidedtoreplacetheMCgrammaritemswiththisnewtestformat,andtheabilitiestheyattempttomeasurethroughthistask.Accordingtohisexplanation,grammarG/Ftaskcanplayabetterroleinactivatingtesttakers’grammaticalcompetenceincomparisonwithMC26 itemswhichmayeasilyresultintest-wiseness.Thetaskisdesignedwiththeintentionofmeasuringtesttakers’grammaticalcompetenceincontextlevel,activatingtesttakers’abilitiesofanalyzingtheblanksfromword,sentence,discourselevelandmakingfulluseoftheirgrammaticalknowledge,includingknowledgeofvocabulary,morphology,syntaxandphonology/graphology,andtextualcompetence,knowledgeofthecohesiveandrhetoricalpropertiesoftext.Thefollowingtablepresentsthetestdesigners’presumptionoftheunderlyingabilitiesmeasuredbygrammarG/FtaskTable3-2Testdesigners’presumptionoftheunderlyingabilitiesmeasuredbygrammarG/FtaskwordSentenceDiscourseFormLexicalknowledge:StructureofclauseandCohesionandStructurelexicalform,orthographicthelogicrelationshipCoherence:referentialforms,wordformation,betweenclausesetc.forms,substitution,Meaningcountability,voice,mood,ellipsis,logicalcollocationetc.connectorsetc.Basedonthetableoftestdesigners’presumptionoftheunderlyingabilitiesmeasuredbygrammarG/Ftask,thestatementsofthequestionnairesareconstructed.Allstatementsinthequestionnairecenterson3×3dimensions,thatisword,sentence,discourselevelsandform,structureandmeaning.TakingadvantageofameetingattendedbyhundredsofhighschoolEnglishteachersfromGuangdongprovince,thequestionnairesweredistributedtohighschoolEnglishteachersduringthemeetinginJune,16th,2015.However,aftercollectingthequestionnaires,thepercentageofthevalidquestionnairesisnotsatisfactoryandsomeissuesalsoemergeafterthepilotcollectionofquestionnaires.Thereforethefinalversionofthequestionnairehasbeenmodifiedandimprovedfromthefollowingaspects.Firstly,moreconsiderationhasbeentakenintoaccounttomakethestatementeasiertobeunderstoodbytherespondents.Forinstance,statement10andstatement11,thedefinitionofcoherenceandexamplesofcohesionarefurtherillustratedintheparenthesestomakerespondentsbetterunderstandthesetwoacademicterm.Secondly,somestatements27 needtobespecifiedclearly.Thereforeinstatement1andstatement2,somespecificlanguagepointsarealsonotedintheparentheses.Last,theboundariesbetweeneachstatementhavebeenmodifiedtobeclearer.Throughcarefulconsiderationanddiscussionwiththeauthor’ssupervisor,thefinalversionofthequestionnairecameintobeing.SoinOctober,11th,2015,thefinalversionofquestionnairesweredistributedtohighEnglishteachersduringanothermeeting.Withthehelpofthehostofthemeeting,thehosttoldthehighschoolEnglishteachersthattheywouldbeappreciatedtospareafewminutestocompletethequestionnaire,atthesametime,thehostemphasizedthattheteachersshouldtakethequestionnairesseriouslysincetheiranswerswerequiteimportantforthestudy.Intheend,205questionnairesarecollectedand156questionnairesarevalid.Conductofretrospectiveinterview:Inordertoinvestigatestudents’answeringprocess,theauthorconductedtheretrospectiveinterviewinNovember,18th,2015.ParticipantsarechosenbytheirEnglishteachers.IntheeyeoftheirEnglishteachers,theseeightstudents’Englishproficiencylevelbelongstohighlevelandallofthemarequiteserioustowardsstudying.TheywereinformedbytheirEnglishteachertofinishtheG/Ftaskwithoutanyconsultationtodictionaryorotherresources,andafterthattheywouldreceiveaninterviewtobeaskedsomequestionsbytheresearcher.Therefore,atthebeginningofeachinterview,theauthorexplainedtothestudentsclearlythattheresearcherwouldasksomequestionsabouttheiransweringprocess,theyaresupposedtotelltheresearcherhowtheygettheiranswersandthereasonwhytheyeventuallydecidetowritetheworddownintheblank.Theyareencouragedtotelltheresearchertheiransweringprocessasdetailedaspossible.Besides,theyareensuredthatthereisnorightorwrongjudgmentconcerningwhattheyspeak.Inaddition,theresearcherremindsthembeforehandthatsometimestheresearcherwillfurtheraskquestionsconcerningoneblank,thisdoesn’tmeantheiransweroftheblankiswrong.WhattheyhavetodoistotrytoelucidatehowtheythoughtwhentheyweredoingthegrammarG/Ftask.TheschemeusedinthisstudyisinChineseseeAppendixC,sometimesthesamequestionswillbeaskedindifferentways,butthemeaningisthesame.Eachinterview28 lasts7.5minutesonaverage,andalltherecordingsaretranscribedandcodedforfurtheranalysis.3.4AnalyticalMethodsForquestionnaires,alldataaretappedintoexcelandthemeanscoreofeachstatementiscalculatedbyexcel.ThemeanscoreofeachstatementcandemonstratethedegreetowhichhighschoolEnglishteachersagreewiththestatement.Forverbalreports,theinterviewisconductedinChineseforthesakeofclearunderstandingandmoresmootherexpressions,andtheverbalreportsaretranscribedinChinese.Intermsoftheverbalreportdata,therearefoursteps.Firstly,alistofunderlyingabilitiesmeasuredbygrammarG/Ftask(constructsofG/FtestorCGFT)isillustratedinthefollowing:1)CGFT1:grammarG/Ftaskmeasurestheknowledgeofwordformationsuchastheconvertoftheclassofwords,morphologicaltransformation.2)CGFT2:grammarG/Ftaskmeasuresthecorrectgrammaticalformofword,includingsingularandpluralformofnouns,thethirdpersonsingularverb,tense,comparativeofadjective,non-predicateverbs3)CGFT3:grammarG/Ftaskmeasurestheset-phrasesandcollocations.4)CGFT4:grammarG/Ftaskmeasuresthesetstructures.5)CGFT5:grammarG/Ftaskmeasurestesttakers’understandingofthesentencemeaning.6)CGFT6:grammarG/Ftaskmeasurestesttakers’analysisofthestructureandcomponentsofthesentence.7)CGFT7:grammarG/Ftaskmeasuresthecontextualcoherence.8)CGFT8:grammarG/Ftaskmeasuresthecontextualcohesion.Becausetheanswersofthetesttakerscannotrevealwhethertheunderlyingabilitiesthattestdesigneraimstomeasurehavebeenactuallyrealizedintheansweringprocess.Therefore,wemakeourjudgmentdependingontheexpectedtest29 takingperformance.Sothesecondstepistoclarifytheexpectedtesttakingperformance(ETTP).1)ETTP1:Studentscanapplymorphologicalknowledge,transformingthegivenwordintothecorrectformfittingintothecontext.2)ETTP2:Studentscanapplysomegrammaticalrulestodecidetheappropriateformofwordtomakethetextgrammaticallycorrect.3)ETTP3:Studentscanrecognizetheusageofsomewordsandphrases,andcanfititintotheblanktomakethesentencemeaningful.4)ETTP4:Studentscanrecognizethesetstructureinthetextandfitintotheblanktomakethesentencecompleteandgrammaticallycorrect.5)ETTP5:Studentscanunderstandthemeaningofthesentenceanddecidewhichwordtofitintotheblankaccordingtothemeaningofthesentence.6)ETTP6:Studentscananalyzethecomponentsandstructureofthesentence,recognizingtheusageandfunctionoftheworddeletedinthepassage.7)ETTP7:Studentscanrecognizethecohesivetieswhichkeepthetexttogetherintheoriginalorder.8)ETTP8:Studentscanrecognizethesemanticalmeaningofthetextandfitintoawordwhichmakesthetextmeaningful.ThethirdstepistotapthepresentgrammarG/Ftaskemployedinthisstudy.Theconstructsmeasuredbyeachitemisanalyzedbytheauthoraswellastwootherpostgraduatestudentswhoalsomajorinlanguagetestingandassessmentfield.Throughadiscussionwiththem,constructsmeasuredbyeachiteminthepresentgrammarG/Ftaskusedinthisstudyiscodedinthefollowingtable3-3.30 Table3-3Constructsmeasuredbyeachiteminthepresentstudy’sgrammarG/FtestItemNo.Constructmeasuredbyeachitem1CGFT5&CGFT82CGFT63CGFT6&CGFT14CGFT6&CGFT15CGFT56CGFT5&CGFT77CGFT5,CGFT7&CGFT38CGFT39CGFT110CGFT5&CGFT3Basedontheabovelistofexpectedtesttakingperformanceandthetableofconstructsmeasuredbyeachiteminthepresentstudy’sgrammarG/Ftest,laststepistomatchtheobservedperformanceofparticipantswiththeexpectedtest-takingperformancelist.Then,theparticipants’actualansweringperformancescanbecategorizedinthefollowingfoursituations:1)Testtakersanswertheitemcorrectly,atthesametime,haveusedtheabilitiesassumedtobemeasuredbythegrammarG/Ftask2)Testtakersanswertheitemcorrectly,buthavenotusedtheabilitiesassumedtobemeasuredbythegrammarG/Ftask3)TesttakersanswertheitemincorrectlyandhaveusedtheabilitiesassumedtobemeasuredbythegrammarG/Ftask4)Testtakersanswertheitemincorrectly,andfailtousetheabilitiesassumedtobemeasuredbythegrammarG/FtaskInordertoillustratetheanalyticalmethodoftheverbalreportdata,twoinstancesfromtheparticipants’verbalreportdataandtheanalysisoftheirverbalreportdataareshowninthefollowing.31 Example1:Participant02ItemNo.1IndependenceDayisregardedasthebirthdayoftheUnitedStatesasafreeandindependentnation.MostAmericanssimplycall1the“FourthofJuly,”on2dateitalwaysfalls.Theexpectedconstructsmeasuredbytheitem:CGFT5(understandingthesentencemeaning)andCGFT8(thecontextualcohesion).Theretrospectiveinterviewofparticipant02:TheprevioussentencehasmentionedIndependenceDay;thefirstblankisfilledaccordingtothesentencebeforethissentence.AndthenIanalyzethissentence(thesentenceoftheblank),mostAmericansservesasthesubject,callisthepredicate,herelacksobject,thenIwroteitintheblanktorefertotheIndependenceDaymentionedbefore.Theanalysisoftheretrospectiveinterviewdata:BlankNo.1requiresthetesttakerstofillapronountosubstitutetheIndependenceDaywhichhasappearedbeforeinthefirstsentence.Therefore,itemNo.1measurestheabilityofunderstandingthemeaningofthefirstsentenceaswellasthecontextualcohesion.Participant02clearlypointsouttheanswerofthefirstblankdependingontheprevioussentencebecauseIndependenceDayhasappearedinthefirstsentence,itshowstheparticipantisclearabouttherelationshipbetweensentencesandalsounderstandwhatthesentencesaretalkingabout.Atthesametime,theparticipantalsoanalyzesthecomponentsofthesecondsentencetocheckheranswer,makingsurethepronounthatshehasfilledmakesthesentencegrammaticallycorrect.Onthebasisoftheaboveanalysis,itcanbeconcludedthattheparticipanthasansweredtheblankcorrectly,andatthesametime,hasgraspedtheexpectedmeasuredabilities.Soheransweringprocessalsomatcheswiththeexpectedtesttakingperformance.32 Example2:Participant06ItemNo.2IndependenceDayisregardedasthebirthdayoftheUnitedStatesasafreeandindependentnation.MostAmericanssimplycall1the“FourthofJuly,”on2dateitalwaysfalls.Theexpectedconstructsmeasuredbytheitem:CGFT6(Analyzingthestructureofthesentence).Theretrospectiveinterviewofparticipant06:Franklyspeaking,Iamnotverysureaboutmyanswerofthisblank.Therefore,Ihavetoreadthesentence“MostAmericans…”again,butwhatapitythatIamstillataloss.Ithinkthewordintheblankshouldbeusedtoreplacesomething,itoccurstomethatit(thewordintheblank)isusedtoreferto“thedate”,soIguessthewordshouldbethatorwhich,andIputthetwowordsintotheblank,seeminglythatmakesthesentencesmoother.SoIwrotethat.Theanalysisoftheretrospectiveinterviewdata:Item2aimstomeasurewhetherthetesttakerscanrecognizetheusageoftheprepositionfollowedbyrelativepronouninthenon-restrictiveattributiveclause.Itrequiresthetesttakerstoanalyzethestructureofthesentence.Intheretrospectiveinterviewofparticipant06,hementionedhecannotunderstandthissentence,andtheanswerisbasedonhispersonalfeeling.Evenifhethinkstheanswermightbethatorwhich,hefailstodistinguishthedifferencebetweentheusageofthatandwhichinthenon-restrictiveclause.Andhedidn’tmentionthisisaclause,revealingthatheisn’tclearaboutthestructureofthesentence.Sobasedonhisaccount,hisansweriswrong,andhedoesn’tdemonstratetheexpectedabilitymeasuredbytheitem.33 3.5SummaryAsmentionedintheoverviewofthischapter,thischapterfocusesonintroducingtheresearchquestionsofthepresentstudy,theresearchdesignandanalyticalmethodinthisstudy.TheauthorconductedaninterviewwithonetestdesignerofthegrammarG/Ftasktofindouttheunderlyingabilitiesassumedtobemeasuredbythetask;constructed,distributedandcollectedquestionnaires;conductedretrospectiveinterviewtoinvestigatethetesttakers’answeringprocess,andreportedthewayhowverbaldataisanalyzed.Therefore,thenextchapterisgoingtoreporttheresultsofthisstudy.34 CHAPTERFOURRESULTSANDDISCUSSION4.1OverviewThischapterisgoingtoreporttheresultsofthestudy.Inthemeanwhile,discussionofthefindingswillalsobepresentedinthischapter.Specifically,Section4.2.1presentstheresultofthequestionnaireinatable,themeanscoreofeachstatementisreported;Section4.2.2aimstoanswerthesecondresearchquestion,testtakers’actualperformancewiththeexpectedansweringperformanceiscomparedandanalyzed.Section4.3isdevotedtodiscussingthereasonsunderlyingtheresults.Intheend,Section4.4makesasummaryofthischapter.4.2Resultsofthestudy4.2.1ResultsforResearchQuestion1Researchquestion1:WhatisthehighschoolEnglishteachers’perceptionoftheconstructofthegrammarG/FtaskinNMETGuangdong?ThisquestionisansweredbydistributingquestionnairestohighschoolEnglishteacherstoobtaintheiropinionabouttheunderlyingabilitiesmeasuredbygrammarG/FtaskinNMETGuangdong.ThemeanscoreofeachstatementcanshowthedegreetowhichhighschoolEnglishteachersagreewiththestatementsinthequestionnaires.Ifthemajorityoftheteacherschoose4,or5whichstandsforagreementandstrongagreementconcerningthestatement,itshowsthehighschoolEnglishteachersagreewiththestatementthatgrammarG/Ftaskhasmeasuredtheability.Sothemeanscoreofeachstatementiscalculatedandpresentedinthefollowingfigure4-1.35 Notes:1=stronglydisagree;2=disagree;3=notsure;4=agree;5=stronglyagreeFigure4-1ThemeanscoreofeachstatementinthequestionnaireInthefigure4-1,themeanscoreofeachstatementinthequestionnaireiscleartosee.Wecanfindthatthereareonly4statementsofwhichthemeanscoreislessthan3point,namelythestatement3inthequestionnairewhichstatesthatthegrammarG/Ftaskhasmeasuredtesttakers’abilityofguessingthemeaningofthestrangewordwiththehelpofthetext;statement7(grammarG/Ftaskhasmeasuredtesttakers’abilityofimplyingtheimplicitmeaningofthecontextmeaning);statement9(grammarG/Ftaskhasmeasuredtesttakers’abilityofdifferentiatingtheimportantfactsanddetails;statement12(grammarG/Ftaskhasmeasuredtesttakers’knowledgeofEnglishculturalbackgroundinformation.Asmentionedinthemethodologychapter,these4statementsaredistractorstomakesurethevalidityofthequestionnaire.Sothelowmeanscoreofthesestatementsisreasonable.Andthedisagreementwiththesestatementsfurtherconfirmthevalidityofthequestionnaires.Andthemeanscoreofother8statementsinthequestionnairesisapproximately4pointsormorethan4pointwhichmeansthestatementsconcerningtheunderlying36 abilitiesmeasuredbygrammarG/Ftaskareapprovedbythemajorityofhighschoolteachers.Inaddition,amongthe8statements,themeanscoreof3statementsisevidentlyhigherthanotherones.Lookatstatement1inthequestionnaire(i.e.GrammarG/Ftaskhasmeasuredtesttakers’knowledgeofwordformation),asillustratedinthetable4-1,95.65%highschoolEnglishteachersagreewiththisstatement.Soitiswidelyrecognizedthatgrammargap-fillingtestfocusesonmeasuringmorphologyknowledge.Next,statement2(Grammargap-fillingtaskhasmeasuredgrammaticalform)isalsocommonlyrecognizedbyhighschoolteachers.Inthetable4-1,itisfoundthat100%teachersrealizethatgrammarG/Ftaskaimstomeasurethegrammaticalform.Last,statement8(GrammarG/Ftaskhasmeasuredtesttakers’abilityofanalyzingthesentencestructure)areagreedby90.07%highschoolEnglishteacherswhichshowsmostteachersalsorealizethatabilityofanalyzingsentenceisalsoanimportantaspectmeasuredbygrammarG/Ftask.Table4-1Teachers’responseofsomestatementsinthequestionnaireScale12345Statement1.WordFormation0.00%3.73%0.62%22.36%73.29%2.GrammaticalForm0.00%0.00%0.00%20.50%79.50%8.SentenceStructureAnalysis1.86%1.86%6.21%26.71%63.36%Notes:1=stronglydisagree;2=disagree;3=notsure;4=agree;5=stronglyagreeOverallspeaking,themajorityofhighschoolEnglishteachersagreewiththeunderlyingabilitiesassumedtobemeasuredbygrammarG/Ftask.Inotherwords,themajorityofhighschoolEnglishteachersagreethatgrammarG/Ftaskshasmeasuredtesttakers’grammaticalabilityincludinggrammaticalformandmeaningatthesententialordiscourselevel.Morespecifically,theteachersagreethatgrammarG/Ftaskhasmeasuredgrammaticalknowledgeencompassingwordformation,grammaticalform,setexpressions,setstructure,sentencemeaning,analysisofsentencestructure,contextualcoherenceandcohesion.Noteworthily,morphology37 knowledgeandanalyticalabilitygainthewidestrecognitionamongthehighschoolEnglishteachers,conveyingthatgrammarG/Ftasknotonlymeasurethegrammaticalability,butalsomeasuretesttakers’analyticability.4.2.2ResultsforResearchQuestion2:Thecomparisonbetweentheparticipants’actualansweringprocessandthetesttakers’expectedperformanceThegrammarG/Ftaskinthisstudycontains10items;hereisthetablewhichcalculatesthesituationoftheparticipants’actualansweringprocess,thefrequencyoffourdifferentkindsofsituations(i.e.Testtakersanswertheitemcorrectlyusingtheexpectedabilitysupposedtobemeasuredbytheitem;testtakersanswertheitemcorrectlywithoutusingtheexpectedability;testtakersanswertheitemincorrectlyeventhoughtheyhaveusedtheexpectedability;testtakersanswertheitemincorrectlyduetolackingofexpectedability)iscalculated.Theresultispresentedintable4-2.Table4-2Thefrequencyofparticipants’responsecategoryineachitemFrequencyItemNo.√√××(answer)TotalusingECNotusingECNotusingECUsingEC1.(it)800082.(which)503083.(located)233084.(wereinvolved)015285.(The)800086.(As)214187.(not)602088.(from)206089.(official)7001810.(as)00808Note:EC=expectedconstruct√=answertheitemcorrectly×=answertheitemincorrectly38 Fromthetableabove,wecanseethatitem1requirestesttakerstounderstandthemeaningofcontextandfillapronountosubstitute“IndependenceDay”intheprevioussentence.Sothefirstblankaimstomeasurewhetherthetesttakerscanunderstandthemeaningofthesentenceandclarifytheconnectionbetweensentences.Fromthestatisticaltable,itisfoundoutthatitem1isallansweredcorrectlybyallparticipants,inthemeanwhile,alltheparticipantsusetheexpectedconstructmeasuredbythisitemwhichsupportsthesoundvalidityofthisitem.Alltheparticipanthavesucceedinitem5whichexaminestheusageofdefinitearticletorefertothewarappearingbefore.Fromthefollowingretrospectiveinterviewdata,wecanknowthatallthetesttakerspossesstheabilitiesmeasuredbythesetwoitems.RetrospectiveinterviewdataaboutitemNo.1:Participant02:Atthebeginning,thetextmentionedIndependenceDay,thesecondsentenceisbasedonthefirstsentence.Thesecondsentence…callsth.sth.Sotheanswerisit,callitthe“FourthofJuly”,namelycalltheIndependenceDaythe“FourthofJuly”.RetrospectiveinterviewdataaboutitemNo.5:Participant05:Ithasmentioned“inawar”before,sointhissentence,thewarappearedagain,usingdefinitearticle“the”refertothewar.Afteranalyzingtheretrospectiveinterviewdataofeachparticipant,itcanbeconcludedthatalltheparticipantsansweritem1anditem5correctly,atthesametime,allofthemhaveusedtheexpectedunderlyingabilitiesassumedtobemeasuredbythesetwoitems.Sointermsofthesetwoitems,participants’actualansweringprocessiscompletelyconsistentwiththeexpectedansweringperformance.Anotheritemwhichcatchestheattentionisitem9.Actuallyallthesubjectsalsohaveappliedtheexpectedconstructmeasuredbytheitemwhichmeasurestheknowledgeofwordformation.However,intheretrospectiveinterview,allthe39 participantsexpresstheirknowingofchangingthegivenwordofficeintoitsadjectiveform,thereisoneparticipantwhohasmadespellingmistakes,soshefailedtoanswertheitemcorrectly.Butthisalsoshowsthatshedidn’tgrasptheknowledge.Becauseshewrotethewronganswer“offical”insteadof“official”.Sointhissense,thisitemsuccessfullymeasurestheexpectedconstruct.AndthisitemalsosupportstheclaimthatgrammarG/Ftaskcangeneratetesttakers’grammaticalcompetence,whileinMCtestsstudentsmightgetthecorrectanswersbynon-contributorytest-takingstrategiesratherthangrammaticalcompetence(Hao2009).SupposethisitemisexaminedinMCformat,thentesttakerscaneasilychoosetherightanswerthroughrulingouttheirrelevantoptions,andthecorrectoptionscangivetesttakersthehinttochoosetherightanswer,thustheitemfailstoreflecttesttakers’ability.Comparedwithitem1anditem5whichareansweredcorrectlybyallparticipants,item10issodifficultthatnoneoftheparticipantshaveansweredtheitemcorrectly.Theitemisillustratedasfollowing:TheDeclarationofIndependence,signedbyleadersfromthecolonies,statedthisclearly,andforthefirsttimeinan9(office)documentthecolonieswerereferredto10theUnitedStatesofAmerica.Answer:9.official10.asTheitemissupposedtomeasurethecollocationrefertosb/sthassthaswellastesttakers’understandingofthecontext,however,intheretrospectiveinterview,noparticipanthasmentionedthecollocation,evenifsometesttakersexpressaccurateunderstandingofthesentencemeaning,whereas,duetounfamiliaritywiththiscollocation,allofthemfailedtoansweritcorrectly.Amajorityofthemfailtoconsiderrefertoasasawhole,insteadtheymistaketoasinfinitive,somostofthemfilltheblankwithaverbbe.Eventhoughnoparticipanthasansweredtheitemcorrectly,allofthemfailtoanswertheitemcorrectlybecausetheydonotpossesstheexpectedmeasuredknowledgepoints.Sointhissense,theitemhasalsojustifieditsvalidityinmeasuringtheconstructs.Andtheirperformanceonthisitemalsosupports40 theclaimthatgrammarG/Ftaskfocusesonmeasuringtesttakers’analyticalability.Theresultisalsoconsistentwiththeresultsfromthequestionnaires.Intermsoftheresultsforquestion1,highschoolEnglishteachersalsostronglyagreewiththestatementthatgrammarG/Ftaskmeasuresthetesttakers’abilityofanalyzingthesentence.Andthenlookattheitem3anditem4.Thesetwoitemsareworthofattentionbecausethesetwoblanksaredeletedinthesamesentencewhichissocomplexthatapproximatelynosubjectshavesucceedinfiguringoutthecomponentsofthewholesentence.Sincethissentenceisquitelongandrepresentative,sothewholesentenceisillustratedinthefollowing:Atthattime,thepeopleofthe13Britishcolonies3(locate)alongtheeasterncoastofwhatisnowtheUnitedStates4(involve)inawaroverwhattheyconsideredunjusttreatmentbythekingandparliamentinBritain.Answer:3.located4.wereinvolvedThesetwoitemsmainlymeasuretesttakers’abilityofanalyzingthestructureofthesentenceanddecidingtheappropriateformoftheverbs.Subjectofthissentenceis“thepeopleofthe13Britishcolonies”,whilethepredicateverbis“involve”insteadof“locate”,therefore,verb“locate”servesasanon-predicatetomodifythenounbeforeit;afterclarifyingthefunctionofthesetwowordsinthesentence,thenthecorrectgrammaticalformofinvolvehastoberesolved.Heretheitem4alsoexaminesthecollocationbeinvolvedin…,therefore,thecorrectgrammaticalformofinvolveiswereinvolvedinservingasthepredicate.Intheretrospectiveinterviewdata,itisfoundthatmosttesttakershavenodifficultyidentifyingthesubjectthepeopleofthe13Britishcolonies,whilesomeparticipantseasilytakelocateasthepredicate,theyfindthatthereisadverbialoftimeatthattime,sothesetwotesttakerswritingtheiranswer“located”,usingthepasttenseofthisverb.Althoughtheformissamewiththepastparticipleoflocate,theirreasonoftheiranswerisnotcorrect.Sotheirperformanceisnotconsistentwiththeexpectedtesttakingperformance.Andthe41 mainreasonofresultinginthisphenomenonisthat,mosttesttakersjustfocusontheblank,theyfailtohaveawholepictureofthiscompletesentence.Soitneveroccurstothemthatamongthetwowordslocateandinvolve,thereisonlyoneofthemserveaspredicateinthesentence,theotherisnon-predicate.Evenifsometesttakersexpressintheretrospectiveinterviewthattheytrytofigureouttherelationshipbetweenthemainclauseandthesubordinateclauseinthiscomplexsentence,butthesentenceistoocomplicatedanddifficultforthem.Sointheend,theyhadtogiveupanalyzingthestructureofthesentence,andfilledtheblankbyguessingorsenseofgrammarwithalotofuncertainty.Sothat’swhyinthetable,asforitem3,threetesttakersanswertheitemcorrectlybuthavenotusedtheexpectedconstruct.Sotheirperformanceonthisitemisnotconsistentwiththeexpectedperformance.Herearetwoinstancesfromtheparticipants’verbaldataaboutthesetwoitems:Participant06:Thepeopleofthe13Britishcoloniesisthesubject,andthenafterthissubjectthereisnoverb,andbeforewhatisnow…,soitisobviouslythatfromthepeopleof13Britishcolonies…to…alongtheeasterncoast…,thisisashortsentence,thenlocateissupposedtobefilledasthepredicateofthisshortsentence,sincethereisatthattime,sothishappenedinthepast,soIfilledtheblankwithlocated.AndthenextblankIfilledinvolving,becausethisisaclauseleadedbywhat,andthepredicateisis,sothisblankshouldfillanon-predicate,theninvolvefollowUnitedStates,Ithinkheretheverbshoulduseitsactivevoice,sotheblankshouldbeinvolving.Participant07:Ijustthinklocateshoulduseitspassivevoiceandinvolveshoulduseitsactivevoice.BecauseIseethepeopleofthe13Britishcoloniesisthesubject,sotherelationshipbetweenlocateandthesubjectispassive,solocateuseitspassivevoice.Anditem4,theclausehaswhatisnow,sotherelationshipbetweenUnitedStatesandinvolveisactive,soIuseitsactivevoiceinvolving.Participant02:Thesetwoitemsaresodifficult.SoIjusthavetodoitbyguessing.Ithinkatthattimeremindsmetochoosethepasttenseoflocate,soIfilledtheblankwithlocated.Thenasfortheblankofinvolve,Ihesitatedbetweentheanswer42 ofwereinvolvedandinvolving.Itrytoanalyzethestructureofthesentence,butIreallycan’tdoit.SoIhavenochoicebuttofillitbysenseofgrammar.Fromtheaboveretrospectiveinterviewdata,itcanbeseenthattesttakersthinkthesetwoitemsaretoodifficultforthem,participant02triestoanalyzethewholesentenceinabiggerpicture,whilesheexpressedherhelplessnessaboutthetwoitems.Andparticipant06andparticipant07justfocusonthesingleclause,donotconsiderthestructureofthewholesentence,theirincomprehensiveconsiderationresultsintheirfailureontheseitems.Inabriefway,evenifthesetwoitemsaredifficultforhighschoolstudents,theitemsworkwellinmeasuringtesttakers’contextualanalysisability.Participants’performanceinthesetwoitemscanalsorevealthedeficiencyoftesttakers’thoughts;mosttesttakersjustfocusonthenarrowareaorlocallevelinsteadofhavingawholepicturewhentheycompletetheblanks.Asforitem6,thereisoneparticipantansweringtheitemcorrectlywithoutusingtheexpectedconstructs.Inherretrospectivedata,sheremarkedthatshedidnotknowhowtodoit.Butinlightofthemeaningofthecontext,shefilledtheblankwithas.Shealsomentionedthatshehadthoughtotheranswers,suchaswhenorduring,butshecouldnottellthedifferencebetweenthesewords.Shejustfeltotheranswerswerenotappropriatewithoutanyarticulatereasons.Soamongeightparticipants,oneparticipant’sansweringprocessdoesnotreflecttheabilitieswhichareexpectedtobemeasuredbytheitem,thisratioisalsounderstandable.Sooverall,thisitemdoesnotposethreattotheconstructvalidityofthegrammarG/Ftest.Intermsofitem2,item7anditem8,noparticipatehasanswereditcorrectlybutwithoutusingtheexpectedmeasuredconstructs,sobasicallyspeaking,participants’performanceontheseitemsconformingtotheexpectedtesttakingperformance.AboveistheindividualanalysisofeachitemonthegrammarG/Ftask,therearesevenitems(item1,item2,item5,item7,item8,item9anditem10)playingaperfectroleinmeasuringtheconstructstheyaimtomeasure,alltheparticipantseitheranswertheitemcorrectlywithusingtheexpectedconstructoranswertheitem43 incorrectlyduetolackingofunderlyingability.Andotherthreeitems,item3,item4,anditem6inevitablyencounteroneortwocasesinwhichparticipantsanswertheblankcorrectlywhilenotusingtheexpectedconstructs.Butoverall,thegrammarG/Ftaskdoesagoodjobinmeasuringtesttakers’grammaticalcompetenceincontext.Thisconclusioncanbefurtherconsolidatedbythedatainthefollowingtable.Table4-3Thetotalfrequencyofparticipants’responsecategoryFrequencypercentageAnswertheUsingEC4090.1%itemcorrectlyNotusingEC49.9%Total44100%AnswertheUsingEC616.7%item3083.3%NotusingECincorrectlyTotal36100%Thistableillustratesthatthefrequencyoftherightanswersofparticipantsis44times,amongwhichtesttakersanswertheblankcorrectlywithutilizingtheexpectedmeasuredconstruct40times,accountingfor90.1%oftherightanswers;testtakersanswertheitemscorrectlywithoututilizingtheexpectedmeasuredconstructonlyreach4times,occupying9.9%oftherightanswers;thefrequencyofwronganswersofparticipantsis36times,amongwhichtesttakersanswertheblankincorrectlyduetolackingoftheexpectedmeasuredconstruct30times,occupying83.3%ofthetotalfrequencyofwronganswers;usingtheexpectedmeasuredconstructsattains6times,occupying16.7%.Sogenerallyspeaking,inmostcases,iftesttakersanswertheitemscorrectly,theypossesstheexpectedabilitymeasuredbytheitems;iftesttakersanswertheitemsincorrectly,theydonotpossesstheexpectedabilities.44 4.3Discussion4.3.1TheFirstQuestionofThisStudyThefirstquestionisaboutthehighschoolteachers’perceptionoftheconstructsofgrammarG/Ftask.Theresultsmanifestthatallthestatementsinthequestionnairesapartfromthedistractivestatementshavereceivedapprovalfromthemajorityoftheteachers.Statement1,statement2andstatement8havereceivedthewidestrecognitionfromhighschoolEnglishteachers.NamelystatementsstatinggrammarG/Ftaskhasmeasuredtheabilitiesofwordformation,correctgrammaticalwordformandtheanalysisofsentencestructureandcomponentshaveacquiredmoreattentionfromtheseteachers.Whilestatement6,statement10andstatement11whicharesupposedtobeimportanthaverelativelowmeanscore,thetentativereasonmightbethat,statement6measuringtesttakers’understandingofthesentencemeaningistakenforgrantedbythehighschoolteachers,sinceintheeyesoftheseteachers,whenstudentsfilltheblank,understandingthemeaningofthesentenceisaprerequisiteforthestudentstocompletethetask.Sotheynormallyattachmoreimportancetootherconstruct.AccordingtoPurpura(2004),grammarG/Ftaskmeasuresgrammaticalformandmeaningatboththesententialanddiscourselevels.AndtheresultsofthequestionnaireshowthathighschoolteachersbasicallyagreethatgrammarG/Ftaskhasmeasuredstudents’knowledgeofwordformandcontextualmeaning.ButtheresultsofthequestionnaireshowthatcohesionandcoherencereceivelessattentionfromthehighschoolEnglishteachers.Itislikelythatsometeachersjustfocusonteachingstudentssometesttechniqueswhilefailtoguidestudentstohaveacomprehensiveunderstandingofthecontext.Thismayeasilyresultinstudents’unsatisfactoryperformanceinthetask.Becausetheanswerintheblankshouldnotonlybegrammaticallyrightbutalsosemanticallymeaningful.SocohesionandcoherenceshouldbeattachedmoreimportanceinthedailyEnglishteachingandlearning.Generallyspeaking,theconstructsofgrammarG/FtaskGuangdonghavebeenapprovedbythehighschoolEnglishteachers.Andtherecognitionofhigh45 schoolteachersabouttheconstructsofthegrammarG/FtaskGuangdongcanalsopromotethebeneficialwashbackofthistask.4.3.2TheSecondQuestionofThisStudyThesecondquestionconcernswhethertesttakers’actualansweringperformancecanreflecttheabilitiesthattestdesignersintendtomeasure.Throughretrospectiveinterviewaboutthe8participants’answeringprocessandanalyzingtheiransweringprocessineachitem,inthetableabove,wecanseethatbroadlyspeaking,thegrammarG/Ftaskcanachieveitsoriginalpurposeinmeasuringgrammaticalability.Asillustratedinfirstcolumnandthirdcolumninthetable,itisfoundthatthepercentageofansweringtheitemcorrectlywithusingECandansweringtheitemincorrectlywithoutusingECofitem1,item2,item5,item7,item8,item9anditem10is100%.Sothesesevenitemsrevealthatallparticipants’actualperformanceisconsistentwiththeexpectedtesttakingperformance.Intermsofitem3anditem4,becausethetwoblankscomefromthesamelongandcomplexsentence,andthepastparticipleoflocateanditspasttensearethesame,sotherearetwoparticipantscoincidentallyanswerstheitemcorrectlywithoutusingtheexpectedconstruct.Andoneparticipantanswertheitem4justbysenseofgrammarfailingtoelucidatethereasons.Sotherearesomespecialcasesemergingthatoneortwotesttakers’answeringprocesscan’treflecttheirabilities.Butsincethetotalfrequencyis8,sothisphenomenonisalsoreasonable.Butthisunanticipatedperformancealsogivesussomehintsindesigningagrammartask.Weshouldmakesurethatallthecorrectanswersofthetesttakersdoreflecttheyhavegraspedtheability.Basicallyspeaking,thegapfillingtestcanmeasurewhatitissupposedtomeasure,andthetesttakers’answeringprocesscanreflecttheabilitiesthatthetestdesignerintendstomeasure.46 4.4SummaryThischapterattemptstoanswerthetworesearchquestionsinthisstudy.Section4.2.1providestheresultsofquestion1.Thedataofthequestionnairearedealtwithcalculatingthemeanscoreofeachstatement.Inthefigure4-1ofthemeanscoreofeachstatement,itisfoundthatexceptthedistractors,themeansocreofalltheotherstatementsconcerningtheconstructofgrammarG/FtaskinGuangdonghavereached4point,itshowsthatthemajorityofthehighschoolEnglishteachersthinkthegrammarG/Ftaskhasmeasuredtheconstructthatthetestdesignerintendstomeasure.TheconsensusfromthesehighschoolteacherscanalsoguaranteethebeneficialwashbackeffectofthistaskinNMETGuangdong.Section4.2.2providesthecalculationofthefrequencyoftesttakers’responsecategories.Theresultsdemonstratethat7in10itemsreceived100%consistencybetweenthetesttakers’actualperformanceandtheirexpectedperformance.Thatmeanswhentesttakersdotheseitems,theycansucceedinansweringitrightwithexpectedabilitiesorconstructtheyhaveadoptedintheiransweringprocess,ortheyfailtodoitrightbecauseoflackingoftheexpectedconstructorabilitieswhichtheblankaimstomeasure.Asitisseenintable4-3,thefrequencyofansweringtheitemcorrectlywiththeutilizationoftheexpectedconstructsaccountsfor90.1%whilethefrequencyofansweringtheitemincorrectlyduetolackingofusingtheexpectedconstructreaches83.3%.Sothesefiguresshowthattesttakers’performanceisinlinewiththeexpectedperformanceandtheiransweringperformancecanreflecttheabilitiesthattestdesignerintendtomeasure.47 CHAPTERFIVECONCLUSIONThischapterincorporatesthreesections.Section1reportsthemajorfindingsofthisstudy.Thesecondsectionputsforwardsomeimplicationsforgrammarteachingandlearning.Thelastsectionpointsoutthelimitationsofthisstudy.5.1MajorFindingsofThisStudyThepresentstudyinvestigatestwoquestions.Oneistofindout,intheeyesofhighschoolteachers,whataretheabilitiesmeasuredbygrammarG/FtaskGuangdong.ThisquestionisansweredbycollectingthedataofthequestionnairesconsistingofstatementsconcerningtheconstructofgrammarG/Ftask.ThemeanscoreofeachstatementcanshowwhetherhighschoolEnglishteachersagreewiththestatementornot.Thesecondresearchquestionistofindoutwhethertesttakers’performanceinthegrammarG/Ftaskcanreflecttheabilitiesthetestdesignerintendstomeasure.Throughcalculationofthemeanscoreofstatementsinthequestionnaireandanalysisoftheretrospectiveinterviewdata,thefindingsarefollowing:Firstly,themeanscoreofthestatementsinthequestionnairehaveachieved4pointswhichmeansthestatementsareagreedbymosthighschoolEnglishteachers.Therefore,theconstructsmeasuredbygrammarG/FtaskhavealsobeenrecognizedbyhighschoolEnglishteachers,namelygrammarG/Ftaskmeasuresabilitiesincludinggrammaticalformandmeaningatsententialanddiscourselevel.Secondly,statementsinthequestionnairethatgrammarG/Ftaskcanmeasuretesttakers’knowledgeofwordformation,correctgrammaticalformandtheanalyticabilityofsentencestructurehavegainedwidestrecognitionfromthehighschoolEnglishteachers.ItshowsthatgrammarG/Ftaskscanmeasureoneormoreareasofgrammaticalknowledge.ThisresultisconsistentwiththeviewpointsofPurpura(2004).However,themeanscoreofstatementsconcerningthetextualcohesionand48 coherenceiscomparativelylowerwhichmeansthesetwoconceptionsneedtobeattachedmoreimportanceinthefuturelanguageteaching.Inaddition,accordingtomatching8participants’actualansweringperformancewiththeexpectedperformance,itisfoundthatallinall,testtakers’answeringprocessisinlinewiththeexpectedperformance.Inotherwords,testtakerscananswertheitemcorrectlyduetousingtheexpectedconstruct.Whiletesttakersfailtoanswertheitemcorrectlyduetolackingoftheexpectedconstruct.7itemsinthegrammarG/Ftaskhas100%consistencybetweenthetesttakers’actualansweringprocesswiththeexpectedperformance.Namely,grammarG/Ftaskcandoagoodjobinmeasuringwhatitintendstomeasureandavoidotherirrelevantfactorsespeciallytest-wiseness.Inthedetailedanalysisoftheparticipants’answeringprocess,itcanalsoshowthatgrammargap-fillingtaskdoesnotsimplymeasurethegrammaticalability,insteadthetaskemphasizepracticingstudents’analyticalability,andtheabilityofapplyingthegrammaticalknowledgeintothecontext.ThisfindingisconsistentwiththevalidationreportofNMETGuangdongin2008(Zeng2009).ThegrammarG/Ftaskisdesignedtosolvethecurrentproblemswithstudents,i.e.Studentscanrecognizesomegrammaticalpoints,buttheirsenseofgrammarisdeficient.SogrammarG/Ftaskcanmeasurestudents’grammaticalabilityindiscourselevel,activatetheirgrammaticalknowledge.5.2ImplicationsofThisStudyTheresultsofthisstudyhavesomeimplicationsforEnglishgrammartesting,teachingandlearning.ThestudyshowsthatgrammarG/Fcanmeasurestudents’grammaticalabilityatnotonlywordlevelbutsententialanddiscourselevel.Soteachersshouldguidestudentstopaymoreattentiontotheunderstandingofthewholecontext.Studentsshouldlearntocomprehendthegrammaticalknowledgeatahigherorderlevel.Secondly,grammarG/Femphasizetheabilityofanalyzingthesentence.Sostudents’analyticalabilityneedstobeimproved.BecausethegrammarG/Fdoesnot49 simplymeasurethegrammaticalpoints,insteadthetaskfocusesonmeasuringstudents’applicationofgrammaticalknowledge,sostudentsneedtolearntoanalyzethesentence.Thirdly,thetextualcohesionandcoherenceshouldcatchmoreattentionfromthehighschoolEnglishteachers.Thesetwoconceptposemorechallengeforstudents.Sincestudentssometimescanthinkofalotofdifferentwordtofillintheblank,buttheyneglectthatsomewordsfailtomakethetextawhole.Sostudents’overalllanguageabilityshouldimproved.Inaword,grammarG/Ftaskcansuccessfullymeasurethegrammaticalabilitiesincludinggrammaticalformandmeaningatahighhigherlevel.Sostudents’grammaticalabilitycanbemeasuredinamorecomprehensiveway.Theimprovementofstudents’awarenessofcontextisimportantforgrammarlearning.5.3LimitationsofThisStudyAlthoughthisstudycombinesaquantitativeandqualitativemethodtoinvestigatethevalidityofthegrammarG/FtaskinNMETGuangdong,therestillexistssomeroomforimprovementforfurtherstudy.Ononehand,thestudyadoptsaprocess-orientedmethodinordertoinvestigatethecognitiveprocessoftesttakers’answeringprocesstogetsomedetailedinformation.However,duetothesubjectsareonlyhighschoolstudents,someofthemmayfeeldifficulttoexpressthereasonoftheiranswer,perhapsitisbecausetheydonotusetheexpectedconstructsormaybetheircommunicativeabilityconcerninggrammarislimited,sincewecan’tdenytheexistenceofthussituationwherestudentscandothetaskright,whilethestudentsjustcan’texpresstheirideaclearly.Ontheotherhand,theresultspartofquestion2hascalculatedthefrequencyoftesttakers’performanceconsistingwiththeexpectedperformanceornot,maybemoresubjectscanreflecttheissuebetter.50 REFERENCESAngoff,W.H.(1988)Validity:AnevolvingconceptinH.Wainer&H.Braun(Ed.),TestValidity.Hillsdale,NJ:LawrenceErlbaun.Anastasi,A.(1997)PsychologicalTesting.EnglewoodCliffs,NJ:PrenticeHall.Alderson,J.C.(1979)TheEffectonClozeTestsinDeletionFrequency.JournalofResearchinReading2(2),108-119.Alderson,J.C.(1979)TheclozeprocedureandproficiencyinEnglishasaforeignlanguage.TESOLQuarterly13,219–27.Alderson,J.C.(1980)Nativeandnonnativespeakerperformanceonclozetests.LanguageLearning30,59–76.Alderson,J.C.,CarolineC.,&DianneW.(1995)LanguageTestConstructionandEvaluation.Beijing:ForeignLanguageTeachingandResearchPress.Bachman,L.F.(1985)Performanceonclozetestswithfixed-ratioandrationaldeletions.TESOLQuarterly19,535–56.Bachman,L.F.(1990)FundamentalConsiderationsinLanguageTesting.Oxford:OxfordUniversityPress.Bachman,L.F.,&Cohen,A.D.(2002)Languagetesting---SLAinterfaces:Anupdate.InL.F.Bachman,&A.D.Cohen(Ed.),Interfacesbetweensecondlanguageacquisitionandlanguagetestingresearch(pp.1-31).Beijing:ForeignLanguageTeachingandResearchPress.Bachman,L.F.,&Palmer,A.S.(1996)LanguageTestinginPractice.Oxford:OxfordUniversityPress.Canale,M.,&Swain,M.(1980)Theoreticalbasesofcommunicativeapproachestosecondlanguageteachingandtesting.AppliedLinguistics1,1–47.Chapelle,C.A.(1998)ConstructdefinitionandvalidityinquiryinSLAresearch.InL.F.BachmanandA.D.Cohen(Ed.)InterfacesBetweenSecondLanguageAcquisitionandLanguageTestingResearch(pp.32–70).Cambridge:CambridgeUniversityPress.Chapelle,C.A.,&Abraham,R.G.(1990)Clozemethod:whatdifferencedoesitmake?LanguageTesting7,121-146.Chavez-Oller,M.A.,Chihara,T.,Weaver,K.A.,&Oller,J.W.Jr.(1994)Whenareclozeitemssensitivetoconstraintsacrosssentences?InOller,J.W.Jr.,&Jonz,51 J.(Ed.).Clozeandcoherence(pp.229-245).London:AssociatedUniversityPress.ChenXiaokou&LiShaoshan(2006)TEM-4wanxingtiankongceshijiegouxiaoduyanjiudatiguochengfenxifa(AstudyoftheconstructvalidityofTEM-4clozetest:Aprocessapproach).XiandaiWaiyu(ModernForeignLanguages)1,71-77.Chihara,T.,Oller,J.,Weaver,K.,&Chavez-Oller,M.A.(1977)Areclozetestssensitivetoconstraintsacrosssentences?LanguageLearning27,63-73.Cohen,A.D.(1984)Ontakinglanguagetests:whatthestudentsreport.LanguageTesting,1(1),70-81.CronbachL,J.(1949)EssentialsofPsychologicalTesting.NewYork:Harper.Cronbach,L.J.(1971)Validity.InR.L.Thorndike(Ed.),EducationalMeasurement.Washington,DC:AmericanCouncilonEducation.Cronbach,L.J.,&P.E.Meehl.(1955)Constructvalidityinpsychologicaltests.PsychologicalBulletin52,281-302.Davies,A.,Brown,A.,Elder,C.,Hill,K.,Lumley,T.,&McNamara,T.(2002)DictionaryofLanguageTesting.Beijing:ForeignLanguageTeachingandResearchPress.Ebel,R.L.,&Frisbie.D.A.(1991)EssentialsofEducationalMeasurement(Rev.edn.).EnglewoodClilffs,NJ:Prentice-Hall.Ericsson,K.A.,&Simon,H.A.(1984)Protocolanalysis:Verbalreportsasdata.Cambridge:MITPress.Faerch,C.,&Kasper,G.(1987)Introspectioninsecondlanguageresearch(Rev.edn.).Clevedon:MultilingualMattersLtd.Fotos,S.S.(1991)TheclozetestasanintegrativemeasureofEFLproficiency:asubstituteforessaysoncollegeentranceexaminations?LanguageLearning41,313-36.Grotjahn,R.(1986)Testvalidationandcognitivepsychology:somemethodologicalconsiderations.LanguageTesting3,159–85.Guion,R.M.(1980)OnTrinitarianconceptionsofvalidity.ProfessionalPsychology.11,385-398.HaoJin(2009)Testmethodseffectsonstudents’performance:Acomparativestudyofgrammartests.UnpublishedMAthesis.Guangzhou:GuangdongUniversityofForeignStudies.Heaton,J.B.(2000)WritingEnglishLanguageTests.Beijing:ForeignLanguageTeachingandResearchPress.Henning,G.(2001)AGuidetoLanguageTesting:Development,Evaluationand52 Research.Beijing:ForeignLanguageTeachingandResearchPress.Hughes,A.(1989)TestingforLanguageTeachers.Cambridge:CambridgeUniversityPress.Hughes,A.(2000)ATestingforLanguageTeachers.Beijing:ForeignLanguageTeachingandResearchPress.Hunter,J.E.,&Schmidt,F.L.(1990)Methodsofmeta-analysis:Correctingerrorandbiasinresearchfindings.NewburyPark,CA.Ingram,E.(1977)Basicconceptsintesting(Rev.edn.).Oxford:OxfordUniversityPress.JinYan&WuJiang(1998)YineixingfajianyanCETyuedulijieceshidexiaodu(IntrospectionmethodtoexaminetheconstructvalidityofreadingcomprehensioninCET).WaiyuJie(ForeignLanguageWorld)2,47-52.Jonz,J.(1990)Anotherturnintheconversation:whatdoesclozemeasure?TESOLQuarterly24,61–83.Klein-Braley,C.(1984)Acloze-upontheC-test:astudyintheconstructionvalidationofauthentictests.LanguageTesting2(1),76-104.Kibby,M.W.(1980)Intersententialprocessesinreadingcomprehension.JournalofReadingBehavior12,299–312.Kintsch,W.,&vanDijk,T.A.(1978)Towardamodeloftextcomprehensionandproduction.PsychologicalReview85,363-94.Lado,R.(1961)LanguageTesting:theConstructionandUseofForeignLanguageTests.London:Longman.Lado,R.(1961)LanguageTesting.NewYork:McGraw-Hill.Larsen-Freeman,D.(1997)Chaos/complexityscienceandsecondlanguageacquisition.AppliedLinguistics48,141–65.Lindquist,E.F.(1942)AFirstCourseinStatistics.Boston:HoughtonMifflin.LiQinghua(2006)Yuyanceshizhixiaodulilunfazhanwushinian(EvolutionofValiditytheoryinthepast50years).XiandaiWaiyu(ModernForeignLanguages)1,87-95.Markman,P.L.(1985)RationaldeletionclozeandglobalcomprehensioninGerman.LanguageLearning35,423–30.McKenna,M.C.,&Layton,K.(1990)Concurrentvalidityofclozeasameasureofintersententialcomprehension.JournalofEducationalPsychology82,372–77.MeDonough,J.,&MeDonough,S.(2000)ResearchMethodsforEnglishLanguageTeachers.ForeignLanguageTeachingandResearchPress.53 Messick,S.(1980)Testvalidityandtheethicsofassessment.AmericanPsychologist35,12-27.Messick,S.(1987)Validity(Rev.edn.).NewYork:AmericanCouncilonEducation/Macmillan.Messick,S.(1990)FundamentalConsiderationsinLanguageTesting.OxfordUniversityPress.Messick,S.(1996)ValidityandWashbackinLanguageTesting.Researchreport:RR-96-17.Princeton:ETS.Nisbett,R.E.,&Wilson,T.D.(1977)Tellingmorethanwecanknow:Verbalreportsonmentalprocesses.PsychologicalReview(84)3,231-259.Nunan,D.(1992)Researchmethodsinlanguagelearning.Cambridge:CUP.Oller,J.W.,Jr.(1979)LanguageTestsatSchool.London:Longman.Purpura,J.E.(2004)AssessingGrammar.Cambridge:CambridgeUniversityPress.Richards,J.C.,&SchmidtR.W.(1998)LongmanDictionaryofLanguageTeaching&AppliedLinguistics.Beijing:ForeignLanguageTeachingandResearchPress.Sasaki,M.(2000)Effectsofculturalschemataonstudents’test-takingprocessesforclozetests:amultipledatasourceapproach.LanguageTesting17,85-114.Schmidt,B.(1994)Theeffectoftopicfamiliarityonsecondlanguagelisteningcomprehension.TheModernLanguageJournal78,179-189.Shanahan,T.,Kamil,M.L.,&Tobin,A.W.(1982)Clozeasameasureofintersententialcomprehension.ReadingResearchQuarterly17,229–55.Someren,M.W.V.,Barnard,Y.F.,&Sandberg,J.A.C.(1994)TheThinkAloudMethod:apracticalguidetomodelingcognitiveprocesses.London:AcademicPressLimited.Storey,P.(1997)Examiningthetest-takingprocess:acognitiveperspectiveonthediscourseclozetest.LanguageTesting14,214-231.Thorndike,R.M.(1997)MeasurementandEvaluationinPsychologyandEducation(Rev.edn).UpperSaddleRiver,NJ:Merrill.XuLiang&ZouShen(2013)Gaokaoyingyushanghaijuanyufapingjiamoshideduibifenxiyusikao(StudyonassessmentofgrammarinNMETShanghaipaper).Waiyudianhuajiaoxue(MediainForeignLanguageInstruction)154,11-16.Yamashita,J.(2003)Processoftakingagap-fillingtest:comparisonofskilledandlessskilledEFLreaders.LanguageTesting3,267-293Weir.C.J.(2005)LanguageTestingandValidation.NewYork:PalgraveMacmillan.ZengYongqiang(2009)BasisofEnglishlanguagetesting.SouthChinaUniversityof54 TechnologyPress.Zhang,Chunqing(2015)GaokaoyingyuyufatiankonggounianxiaodudeRaschmoxingfenxi(ARasch-basedconstructvalidationofGMETGrammarGap-fillingTask).Xiandaiwaiyu(ModernForeignLanguages)38(2),258-268.55 5656 APPENDICESAppendixAQuestionnaireforhighschoolEnglishteachers高考英语语法填空题调查问卷老师,您好!本问卷是为了搜集老师们对高考英语语法填空这道题所能考查到学生的能力的看法,希望您能抽出几分钟的时间认真完成下列问题,您的回答结果对此研究非常重要。非常感谢您的参与和合作!1.您的性别:A.男B.女2.您的年龄:A.21-30岁B.31-40岁C.41-50岁D.51-60岁3.您的学历:A.大专B.本科C.硕士D.其它4.您的教龄:A.1-3年B.4-9年C.9年以上D.18年以上5.您所任教的学校是____。A.省重点B.市重点C.区(县)重点D.普通中学E.外国语学校6.您认为高考语法填空题主要考查了学生哪些知识和能力,请认真如实勾选您的想法(在1,2,3,4,5上选一个打√)。观点陈述完全不比较不不确定比较同完全同同意同意意意1.语法填空考查了构词法(如词性转换,词形变换)。123452.单词的正确语法形式(包括名词的单复数,动词的12345第三人称单数,动词的时态,形容词的比较级,非谓语动词等)。3.语法填空考查了利用上下文猜测词义的能力。123454.语法填空考查了短语的固定搭配。123455.语法填空考查了固定句型的使用。123456.语法填空考查了对句子意义的理解。123457.语法填空考查了隐含意义的推断能力。123458.语法填空考查了对句子结构和成分分析。123459.语法填空考查了区分重要事实和细节的能力。1234557 观点陈述完全不比较不不确定比较同完全同同意同意意意10.语法填空考查了对上下文连贯性的分析(即上下12345分句之间的意义关联的理解)。11.语法填空考查了上下文的衔接(指文章当中的某12345一部分的阐释需要借助于文章中的其他内容,衔接形式分为照应、替代、省略等语法手段及复现等词汇手段和逻辑连接词)12.语法填空考查了英美文化背景知识。12345再次感谢您的合作和参与!(^_^)请您将填写完后的问卷留在座位上或者离场时交给出口的回收人员。58 AppendixBTheInterviewGuideline访谈之前的指示:接下来就这篇文章,我会询问你的答题过程,你当时做题时是怎么思考的,如何得出这个答案的?你只需要尽可能详细的告诉我你当时做题的思考过程,无所谓对错,对你不会造成任何影响。同时,有些空我可能会追问,但这不代表你做错了。好,那我们下面一个空一个空的开始。访谈的基本句式:第*个空,你是如何思考的?/你是怎么得出这个答案的?/为什么要填这个词?如果考生表达不顺畅,或者不能正常表达出自己思考过程,则继续追问:你是通过哪些信息得出要填这个词的?如果考生讲的跟句子意义有关,则追问:你是通过意义上的理解吗?进行确认。如果考生讲的跟语法有关,则追问:你有分析句子的结构吗?进行确认。如果考生表达这个空不确定,则询问不会做的原因:你理解这句话的意义吗?你有尝试去分析句子结构吗?59 AppendixCGrammarGap-fillingtest语法填空(共10小题,每小题1.5分,满分15分)阅读下面短文,按照句子结构的语法性和上下文连贯的要求,在空格处填入一个适当的词或使用括号中词语的正确形式填空,并将答案填写在答题卡标号为1-10的相应位置上。IndependenceDayisregardedasthebirthdayoftheUnitedStatesasafreeandindependentnation.MostAmericanssimplycall1the“FourthofJuly,”on2dateitalwaysfalls.TheholidayrecallsthesigningoftheDeclarationofIndependenceonJuly4,1776.Atthattime,thepeopleofthe13Britishcolonies3(locate)alongtheeasterncoastofwhatisnowtheUnitedStates4(involve)inawaroverwhattheyconsideredunjusttreatmentbythekingandparliamentinBritain.5warbeganin1775.6thewarcontinued,thecolonistsrealizedthattheywerefighting7justforbettertreatment;theywerefightingforfreedom8England’srule.TheDeclarationofIndependence,signedbyleadersfromthecolonies,statedthisclearly,andforthefirsttimeinan9(office)documentthecolonieswerereferredto10theUnitedStatesofAmerica.Keys:1.it2.which3.located4.wereinvolved5.the6.As7.not8.from9.official10.as60