Cambridge Dictionary

  • Cambridge Dictionary +Plus

Collocations with thesis

These are words often used in combination with thesis .

Click on a collocation to see more examples of it.

{{randomImageQuizHook.quizId}}

Word of the Day

Your browser doesn't support HTML5 audio

a computer program designed to have a conversation with a human being, usually over the internet

Searching out and tracking down: talking about finding or discovering things

Searching out and tracking down: talking about finding or discovering things

thesis collocation verbs

Learn more with +Plus

  • Recent and Recommended {{#preferredDictionaries}} {{name}} {{/preferredDictionaries}}
  • Definitions Clear explanations of natural written and spoken English English Learner’s Dictionary Essential British English Essential American English
  • Grammar and thesaurus Usage explanations of natural written and spoken English Grammar Thesaurus
  • Pronunciation British and American pronunciations with audio English Pronunciation
  • English–Chinese (Simplified) Chinese (Simplified)–English
  • English–Chinese (Traditional) Chinese (Traditional)–English
  • English–Dutch Dutch–English
  • English–French French–English
  • English–German German–English
  • English–Indonesian Indonesian–English
  • English–Italian Italian–English
  • English–Japanese Japanese–English
  • English–Norwegian Norwegian–English
  • English–Polish Polish–English
  • English–Portuguese Portuguese–English
  • English–Spanish Spanish–English
  • English–Swedish Swedish–English
  • Dictionary +Plus Word Lists

{{message}}

There was a problem sending your report.

Search

  • I nfographics
  • Show AWL words
  • Subscribe to newsletter
  • Incidental learning
  • Incidental learning tool
  • Pronunciation
  • Part of speech
  • Word family
  • Common/uncommon
  • General/Academic
  • Collocation
  • About vocabulary building
  • AWL highlighter/gapfill
  • AWL tag cloud/gapfill
  • ACL highlighter/gapfill
  • AFL highlighter/gapfill
  • Multi highlighter
  • NAWL highlighter
  • AVL highlighter
  • DCL highlighter
  • EAWL highlighter
  • MAWL highlighter
  • MAVL highlighter
  • MAWL/MAVL highlighter
  • CAWL highlighter
  • CSAVL highlighter
  • SWL highlighter
  • SPL highlighter
  • SVL/MSVL profiler
  • GSL highlighter
  • NGSL highlighter
  • New-GSL highlighter
  • About the AWL
  • AWL word finder
  • AWL highlighter & gapfill
  • AWL tag cloud & gapfill
  • AWL word cloud
  • AWL word cloud (sublist)
  • About the ACL
  • ACL by headword
  • ACL by type
  • ACL by frequency
  • ACL highlighter & gapfill
  • ACL mind map
  • About the GSL
  • GSL by frequency
  • GSL alphabetical
  • GSL limiter
  • About technical vocab
  • Secondary Vocab Lists
  • Middle Sch Vocab Lists
  • Science Word List
  • Secondary Phrase Lists
  • Nursing Collocation List
  • Other specific lists
  • Text profiler
  • Word & phrase profiler
  • Academic concordancer
  • Guide to the concordancer
  • Academic corpora
  • Concordancers review
  • Environment
  • Physical Health

Show AWL words on this page.

Levels 1-5:     grey  Levels 6-10:   orange 

Show sorted lists of these words.

Any words you don't know? Look them up in the website's built-in dictionary .

Choose a dictionary .  Wordnet  OPTED  both

  • Academic vocab

The Academic Collocation List (ACL) Common academic collocations

This page describes the Academic Collocation List (ACL), explaining what it is and giving a full list of collocations in the ACL, sorted by headword . There is also, in another section, an ACL highlighter which can be used to highlight ACL words in a text, as well as an ACL mind map creator.

What is the ACL?

ACL video

For another look at the same content, check out YouTube or Youku , or the infographic . There is a worksheet (with answers and teacher's notes) for this video.

The Academic Collocation List (ACL) is a list containing 2,469 of the most frequent and useful collocations which occur in written academic English. It can be seen as a collocational companion to the Academic Word List (AWL) , consisting of collocations (or word combinations) rather than single words.

The ACL was developed by Kirsten Ackermann and Yu-Hua Chen using the Pearson International Corpus of Academic English (PICAE), with advice from English teaching experts to ensure the collocations chosen would be useful to students of English. The ACL gives around 1.4% coverage of words in academic English (based on the source corpus used in the study). In contrast, the same collocations give only 0.1% coverage for a general corpus, showing they are indeed much more common in academic than general English.

Many of the words in the ACL are also contained in the AWL, e.g. alternative approach appears in the ACL, and both of these words appear in the AWL. However, there are many word combinations which are very common in academic writing which contain one word or no words from the AWL, such as generally agree (both of these words actually appear in the GSL ). Studying collocations is an important way to build up your academic vocabulary, and the Academic Collocation List is one possible tool to help you do this.

Check out the Quizzes section for exercises to practise using words in the ACL.

The Academic Collocation List

The 2,469 collocations in the ACL are listed below. The list has been adapted for this website by collecting collocations under headwords, in the same way that words in the AWL are categorised. In addition, the collocations have been listed under both of the headwords they contain in order to make them easier to find. This means, for example, that the collocation great accuracy appears both under the headword great and the headword accurate . The collocation accurate description likewise appears under accurate , as well as under the headword describe . This means each collocation appears twice in the list, once for each headword. Where words occur in the AWL , the AWL headword has been used, e.g. the AWL headword for academic is academy .

There are three versions of this list on the website:

  • ordered by headword ( this page )
  • listed according to collocation type ( adj + n etc.)
  • listed by frequency

Unlock AWL

GET FREE EBOOK

Like the website? Try the books. This extract from Unlock the Academic Wordlist: Sublists 1-3 contains all sublist 1 words, plus exercises, answers and more!

logo

Author: Sheldon Smith    ‖    Last modified: 28 November 2022.

Sheldon Smith is the founder and editor of EAPFoundation.com. He has been teaching English for Academic Purposes since 2004. Find out more about him in the about section and connect with him on Twitter , Facebook and LinkedIn .

The AWL highlighter allows you to highlight words from the AWL (Academic Word List) in any text you choose.

The Academic Word List (AWL) contains 570 word families which frequently appear in academic texts.

The Academic Collocation List (ACL) is a list containing 2,469 of the most frequent and useful collocations which occur in written academic English.

Academic vocabulary consists of general words, non-general academic words, and technical words.

Resources for vocabulary contains additional activities and information (requires users to be logged in).

Learning vocabulary depends on knowing how much to learn, the type of vocabulary to study, and how to study it properly.

Học tiếng anh tại IELTSsongngu.com

Collocation của từ, part of a university degree.

doctoral, MA, Master's, MSc, PhD, etc. | chemistry, geology, etc. | research

VERB + THESIS

work on | complete, do, write (up) | submit | publish

THESIS + NOUN

research, title, topic

| in a/the ~

| ~ about/on

statement of an idea/a theory

basic, central, main

prove, support

| disprove | advance

| challenge, refute

  • Old Version |
  • Collins Dictionary |
  • Google Dictionary |
  • IELTS Lessons|
  • Practice tests|
  • English exams|
  • Document image tool

A completely new type of dictionary with word collocation that helps students and advanced learners effectively study, write and speak natural-sounding English . This online dictionary is very helpful for the education of the IELTS, TOEFL test.

  • Collocations/collocation - common word combinations such as 'bright idea' or 'talk freely' - are the essential building blocks of natural-sounding English. The dictionary contains over 150,000 collocations for nearly 9,000 headwords.
  • The dictionary shows words commonly used in combination with each headword: nouns, verbs, adjectives, adverbs, and prepositions, common phrases.
  • The collocation dictionary is based on 100 million word British National Corpus.
  • Over 50,000 examples show how the collocation/collocations are used in context, with grammar and register information where helpful.
  • The clear page layout groups collocations according to part of speech and meaning, and helps users pinpoint speedily the headword, sense and collocation they need.
  • Free Download -- OXFORD Collocations Dictionary
  • Document image tool -- Free photo repair|Free document conversion|More
  • Google Dictionary
  • Wordnet Online
  • English Test Info
  • IELTS English Test
  • Collins Dictionary
  • IELTS Lessons
  • Practice tests
  • English exams

This site is supported by Send me an email .

Grammar Monster Logo

paper-free learning

menu

  • conjunctions
  • determiners
  • interjections
  • prepositions
  • affect vs effect
  • its vs it's
  • your vs you're
  • which vs that
  • who vs whom
  • who's vs whose
  • averse vs adverse
  • 250+ more...
  • apostrophes
  • quotation marks
  • lots more...
  • common writing errors
  • FAQs by writers
  • awkward plurals
  • ESL vocabulary lists
  • all our grammar videos
  • idioms and proverbs
  • Latin terms
  • collective nouns for animals
  • tattoo fails
  • vocabulary categories
  • most common verbs
  • top 10 irregular verbs
  • top 10 regular verbs
  • top 10 spelling rules
  • improve spelling
  • common misspellings
  • role-play scenarios
  • favo(u)rite word lists
  • multiple-choice test
  • Tetris game
  • grammar-themed memory game
  • 100s more...

Collocation

What is a collocation.

correct tick

Table of Contents

Examples of Collocation

Types of collocation, why understanding collocation is important.

collocation examples

Have, Take, and Make

  • have a baby, have breakfast, have fun, have a headache, have an illness, have a good time
  • take advice, take a bath, take medicine, take a picture, take a shower, take your time
  • make breakfast, make a cake, make a mistake, make some tea, make a wish
  • If you want to sound like a native speaker, you must recognize and learn the collocations.

author logo

This page was written by Craig Shrives .

Learning Resources

more actions:

This test is printable and sendable

Help Us Improve Grammar Monster

  • Do you disagree with something on this page?
  • Did you spot a typo?

Find Us Quicker!

  • When using a search engine (e.g., Google, Bing), you will find Grammar Monster quicker if you add #gm to your search term.

You might also like...

Share This Page

share icon

If you like Grammar Monster (or this page in particular), please link to it or share it with others. If you do, please tell us . It helps us a lot!

share icon

Create a QR Code

create QR code

Use our handy widget to create a QR code for this page...or any page.

< previous lesson

X Twitter logo

next lesson >

  • Dictionaries home
  • American English
  • Collocations
  • German-English
  • Grammar home
  • Practical English Usage
  • Learn & Practise Grammar (Beta)
  • Word Lists home
  • My Word Lists
  • Recent additions
  • Resources home
  • Text Checker

Definition of thesis noun from the Oxford Advanced Learner's Dictionary

  • Students must submit a thesis on an agreed subject within four years.
  • He presented this thesis for his PhD.
  • a thesis for a master's degree
  • He's doing a doctoral thesis on the early works of Shostakovich.
  • Many departments require their students to do a thesis defense.
  • She completed an MSc by thesis.
  • her thesis adviser at MIT
  • in a/​the thesis
  • thesis about

Take your English to the next level

The Oxford Learner’s Thesaurus explains the difference between groups of similar words. Try it for free as part of the Oxford Advanced Learner’s Dictionary app

thesis collocation verbs

  • 1 School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China
  • 2 Institute of Corpus Studies and Applications, Shanghai International Studies University, Shanghai, China
  • 3 College of Education, King Saud University, Riyadh, Saudi Arabia

The investigation of learners’ interlanguage could greatly contribute to the teaching of English as a foreign language and the development of teaching materials. The present study investigates the collocational profiles of large-scale written production by English learners with varied L1 backgrounds and different proficiency levels. Using the British National Corpus as reference corpus, learners’ collocation use was extracted by corpus query language and further identified by t -score via Python programming language. The collocation list consists of 2,501 make/take  + noun (the direct object) collocations. Findings show that proficient learners tend to use collocations containing more semantically complicated and abstract noun elements for varied communication tasks. Moreover, advanced learners are inclined to use collocations comprised of more difficult and longer noun elements.

Introduction

Collocational competence has been widely recognized as a prerequisite for native-like mastery of target language and attracting substantial attention in the field of language acquisition ( Gablasova et al., 2017 ; Altun, 2021 ; Cao and Badger, 2021 ). Along with other types of prefabricated language, collocations help language users economize on cognitive processing effort and reduce dysfluency and hesitation ( Fillmore, 1979 ; Hunston and Francis, 2000 ). Collocations or “arbitrarily restricted lexeme combinations” ( Nesselhauf, 2005 ), such as commit crime or make a joke, have been found to take up a large proportion of native speakers’ language production ( Cowie, 1992 ). Based on the investigation of an academic corpus, Howarth (2013) revealed that 41% of verb + noun pairings consist of collocations or idioms. Prefabricated language, including collocation, accounts for approximately half of the spoken and written texts Erman and Warren (2000) . Unfortunately, multiple studies have shown that language learners, even those with higher proficiency, have difficulties approximating native speakers’ collocation use ( Fan, 2009 ; Granger and Paquot, 2009 ; Li and Schmitt, 2010 ; Yamashita and Jiang, 2010 ; Laufer and Waldman, 2011 ). So far, the number of studies investigating learners’ collocation use based on a large scale of longitudinal learner data with multiple proficiency levels is small. The present study set out to address this gap and help clarify the developmental patterns of learners’ productive collocational competence.

Previous Studies

The past decades have witnessed the surge of research on learners’ collocational competence. Much of the current literature pays particular attention to their productive collocation use in learner corpus. Such studies differ remarkedly in the way they define and identify collocation. Taking a statistical perspective of collocation, some have utilized the frequency-based approach, which is frequently adopted in computational linguistics ( Gyllstad, 2007 ; Nguyen and Webb, 2016 ; Liu and Afzaal, 2020 , 2021 ). Studies of such kind are highly quantitative ( Lee and Shin, 2021 ) and based on the notion that collocation pertains to the “probability of occurrence of their constituent words” ( Henriksen, 2013 , p. 31). In contrast to the frequency-based approach, some have adopted the phraseological approach and define collocation by delimiting it from other significant types of combinations, namely, free combinations and idioms, in terms of their degree of transparency and commutability ( Nesselhauf, 2005 ). Aisenstadt (1979) viewed collocation as “combinations of two or more words used in one of their regular, non-idiomatic meanings, following certain structural patterns, and restricted in their commutability not only by grammatical and semantic valency”. Cowie et al. (1988 , p. 71) defined collocation by distinguishing it from the other types of multi-word units. Fewer studies have combined the above two approaches to avoid the inconsistency of human judgment in the phraseological approach and the risk of retrieving n-grams which are devoid of meaning, such as and the and by the, in frequency-based approach ( Szudarski and Carter, 2016 ; Nizonkiza, 2017 ).

The existing literature also varies greatly concerning their methodology and research design. Most of such studies focused on learners of English from the same first language (L1) backgrounds, for instance, Chinese learners ( Li and Schmitt, 2010 ) and Japanese learners ( Yamashita and Jiang, 2010 ; Saito and Liu, 2021 ). Moreover, various kinds of measures have been used for the identification of collocations. For example, studies adopting the phraseological approach may rely on native speakers’ judgment ( Nesselhauf, 2005 ), while those taking the frequency-based approach tend to make use of indices of z -score, t -score, MI score, etc. ( Durrant and Schmitt, 2009 ; Granger and Bestgen, 2014 ; Gablasova et al., 2017 ). However, it should be noted that the manual filtering of native-like collocations in learner data is tremendously time-consuming work for either of these two approaches. Researchers taking phraseological approach must apply the relevant criteria, such as transparency and commutability, to identify word combinations one by one. In addition, the frequency-based approach requires the measurement of association strength in each word combination based on their frequency information in large-scale reference corpus. Automatic and reliable identification of the nativelikeness of given word combinations in learner data based on programming language is in need. What is more, a handful of research has assessed the amount and quality of collocation use by intermediate or advanced language learners, with beginners been given insufficient attention ( Siyanova-Chanturia, 2015 ). Studies that managed to investigate a large amount of L2 production by learners at different proficiency levels has been fairly modest until now.

So far, much attention has been accorded to clarifying the deviance in learners’ collocation use by comparing against that of native speakers ( Siyanova and Schmitt, 2008 ; González and Ramos, 2013 ). Among the various collocation types investigated in previous studies, verb + noun collocations were found to be particularly challenging for language learners ( Bahns, 1993 ; Wang and Shaw, 2008 ; Tsai, 2015 ). Laufer and Waldman (2011) found that leaners at all proficiency levels produced significantly smaller number of verb + noun collocations and errors appear to be persistent in fairly advanced learners’ language production. Altenberg and Granger (2001) analyzed EFL learner use of collocations comprised of high-frequency verbs and concluded that this collocation type is surprisingly error-prone and has posed great problems to either beginners or proficient learners. Different reasons for why the uptake of verb + noun collocations is hampered have been proposed. Boers et al. (2014) suggested that high-frequency verb elements in collocations can be problematic as they “contribute relatively little to the semantics of” the collocation as a whole and barely grab learners’ attention. In addition, they summarized that learners may experience particular problems when dealing with semantically related words (e.g., make and do in make a mess and do damage ) and formally related words (e.g., make and take in make a drawing and take a photo ). Barcroft (2006) stated that unfamiliar word elements in collocations would hinder learners from mastering the form of collocations by exhausting cognitive processing resources.

Although lots of extant studies focused on the number of collocations accurately used ( Siyanova-Chanturia, 2015 ) or errors ( Phoocharoensil, 2014 ; Kim, 2018 ) made by learners, studies aiming to identify the properties of learners’ collocation use at different proficiencies are relatively scarce. The following studies may have implications for such a research aim. Peters (2016) investigated the learning burden of different types of collocation in connection with their congruency (presence or absence of literal L1 translation equivalent), collocate-node relationships, and length of constituent words. According to her research, incongruent collocations and verb + noun type collocations tend to cause more difficulty in acquisition. Moreover, collocation items composed of longer words are more challenging to master in a form recall test. Another relevant work was undertaken by Uchida (2015) which highlighted the influence exerted by L2 input. While the factors that could account for learners’ developmental patterns do not seem to be concluding yet, target language input was considered to be crucial for language acquisition and can explain certain acquisition sequences ( Rankin and Unsworth, 2016 ). Reports have shown that second language acquisition is “heavily input-oriented” ( Dietrich et al., 1995 , p. 271) and input driven ( Goldschneider and DeKeyser, 2001 ). Textbooks in EFL environment are one of the major target language inputs and could greatly influence learners’ acquisition. In Uchida’s research, he analyzed the delexical verb + noun collocations taught in EFL textbooks and proposed that the features of noun elements within collocations ( viz. semantical fields, concreteness or abstractness, and difficulty levels) may help characterize learners’ collocation use at different proficiencies. Uchida called on further analysis to verify the assumption and explore more properties to profile learners’ collocation use.

Addressing the findings in Uchida’s research, we assume that learners at higher proficiency levels may start to be exposed to collocations made up of more difficult nouns belonging to varied and abstract semantic fields and hypothesize that they tend to use such collocations as proficiency increases. Moreover, referring to Peters’ research, we speculate that collocations containing longer noun elements are better mastered by advanced learners due to its relatively heavier learning burden. Therefore, based on the literature review above, this paper seeks to clarify the characteristics of learners’ productive collocational competence at different proficiency levels. Learners’ collocation use is to be compared in the following aspects, difficulty level, semantic fields, and length of constituent noun elements in the collocation.

This study’s view of collocation is that it can be defined as the co-occurrence of lexical items that “appear with greater than random probability” within a specific span ( Hoey, 1991 , p. 7). Lexical items here refer to lexemes. Hence, for instance, make a decision and make decisions are considered as instances of one collocation. This study took a broad view of collocation which does not distinguish between collocations and idioms. Moreover, this study utilized a criterion from the phraseological approach and paid attention to the syntactic relationship between the constituent words in collocations as well.

This analysis fixes attention on make/take  + the direct objects collocation for the following reasons. Firstly, verb + noun collocations carry essential information, which is indispensable in communication and frequently used by language users ( Gyllstad, 2007 ). Secondly, compared with collocations made up of more complicated verbs, collocations consisting of common verbs are more likely to be used by beginner learners, thus allowing researchers to observe the developmental patterns of collocation use from lower levels to advanced ones. Thirdly, make and take are among the most frequently used verbs in the learner corpus.

Based on the discussion above, the present research aims to investigate the following research questions:

How does CEFR proficiency level impact three collocational properties? More specifically, what are the semantic features of the direct object in make/take + noun collocations across CEFR levels? Whether advanced learners use collocations consisting of more difficult and longer noun elements?

Materials and Methods

Learner data.

The second release of the large-scale learner corpora, EF-Cambridge Open Language Database ( Geertzen et al., 2013 ; henceforth EFCAMDAT), was used in this study. The EFCAMDAT comprises 1,180,310 compositions submitted by 174,743 language learners as assignments to Englishtown , an online English language school ( Huang et al., 2018 ). Learners are from about 200 nationalities, with Brazilians, Chinese, Mexicans, and Germans accounting for 70% of the composition. The proficiency levels of students are validly determined by their performance in placement test when they start or advance to a language course. The EFCAMDAT is a pseudo-longitudinal corpus containing a collection of essays written by learners whose proficiency levels span from A1 to C2 level in terms of Common European Framework of Reference for Languages, which enables the exploration of a general developmental pattern of learners’ collocational knowledge.

Compositions in the EFCAMDAT are elicited by means of writing tasks on a wide variety of topics and graded by teachers. There are 128 different writing activities in the full course which consist of topics, such as editing an online profile, writing to a pen pal, and reporting a news story ( Geertzen et al., 2013 ). These topics help to generate varied situations for eliciting a wide variety of collocations from the learners. Each writing task suggests an expected word count according to the complexity of the topic and learners’ language proficiency, ranging from approximately 30 words in lower levels to approximately 150 words in higher levels.

We randomly extracted 3,600 compositions from A1, A2, B1, and B2 CEFR levels, respectively, for analysis to obtain a manageable data size. Due to the relatively small numbers of essays written by advanced learners, compositions written by learners at C1 and C2 levels were treated as a whole to represent essays written by C level learners, from whence 3,600 scripts were extracted. Table 1 presents a summary of the total number of words of extracted data.

www.frontiersin.org

Table 1 . Summary of randomly extracted learner data.

The EFCAMDAT data were uploaded to Sketch Engine ( Kilgarriff et al., 2014 ) and tagged with the TreeTagger Tag Set ( Santorini, 1990 ). Firstly, corpus query language (CQL), namely, [lemma = “make”][]{0,4}[tag = “NN.?”] and [lemma = “take”][]{0,4} [tag = “NN.?”] , was run on Sketch Engine to extract make/take  + noun sequences. All the retrieved sequences were manually checked to remove the infelicitous or incomplete ones. For example, take dog may be extracted from take care of my dog , which would be removed as dog is not the direct object of take .

Secondly, in line with Durrant and Schmitt (2009) , the British National Corpus was used as reference corpus to retrieve frequency information of component words within each sequence and the sequence as a whole. We first downloaded the BNC XML edition (BNC XML, available at http://www.natcorp.ox.ac.uk/ ; BNC Consortium, 2007 ) and removed its xml tags. It was then uploaded to Sketch Engine for the extraction of make/take  + noun sequences employing the same corpus query language. Afterward, the retrieved sequences devoid of meaning was eliminated by hand. Ninety percent of the data in BNC consists of written language is extracted from a wide range of registers, such as novels, news, and thesis. It contains a 100 million-word sample of modern British English from the late 20th century which makes it a credible reference source. What is more, the convenient data accessibility and handy XML format have made it an ideal reference corpus in our analysis.

Thirdly, a Python (version 3.7.2) script was written to calculate the t -score of each make/take  + noun sequence found in learner corpus based on their observed frequency in reference corpus BNC. Among the varied kinds of collocational association strength measures, the t -score method was selected to identify collocations in learner data. The t -score measurement was considered to be one of the major measures of collocation strength and more reliable than other measures, such as the z -score ( Schmitt, 2010 ). Following Durrant (2008) , make/take  + noun pairings in our research with a t -score higher than 3.9 were regarded as collocations. Table 2 summarizes the number of make/take  + noun combinations and collocations used by learners across proficiencies. About 61% of the make/take  + direct noun object combinations were identified as collocations.

www.frontiersin.org

Table 2 . Summary of make/take + noun patterns identified from learner data.

The following information was annotated to each collocation for analysis. First of all, the UCREL Semantic Analysis System ( Rayson et al., 2004 ) was used to annotate the noun elements in collocation with semantic tags via Free USAS English web tagger. USAS is a semantic analysis system based on Tom McArthur’s Longman Lexicon of Contemporary English ( Mcarthur, 1981 ) and reported to achieve a precision value as high as 91% ( Rayson et al., 2004 ). The semantic tags refer to the semantic fields which are collections of related word senses. Word senses were grouped into 21 major discourse fields. The tagged results were manually checked and corrected. The list of semantic fields and their sub-categories is summarized in Table 3 based on the tag set description provided in the USAS ( Piao et al., 2015 ).

www.frontiersin.org

Table 3 . Summary of USAS tag set.

Secondly, the English Vocabulary Profile ( Kurtes and Saville, 2008 ) was employed to annotate noun elements with difficulty levels. The EVP project was initiated to substantiate the vocabulary that L2 learners typically know at different CEFR levels. This project assigns each word in its wordlist a level between A1 and C2 on CEFR underpinned by extensive research on 50-million-word Cambridge Learner Corpus and curricula analysis ( Capel, 2010 ). The EVP was reported to be an effective and promising benchmark ( Leńko-Szymańska, 2015 ). The VLOOKUP function in Microsoft Excel was used to assign EVP levels to noun elements within collocation.

Moreover, to verify whether advanced learners tend to use collocations comprised of more extended noun elements, the LEN function was used to calculate the length of each noun.

The Semantic Fields of Noun Elements Within Make/Take  + Noun Collocations

This section analyzes the semantic features of noun elements in learners’ collocation use. Figure 1 shows the proportion of noun elements belonging to different semantic fields used by learners at each CEFR level. What stands out most is that the percentage of nouns belonging to semantic field A (General & abstract terms), X (psychological actions, states, & process), and I (money & commerce in industry) increased as learners’ language ability improved. In contrast, the number of nouns belonging to the semantic field B (The body & the individual) and F (food and farming) decreased at higher levels. Moreover, the results indicate that noun elements belonging to certain semantic fields are used by learners at relatively higher proficiencies. For instance, nouns within semantic fields G (Government and the public domain) are only used by B2 and C learners; E (EMOTIONAL ACTIONS, STATES, & PROCESSES) are used by learners above A2 levels.

www.frontiersin.org

Figure 1 . Percentage of nouns in each semantic field across CEFR level.

Table 4 displays the two semantic fields accounting for the highest ratio at each CEFR level, as well as frequently used examples from each. Nouns from semantic fields A and S account for the highest ratios among the collocations used by advanced learners.

www.frontiersin.org

Table 4 . Top two semantic fields at each level.

To better characterize the CEFR levels in terms of the distribution of semantic fields of noun elements, setting learners’ proficiency levels as row variables, and semantic fields as column variables, a correspondence analysis was conducted using R 3.5.1. Correspondence analysis enables the summarization of multiple data sets and visualization of their relationship through a two-dimensional graph ( Ishikawa et al., 2010 ). By plotting the groups of compositions at each CEFR level and semantic features of noun elements together in the bi-dimensional space, we can observe which features could better distinguish each group. This method fits the current analysis as it can deal with categorical data ( Ishikawa et al., 2010 ).

Figure 2 is the bi-plot of CEFR levels and the semantic fields of noun elements. The cumulative contribution rate of Dimension 1 and Dimension 2 in our correspondence analysis sums up to 89.31%, indicating that these two dimensions explain the variance between CEFR levels and semantic fields to a high degree. To understand the relationship between row and column variables, we can first graph a vector connecting the origin and the plotting point of semantic fields (K, for instance). Afterward, perpendicular line from the position of each CEFR level was drawn to this vector. We need to observe how close each CEFR level is on this vector to the point, K. It can be seen from the bi-plot that A2 is the closest, A1 follows, and the other levels are the furthest. Accordingly, noun elements from semantic field K are most characteristic to A2 learners, and least associated with intermediate and advanced learners. All in all, the bi-plot shows that A1 learners are more likely to use nouns belonging to B (The body & The individual) and F, while A2 learners prefer those from H (Architecture, building, houses, & The home) and K (ENTERTAINMENT, SPORTS, & GAMES). Moreover, the other higher-level learners (B1, B2, and C) were plotted very closely to each other, which shows that their use of noun elements is relatively similar in terms of the semantical features. Learners at these three levels appear to rely on nouns belonging to A (GENERAL & ABSTRACT TERMS), S (Social actions, states, & process), E (Emotional actions, states, & process), and Q (linguistic actions, states, & process), etc.

www.frontiersin.org

Figure 2 . Bi-plot of correspondence analysis: CEFR levels and semantic fields of noun elements.

The Difficulty Level of Noun Elements in Collocations

To clarify whether advanced learners tend to use collocations consisting of more difficult noun elements, we assigned each noun element with its difficulty information, i.e., EVP level. Nouns annotated with A1 level are supposed to the easiest words, while those with C2 level are the most complicated ones. Goodman and Kruskal’s Gamma coefficient is used to measure the association of the two ordinal variables, G  = 0.36, p  < 0.01, indicating a positive relationship between learners’ CEFR level and the EVP level of noun elements. Table 5 presents the adjusted residual scores in our analysis, which shows the difference between observed and expected values for each cell.

www.frontiersin.org

Table 5 . Crosstabulation of proficiency level and difficulty level of noun elements.

According to Table 5 , the adjusted residual scores of A1 and A2 learners in the use of noun elements at A1 difficulty level are the greatest, while that of C learners are the smallest. It implies that A2 leaners are most inclined to use noun elements at A1 levels, whereas C level leaners are least incline. Meanwhile, the residual scores obtained by C learners in the use of noun elements at C1 and C2 difficulty levels are the greatest, indicating that advanced learners tend to use those difficult nouns most. The residual statistical analysis further confirms the tendency that students with higher English proficiency tend to use more difficult words.

The Length of Noun Elements in Collocations

The relationship between learners’ proficiency levels and the length of noun elements is presented in Figure 3 . As can be seen from the graph, learners’ collocation use at B and C levels contains a greater proportion of noun elements composed of seven or more letters.

www.frontiersin.org

Figure 3 . The relationship between the CEFR level and length of noun elements.

We employed mixed-effects models to analyze the contribution of multiple factors to the length of noun elements used by learners. Learners’ CEFR level was set as fixed effect, while individual learner, nationality, and writing topics as random effects. 1 A1 level was set as the reference level of the categorical predictor variable. The statistical analysis was conducted using the lmerTest package in R (version 3.6.3).

Table 6 presents the parameter estimates from the model. The results indicate a statistically significant difference in noun length between the reference level (A1) and B1 level ( Estimate  = 0.92, SE  = 0.45, t  = 2.04, p  < 0.05). Accordingly, the expected noun length in the B1 level tended to be longer by 0.92 words than that of A1. Moreover, there was a significant difference between the A1 level and C level ( Estimate  = 0.94, SE  = 0.44, t  = 2.1, p  < 0.05), suggesting that the expected noun length is longer at the C level than at the A1 level by 0.94 words. Nevertheless, the analysis found no significant differences between A1 and A2, A1, and B2 level.

www.frontiersin.org

Table 6 . Summary of the mixed effects model for the length of noun elements.

The present study was designed to provide insights into the characteristics of learners’ collocation use at different proficiency levels. We randomly selected 18,000 essays from the EFCAMDAT and extracted 4,071 make/take  + noun pairings. T-score value, a measure of collocational strength, for each make/take  + noun pairing was calculated, based on which 2,501 pairings were identified as collocations. Those collocations were annotated with necessary information and then examined concerning the semantic features, difficulty levels, and length of noun elements in collocation. Our focus was on the different characteristics of EFL learners’ collocation use at each proficiency level.

Over half of the make/take  + noun combinations used by learners at each proficiency level were identified as collocations with t -scores higher than 3.9. In terms of the semantic features of noun elements within collocation, the quantitative analysis has found an association between the proficiency level and the semantic fields of noun elements. It was found that beginner learners mainly used collocations containing nouns belonging to the semantic fields B, K, H, and F which were about everyday activities and concrete objects. The advanced learners are found to behave in a similar way regarding the semantic elements they used. They tended to use collocations belonging to semantic fields, such as A, S, E, and Q, which are concerned with abstract social/psychological/political topics. Our analysis has shown that proficient learners tend to use noun elements of higher difficulty levels. Moreover, although there was no significant difference in the length of noun elements used by A1 and A2 learners, B1 and C learners were found to use longer nouns than A1 learners. To summarize, our analysis implies that EFL beginners tend to use make/take  + noun collocations containing relatively concrete, easy, and short noun elements, while advanced EFL learners manage to combine the common verbs with semantically more complicated, difficult, and relatively longer nouns for various communication tasks. This result aligns with previous research conducted by Namvar (2012) , who found a strong and positive relationship between learners’ collocational knowledge and their overall proficiency. In addition, the current analysis also echoes that of Nizonkiza (2012) which suggested that learners’ productive collocational knowledge develops as their proficiency increases.

The present results are significant as it facilitates our understanding of the developmental patterns of EFL learners’ productive collocational competence. EFL learners are widely assumed to focus on the learning of individual words without paying attention to their co-occurring companions ( Wray, 2002 ). However, our analysis has shown that over half of the make/take  + noun combinations used by learners are native-like collocations in writing tasks. Meanwhile, their collocational competence kept growing until they are able to use collocations containing noun elements of more varied semantic fields, which may enable them to accomplish diversified communication activities. This study has identified that EFL learners tend to try out the combinatorial mechanisms and mimic the combination of words as native speakers do from the early stage of language learning. Our study supports Durrant (2008) and Durrant and Schmitt (2010) , who claimed that EFL learners “do retain information about what words appear together in their input” (p. 1) and intensive exposure to collocations can improve their language acquisition. Therefore, the findings can be considered as positive news to EFL teachers as students’ productive collocational knowledge appear to develop as their proficiency grows. In accordance with Boers et al. (2014) , we propose that involving learners in extensive exposure to collocations and varied communication tasks would encourage the deliberate learning of collocations and elicit diverse collocations from them.

Our results provide implications for the teaching of collocations as well. We found that the semantic fields of noun elements characterize learners’ collocation use at different proficiencies. Beginner learners’ ability of combing abstract noun elements within semantic fields, such as social action and economics, appears to be underdeveloped. Instructions facilitating the mastery of collocations containing noun elements within such semantic fields can be particularly beneficial to EFL learners at lower proficiencies. With respect to the ideal way of presenting collocations, Lewis (1993) stated that vocabulary organized according to topics or semantic fields leads to a more effective memorization than randomly occurring vocabulary. Meanwhile, Karoly (2005) has also emphasized that learners should record collocations in an organized way. Therefore, we encourage EFL teachers and material compilers to present collocations according to specific semantic fields and have learners acquire them collectively.

The findings also expand the previous work which set out to examine the possibility of utilizing learner’s collocational competence as a possible criterial feature. According to Hawkins and Buttery (2010) , criterial feature refers to “linguistic properties that are characteristic and indicative of L2 proficiency at each level, on the basis of which examiners make their practical assessments (p. 2).” It has immediate implications for EFL learners, teachers, as well as teaching material compilers. The present study captures a set of properties characterizing learners’ productive collocational knowledge across CEFR levels. Collocations within semantic fields, for instance, G and E, are used by learners at certain proficiency levels only, which appear to be promising criterial features that could distinguish learners at adjacent CEFR levels. Future studies on the current topic are highly recommended.

However, the findings need to be interpreted with caution for the following reasons. Firstly, the use of nouns in different semantic fields tends to be greatly influenced by the topic of given tasks. Many studies have shown that the lexical choices that language users made differ remarkedly across disciplines, registers, and genres ( Biber and Conrad, 1999 ; Hyland, 2008 ). L2 learners’ language production is no exception. Alexopoulou et al. (2017) investigated pairs of tasks in three task types, viz. narrative, descriptive, and professional. Topics, such as cruise complaints, would elicit compositions with higher linguistic complexity than other topics, such as a job ad. Higher Englishtown-level learners might have been assigned more complicated writing tasks that required abstract nouns for successful completion. Therefore, further research based on the investigation of essays written under the same pair of tasks would offer us more valid information on learners’ collocation. Secondly, it is important to bear in mind that the vocabulary learning mechanism is extremely complicated and dynamic. The increase in learners’ overall proficiency and accumulated learning of individual words might greatly influence them when deciding which words to use in collocations. The investigated properties of collocation use may also be a result of the increase of overall proficiency.

The present study investigated the properties of learners’ productive collocational competence at each CEFR level. The main findings of this study are that beginner learner is able to use verb + noun collocations consist of nouns concerning concrete objects and daily activities, while intermediate and advanced learners are able to use collocations containing semantically varied and complicated noun elements. Moreover, the results suggested that proficient leaners are to use collocations containing more difficult and relatively longer noun elements in make/take  + noun collocations. The findings of this study have a number of practical implications on language teaching and the exploration of criterial features. It also provided a time and energy-efficient way of identifying collocations using a programming language based on the observed frequency of the collocations and their constituent words in a large-scale native corpus.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author Contributions

MA is the main corresponding author of this manuscript and completed the overall write-up of the manuscript. Analysis and discussion sections are conducted by XD. All authors contributed to the article and approved the submitted version.

This project is funded by the research supporting project number (RSP-2021/251), King Saud University, Riyadh, Saudi Arabia.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

We are thankful to the reviewers for their dedicated efforts to substantiate this manuscript. We have improved thepaper with their valuable comments and suggestions.

1. ^ It should be noted that the topic ID used in the present study may not be as credible as it could have been as the original EFCAMDAT data happen to contain texts which did not coincide with the listed task prompt. The EFCAMDAT Cleaned Subcorpus ( Shatz, 2020 ), a derivative corpus of EFCAMDAT, has achieved higher reliability in the annotation of task prompts (topic ID) and can be an ideal option in future analysis.

Aisenstadt, E. (1979). Collocability restrictions in dictionaries. Int. J. App. Ling. 45, 71–74.

Google Scholar

Alexopoulou, T., Michel, M., Murakami, A., and Meurers, D. (2017). Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques. Lang. Learn. 67(Suppl.1), 180–208. doi: 10.1111/lang.12232

CrossRef Full Text | Google Scholar

Altenberg, B., and Granger, S. (2001). The grammatical and lexical patterning of MAKE in native and non-native student writing. Appl. Linguis. 22, 173–195. doi: 10.1093/applin/22.2.173

Altun, H. (2021). The learning effect of corpora on strong and weak collocations: implications for corpus-based assessment of collocation competence. Inter. J. Assess. Tools Educ. 8, 509–526. doi: 10.21449/ijate.845051

Bahns, J. (1993). Lexical collocations: a contrastive view. ELT J. 47, 56–63. doi: 10.1093/elt/47.1.56

Barcroft, J. (2006). Can writing a new word detract from learning it? More negative effects of forced output during vocabulary learning. Second. Lang. Res. 22, 487–497. doi: 10.1191/0267658306sr276oa

Biber, D., and Conrad, S. (1999). Lexical bundles in conversation and academic prose. Lang. Comput. 26, 181–190.

BNC Consortium. (2007). British National Corpus, XML edition. Oxford Text Archive. Available at: http://hdl.handle.net/20.500.12024/2554 (Accessed April 1, 2019).

Boers, F., Demecheleer, M., Coxhead, A., and Webb, S. (2014). Gauging the effects of exercises on verb–noun collocations. Lang. Teach. Res. 18, 54–74. doi: 10.1177/1362168813505389

Cao, D., and Badger, R. (2021). Cross-linguistic influence on the use of L2 collocations: the case of Vietnamese learners. Appl. Linguistic. Rev. doi: 10.1515/applirev-2020-0035

Capel, A. (2010). Insights and issues arising from the English profile wordlists project. Res. Notes 41, 2–7. doi: 10.1017/S2041536210000048

Cowie, A. P. (1992). “Multiword lexical units and communicative language teaching,” in Vocabulary and Applied Linguistics (London: Palgrave Macmillan), 1–12.

Cowie, A., Cater, R., and McCarthy, M. (1988). Stable and creative aspects of vocabulary. Vocabulary and language teaching , 139.

Dietrich, R., Klein, W., and Noyau, C. (1995). The Acquisition of Temporality in a Second Language. Vol . 7. John Benjamins Publishing: Netherlands.

Durrant, P. (2008). High Frequency Collocations and Second Language Learning. PhD Thesis, University of Nottingham Nottingham].

Durrant, P., and Schmitt, N. (2009). To what extent do native and non-native writers make use of collocations? De Gruyter Mouton 47, 157–177. doi: 10.1515/iral.2009.007

Durrant, P., and Schmitt, N. (2010). Adult learners’ retention of collocations from exposure. Second. Lang. Res. 26, 163–188. doi: 10.1177/0267658309349431

Erman, B., and Warren, B. (2000). The idiom principle and the open choice principle. Text Talk 20, 29–62. doi: 10.1515/text.1.2000.20.1.29

Fan, M. (2009). An exploratory study of collocational use by ESL students - A task based approach. System 37, 110–123. doi: 10.1016/j.system.2008.06.004

Fillmore, C. J. (1979). “On fluency,” in Individual Differences in Language Ability and Language Behavior. eds. C. J. Fillmore, D. Kempler, and W. S.-Y. Wang (New York, NY: Elsevier), 85–101.

Gablasova, D., Brezina, V., and McEnery, T. (2017). Collocations in corpus-based language learning research: identifying, comparing, and interpreting the evidence. Lang. Learn. 67(Suppl. 1), 155–179. doi: 10.1111/lang.12225

Geertzen, J., Alexopoulou, T., and Korhonen, A. (2013). Automatic Linguistic Annotation of Large Scale L2 Databases: The EF-Cambridge Open Language Database (EFCAMDAT). Proceedings of the 31st Second Language Research Forum . (Somerville, MA: Cascadilla Proceedings Project).

Goldschneider, J. M., and DeKeyser, R. M. (2001). Explaining the “natural order of L2 morpheme acquisition” in English: A meta-analysis of multiple determinants. Lang. Learn. 51, 1–50. doi: 10.1111/1467-9922.00147

González, A. O., and Ramos, M. A. (2013). A comparative study of collocations in a native corpus and a learner corpus of Spanish. Procedia Soc. Behav. Sci. 95, 563–570. doi: 10.1016/j.sbspro.2013.10.683

Granger, S., and Bestgen, Y. (2014). The use of collocations by intermediate vs. advanced non-native writers: A bigram-based study. Inter. Rev. Appl. Ling. Lang. Teach. 52, 229–252. doi: 10.1515/iral-2014-0011

Granger, S., and Paquot, M. (2009). Lexical Verbs in Academic Discourse: A Corpus-Driven Study of Learner Use. London: Bloomsbury.

Gyllstad, H. (2007). Testing English Collocations: Developing Receptive Tests for Use with Advanced Swedish Learners. Sweden: Lund University.

Hawkins, J. A., and Buttery, P. (2010). Criterial features in learner corpora: theory and illustrations. Eng. Profile J. 1:103. doi: 10.1017/S2041536210000103

Henriksen, B. (2013). Research on L2 learners’ collocational competence and development -a progress report. (eds.) C. Bardel, C. Lindqvist, and B. Laufer 29–56.

Hoey, M. (1991). Patterns of Lexis in Text. United Kingdom: Oxford University Press.

Howarth, P. A. (2013). Phraseology in English Academic Writing: Some Implications for Language Learning and Dictionary Making. Vol . 75. Berlin: Walter de Gruyter.

Huang, Y., Murakami, A., Alexopoulou, T., and Korhonen, A. (2018). Dependency parsing of learner English. Inter. J. Corpus Ling. 23, 28–54. doi: 10.1075/ijcl.16080.hua

Hunston, S., and Francis, G. (2000). Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English. John Benjamins Publishing: Netherlands.

Hyland, K. (2008). As can be seen: lexical bundles and disciplinary variation. Engl. Specif. Purp. 27, 4–21. doi: 10.1016/j.esp.2007.06.001

Ishikawa, S., Maeda, T., and Yamasaki, M. (2010). Gengo Kenkyu no tame no Toukei Nyumon [An introduction to statistics for language studies]. Kuroshio Publishing.

Karoly, A. (2005). The importance of raising collocational awareness in the vocabulary development of intermediate level learners of English. Eger J. Eng. Stud. 5, 58–69.

Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., et al. (2014). The sketch engine: ten years on. Lexicography 1, 7–36. doi: 10.1007/s40607-014-0009-9

Kim, S. (2018). EFL learners’ dictionary consultation behaviour during the revision process to correct collocation errors. Int. J. Lexico. 31, 312–326.

Kurtes, S., and Saville, N. (2008). The English Profile Programme – An overview, Research Notes. 33, Cambridge: Cambridge ESOL. 2–4.

Laufer, B., and Waldman, T. (2011). Verb-noun collocations in second language writing: A corpus analysis of Learners’ English. Lang. Learn. 61, 647–672. doi: 10.1111/j.1467-9922.2010.00621.x

Leńko-Szymańska, A. (2015). The English vocabulary profile as a benchmark for assigning levels to learner corpus data. Learner Corpora Lang. Test. Assess. 69, 115–140. doi: 10.1075/scl.70.05len

Lewis, M. (1993). The Lexical Approach: The State of ELT and a Way Forward. Hove: Language Teaching Publication.

Lee, S., and Shin, S. (2021). Towards Improved Assessment of L2 Collocation Knowledge. Lang. Assess. Quarterly 18, 419–445. doi: 10.1080/15434303.2021.1908295

Li, J., and Schmitt, N. (2010). The Development of Collocation Use in Academic Texts by Advanced L2 Learners: A Multiple Case Study Approach. In Perspectives on Formulaic Language: Acquisition and Communication (London, New York: Continuum), 23–46.

Liu, K., and Afzaal, M. (2020). Lexical Bundles: A Corpus-driven investigation of Academic Writing Teaching to ESL Undergraduates. Int. J. Emerg. Technol. 11, 476–482.

Liu, K., and Afzaal, M. (2021). Syntactic complexity in translated and non-translated texts: A corpus-based study of simplification. Plos one 16:e0253454. doi: 10.1371/journal.pone.0262050

PubMed Abstract | CrossRef Full Text | Google Scholar

Mcarthur, T. (1981). Longman Lexicon of Contemporary English. United Kingdom: Longman Group Limited.

Namvar, F. (2012). The relationship between language proficiency and use of collocation by Iranian EFL students. Lang. Ling. Lit. 18:41.

Nesselhauf, N. (2005). Collocations in a Learner Corpus. Vol . 14 . John Benjamins Publishing: Netherlands.

Nguyen, T., and Webb, S. (2016). Examining second language receptive knowledge of collocation and factors that affect learning. Lang. Teach. Res. 21, 298–320.

Nizonkiza, D. (2012). Quantifying controlled productive knowledge of collocations across proficiency and word frequency levels. Stud. Second Lang. Learn. Teach. 2, 67–92. doi: 10.14746/ssllt.2012.2.1.4

Nizonkiza, D. (2017). “Predictive power of controled productive knowledge of collocations over L2 proficiency,” in Usage-Based Approaches to Language Acquisition and Language Teaching , Vol . 11 (Berlin: De Gruyter Mouton), 263–286.

Peters, E. (2016). The learning burden of collocations: The role of interlexical and intralexical factors. Lang. Teach. Res. 20, 113–138. doi: 10.1177/1362168814568131

Piao, S. S., Bianchi, F., Dayrell, C., D’egidio, A., and Rayson, P. (2015). Development of the multilingual semantic annotation system. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies .

Phoocharoensil, S. (2014). Exploring learners. developing L2 collocational competence. Theor. Prac. Lang. Stu. 4:2533

Rankin, T., and Unsworth, S. (2016). Beyond poverty: engaging with input in generative SLA. Second. Lang. Res. 32, 563–572. doi: 10.1177/0267658316648732

Rayson, P., Archer, D., Piao, S., and McEnery, A. M. (2004). The UCREL semantic analysis system.

Saito, K., and Liu, Y. (2021). Roles of collocation in L2 oral proficiency revisited: different tasks, L1 vs. L2 raters, and cross-sectional vs. longitudinal analyses. Second. Lang. Res. doi: 10.1177/0267658320988055

Santorini, B. (1990). Part-of-speech tagging guidelines for the penn treebank project (3rd revision). Tech. Rep. 570:4.

Schmitt, N. (2010). Researching Vocabulary: A Vocabulary Research Manual. Germany: Springer.

Shatz, I. (2020). Refining and modifying the EFCAMDAT: lessons from creating a new corpus from an existing large-scale English learner language database. Inter. J. Lear. Corpus Res. 6, 220–236. doi: 10.1075/ijlcr.20009.sha

Siyanova, A., and Schmitt, N. (2008). L2 learner production and processing of collocation: A multi-study perspective. Can. Mod. Lang. Rev. 64, 429–458. doi: 10.3138/cmlr.64.3.429

Siyanova-Chanturia, A. (2015). Collocation in beginner learner writing: A longitudinal study. System 53, 148–160. doi: 10.1016/j.system.2015.07.003

Szudarski, P., and Carter, R. (2016). The role of input flood and input enhancement in EFL learners’ acquisition of collocations. Int. J. Appl. Linguist. 26, 245–265. doi: 10.1111/ijal.12092

Tsai, K.-J. (2015). Profiling the collocation use in ELT textbooks and learner writing. Lang. Teach. Res. 19, 723–740. doi: 10.1177/1362168814559801

Uchida, S. (2015). Kihon doushi no korokeshonn nanido sokutei [The measurement of the difficulty level of collocation composed of basic verbs: an investigation of a teaching material corpus based on CEFR levels]. Gengo Shori Gakkai Nenji Daikai Happyou Ronbunshu 21, 880–883.

Wang, Y., and Shaw, P. (2008). Transfer and universality: collocation use in advanced Chinese and Swedish learner English. ICAME J. 32, 201–232.

Wray, A. (2002). Formulaic Language and the Lexicon. New York: Cambridge University Press.

Yamashita, J., and Jiang, N. (2010). L1 influence on the acquisition of L2 collocations: Japanese ESL users and EFL learners acquiring English collocations. TESOL Q. 44, 647–668. doi: 10.5054/tq.2010.235998

Keywords: collocations, foreign language writing, lexical developmental patterns, EFCAMDAT, corpus analysis

Citation: Du X, Afzaal M and Al Fadda H (2022) Collocation Use in EFL Learners’ Writing Across Multiple Language Proficiencies: A Corpus-Driven Study. Front. Psychol . 13:752134. doi: 10.3389/fpsyg.2022.752134

Received: 02 August 2021; Accepted: 06 January 2022; Published: 09 February 2022.

Reviewed by:

Copyright © 2022 Du, Afzaal and Al Fadda. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Muhammad Afzaal, [email protected] , orcid.org/0000-0003-4649-781X

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

  • Bibliography
  • More Referencing guides Blog Automated transliteration Relevant bibliographies by topics
  • Automated transliteration
  • Relevant bibliographies by topics
  • Referencing guides

The effectiveness of corpus-based training on collocation use in L2 writing for Chinese senior secondary school students

Liuqin Fang is an English teacher at Gertrude Simon Lutheran College. She received her Master’s Degree in TESOL from The Education University of Hong Kong and her Bachelor’s Degree in TESOL Education from Brigham Young University Hawaii. Her research interests include corpus linguistics, data-driven learning (DDL), and English for academic purposes (EAP).

Qing Ma is an associate professor at the Department of Linguistics and Modern Language Studies, The Education University of Hong Kong. Her main research interests include second language vocabulary acquisition, corpus linguistics, computer assisted language learning (CALL) and mobile assisted language learning (MALL).

Jiahao Yan is a Research Associate and post-graduate student at the Education University of Hong Kong. He has taken part in a number of research projects and come up with many insight research outputs. His research interests include corpus linguistics, data-driven learning (DDL) and English for academic purposes (EAP).

Corpus tools are known to be effective in helping L2 learners improve their writing, especially regarding their use of words. Most corpus-based L2 writing research has focused on university students while little attention has been paid to secondary school L2 students. This study investigated whether senior secondary school students in China, upon receiving corpus-based training under the framework of data-driven learning (DDL), could improve their vocabulary use, especially the use of collocations, in their writing for the International English Language Testing System (IELTS) test. Twenty-two students aged 16–18 in a senior secondary school in Nanchang, China who were planning to take the IELTS exam participated in the study. Corpus of Contemporary American English (COCA) and Word and Phrase were the main corpora that the participants used to learn various search functions. Pre-writing and post-writing tests were administered to measure the effect of corpus training. In addition, a questionnaire and interviews were used to collect students’ perspectives and attitudes. The results indicate that students made improvement in word selection after three corpus training sessions, and their attitudes towards corpus use were positive even though they were restricted from using computers to access corpora inside their school.

1 Introduction

In view of rapid globalization, it is not surprising that the number of English language learners continues to increase as English is embraced as the contemporary lingua franca. Meanwhile, many researchers have noted that vocabulary and grammar learning can be considered as the major focus of second/foreign language acquisition ( Pan & Xu, 2011 ; Song & Chen, 2017 ; Zhang, 2011 ). Academic writing in English is an essential skill for students planning to enter university; however, writing in English is a major challenge for students in China because traditional teaching approaches emphasize correct grammar rather than well-structured and well-argued academic writing ( Zhang & Luo, 2004 ). Consequently, many Chinese students are capable of recognizing grammatical errors but are deficient in writing academically ( Cai, 2013 ; Zhang, 2018 ), particularly regarding word choice. This is because their academic vocabulary is limited and they are unfamiliar with how words in English are habitually juxtaposed or collocated ( Nation, 2001 ). In the past decades, there is a rapidly increasing number of Chinese students taking IELTS. However, the failure to appropriately use the learned vocabulary knowledge in their academic writing is one of the biggest obstacles to achieve higher scores ( Yao, 2014 ).

Johns (1991) promoted data-driven Learning (DDL) where English learners use language data sets, such as corpora, to search for semantic information, which has proved effective in learning collocations. Considerable research has been conducted and found out that corpus tools are effective in helping L2 learners improve their vocabulary use and collocation competence in academic writing ( Chan & Liou, 2005 ; Charles, 2007 ; Kennedy & Miceli, 2010 ; Lee & Swales, 2006 ; Sun & Wang, 2003 ; Todd, 2001 ; Wu, 2016 ). However, most research on corpus-based L2 writing focused on university students, while there has been little research carried out in secondary school settings. Therefore, the present study aims to investigate the effect that corpus-tools’ training has on secondary school students’ academic writing with the ultimate aim of improving their IELTS performance.

2 English writing and vocabulary learning of Chinese students

Traditionally in China, students are encouraged to produce error-free sentences from elementary school to university, and teachers focus on delivering grammar rules and analyzing students’ academic performance based on dictation and quizzes ( Duan & Yang, 2016 ). Under this learning approach, students have limited opportunities to learn content-based knowledge where unknown words are taught through context. Despite the focus on vocabulary knowledge, previous research has revealed that Chinese students have difficulty in converting their receptive vocabulary to productive vocabulary in their writing ( Wen, 2018 ; Zhai, 2016 ; Zhou, 2010 ); they also lack skills and strategies to write in English ( Sang, 2017 ). Conveying a clear written message using appropriate words is a major challenge for Chinese learners in their academic writing ( Clark & Yu, 2020 ). Chinese students may have acquired a passive knowledge of many lexical items throughout a long period of English learning, but they still fail to access this vocabulary knowledge to express their thoughts clearly ( Li & Schmitt, 2009 ).

In recent years, a growing number of Chinese students have been studying abroad, most of them taking IELTS as a qualifying standard. It is estimated that 50% of IELTS test takers in Asia are from mainland of China. Among the IELTS’s four language skills, Chinese students tend to get lower score in writing compared to scores in the other skills ( British Council, 2020b ). In the IELTS writing Task 2, test takers are required to write at least 250 words in a formal, academic style, which is difficult for many Chinese students ( An, 2013 ; Hao, 2016 ; Mayor, 2006 ). Despite a large vocabulary and a solid foundation in grammar, Chinese students’ scores on IELTS academic writing are below average ( Yao, 2014 ). A common error occurred in IELTS writing is inappropriate collocation use ( Futagi, Deane, Chodorow, & Tetreault, 2008 ).

To increase vocabulary competence, research reveals that a greater focus is needed on vocabulary learning, especially collocations ( Conzett, 2000 ; Men, 2015 ; Webb, Newton, & Chang, 2013 ). Knowledge of collocations is important for fostering language learning ( Nation & Webb, 2011 ); however, Chinese learner’s writing competence has been found to be weak in this area ( Hou, 2014 ; Zou, 2019 ). Zou (2019) summarized six major collocation types in Chinese student writing, namely, noun-noun, noun-verb, verb-noun, adjective-noun, verb-adjectives, and adjective-adverb, where more than half of the errors belong to verb-noun collocations. Hou (2014) also identified verb-noun as well as adjective-noun collocations as the two most frequent collocation error types. Zou (2019) explained that these collocation errors can be attributed to factors such as first language transfer and misuse of synonyms. Several studies have demonstrated that exposing students to concordance lines from corpora is a useful approach to improve their collocation performance ( Li, 2017 ; Saeedakhtar, Bagerin, & Abdi, 2020 ; Vyatkina, 2017 ; Wu, 2016 ).

3 Data-driven learning

Beginning with the work by Johns (1991) and Sinclair (1991) , the application of DDL has been advocated by scholars because corpora provide rich authentic examples regarding how words are used in real contexts. Gilquin and Granger (2010) also promoted DDL as a field with pedagogical value for language learners. DDL allows students to learn and internalize frequency patterns and contextual information regarding the appropriate use of words. Under this approach, learners are described as “language detectives” ( Johns, 1997 :101) as they can actively detect how words are used in context. Research has revealed considerable benefits from using corpora as a pedagogical tool in L2 writing ( Chang, 2014 ; Crosthwaite, 2017 ; Kennedy & Miceli, 2010 ; Yoon & Hirvela, 2004 ).

One of the most important roles of corpus use is error correction ( Crosthwaite, 2017 ; Vyatkina, 2017 ; Yoon & Hirvela, 2004 ). It is found that corpus-led DDL instruction can help L2 students identify and correct errors in their writing ( Crosthwaite, 2017 ; Li, 2017 ; Saeedakhtar et al., 2020 ; Todd, 2001 ; Vyatkina, 2017 ; Wu, 2016 ). Wu (2016) noted two broad types of collocation errors, (1) Adj. + noun and (2) Verb + noun, and found that Chinese learners with both high and low proficiency levels improved their collocation knowledge with the help of corpus tools. Li (2017) focused on verb-preposition collocations and found that Chinese postgraduates who had no previous knowledge of corpora displayed a significant improvement in using this type of collocation in their academic writing.

Furthermore, after being introduced to the basic functions of corpora, learners can use corpora as a reference/editing tool without receiving any comments or error correction from their teachers ( Charles, 2012 , 2014 ; Kennedy & Miceli, 2010 ; Yoon & Hirvela, 2004 ). Kennedy and Miceli (2010) , for example, designed a longitudinal training program and found that their students were able to use various search functions when facing collocation difficulties, and subsequently built their own corpus for later consultation. Studies have also revealed that students hold positive attitudes and perceptions towards corpus use ( Charles, 2012 , 2014 ; Crosthwaite, 2017 ; Yoon & Hirvela, 2004 ). Charles (2012) investigated 40 English for Academic Purposes (EAP) students’ evaluations of using their self-created corpora to solve collocation errors and found that 70% of the students wanted to use corpora in their EAP studies over the long-term.

In what ways does corpus training benefit secondary school students’ collocation competence and lexical quality in IELTS academic writing?

How do senior secondary school students perceive corpus training, and what attitudes do they hold towards corpora use in EAP writing?

4 Methodology

4.1 participants.

Twenty-two grade eleven Chinese students volunteered to participate in the study. They were studying at a senior high school in Nanchang, Jiangxi, and wished to pursue higher education overseas. The participants were aged 16–18, and had been learning English as a foreign language for about 14 years. The average English proficiency level among the participants was 5.0 on the IELTS exam. For confidentiality, pseudonyms are used throughout the paper in alphabetical letters (from A to V). Normal ethical procedures were followed throughout the study.

4.2 Research instruments

In total, three instruments were designed for data collection in the current study: (1) one pre-test and one post-test, (2) one questionnaire (see Appendix II ), and (3) interviews (see Appendix III ) with one intervention study (corpus training) between the two writing tests. The two tests were compositions written by students before and after the intervention (three training sessions) both of which were assessed to investigate whether students were able to reduce their error frequency (and as a formative exercise).

4.3 Procedure

Students were firstly asked to complete the pre-test. Following Wu (2016) , students were not allowed to refer to any tools so the researcher could evaluate students’ baseline writing skills on collocation knowledge. Then, by examining errors in the students’ pre-test scripts, the researcher designed the teaching materials for the corpus training sessions. Three corpus training sessions were then conducted (see Section 4.3.1 ) to demonstrate how corpora function as an alternative reference tool for academic writing. Following the training sessions, students then completed the post-test with guidance from the corpora to measure their change of collocation knowledge. Apart from consulting COCA and Word and Phrases Concordance, students were also allowed to use dictionaries which are their habitual reference tools in writing. Finally, a questionnaire was administered (see Appendix II ), and interviews were conducted to determine the students’ attitudes as well as their overall evaluation of corpus use.

The prompts for the two compositions focused on educational matters and were taken from the IELTS writing database on Task 2:

Pre-test prompt: “In the past, the role of teachers was to provide information. Today, students have access to wide sources of information. There is, therefore, no role of teachers in modern education. To what extent do you agree or disagree?”
Post-test prompt: “In the past, lectures were used as a way of teaching large numbers of students, but now with the development of technology for education, many people think there is no justification for attending lectures. To what extent do you agree or disagree?”

4.3.1 Training sessions

Three corpus training sessions (see Appendix IV ) were designed to introduce corpus-based reference tools for students to consult as they wrote. Three major types of collocation errors were addressed in this study as shown in the literature review, namely adjective + noun, verb + preposition, and verb + noun. The collocations were addressed in the three training sessions (30 min each), and these errors were collected from students’ scripts in the pre-test. During the three training sessions, COCA and Word and Phrase Concordance were used as the main reference tools. Participants were taught to use various search functions from the two corpora and were asked to examine the correctness of collocations as well as to provide suggested answers to the extension tasks either in class or at home (see Table 1 ). The aim of the training sessions was to provide alternative tools for enhancing the students’ writing competence, i.e., using the collocates function of COCA and the frequency list & analyze texts function of Word and Phrase Concordance. Meanwhile, concordance lines from the corpora could provide students with rich authentic examples to discover language patterns (see Tables 1 and 2 ).

Three examples of collocation errors for each corpus training.

The Display of concordances.

Each corpus training session was divided into three parts: lead-in activity (to check students’ comprehension and increase their awareness of the usage of collocations); hands-on corpus search (to show students how to consult corpora to check the correctness of collocations and to explore the synonyms of the target words); and extension tasks (to provide students with more opportunities to work individually). Each corpus training session focused on two search functions of COCA: (1) Collocation and (2) Synonyms. Some simple search functions catering to the learners’ current writing competence were designed in the teaching guide as students had no background knowledge of corpora (see Appendix IV for a sample of instructional materials design).

In IELTS writing, students are required to use different expressions to avoid repetition as well as to prevent monotonous-sounding writing. Therefore, synonym learning was chosen in the design of the teaching guides. The collocates function from COCA was introduced to help students examine collocation accuracy. The frequency lists function from Word and Phrase was presented to help learners generate language patterns, and the analyze texts function was an alternate way to search for synonyms. An inductive method was adopted for instruction to activate learners’ schemata to become familiar with corpora along with the searching functions. The three corpus training sessions were delivered by one of the researchers at a computer lab in a large classroom setting.

4.4 Data collection and analysis

4.4.1 assessing the students’ writing.

It is generally accepted that students’ progress can be adopted as a measure of effectiveness of a given teaching innovation. In this case, a pre-writing task (the pre-test) before corpus training intervention and a post-writing task (the post-test) were conducted to measure learners’ performance and progress using IELTS-oriented prompts. Four marking criteria were used in the assessment of the two tests: “task achievement, coherence and cohesion, lexical resource, and grammatical range and accuracy” ( British Council, 2020a ). In addition, the raters marked the scores on lexical quality separately based on the IELTS rubric with reference to British Council (2020a) (see Appendix I ). Students’ writing samples were independently rated by the one of the authors and one writing teacher in the school. Disagreements were resolved via discussion. Mean scores and standard deviation measures were generated.

4.4.2 Questionnaire and interviews

The questionnaire and the interviews were designed to measure learners’ perceptions of and overall attitudes towards corpus usage after the training sessions. Two parts were included in the questionnaire: (1) Background information and an examination of students’ prior knowledge of corpora and (2) An evaluation and investigation of their attitudes towards corpora use. Fifteen items from the evaluation part were designed with a 7-point Likert preference scale, ranging from “strongly agree” (7 points) to “strongly disagree” (1 point) (see Appendix II ). Inspired by Yoon and Hirvela (2004) and Crosthwaite (2017) , the questionnaire data were categorized into four dimensions: (1) Reflection and perception of corpus use after corpus training; (2) Evaluation of a corpus; (3) Learners’ autonomy in choosing a corpus to consult when writing; and (4) Attitudes towards corpus use. Semi-structured interviews ( Appendix III ) were conducted to gain insight into students’ attitudes towards corpora; most questions were taken from Chang (2014) .

The interview data were analyzed to answer research question 2 regarding the learners’ attitudes and overall evaluation of the corpus training and usage. Eleven of the 22 participants were interviewed individually in 10–15 min. The interview responses were recorded and then transcribed and translated by the authors. Each participant viewed the transcripts and approved of the contents. The interview data were then coded independently by two of the researchers and an inter-rater reliability of 85% was reached. The two sets of codes were then reduced to one set via discussion between the two raters.

The results are mainly presented in two parts based on the two research questions: (1) The effectiveness of corpus training and its usage as measured by frequency of errors and quality of writing and (2) The students’ overall attitudes towards corpora. The effectiveness of corpus training was divided into two categories: (1) Collocation error frequency and (2) Lexical quality scores.

5.1 Collocation error frequency

The effectiveness of corpus training and its usage was measured based on two aspects: (1) The frequency of collocation errors in the two writing tests and (2) The lexical quality scores in the two tests. The results are displayed in Tables 3 – 6 . As the study took place in evenings during students’ self-study period and all students participated in the study on a voluntary basis, some students missed either the pre-test or the post-test for various reasons. There remained seven students who took both the pre-test and the post-test. So we decided to focus on the scripts of the seven students.

Frequency of collocation errors in the pre-test and post-test ( n  = 7).

Only a certain number of students’ data were used as they had a full data set. Due to practical constraints, some students’ data was missing.

T-test of error frequency in collocation.

a p  < 0.05 means significant.

Means and standard deviation of writing scores ( n  = 7).

T-test of writing scores.

The mean error frequency of the seven students in their pre-test and post-test indicates that learners made a minor improvement of 0.72 in the post-test. However, the result of the t-test (t = 1.94; p  = 0.09 > 0.05) showed that there was an insignificant effect on the frequency of errors between the pre- and post-writing tasks, possibly due to the relatively small sample size.

5.2 Lexical quality and writing scores

The mean writing score of the post-test (7.5) ( Table 5 ) indicates that the students’ quality of writing increased by an average score of 0.64 over the pre-test. In addition, the result of the writing scores of the t-test (t = 1.94; p  = 0.00 < 0.05) suggested that there was a statistically significant difference between the pre- and post-writing tasks, which showed that learners made slight improvements in writing scores after three corpus training sessions.

5.3 Questionnaire

The questionnaire was divided into two categories, (1) The background information of students’ prior knowledge in writing and (2) The evaluation of their attitudes towards the corpora. Students indicated in the survey that the e-dictionary was the most popular tool to consult when encountering any vocabulary problems in writing before receiving any corpus training. However, after receiving the corpus instruction, most participants expressed that they would seek help from the corpora to solve any problems they encountered.

The latter part of the questionnaire was divided into four dimensions (see Table 7 ). The results indicate that learners’ overall attitude towards corpus training was positive, with a mean score of 5.87 out of 7. Their perceptions and reflections about corpus use after corpus training were positive with a mean of 5.88. This result indicates that students believed that the three corpus training sessions were useful for learning new words and collocations when writing academic essays. Students highly appreciated using the corpus (mean = 5.93) and their attitude (mean = 5.94) towards the corpus was the highest. They expressed a strong willingness to continue using corpora, such as COCA and Word and Phrase in their future English learning and academic writing. However, learner autonomy (mean = 5.67), received a relatively low score. This is understandable since full learner autonomy in corpora use is likely to be developed after sustained usage.

Result of the questionnaire.

5.4 Interview

Coding of the interviews generated two main codes – benefits and difficulties of corpus use. These are elaborated here with excerpts from the students.

5.4.1 Benefits of corpus use

5.4.1.1 corpus as a beneficial reference tool in writing.

The search functions of the corpora (COCA and Word and Phrase Concordance) motivated the students to actively infuse their search results into their academic writing, especially the functions, “collocates” and “analyze texts,” which they could consult to find appropriate synonyms to replace the target words. Most students claimed that the corpora facilitated and guided them to develop their writing abilities. They were more willing to use corpora to find suitable synonyms and to examine how to appropriately use words and collocations. Prior to the corpus training, learners tended to translate directly from Chinese when writing English essays as noted in literature ( Zheng & Chang, 2014 ). For example:

The size of my vocabulary is small. If I encounter any questions in choosing which word to use in IELTS writing, I will think in Chinese first, and then I will translate it into English. (A)
I would look up the Chinese meaning in an electronic dictionary first and then choose a word that I had not used before in my writing so that my essay would be more academic. (B)

After being introduced to the concepts of corpora, many students showed interests in using corpora to help them overcome difficulties in collocating words. Students indicated that they appreciated the teaching guides from the corpus training sessions, which were practical and closely matched the needs of the IELTS writing topics for which they were preparing. Moreover, the use of collocation errors drawn from students’ previous writing motivated them to find ways to solve their own problems. During the training aimed at collocations and synonyms, students enjoyed being given more opportunities to do individual work by using corpora to identify appropriate words to collocate with target words. Some students looked in their notebooks and used corpora to check whether the phrases they had previously written were correct.

IELTS candidates are also required to use a variety of words and expressions in the writing section of the test according to the marking criteria; this requirement motivated students to turn to corpora searching to decrease the monotony. Student C and E mentioned that:

I have benefited from Word and Phrase Concordance in finding synonyms. Meanwhile, I also used both Word and Phrase and COCA to find the words which can be juxtaposed with these synonyms. Furthermore, the collocation and synonym learning through Word and Phrase and COCA provided me a new learning opportunity for using different expressions to avoid repeating the same word and enhanced language accuracy, which improved the quality of my writing. (C)
Writing is my weak point. I always use one word to express my opinion in my writing, so corpora offered me a chance to look up the collocation and synonym usage which helped me improve my writing ability. (E)

5.4.1.2 Advantages of using concordance lines

Two advantages of using concordance lines were (1) The rich authentic examples provided for generating language patterns and (2) The opportunity to train how to summarize the usages of collocations. Specifically, summarizing requires students to generate different language patterns for two similar collocations, such as “apply to” and “apply for.” Student B reported that summarizing required him to think critically, while student D stated that concordance lines are more efficient than dictionaries in terms of self-discovery:

Summarizing requires my deep knowledge of vocabulary. For example, when we had the training to find out the difference between “apply for” and “apply to” through the concordance lines, I was not able to figure it out without the help from the teacher. After I tried two to three times, I found out that the concordance lines provided me examples to analyze what kinds of words can be added after the target word. (B)
Concordance lines require me to self-discover the language patterns while dictionaries present all the patterns for learners. In this case, learners will just copy the phrases without knowing the usage. However, I can find the language patterns through concordance lines and use them in my writing which can make me feel successful. Thus, I think I can improve with the help of corpora. (D)

5.4.2 Difficulties students encountered in using corpus tools

5.4.2.1 practical problems of corpus use in classrooms with short training sessions.

The interview responses also indicated that it was difficult for learners to search for information through the corpora due to limited training. During the training session about verb + preposition collocation, the sheer abundance of concordance lines appearing in their second language overwhelmed the students, preventing them from promptly deciphering language patterns. Having only three 30 min training sessions was also insufficient for learners to be fully familiar with the search functions. The excerpts below illustrate:

It takes me a long time to search for one word because I am not familiar with the searching functions. (E)
We had never heard of corpora before, so it was hard for us to remember all the searching steps in such a short time. (C)
When I use these two corpora, it is easy for me to mix up the order of searching for words. For example, I always forget when I should change the order of the words to either the left side or the right side. Thus, I think it will be better if I can receive more corpus-training. (F)

5.4.2.2 Constraints of electronic device usage in classrooms

Due to the school policy, students’ use of electronic devices during class was restricted, which reduced the capacity of students to become familiar with corpora. Although some students were able to use their mobile phones to look up collocations, the small screen proved to be a challenge for them to continue using corpora to search for words. It was also found that some students did not have access to corpora for their post-writing task. Therefore, the reference tool they could use was dictionary.

I did not use corpora because I cannot use laptops at school. I used my mobile phone to search for words, but the words and screens are too small. We are studying for IELTS, and it required handwriting. Thus, we are not allowed to use computers to write essays at school. (H)
I really appreciated the training sessions because I learned something new that can be applied to my academic writing. However, the sessions were too short, and our writing teachers never mentioned corpora before. So, I do not think I will use corpora at this moment, but I will use it later when I enter university. At that time, I will have my own laptop. Moreover, we are learning IELTS, which requires us to write essays in a written form. If I am learning TOEFL (Test of English as a Foreign Language), then I think it is definitely a good resource, and our teacher will allow us to use computers in school because TOEFL requires candidates to use the computer to write the essay. (G)

6 Discussion and implications

6.1 benefits of using corpus tools to facilitate writing.

The higher scores on IELTS writing task 2 show that the students improved their lexical use and tended to reduce errors in their writing after corpus-based training. According to the questionnaire, the students had barely heard of the concept of collocations and had no prior knowledge of corpora in English vocabulary learning. In this sense, the study achieved two aims, namely, to familiarize students with collocations and provide them with tools to improve their writing. Similar to existing studies on DDL ( Crosthwaite, 2017 ; Li, 2017 ; Saeedakhtar et al., 2020 ; Vyatkina, 2017 ; Wu, 2016 ; Yoon & Hirvela, 2004 ), students made noticeable improvement on collocations and vocabulary use. We therefore conclude that Chinese secondary students can benefit from DDL in their language learning, especially their collocation competence.

This finding contributes to the application of DDL for international English learners. According to Futagi et al. (2008) , collocation errors are not only common in Chinese IELTS examinees, but also frequently appear in academic writing by other L2 learners, including college students in Saudi Arabia and international students in American universities. Same as Futagi et al. (2008) , our study highlights the need for learners to focus on collocation strings and distinguish synonyms based on their collocations. Smirnova (2017) also found that the experience of using corpora proved effective for EFL learners in a Russian university. She claimed the use of corpora improved students’ understanding of usage patterns, helped them correct collocation errors autonomously, and more importantly, increased their scores in the IELTS writing section. In addition, students in the current study expressed their willingness to use corpora when practicing for other tests, such as TOFEL and CET (College English Test). Therefore, it is suggested that non-native English learners refer to corpora to improve their writing competence to increase their scores in English tests.

The inductive thinking involved when searching linguistic items and looking for language use patterns in corpora helps language learners take charge of their own learning beyond the examples provided in textbooks ( Crosthwaite, 2019 ). In the current study, the secondary students also showed the ability to correct errors on their own. Searching and examining language use in corpora provided them with the opportunity to correct writing errors. Crosthwaite (2019) suggests that the more pre-tertiary students interact with corpora data, the more they develop problem solving skills and autonomous learning, which are important learning strategies required after graduation. However, corpus-using strategies take time to develop. Charles (2014) , for example, showed that 70% of students practiced for 12 months before getting accustomed to using corpora. Likewise, Yoon and Hirvela (2004) found learners needed a long time to develop their abilities in editing lexico-grammatical errors in the corpora. This explains why some students from the present study claimed that the limited number of training sessions prevented them from being fully familiar with various search functions embedded in the corpus websites. Early studies, Thurstun and Candlin (1998) and Sun (2000) , revealed that students may not be confident in reading concordance lines in English. This concern was also evident in our study. Thus, it appears that if corpora training were to become a part of the English curriculum, students would require prolonged training to become more skillful and confident in using search functions and benefit more from corpus use.

6.2 Attitudes and overall evaluations of corpora

6.2.1 student attitudes and overall evaluations of corpora.

The results reveal that students held positive attitudes towards corpora use in academic writing. Even though they were not able to fully familiarize themselves with the search functions, they regarded corpora as beneficial alternative tools to better understand language patterns as well as to enhance their confidence and ability in writing. This finding is consistent with Yoon and Hirvela (2004) , who stated that learners were positive towards corpora as they regarded them as helpful reference tools not only for obtaining word usage but also for developing writing skills. Chang (2014) also showed that learners appreciated the benefits corpora provide, even though they had not yet developed a deep knowledge of corpus use.

There appear to be several reasons for the students’ positive attitudes towards corpora usage as indicated by responses to the questionnaire. Apart from arousing students’ attention to the lexical mistakes offered by the researcher during the training session, some students used corpora to check whether the previously written phrases in their notebooks were appropriate after the training session. The questionnaire responses implied that dictionaries were commonly used by students before corpus training sessions; however, after consulting corpora, they found that words from dictionaries were not always appropriate for certain contexts. Similar to Chan and Liou (2005) , students were able to transfer collocation patterns in examples they learned during the lesson to new sentences they wrote outside of class. In addition, after training, many learners asked the researcher to teach them more search functions, such as “all forms” and “parts of speech” from the Word and Phrase Concordance (see Figure 1 ).

Figure 1: 
Search functions of Word and Phrase that students were interested to learn.

Search functions of Word and Phrase that students were interested to learn.

Students showed their appreciation for learning about collocations through corpora even though the training sessions were limited. Some students stated that using various expressions can enhance one’s ability to use lexical resources. Most of the students were fully engaged in learning about collocations during the three training sessions, especially the hands-on activities, which encouraged them to further practice, leading to a highly positive attitude towards using corpora.

Nevertheless, some negative aspects were evident. Because self-discovery plays an important role in corpora usage, particularly for the pattern “verb + preposition,” it was found that some students were unable to develop their own language patterns through concordance lines without any hints from their teacher even after discussions with their groupmates. Also, some students had a difficult time locating information similar to what they wanted when faced with such an abundance of concordance lines, which caused them to give up. These reasons probably explain why learners’ autonomy was rated the lowest in the survey among all dimensions.

6.2.2 Technology issues related to corpora use

One salient factor that caused learners to have a negative evaluation of corpus usage was the unstable speed of the Internet. Due to the school policy of banning computers or electronic devices, the training sessions were conducted in the school’s computer lab, where the speed of the Internet was relatively slow, thus reducing the students’ practice time. A few of the students explained that the slow speed of the Internet was one possible factor inhibiting the effectiveness of the training. It prevented students from having more opportunities to practice search functions. In the early days of corpora usage, Sun (2000) also noted that network speed and stability are two major problems. This challenge thus remains.

Although students were generally positive about corpora usage, they had not yet gained the habit of using them. According to Charles (2014) , developing a habit of using corpora means that students can continue to use them after the training is completed. However, because the students could access computers only either during the training sessions or at home on the weekend, the opportunities for students to consult corpora and develop a behavioral pattern of using them were limited.

One solution, noted by students, is to create a record of the collocations they searched from the corpora. Some students made a record of the suggested answers from their corpus search to replace their errors during the training, which they could later refer to for further study or review. Some of them also copied some concordance lines so that they could learn the difference between two phrases in their free time. This notetaking was especially important in the present context of their English learning because students were restricted from using electronic devices in the classroom. Further, teachers need to urge their schools to provide facility for students while encouraging students to be pro-active in their learning by keeping records of the new collocations they have learned.

7 Conclusion

The study explored the effectiveness of corpus training sessions in IELTS academic writing among secondary school students by assessing their ability to learn from a corpus-based intervention; afterwards, their attitudes towards the corpora were gathered. The results of this research shed positive light on the efficacy of corpus training for improving senior secondary school students’ lexical usage. Specifically, the findings revealed the students collocated better in their writing after the intervention and had mostly positive views about using corpora. Moreover, the questionnaire responses suggest that students had ample opportunities to use new vocabulary or phrases in their weekly writing practice. Students also received adequate support from their writing teachers for further improvements, such as grammatical errors and coherence. However, corpus tools are relatively new to Chinese teachers and students, and not all writing teachers are able to point out collocation errors in students’ texts or even know about corpora, indicating a need for bringing corpora usage into language teacher training.

The findings also revealed some limitations and difficulties that the students encountered in using corpus tools. One key element that emerged from this study is that it is difficult for Chinese secondary school students to have access to electronic devices to use corpora. Students could only use computers during the training sessions and on weekends. Therefore, we recommend that schools provide students with computers after class to access corpora and create records of concordance lines for certain language items so that they can analyze these corpus data in free time. Although only three corpus training sessions were conducted, the findings revealed the positive effects of DDL for pre-tertiary students. Since learning corpus search functions and the analysis of concordance lines can be a very complicated process ( Heather & Helt, 2012 ; Naismith, 2017 ), it is suggested that instructors provide adequate corpus training for students so that they can be more autonomous and confident in corpora use over the course of their long-term language learning.

In conclusion, the study showed that in general, secondary L2 learners can improve their writing skill and are positive towards corpora usage. However, due to practical constraints, only seven participants’ full data sets were used in the quantitative scoring part of the study, which limits its generalizability; thus, the results are indicative only. The qualitative parts of the study gathered positive views from a larger group of students indicating good potential. Although the learners’ writing scores improved, collocations and synonyms learned through corpora are still only one instructional element among a myriad of elements that go into L2 writing.

There are a few limitations in the study. First, this is a small-scale study that only examined limited data on student consultation of corpora as reference tools for academic writing. Second, due to some practical constraints caused by the school policy, some students did not make full use of corpora, which may have undermined the results of the study. Even though, our data is indicative of good potential of corpus use in enhancing secondary school students’ collocation learning in academic writing. Future studies researching corpora training on language students can consider having a larger number of participants while including an expanded list of the type of collocation errors L2 students make.

Funding source: CRAC

Award Identifier / Grant number: 03AAB

Funding source: Education University of Hong Kong

Funding source: Beijing Foreign Studies University

Award Identifier / Grant number: 2020SYLZDXM011

About the authors

Research funding: The article was funded by the CRAC project (Ref: 03AAB) by the Education University of Hong Kong and it was also supported by Project of Discipline Innovation and Advancement (PODIA)-Foreign Language Education Studies at Beijing Foreign Studies University (Ref: 2020SYLZDXM011).

Appendix I: Criteria for lexical quality dimension (adapted and edited from British Council, 2020a )

Appendix ii: questionnaire about using coca & word and phrase concordance in eap writing (adapted from yoon & hirvela, 2004 ), background information.

Gender: Male ____ Female _____

Country where you were born: _____________

First language: _____________

In general, do you like to try new things? Yes ________ No _______

One time a week

Two times a week

More than three times a week

What resources do you use for English writing? (e.g., dictionary) ________________________

Had you heard about corpus before the corpus training? Yes ____ (see Q. 8) No (see Q. 10) _____

Had you used corpus before the corpus training? Yes ______ (see Q. 9) No (see Q. 10) ______

Which corpus did you use before? ________________________

Which corpus did you use for your second writing? COCA _____ Word and Phrase Concordance______ or please provide the name of the corpus you used _________

Evaluations of using COCA & Word and Phrase Concordance

Strongly disagree

Somewhat disagree

Somewhat agree

Strongly agree

Appendix III: Semi-structured interview about using COCA in EAP writing ( Chang, 2014 )

Which corpus search function(s) do you think is(are) useful?

Why do you think COCA or Word and Phrase Concordance is or isn’t helpful?

Did you encounter any difficulty in using COCA & Word and Phrase Concordance? If so, how did you overcome it?

After you learned about COCA & Word and Phrase Concordance, do you think it is useful when compared to a dictionary? Why?

In what ways does COCA & Word and Phrase Concordance benefit your academic writing? What difficulties did COCA & Word and Phrase Concordance help you resolve?

How long does it take for you to search one collocation in COCA & Word and Phrase Concordance?

How frequently do you use COCA or Word and Phrase when you were doing the second writing task?

Do you still want to continue to use COCA or Word and Phrase Concordance after the three training sessions? Why or Why not?

How important is it for teachers to introduce corpus in class? Why?

Appendix IV: Teaching guides for the corpus training session (verb + noun)

Step 1: matching exercises.

Step 2: Hands-on corpus search

– Show students how to process the vocabulary ‘meet’ and ‘encounter’ on the collocates function of COCA.

– Look at the frequency page to see whether “questions” can be juxtaposed after meet & encounter.

– Show students how to process the vocabulary “question” on the collocates function of COCA and to see what kind of verbs are commonly collocated with “questions”.

Step 3: Extension task

– First, students process the vocabulary “tell” and “express” on the collocates function of COCA.

– Second, check the frequency page of these two words and see if “opinions” can be collocated together.

– Third, process the vocabulary “opinion” on the collocates function of COCA and to see what kind of verbs are commonly collocated with “opinion”.

Step 4: Synonym learning

– Showing students how to find synonyms of “questiond” that are collocated with “encounter” and the synonyms of “encounter” that are collocated with “question”.

Step 5: Extension task (synonyms)

– Students search individually how to find synonyms of “opinion” that are collocated with “express” and the synonyms of “express” that are collocated with “opinion”.

Step 6: Homework

Although modern technology can provide lots of information, it will also bring some disadvantages .

I said some unpleasant comments about her work. ( Say comments )

Did you have some exercise today? ( Have exercise )

An, Y. 安亚伦 (2013). [Study of repetition in Chinese IELTS candidates’ English written texts with an angel of systemic-functional linguistics] 功能语言学视角下的中国雅思考生写作文本中的重复手法研究. Henan: Henan University. Search in Google Scholar

Boulton, A., & Cobb, T. (2017). Corpus use in language learning: A meta‐analysis. Language Learning , 67(2), 348–393. https://doi.org/10.1111/lang.12224 . Search in Google Scholar

British Council . (2020a). IELTS TASK 2 Writing band descriptors [public version] . Retrieved from https://takeielts.britishcouncil.org/sites/default/files/ielts_task_2_writing_band_descriptors.pdf . Search in Google Scholar

British Council . (2020b). Test taker performance 2019 . Retrieved from https://www.ielts.org/research/test-taker-performance . Search in Google Scholar

Cai, L. J. (2013). Students’ perceptions of academic writing: A need analysis of EAP in China. Language Education in Asia , 4(1), 5–22. https://doi.org/10.5746/leia/13/v4/i1/a2/cai . Search in Google Scholar

Chan, T. P., & Liou, H. C. (2005). Effects of web-based concordancing instruction on EFL students’ learning of verb-noun collocations. Computer Assisted Language Learning , 18(3), 231–251. https://doi.org/10.1080/09588220500185769 . Search in Google Scholar

Chang, J. Y. (2014). The use of general and specialized corpora as reference sources for academic English writing: A case study. ReCALL , 26(2), 243–259. https://doi.org/10.1017/s0958344014000056 . Search in Google Scholar

Charles, M. (2007). Reconciling top-down and bottom-up approaches to graduate writing: Using a corpus to teach rhetorical functions. Journal of English for Academic Purposes , 6(4), 289–302. https://doi.org/10.1016/j.jeap.2007.09.009 . Search in Google Scholar

Charles, M. (2012). ‘Proper vocabulary and juicy collocations’: EAP students evaluate do-it-yourself corpus-building. English for Specific Purposes , 31(2), 93–102. https://doi.org/10.1016/j.esp.2011.12.003 . Search in Google Scholar

Charles, M. (2014). Getting the corpus habit: EAP students’ long-term use of personal corpora . English for Specific Purposes, 35, 30–40. 10.1016/j.esp.2013.11.004 Search in Google Scholar

Clark, T., & Yu, G. (2020). Beyond the IELTS test: Chinese and Japanese postgraduate UK experiences. International Journal of Bilingual Education and Bilingualism , 1–19. https://doi.org/10.1080/13670050.2020.1829538 . Search in Google Scholar

Conzett, J. (2000). Integrating collocation into a reading and writing course . In M. Lewis (Ed.), Teaching collocation: Further developments in the lexical approach (pp. 70–87). Hove: Language Teaching Publications. Search in Google Scholar

Crosthwaite, P. (2017). Retesting the limits of data-driven learning: Feedback and error correction. Computer Assisted Language Learning , 30(6), 447–473. https://doi.org/10.1080/09588221.2017.1312462 . Search in Google Scholar

Crosthwaite, P. (2019). Data-driven learning and younger learners: Introduction to the volume. In P. Crosthwaite (Ed.), Data-driven learning for the next generation: Corpora and DDL for pre-tertiary learners (pp. 1–10). London: Routledge. 10.4324/9780429425899-1 Search in Google Scholar

Duan, Y. H., & Yang, X. Y. (2016). Difficulties of Chinese students with their academic English: Evidence from a China-United States university program. SFERC , 2016, 45–51. Search in Google Scholar

Futagi, Y., Deane, P., Chodorow, M., & Tetreault, J. (2008). A computational approach to detecting collocation errors in the writing of non-native speakers of English. Computer Assisted Language Learning , 21(4), 353–367. https://doi.org/10.1080/09588220802343561 . Search in Google Scholar

Gilquin, G., & Granger, S. (2010). How can data-driven learning be used in language teaching. The Routledge handbook of corpus linguistics , 359, 370. https://doi.org/10.4324/9780203856949-26 . Search in Google Scholar

Hao, Y. (2016). A contrastive study of lexical chunks in high- and low-score IELTS compositions [Master’s thesis]. Shanghai International Studies University, Shanghai. Search in Google Scholar

Heather, J., & Helt, M. (2012). Evaluating corpus literacy training for pre-service language teachers: Six case studies. Journal of Technology and Teacher Education , 20(4), 415–440. Search in Google Scholar

Hou, J. (2014). Analyzing collocation errors in EFL Chinese learners’ writings based on corpus. Higher Education of Social Science , 7(1), 87–91. Search in Google Scholar

Johns, T. (1991). From printout to handout: Grammar and vocabulary teaching in the context of data-driven learning. CALL Austria , 10, 14–34. https://doi.org/10.1111/j.1467-971x.1991.tb00171.x . Search in Google Scholar

Johns, T. (1997). Contexts: The background, development and trialling of a concordance-based CALL program. In A. Wichmann, S. Fligelstone, T. McEnery, & G. Knowles (Eds.), Teaching and language corpora (pp. 100–115). Harlow: Addison Wesley Longman. 10.4324/9781315842677-9 Search in Google Scholar

Kennedy, C., & Miceli, T. (2010). Corpus-assisted creative writing: Introducing intermediate Italian learners to a corpus as a reference resource. Language, Learning and Technology , 14(1), 28–44. Search in Google Scholar

Lee, D., & Swales, J. (2006). A corpus-based EAP course for NNS doctoral students: Moving from available specialized corpora to self-compiled corpora. English for Specific Purposes , 25(1), 56–75. https://doi.org/10.1016/j.esp.2005.02.010 . Search in Google Scholar

Li, S. (2017). Using corpora to develop learners’ collocational competence. Language, Learning and Technology , 21(3), 153–171. Search in Google Scholar

Li, J., & Schmitt, N. (2009). The acquisition of lexical phrases in academic writing: A longitudinal case study. Journal of Second Language Writing , 18(2), 85–102. https://doi.org/10.1016/j.jslw.2009.02.001 . Search in Google Scholar

Mayor, B. M. (2006). Dialogic and hortatory features in the writing of Chinese candidates for the IELTS test. Language Culture and Curriculum , 19(1), 104–121. https://doi.org/10.1080/07908310608668757 . Search in Google Scholar

Men, H. (2015). Vocabulary increase and collocation learning: A corpus-based cross-sectional study of Chinese EFL learners [Doctoral dissertation]. Birmingham City University, Birmingham. Search in Google Scholar

Naismith, B. (2017). Integrating corpus tools on intensive CELTA courses. ELT Journal , 71(3), 273–283. 10.1093/elt/ccw076 Search in Google Scholar

Nation, I. S. (2001). Learning vocabulary in another language . Cambridge, UK: Cambridge University Press. 10.1017/CBO9781139524759 Search in Google Scholar

Nation, I. S., & Webb, S. A. (2011). Researching and analyzing vocabulary . Boston, MA: Heinle, Cengage Learning. Search in Google Scholar

Pan, Q., & Xu, R. J. (2011). Vocabulary teaching in English language teaching. Theory and Practice in Language Studies , 1(11), 1586–1589. https://doi.org/10.4304/tpls.1.11.1586-1589 . Search in Google Scholar

Saeedakhtar, A., Bagerin, M., & Abdi, R. (2020). The effect of hands-on and hands-off data-driven learning on low-intermediate learners’ verb-preposition collocations. System , 91, 102268. 10.1016/j.system.2020.102268 Search in Google Scholar

Sang, Y. (2017). Investigate the “issues” in Chinese students’ English writing and their “reasons”: Revisiting the recent evidence in Chinese academia. International Journal of Higher Education , 6(3), 1–11. https://doi.org/10.5430/ijhe.v6n3p1 . Search in Google Scholar

Sinclair, J. (1991). Corpus concordance collocation . Oxford: Oxford University Press. Search in Google Scholar

Smirnova, E. A. (2017). Using corpora in EFL classrooms: The case study of IELTS preparation. RELC Journal , 48(3), 302–310. https://doi.org/10.1177/0033688216684280 . Search in Google Scholar

Song, M., & Chen, L. P. (2017). A review on English vocabulary acquisition and teaching research in recent 30 years in China. Science Journal of Education , 5(4), 174–180. https://doi.org/10.11648/j.sjedu.20170504.18 . Search in Google Scholar

Sun, Y. C. (2000). Using on-line corpus to facilitate language learning. Paper presented at the Annual Meeting of the Teachers of English to Speakers of Other Languages. British Columbia: Canada . Search in Google Scholar

Sun, Y. C., & Wang, L. Y. (2003). Concordancers in the EFL classroom: Cognitive approaches and collocation difficulty. Computer Assisted Language Learning , 16(1), 83–94. https://doi.org/10.1076/call.16.1.83.15528 . Search in Google Scholar

Thurstun, J., & Candlin, C. (1998). Concordancing and the teaching of the vocabulary of academic English. English for Specific Purposes , 17(3), 267–280. 10.1016/S0889-4906(97)00013-6 Search in Google Scholar

Todd, R. W. (2001). Induction from self-selected concordances and self-correction. System , 29(1), 91–102. https://doi.org/10.1016/s0346-251x(00)00047-6 . Search in Google Scholar

Vyatkina, N. (2017). Data-driven learning of collocations: Learner performance, proficiency, and perceptions. Language Learning & Technology 20(3), 159–179. Search in Google Scholar

Webb, S., Newton, J., & Chang, A. (2013). Incidental learning of collocation. Language Learning , 63(1), 91–120. https://doi.org/10.1111/j.1467-9922.2012.00729.x . Search in Google Scholar

Wen, Q. (2018). The production-oriented approach to teaching university students English in China. Language Teaching , 51(4), 526–540. https://doi.org/10.1017/s026144481600001x . Search in Google Scholar

Wu, Y. J. A. (2016). The effects of utilizing corpus resources to correct collocation errors in L2 writing – Students’ performance, corpus use, and perceptions. In S. Papadima-Sophocleous, L. Bradley, & S. Thouesny (Eds.), CALL communities and culture – Short papers from EUROCALL 2016 (pp. 479–484). Dublin: Research-publishing.net. 10.14705/rpnet.2016.eurocall2016.610 Search in Google Scholar

Yao, S. (2014). An analysis of Chinese students’ performance in IELTS academic writing. The New English Teacher , 8(2), 104–138. Search in Google Scholar

Yoon, H., & Hirvela, A. (2004). ESL student attitudes toward corpus use in L2 writing. Journal of Second Language Writing , 13(4), 257–283. https://doi.org/10.1016/j.jslw.2004.06.002 . Search in Google Scholar

Zhai, L. (2016). A study on Chinese EFL learners’ vocabulary usage in writing. Journal of Language Teaching and Research , 7(4), 752–759. https://doi.org/10.17507/jltr.0704.16 . Search in Google Scholar

Zhang, B. (2011). A study of the vocabulary learning strategies used by Chinese students [Master’s thesis]. Kristianstad University, Kristianstad. Search in Google Scholar

Zhang, Z. (2018). Academic writing difficulty of Chinese students: The cultural issue behind Chinese and British academic writing styles. Studies in Literature and Language , 17(2), 118–124. https://doi.org/10.3968/10570 . Search in Google Scholar

Zhang, W., & Luo, L. (2004). Some thoughts on the development and current situation of college English teaching in China. Foreign Language World , 3, 2–7. Search in Google Scholar

Zheng, C., & Chang, C. E. (2014). Shi chuandi yiyi haishi xuanyao xingshi: Waiyu xiezuo gongcheng zhong de xing-yi chanjie xianxiang [Communication or show-off: A study of the confusion in meaning-structure relationship in undergraduate English writing]. Modern Foreign Languages , 37(4), 513–524. Search in Google Scholar

Zhou, S. (2010). Comparing receptive and productive academic vocabulary knowledge of Chinese EFL learners. Asian Social Science , 6(10), 14. https://doi.org/10.5539/ass.v6n10p14 . Search in Google Scholar

Zou, Q. (2019). A corpus-based study of verb-noun collocation errors in Chinese non-English majors’ writings. In 4th international conference on contemporary education, social sciences and humanities (ICCESSH 2019) . Paris: Atlantis Press. 10.2991/iccessh-19.2019.188 Search in Google Scholar

© 2021 Liuqin Fang et al., published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

  • X / Twitter

Supplementary Materials

Please login or register with De Gruyter to order this product.

Journal of China Computer-Assisted Language Learning

Journal and Issue

Articles in the same issue.

thesis collocation verbs

COMMENTS

  1. thesis collocations

    Words often used with thesis in an English sentence: basic thesis, central thesis, doctoral thesis, general thesis, graduate thesis, main…

  2. Academic Collocation List

    The Academic Collocation List (ACL) is a list containing 2,469 of the most frequent and useful collocations which occur in written academic English. It can be seen as a collocational companion to the Academic Word List (AWL), consisting of collocations (or word combinations) rather than single words. The ACL was developed by Kirsten Ackermann and Yu-Hua Chen using the Pearson International ...

  3. thesis

    THESIS + NOUN research, title, topic . PREP. by ~ She completed an MSc by thesis. | in a/the ~ research presented in a thesis | ~ about/on He's doing a doctoral thesis on the early works of Shostakovich. 2 statement of an idea/a theory . ADJ. basic, central, main . VERB + THESIS prove, support The results of the experiment support his central ...

  4. thesis

    the deadline for the thesis. [present, offer, proffer] a thesis (about) in support of your thesis. support your thesis with. the thesis is supported by. your [essay, writing] needs a clear thesis. the main thesis of the [paper, report, book] a [strong, concise] thesis. n as adj.

  5. PDF Master English Collocations & Phrasal Verbs: The Ultimate Phrasal Verbs

    English Collocations & Phrasal verbs: The Ultimate Phrasal Verbs and Collocations Book for Learning English " from the English Vocabulary & Grammar Series. This book is packed full of collocations and phrasal verb ... Example I handed in my thesis. Meaning: to submit Phrasal Verb join in Example She joined in the conversation at the party ...

  6. PDF Collocation Networks of Selected Words in Academic Writing: A Corpus

    As a matter of fact, there are two types of collocations: lexical and grammatical collocations. The former type usually contains two lexical elements ( noun+ adjective, verb+ adjective). While the latter, basically is formed by combining a verb and preposition (e.g. depend on) or adjective with a preposition such as (good at, ready for, bored ...

  7. thesis collocation

    VERB + THESIS . prove, support . The results of the experiment support his central thesis. | disprove | advance . He advanced the thesis that too much choice was burdensome to people.

  8. Online OXFORD Collocation Dictionary of English

    A completely new type of dictionary with word collocation that helps students and advanced learners effectively study, write and speak natural-sounding English . This online dictionary is very helpful for the education of the IELTS, TOEFL test. Level: Upper-Intermediate to Advanced. Key features of oxford dictionary online.

  9. PDF English Collocations in Use Advanced

    Collocations are not just a matter of how adjectives combine with nouns. They can refer to any kind of typical word combination, for example verb + noun (e.g. arouse someone s interest, lead a seminar), adverb + adjective (e.g. fundamentally di erent), adverb + verb (e.g. flatly contradict), noun + noun (e.g. a lick of paint, a team of experts,

  10. About Oxford Collocations Dictionary

    Collocation is the way words combine in a language to produce natural-sounding speech and writing. For example, in English you say strong wind but heavy rain. It would not be normal to say heavy wind or strong rain. And whilst all four of these words would be recognized by a learner at pre-intermediate or even elementary level, it takes a ...

  11. Collocation: Explanation and Examples

    A collocation is a group of words that sound natural when used together. For example: fast train. (Using "fast" with "train" sounds natural to a native speaker. This is an example of a collocation.) quick train (unnatural) (This is not technically wrong, but using "quick" with "train" sounds unnatural, even though the words are perfectly ...

  12. PDF The Use of English Collocations in Written Translation

    19 verb patterns: pattern D: verb + preposition (p. x-xxii) The second group is lexical collocations. However, no prepositions, clauses or infinitives are included, but consist of diverse combinations of nouns, verbs, adjectives and adverbs. Six categories and examples are: verb + noun: Collocations that denote Activation or Creation: set an alarm

  13. Collocation: Theoretical Considerations, Methods and Techniques for

    (3) v + prep (phrasal verbs collocation): rely on, dry up, look after, and put off. (4) adj + prep: dependent on, familiar with, close to and angry with. (5) quantifier + n: a pride of lions, a ...

  14. thesis noun

    Collocations Scientific research Scientific research Theory. formulate/ advance a theory/ hypothesis; build/ construct/ create/ develop a simple/ theoretical/ mathematical model; develop/ establish/ provide/ use a theoretical/ conceptual framework; advance/ argue/ develop the thesis that…; explore an idea/ a concept/ a hypothesis; make a prediction/ an inference

  15. Frontiers

    Secondly, compared with collocations made up of more complicated verbs, collocations consisting of common verbs are more likely to be used by beginner learners, thus allowing researchers to observe the developmental patterns of collocation use from lower levels to advanced ones. ... news, and thesis. It contains a 100 million-word sample of ...

  16. Dissertations / Theses: 'Collocation'

    Consult the top 50 dissertations / theses for your research on the topic 'Collocation.'. Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver ...

  17. PDF Verb-Noun Collocations in L2 Writing in an English-Medium Instruction

    3.1. Graduation Thesis (GT) Corpus. In order to investigate noun-verb collocations, a learner corpus was compiled with the drafts of graduation theses of 132 students, amounting to 282,293 words in total to form the Graduation Thesis (GT) corpus. The quantity and quality were not the same among the student drafts.

  18. (PDF) A Linguistic Study Of The Use Of Collocation In ...

    The present thesis is a linguistic study that aims at: a- identifying the significant patterns of collocation of specific words within data of five novels. b­ shedding light on the possible ...

  19. University of Massachusetts Amherst ScholarWorks@UMass Amherst

    SECOND LANGUAGE ACQUISITION OF CHINESE VERB-NOUN COLLOCATIONS . A Thesis Presented . by . YING CAI . Submitted to the Graduate School of the . University of Massachusetts Amherst in partial fulfillment . of the requirements for the degree of . MASTER OF ARTS . September 2017 .

  20. PDF Semantic prosody and collocation: A corpus study of the near ...

    clarify this confusion through a corpus-based investigation of the target synonymous verbs persist and persevere with focus on distribution across genres, collocations, and semantic preference/prosody. The ... Collocation occurs in statistically significant manners (Lewis, 2000). An example of common English collocations is the adjective + noun

  21. The effectiveness of corpus-based training on collocation use in L2

    Corpus tools are known to be effective in helping L2 learners improve their writing, especially regarding their use of words. Most corpus-based L2 writing research has focused on university students while little attention has been paid to secondary school L2 students. This study investigated whether senior secondary school students in China, upon receiving corpus-based training under the ...

  22. PDF Collocations of Light Verb Make in Chinese Master Theses

    a light verb, and the whole collocation is semantically equivalent to the lexical verb "choose." In contrast, make a law is neitherequivalent to a lexical verb nor a collocation of the light verb make. In general, the development of theories on light verbs has been a gradual process, with researchers building

  23. Verb + Noun: Verb Collocations Examples in English • 7ESL

    Examples of collocations with have in English. Have a bath. Have a drink. Have a good time. Have a haircut. Have a holiday. Have a problem. Have a relationship. Have a rest.