coursework vs exams statistics

Rethinking educational assessments: the matrimony of exams and coursework

Rethinking educational assessments: the matrimony of exams and coursework

Standardised tests have been cemented in education systems across the globe, but whether or not they are a better assessment of students’ ability compared to coursework still divides opinions.

Proponents of exam assessments argue that despite being stressful, exams are beneficial for many reasons, such as:

  • Provides motivation to study;
  • Results are a good measure of the student’s work and understanding (and not anyone else’s); and
  • They are a fair way of assessing students’ knowledge of a topic and encourage thinking in answering questions that everyone else is also taking.

But the latter may not be entirely true. A  Stanford study says question format can impact how boys and girls score on standardised tests. Researchers found that girls perform better on standardised tests that have more open-ended questions, while boys score higher when the tests include more multiple-choice questions.

Meanwhile, The Hechinger Report notes that assessments, when designed properly, can support, not just measure, student learning, building their skills and granting them the feedback they need.

“Assessments create feedback for teachers and students alike, and the high value of feedback – particularly timely feedback – is well-documented by learning scientists. It’s useful to know you’re doing something wrong right after you do it,” it said.

coursework vs exams statistics

Exams are important for students, but they must be designed properly to ensure they support student learning. Source: Shutterstock

Conversely, critics of exams say the obsession with test scores comes at the expense of learning – students memorise facts, while some syllabi lack emphasis of knowledge application and does little to develop students’ critical thinking skills.

Meanwhile, teachers have argued that report card grades aren’t the best way to measure a student’s academic achievement , adding that they measure effort more than achievement.

Coursework, on the other hand, assesses a wider range of skills – it can consist of a range of activities such as quizzes, class participation, assignments and presentations. These steady assessments over an academic year suggests there is fair representation of students’ educational attainment while also catering for different learning styles.

Quizzes can be useful as they keep students on their toes and encourages them to study consistently, while giving educators a yardstick as to how well students are faring. Group work, however, can open up a can of worms when lazy students latch on to hard-working peers to pull up their grades, or when work is unevenly distributed among teammates.

It becomes clear that exams and coursework clearly test students’ different ‘muscles’, but do they supplement and support students’ learning outcomes and develop students as a whole?

The shifting tides

coursework vs exams statistics

Coursework can develop skills such as collaboration and critical thinking among students, which exams cannot. Source: Shutterstock

News reports suggest that some countries are gradually moving away from an exam-oriented education system; these include selected schools in the US and Asian countries.

Last year, Malaysia’s Education Minister, Dr Maszlee Malik, said students from Year One to Three will no longer sit for exams come 2019, enabling the ministry to implement the Classroom-Based Assessment (PBD), in which they can focus on a pupil’s learning development.

Meanwhile, Singapore is cutting down on the number of exams for selected primary and secondary school levels, while Georgia’s school graduate exams will be abolished from 2020. Finland is a country known for not having standardised tests, with the exception of one exam at the end of students’ secondary school year.

Drawing from my experience, I found that a less exam-oriented system greatly benefitted me.

Going through 11 years of the Malaysian national education system was a testament that I did not perform well in an exam-oriented environment. I was often ‘first from the bottom’ in class, which did little to boost my confidence in school.

For university, I set out to select a programme that was less exam-oriented and eventually chose the American Degree Programme (ADP), while many of my schoolmates went with the popular A-Levels before progressing to their degree.

With the ADP, the bulk of student assessments (about 70 percent, depending on your institution) came from assignments, quizzes, class participation, presentations and the like, while the remaining 30 percent was via exams. Under this system, I found myself flourishing for the first time in an academic setting – my grades improved, I was more motivated to attend my classes and learned that I wasn’t as stupid as I was often made out to be during my school days.

This system of continuous assessments worked more in my favour than the stress of sitting for one major exam. In the former, my success or failure in an educational setting was not entirely dependent on how well I could pass standardised tests that required me to regurgitate facts through essays and open-ended or multiple choice questions.

Instead, I had more time to grasp new and alien concepts, and through activities that promoted continuous development, was able to digest and understand better.

coursework vs exams statistics

Mixed assessments in schools and universities can be beneficial for developing well-rounded individuals. Source: Shutterstock

Additionally, shy students such as myself are forced between a rock and a hard place – to contribute to class discussions or get a zero for class participation, and to engage in group and solo presentations or risk getting zero for oral presentations.

One benefit to this system is that it gives you the chance to play to your strengths and work hard towards securing top marks in areas you care about. If you preferred the  examination or assignments portion, for example, you could knock it out of the park in those areas to pull up your grades.

Some students may be all-rounders who perform well in both exam-oriented and coursework assessments, but not all students say the same. However, the availability of mixed assessments in schools and universities can be beneficial for developing well-rounded individuals.

Under this system, students who perform poorly in exams will still have to go through them anyway, while students who excel in exam-oriented conditions are also forced to undergo other forms of assessments and develop their skill sets, including creativity, collaboration, oral and critical thinking skills.

Students who argue that their grades will fall under mixed-assessments should rethink the purpose of their education – in most instances, degrees aim to prepare people for employment.

But can exams really prepare students for employment where they’ll be working with people with different skills, requiring them to apply critical thinking and communication skills over a period of time to ensure work is completed within stipulated deadlines, despite hiccups that can happen between the start and finishing line of a project?

It’ll help if parents, educators and policymakers are on the bandwagon, too, instead of merely chasing for children and students to obtain a string of As.  

Grades hold so much power over students’ futures – from the ability to get an academic scholarship to gaining entry to prestigious institutions – and this means it can be difficult to get students who prefer one mode of assessment to convert to one that may potentially negatively affect their grades.

Ideally, education shouldn’t be about pitting one student against the other; it should be based on attaining knowledge and developing skills that will help students in their future careers and make positive contributions to the world.

Exams are still a crucial part of education as some careers depend on a student’s academic attainment (i.e. doctors, etc.). But rather than having one form of assessment over the other, matrimony between the two may help develop holistic students and better prepare them for the world they’ll soon be walking into.

Liked this? Then you’ll love…

Smartphones in schools: Yes, no or maybe?

How do universities maintain a cohesive class culture?

Popular stories

The prestigious education of japan’s royal family.

The prestigious education of Japan’s royal family

6 cheapest universities in Europe to study medicine

6 cheapest universities in Europe to study medicine

6 most affordable universities in Finland for international students

6 most affordable universities in Finland for international students

From ‘The Exorcist’ to ‘Legally Blonde,’ these unis are iconic film locations

From ‘The Exorcist’ to ‘Legally Blonde,’ these unis are iconic film locations

You are using an outdated browser. Please upgrade your browser to improve your experience.

  • Cambridge Core
  • Cambridge Advance Online
  • Environment
  • Open Research
  • Earth & Environmental Science
  • Engineering
  • Life Sciences
  • Materials Science
  • Mathematics
  • Global Health
  • Neuroscience
  • Psychology & Psychiatry
  • Archaeology
  • Area Studies
  • Music & Drama
  • Religious Studies
  • Business and Management
  • Political Science and International Relations
  • Social Studies

Students on the Future of Exams

coursework vs exams statistics

Constance Lam and Joanna Berry

23 August 2022 Last update: 12/08/22 11:12

About Constance: I’m currently studying for my master’s in publishing at University College London, and in 2021 I graduated from Durham University with a degree in English Literature. During the pandemic, I studied remotely back home in Hong Kong and returned to start my master’s course.

Since the start of Covid-19, students and educators have adapted rapidly to changing examination methods, such as online assessment. In June, we conducted a student panel to explore how 17 students from around the world perceive these changes.

How do students prepare for exams and retain information afterwards? Will assessments and learning resources move digitally or remain the same?

Survey Overview

Our student panel is global and includes students at all stages, from first year undergraduates through to PhD students. The participants in this survey were from six different regions: South America, Middle East, Asia Pacific, United Kingdom, North America, and Europe, and included undergraduate, postgraduate, and PhD students.

How do students prepare for exams?

I remember revising for my English Literature exams and using several revision methods: I spent hours making my own notes, mind maps, and flashcards. In this survey, 88% of the participants said ‘using your notes’ was their most popular resource, but I was surprised to find that flashcards were the least popular – only 2 participants, or 12% of the total participants, thought they were useful.  For me, handwritten flashcards were the most effective method, because I could remember information more easily with active recall and visual cues.

In terms of revision time, most participants studied at least a few weeks before the exam, but some students said they couldn’t give one set answer, because each exam was different. My experience also varied depending on the exam. Some of my modules were more difficult to prepare for in advance because you were assessed on the spot – for example, being given a literature extract to analyse.

Retaining information after exams

Students generally agreed that interesting subjects were much easier to remember. One participant mentioned it depended on their study method: ‘If it was a subject I am genuinely interested in, then I studied in-depth, so my retention remains at high levels even years after the exam. On the other hand, if I studied only to pass the exam and forget about it, if not interested in the material, then I tend to forget most of it, with at least 50% retention.’

My experience was largely similar – I don’t remember the maths equations or the historical facts that I memorised in secondary school exams; but I still clearly remember the literary theories that I wrote exams on, even three years ago.

Coursework vs. exams

Students had varying views about the relevance of coursework and exams: while some believe that both are ‘equally important’, others think that coursework is starting to play a more dominant role than examinations.

One law student discussed the impact of the pandemic on their assessments: ‘My subject is currently assessed by coursework and no examinations. Unfortunately, this is partly due to the ongoing coronavirus pandemic.’

This student also raises an interesting point about how methods of assessment differ by the subject area. While their law faculty only uses coursework, other participants from social science and STM backgrounds said that coursework and exams were ‘equally weighted’ and ‘intertwined’ for their subject areas. 

However, many universities continue to keep examinations as part of assessment, simply by moving them digitally.  In my case, I completed two years of online undergraduate exams with a 24-hour time limit, but my coursework was also assessed.

The future of assessments

Most panellists agree that universities are progressing towards digital examinations. However, some are confident that the way exams are conducted will not change anytime soon, while others say the future of assessments is difficult to predict.

One student highlights that coursework and online assessments can help level the academic playing field: ‘I believe universities will continue to make coursework a more integral part of their assessment strategy and also assemble online assessments for students who may be unable to physically attend lectures because of illness or personal circumstances to participate and have the ability to complete their studies.’

The survey results revealed that students use a wide variety of revision methods – which ones do you prefer?

Interested in getting a head start on the next Academic year? Check out our Good Student collection, a selection of introductory titles on a variety of topics.

  • assessments
  • higher education
  • Online Learning

Leave a reply Cancel reply

Your email address will not be published. Required fields are marked *

Email Please enter a valid email address

Related Posts

coursework vs exams statistics

William Nee · 19 April 2022

The crackdown on trade unions in Hong Kong: what response from responsible investors?

The Hong Kong government has vowed to potentially further criminally punish the leaders of the now disbanded the Hong Kong Confederation of Trade Unions (HKCTU), which was Hong Kong’s largest pro-democracy trade union.   Since the National Security Law (NSL)  came into effect on July 1 2020, alarming reports drip out of Hong Kong on […]

coursework vs exams statistics

Jesse Lund and more · 6 March 2020

International Women’s Day 2020: Influential women in STEM

International Women’s Day 2020 falls on Sunday, 8th March this year. In the run up to this date, each week day we’ll be highlighting one woman whose accomplishments in science, technology, engineering and/or mathematics not only elevated their fields but also took us one step closer to a gender-equal world. We hope you’ll join us […]

coursework vs exams statistics

Tim Pringle · 5 July 2017

The 20th anniversary of the transfer of sovereignty of Hong Kong to China

Perhaps the most remarkable aspect of the transition that the Hong Kong Special Administrative Region of China has undergone since 1997 is the city’s relocation to the centre of Chinese politics. For most of the colonial era,  Hong Kong rarely presented itself as a political subject – much less one with the capacity to impact […]

coursework vs exams statistics

Paul Clarke · 23 June 2020

Where Physical and Digital Worlds Collide

In this blog for Data-Centric Engineering, Paul Clarke (Chief Technology Officer at Ocado) documents Ocado’s journey with building synthetic models of its business, its platforms and its underlying technologies, including the use of simulations, emulations, visualisations and digital twins. He explores the potential benefits of digital twins, including the opportunities for creating digital twins at […]

Related Tags

  • business and human rights
  • Business and Human Rights Journal
  • Adele Goldberg
  • Barbara McClintock
  • Caroline Herschel
  • Cathleen Morawetz
  • Cecilia Payne-Gaposchkin
  • Christine Darden
  • Dame Stephanie Steve Shirley
  • View all tags

Latest Tweets

CambridgeUP avatar

Listen to @BBCRadio4's Start the Week, featuring @NineDotsPrize winner @jkkusiak, talking about her book, 'Radically Legal'. Learn how a group of ordinary people inspired the book when they reclaimed over 240,000 apartments back from corporate landlords 🔗

Image for twitter card

Start the Week - ‘Left behind’, but not forgotten - BBC Sounds

Tom Sutcliffe with Paul Collier, Joanna Kusiak and Matthew Xia.

Sparking curiosity from geopolitics in Japan to the cultural implications of AI! Explore the seven new titles we are welcoming to Cambridge and the seven #OpenAccess titles we are launching in 2025 🚀 🔗 https://cup.org/3yhAuAx

Image for the Tweet beginning: Sparking curiosity from geopolitics in

Our Flip it Open programme has published titles from authors @GibbsSpike, @liederfollower and Inés Valdez! Discover how we are funding the publication of #OpenAccess books without changing purchasing habits 🔗 https://cup.org/4diDHii

Image for the Tweet beginning: Our Flip it Open programme

Cookies on GOV.UK

We use some essential cookies to make this website work.

We’d like to set additional cookies to understand how you use GOV.UK, remember your settings and improve government services.

We also use cookies set by other sites to help us deliver content from their services.

You have accepted additional cookies. You can change your cookie settings at any time.

You have rejected additional cookies. You can change your cookie settings at any time.

Impact of coursework on attainment dependent on student characteristics

Comparing specifications from 5 subject areas, at GCSE and A level, with and without coursework to investigate how attainment differs dependent on student characteristics.

Applies to England

The impact of coursework on attainment dependent on student characteristics.

Ref: Ofqual/20/6629

PDF , 1.53 MB , 47 pages

This file may not be suitable for users of assistive technology.

Ofqual commissioned this work in 2019 in order to try to gain some deeper insight into the relationship between different forms of assessment within high stakes qualifications, to understand the extent to which different types of assessment – examination and non-examined assessment (coursework) - might be associated with different levels of performance for different groups or types of students. While there are often views or observations around, say, whether a particular gender ‘does better’ in examinations, or that students of lower socio-economic status may not have the same access to resources at home to support coursework, the research literature to support these views is not as substantial as it can be; and often does not adequately control for other factors.

This research aims to make the most of existing data around performance in different types of assessment. Over the last two decades, there have been changes in the proportion of examined and non-examined assessment (coursework) and this research utilises this data to see whether there is any evidence to show that differences in learner characteristics (gender, ethnicity, special educational needs, socio-economic class) are associated with different patterns of performance in relation to different types of assessment. The data is useful in that it comprises large, whole cohort data for the years analysed. The insights are likely to provide a useful evidence base and have a bearing on future decisions about use of different types of assessment in high stakes examinations.

Key findings

  • There is little evidence that coursework has any impact on outcomes for students of different socio-economic statuses (SES) or for students with special educational needs.
  • Male students perform better than female students in wholly examined GCSE specifications and also in GCSE specifications where there is a greater level of control in the coursework. Female students tend to have better outcomes than males where internally set, internally marked coursework is included.
  • There is little indication of different outcomes for students of different ethnicities across different assessment types; except for students of Chinese ethnicity who despite performing well overall, perform relatively poorly when entered for specifications with coursework.
  • In specifications where coursework was optional, the examined alternative appears to provide a safety net for less able students who failed to submit coursework.

Updates to this page

Sign up for emails or print this page, is this page useful.

  • Yes this page is useful
  • No this page is not useful

Help us improve GOV.UK

Don’t include personal or financial information like your National Insurance number or credit card details.

To help us improve GOV.UK, we’d like to know more about your visit today. Please fill in this survey (opens in a new tab) .

End-of-course exams benefit students—and states

coursework vs exams statistics

Education reformers in the United States have stumbled when it comes to high schools and the achievement evidence shows it. National Assessment results in grade twelve have been flat for a very long time. ACT and SAT scores are flat. U.S. results on PISA and TIMSS are essentially flat. College remediation rates—and dropout rates—remain high. Advanced Placement (AP) participation is up, but success on AP exams is not—and for minority students it’s down. And while high school graduation rates are up—and it’s indisputably a good thing for young people to acquire that credential—it’s not so good when there’s reason to believe it does not signify levels of learning that augur success in post-high-school pursuits.

We at the Fordham Institute have a longstanding interest in strengthening student achievement and school performance, and it’s no secret that we’re accountability hawks: We believe strongly that results—and growth in results—are what matter in education, and we’ve been concerned for some time about ways in which the appearance or assertion of improvement may conceal something far more disappointing. In that connection, previous Fordham studies have unmasked what we termed the “ proficiency illusion ,” the “ accountability illusion ,” the rise of often-questionable “ credit recovery ,” and the discrepancy between teacher-conferred grades and student performance on statewide assessments.

On the upside, we’ve also documented respectable—and authentic—achievement gains in the early grades, particularly among disadvantaged and low-achieving youngsters and children of color. But high schools, as we’ve noted on multiple occasions, remain a huge challenge.

Nor have federal efforts to strengthen academic performance via school accountability ever gotten much traction at the high school level, where—under No Child Left Behind and now the Every Student Succeeds Act—there’s been more emphasis on graduation rates than on student achievement. To their credit, most states, at one point or another, have supplemented those efforts by instituting their own exam-based requirements for students before awarding diplomas. These have taken the form of multisubject graduation tests—the best known probably being the Massachusetts MCAS exam—as well as subject-specific end-of-course exams (EOCs).

Both were extensively used until just a few years ago. At their high-water mark, graduation tests were required by thirty states and EOCs were employed by thirty jurisdictions (there’s double counting there, as the two types of tests overlap somewhat). Both, however, are now in decline. For the class of 2020, students in just twelve states will have taken a graduation test, and in twenty-six states, students will have taken one or more EOCs.

Three factors seem to have driven that decline: the overriding push for higher graduation rates, which militates against anything that might get in the way; the nationwide backlash against testing in general; and a handful of studies indicating that requiring students to pass a graduation test may discourage them and lead to more dropouts, which is obviously bad for them and would also depress the graduation rate without much evidence of a positive impact on student achievement.

Yet very little prior research has looked at EOCs in particular. Our new report, End-of-Course Exams and Student Outcomes , helps remedy that. We wondered: How, exactly, do states employ EOCs? And what difference, if any, do they make for student achievement and graduation rates? If they cause more harm than good, states may be right to downplay or discard them. If, on the other hand—and unlike graduation exams—they do good things for kids or schools, it’s possible that states, in turning away from EOCs, are throwing a healthy baby out with the testing bathwater.

We entrusted this inquiry to Fordham’s own Adam Tyner and Lafayette College economist Matthew Larsen, and they’ve done a first-rate job, the more so considering how challenging it is to corral EOCs separately from other forms of testing, how tricky it is to determine exactly what a test is being “used for,” and how many different tests and states are involved and over such a long period of time. It’s also a big problem that the nation lacks a reliable gauge of state-by-state achievement at the twelfth-grade level—a challenge that the National Assessment Governing Board recently promised to address, but not until 2027!

Tyner and Larsen learned much that’s worth knowing and sharing because the implications for state (and district and school) policy and practice are potentially quite valuable. Probably most important, EOCs, properly deployed, have positive (albeit modest) academic benefits and do so without causing kids to drop out or graduation rates to falter. “In other words,” write the authors, “the key argument against exit exams—that they depress graduation rates—does not hold for EOCs.” Instead, these exams “are generally positively correlated with high school graduation rates.” Better still, “The more EOCs a state administers, the better is student performance on college-entrance exams, suggesting that the positive effects of EOCs may be cumulative.”

Nor are those the only potential benefits associated with strategic deployment of EOCs. External exams are a good way for states to maintain uniform content and rigor in core high school courses and keep a check on the local impulse (often driven as much by parents as by teachers or administrators) to inflate student grades. At the same time, EOCs can motivate students to take those courses more seriously and tend to place teachers and their pupils on the “same team”—for when the exam is external, the teacher becomes more coach than judge.

Such exams also lend themselves to an individualized, “mastery”-based education system in which students proceed through their coursework at their own speed, often with the help of technology as well as teachers. (To optimize this benefit, “end-of-unit” exams would be even more beneficial than the kind given only at the end of a semester or a year.)

We’re surely not suggesting that states go crazy with EOCs—there’s little danger of that happening in today’s climate anyway—but we do suggest that policymakers take seriously both the good that these exams can do and the potential harm from scrapping or softening them. And softening seems to be underway in more and more places, as states create detours around EOCs for kids who have trouble passing them, delay the year when they must actually be passed, or turn them into part of a student’s course grade rather than actually requiring that kids pass them.

As we said, we’re accountability hawks and thus generally opposed to softening. Yet as Tyner and Larsen note, EOCs have the virtue of flexibility. States can deploy them in various ways: some firmer, some softer, and some simply as a source of valuable information for teachers, parents, school leaders, and policymakers. At a time when states are back in the driver’s seat on school and student accountability, that’s mostly a good thing. But at a time when high school performance is flat, flat, flat, it seems to us that wise educators and policymakers alike should use every tool in their toolbox to build the scaffolding for major improvement. EOCs are such a tool.

coursework vs exams statistics

Chester E. Finn, Jr., scholar, educator and public servant, has devoted his career to improving education in the United States. At Fordham , he is now Distinguished Senior Fellow and President Emeritus . He’s also a Volker Senior Fellow at…

coursework vs exams statistics

Amber Northern is senior vice president for research at the Thomas B. Fordham Institute , where she supervises the Institute’s robust research portfolio and…

Related Content

Kathleen Porter-Magee

Exit Interview: Kathleen Porter-Magee, Superintendent of Partnership Schools

Building pillars

Philanthropy wrestles with civics education

Cassette tape

When was American education’s best decade? And how can we tell?

  • My Account |
  • StudentHome |
  • TutorHome |
  • IntranetHome |
  • Contact the OU Contact the OU Contact the OU |
  • Accessibility hub Accessibility hub

Postgraduate

  • International
  • News & media
  • Business & apprenticeships

Open Research Online - ORO

Coursework versus examinations in end-of-module assessment: a literature review.

Twitter Share icon

Copy the page URI to the clipboard

Richardson, John T. E. (2015). Coursework versus examinations in end-of-module assessment: a literature review. Assessment & Evaluation in Higher Education , 40(3) pp. 439–455.

DOI: https://doi.org/10.1080/02602938.2014.919628

In the UK and other countries, the use of end-of-module assessment by coursework in higher education has increased over the last 40 years. This has been justified by various pedagogical arguments. In addition, students themselves prefer to be assessed either by coursework alone or by a mixture of coursework and examinations than by examinations alone. Assessment by coursework alone or by a mixture of coursework and examinations tends to yield higher marks than assessment by examinations alone. The increased adoption of assessment by coursework has contributed to an increase over time in the marks on individual modules and in the proportion of good degrees across entire programmes. Assessment by coursework appears to attenuate the negative effect of class size on student attainment. The difference between coursework marks and examination marks tends to be greater in some disciplines than others, but it appears to be similar in men and women and in students from different ethnic groups. Collusion, plagiarism and personation (especially ‘contract cheating’ through the use of bespoke essays) are potential problems with coursework assessment. Nevertheless, the increased use of assessment by coursework has generally been seen as uncontentious, with only isolated voices expressing concerns regarding possible risks to academic standards.

Viewing alternatives

Public attention, number of citations, item actions.

This item URI

-

The Open University

  • Study with us
  • Work with us
  • Supported distance learning
  • Funding your studies
  • International students
  • Global reputation
  • Sustainability
  • Apprenticeships
  • Develop your workforce
  • Contact the OU

Undergraduate

  • Arts and Humanities
  • Art History
  • Business and Management
  • Combined Studies
  • Computing and IT
  • Counselling
  • Creative Writing
  • Criminology
  • Early Years
  • Electronic Engineering
  • Engineering
  • Environment
  • Film and Media
  • Health and Social Care
  • Health and Wellbeing
  • Health Sciences
  • International Studies
  • Mathematics
  • Mental Health
  • Nursing and Healthcare
  • Religious Studies
  • Social Sciences
  • Social Work
  • Software Engineering
  • Sport and Fitness
  • Postgraduate study
  • Research degrees
  • Masters in Social Work (MA)
  • Masters in Economics (MSc)
  • Masters in Creative Writing (MA)
  • Masters degree in Education (MA/MEd)
  • Masters in Engineering (MSc)
  • Masters in English Literature (MA)
  • Masters in History (MA)
  • Masters in International Relations (MA)
  • Masters in Finance (MSc)
  • Masters in Cyber Security (MSc)
  • Masters in Psychology (MSc)
  • A to Z of Masters degrees
  • OU Accessibility statement
  • Conditions of use
  • Privacy policy
  • Cookie policy
  • Manage cookie preferences
  • Modern slavery act (pdf 149kb)

Follow us on Social media

Google+

  • Student Policies and Regulations
  • Student Charter
  • System Status
  • Contact the OU Contact the OU
  • Modern Slavery Act (pdf 149kb)

Think Student

Coursework vs Exams: What’s Easier? (Pros and Cons)

In A-Level , GCSE , General by Think Student Editor September 12, 2023 Leave a Comment

Coursework and exams are two different techniques used to assess students on certain subjects. Both of these methods can seem like a drag when trying to get a good grade, as they both take so many hours of work! However, is it true that one of these assessment techniques is easier than the other? Some students pick subjects specifically because they are only assessed via coursework or only assessed via exams, depending on what they find easiest. However, could there be a definite answer to what is the easiest?

If you want to discover whether coursework or exams are easier and the pros and cons of these methods, check out the rest of this article!

Disclaimer: This article is solely based on one student’s opinion. Every student has different perspectives on whether coursework or exams are easier. Therefore, the views expressed in this article may not align with your own.

Table of Contents

Coursework vs exams: what’s easier?

The truth is that whether you find coursework or exams easier depends on you and how you like to work. Different students learn best in different ways and as a result, will have differing views on these two assessment methods.

Coursework requires students to complete assignments and essays throughout the year which are carefully graded and moderated. This work makes up a student’s coursework and contributes to their final grade.

In comparison, exams often only take place at the end of the year. Therefore, students are only assessed at one point in the year instead of throughout. All of a student’s work then leads up to them answering a number of exams which make up their grade.

There are pros and cons for both of these methods, depending on how you learn and are assessed best. Therefore, whether you find coursework or exams easier or not depends on each individual.

Is coursework easier than exams?

Some students believe that coursework is easier than exams. This is because it requires students to work on it all throughout the year, whilst having plenty of resources available to them.

As a result, there is less pressure on students at the end of the year, as they have gradually been able to work hard on their coursework, which then determines their grade. If you do coursework at GCSE or A-Level, you will generally have to complete an extended essay or project.

Some students find this easier than exams because they have lots of time to research and edit their essays, allowing the highest quality of work to be produced. You can discover more about coursework and tips for how to make it stand out if you check out this article from Oxford Royale.

However, some students actually find coursework harder because of the amount of time it takes and all of the research involved. Consequently, whether you prefer coursework or not depends on how you enjoy learning.

What are the cons of coursework?

As already hinted at, the main con of coursework is the amount of time it takes. In my experience, coursework was always such a drag because it took up so much of my time!

When you hear that you have to do a long essay, roughly 2000-3000 words, it sounds easily achievable. However, the amount of research you have to do is immense, and then editing and reviewing your work takes even more time.

Coursework should not be over and done within a week. It requires constant revisits and rephrasing, as you make it as professional sounding and high quality as possible. Teachers are also unable to give lots of help to students doing coursework. This is because it is supposed to be an independent project.

Teachers are able to give some advice, however not too much support. This can be difficult for students who are used to being given lots of help.

You also have to be very careful with what you actually write. If you plagiarise anything that you have written, your coursework could be disqualified. Therefore, it is very important that you pay attention to everything you write and make sure that you don’t copy explicitly from other websites. This can make coursework a risky assessment method.

You are allowed to use websites for research, however you must reference them correctly. This can be a difficult skill for some students to learn also!

What are the pros of coursework?

Some of the cons of coursework already discussed can actually be seen as pros by some students! Due to coursework being completed throughout the year, this places less pressure on students, as they don’t have to worry about final exams completely determining their grade.

Some subjects require students to sit exams and complete some coursework. However, if a student already knows that they have completed some high-quality coursework when it comes to exam season, they are less likely to place pressure on themselves. They know that their coursework could save their grade even if they don’t do very well on the exam.

A lot of coursework also requires students to decide what they want to research or investigate. This allows students to be more creative, as they decide what to research, depending on the subject. This can make school more enjoyable and also give them more ideas about what they want to do in the future.

If you are about to sit your GCSEs and are thinking that coursework is the way to go, check out this article from Think Student to discover which GCSE subjects require students to complete coursework.

What are the cons of exams?

Personally, I hated exams! Most students share this opinion. After all, so much pressure is put on students to complete a set of exams at the end of the school year. Therefore, the main con of sitting exams is the amount of pressure that students are put under.

Unlike coursework, students are unable to go back and revisit the answers to their exams over many weeks. Instead, after those 2 (ish) hours are up, you have to leave the exam hall and that’s it! Your grade will be determined from your exams.

This can be seen as not the best method, as it doesn’t take student’s performances throughout the rest of the year into account. Consequently, if a student is just having a bad day and messes up one of their exams, nothing can be done about it!

If you are struggling with exam stress at the moment, check out this article from Think Student to discover ways of dealing with it.

Exams also require an immense amount of revision which takes up time and can be difficult for students to complete. If you want to discover some revision tips, check out this article from Think Student.

What are the pros of exams?

Exams can be considered easier however because they are over with quickly. Unlike coursework, all students have to do is stay in an exam hall for a couple of hours and it’s done! If you want to discover how long GCSE exams generally last, check out this article from Think Student.

Alternatively, you can find out how long A-Level exams are in this article from Think Student. There is no need to work on one exam paper for weeks – apart from revising of course!

Revising for exams does take a while, however revising can also be beneficial because it increases a student’s knowledge. Going over information again and again means that the student is more likely to remember it and use it in real life. This differs greatly from coursework.

Finally, the main advantage of exams is that it is much harder to cheat in any way. Firstly, this includes outright cheating – there have been issues in the past with students getting other people to write their coursework essays.

However, it also includes the help you get. Some students may have an unfair advantage if their teachers offer more help and guidance with coursework than at other schools. In an exam, it is purely the student’s work.

While this doesn’t necessarily make exams easier than coursework, it does make them fairer, and is the reason why very few GCSEs now include coursework.

If you want to discover more pros and cons of exams, check out this article from AplusTopper.

What type of student is coursework and exams suited to?

You have probably already gathered from this article whether exams or coursework are easier. This is because it all depends on you. Hopefully, the pros and cons outlined have helped you to decide whether exams or coursework is the best assessment method for you.

If you work well under pressure and prefer getting assessed all at once instead of gradually throughout the year, then exams will probably be easier for you. This is also true if you are the kind of person that leaves schoolwork till the last minute! Coursework will definitely be seen as difficult for you if you are known for doing this!

However, if, like me, you buckle under pressure and prefer having lots of time to research and write a perfect essay, then you may find coursework easier. Despite this, most GCSE subjects are assessed via exams. Therefore, you won’t be able to escape all exams!

As a result, it can be useful to find strategies that will help you work through them. This article from Think Student details a range of skills and techniques which could be useful to use when you are in an exam situation.

Exams and coursework are both difficult in their own ways – after all, they are used to thoroughly assess you! Depending on how you work best, it is your decision to decide whether one is easier than the other and which assessment method this is.

guest

Are exams a more effective tool to assess students than coursework?

Matthew Murchie, 15, St Joseph College

For thousands of years, exams have been used to assess students' abilities and intelligence. But recently, more schools have been opting for a different approach to assessment: coursework.

Exams seem, at first sight, to be an excellent way to assess students. They test a specific syllabus that everyone follows and they put students on an even footing as all students sit for the same exam papers.

Unfortunately, basing a student's academic abilities on a single exam paper is hardly an accurate method of assessment. What if the student is not on form on the day of the exam? What if the student is sick and can't make it to the exam? It is often the case that a talented student misses great opportunities purely because they couldn't turn up for a single important exam.

Coursework, on the other hand, provides a steady assessment over the course of months, guaranteeing the students' results to be an accurate summary of their academic standards.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Choosing the Right Statistical Test | Types & Examples

Choosing the Right Statistical Test | Types & Examples

Published on January 28, 2020 by Rebecca Bevans . Revised on June 22, 2023.

Statistical tests are used in hypothesis testing . They can be used to:

  • determine whether a predictor variable has a statistically significant relationship with an outcome variable.
  • estimate the difference between two or more groups.

Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.

If you already know what types of variables you’re dealing with, you can use the flowchart to choose the right statistical test for your data.

Statistical tests flowchart

Table of contents

What does a statistical test do, when to perform a statistical test, choosing a parametric test: regression, comparison, or correlation, choosing a nonparametric test, flowchart: choosing a statistical test, other interesting articles, frequently asked questions about statistical tests.

Statistical tests work by calculating a test statistic – a number that describes how much the relationship between variables in your test differs from the null hypothesis of no relationship.

It then calculates a p value (probability value). The p -value estimates how likely it is that you would see the difference described by the test statistic if the null hypothesis of no relationship were true.

If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you can infer a statistically significant relationship between the predictor and outcome variables.

If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no statistically significant relationship between the predictor and outcome variables.

Prevent plagiarism. Run a free check.

You can perform statistical tests on data that have been collected in a statistically valid manner – either through an experiment , or through observations made using probability sampling methods .

For a statistical test to be valid , your sample size needs to be large enough to approximate the true distribution of the population being studied.

To determine which statistical test to use, you need to know:

  • whether your data meets certain assumptions.
  • the types of variables that you’re dealing with.

Statistical assumptions

Statistical tests make some common assumptions about the data they are testing:

  • Independence of observations (a.k.a. no autocorrelation): The observations/variables you include in your test are not related (for example, multiple measurements of a single test subject are not independent, while measurements of multiple different test subjects are independent).
  • Homogeneity of variance : the variance within each group being compared is similar among all groups. If one group has much more variation than others, it will limit the test’s effectiveness.
  • Normality of data : the data follows a normal distribution (a.k.a. a bell curve). This assumption applies only to quantitative data .

If your data do not meet the assumptions of normality or homogeneity of variance, you may be able to perform a nonparametric statistical test , which allows you to make comparisons without any assumptions about the data distribution.

If your data do not meet the assumption of independence of observations, you may be able to use a test that accounts for structure in your data (repeated-measures tests or tests that include blocking variables).

Types of variables

The types of variables you have usually determine what type of statistical test you can use.

Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types of quantitative variables include:

  • Continuous (aka ratio variables): represent measures and can usually be divided into units smaller than one (e.g. 0.75 grams).
  • Discrete (aka integer variables): represent counts and usually can’t be divided into units smaller than one (e.g. 1 tree).

Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical variables include:

  • Ordinal : represent data with an order (e.g. rankings).
  • Nominal : represent group names (e.g. brands or species names).
  • Binary : represent data with a yes/no or 1/0 outcome (e.g. win or lose).

Choose the test that fits the types of predictor and outcome variables you have collected (if you are doing an experiment , these are the independent and dependent variables ). Consult the tables below to see which test best matches your variables.

Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

The most common types of parametric test include regression tests, comparison tests, and correlation tests.

Regression tests

Regression tests look for cause-and-effect relationships . They can be used to estimate the effect of one or more continuous variables on another variable.

Predictor variable Outcome variable Research question example
What is the effect of income on longevity?
What is the effect of income and minutes of exercise per day on longevity?
Logistic regression What is the effect of drug dosage on the survival of a test subject?

Comparison tests

Comparison tests look for differences among group means . They can be used to test the effect of a categorical variable on the mean value of some other characteristic.

T-tests are used when comparing the means of precisely two groups (e.g., the average heights of men and women). ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g., the average heights of children, teenagers, and adults).

Predictor variable Outcome variable Research question example
Paired t-test What is the effect of two different test prep programs on the average exam scores for students from the same class?
Independent t-test What is the difference in average exam scores for students from two different schools?
ANOVA What is the difference in average pain levels among post-surgical patients given three different painkillers?
MANOVA What is the effect of flower species on petal length, petal width, and stem length?

Correlation tests

Correlation tests check whether variables are related without hypothesizing a cause-and-effect relationship.

These can be used to test whether two variables you want to use in (for example) a multiple regression test are autocorrelated.

Variables Research question example
Pearson’s  How are latitude and temperature related?

Non-parametric tests don’t make as many assumptions about the data, and are useful when one or more of the common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.

Predictor variable Outcome variable Use in place of…
Spearman’s 
Pearson’s 
Sign test One-sample -test
Kruskal–Wallis  ANOVA
ANOSIM MANOVA
Wilcoxon Rank-Sum test Independent t-test
Wilcoxon Signed-rank test Paired t-test

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

coursework vs exams statistics

This flowchart helps you choose among parametric tests. For nonparametric alternatives, check the table above.

Choosing the right statistical test

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient
  • Null hypothesis

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Statistical tests commonly assume that:

  • the data are normally distributed
  • the groups that are being compared have similar variance
  • the data are independent

If your data does not meet these assumptions you might still be able to use a nonparametric statistical test , which have fewer requirements but also make weaker inferences.

A test statistic is a number calculated by a  statistical test . It describes how far your observed data is from the  null hypothesis  of no relationship between  variables or no difference among sample groups.

The test statistic tells you how different two or more groups are from the overall population mean , or how different a linear slope is from the slope predicted by a null hypothesis . Different test statistics are used in different statistical tests.

Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test . Significance is usually denoted by a p -value , or probability value.

Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis .

When the p -value falls below the chosen alpha value, then we say the result of the test is statistically significant.

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Choosing the Right Statistical Test | Types & Examples. Scribbr. Retrieved September 9, 2024, from https://www.scribbr.com/statistics/statistical-tests/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, hypothesis testing | a step-by-step guide with easy examples, test statistics | definition, interpretation, and examples, normal distribution | examples, formulas, & uses, what is your plagiarism score.

ORIGINAL RESEARCH article

Gender performance gaps across different assessment methods and the underlying mechanisms: the case of incoming preparation and test anxiety.

\nShima Salehi

  • 1 Graduate School of Education, Stanford University, Stanford, CA, United States
  • 2 Department of Biology Teaching and Learning, University of Minnesota, Minneapolis, MN, United States
  • 3 Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, MN, United States
  • 4 Department of Chemistry, University of Minnesota, Minneapolis, MN, United States
  • 5 Department of Ecology, Evolution and Behavior, University of Minnesota, Minneapolis, MN, United States
  • 6 Department of Biological Sciences, Auburn University, Auburn, AL, United States

A persistent “gender penalty” in exam performance disproportionately impacts women in large introductory science courses, where exam grades generally account for the majority of the students' assessment of learning. Previous work in introductory biology demonstrates that some social psychological factors may underlie these gender penalties, including test anxiety and interest in course content. In this paper, we examine the extent that gender predicts performance across disciplines, and investigate social psychological factors that mediate performance. We also examine whether a gender penalty persists beyond introductory courses, and can be observed in more advanced upper division science courses. We ran analyses (1) across two colleges at a single institution: the College of Biological Sciences and the College of Science and Engineering (i.e., physics, chemistry, materials science, math); and (2) across introductory lower division courses and advanced upper division courses, or those that require a prerequisite. We affirm that exams have disparate impacts based on student gender at the introductory level, with female students underperforming relative to male students. We did not observe these exam gender penalties in upper division courses, suggesting that women are either being “weeded out” at the introductory level, or “warming to” timed examinations. Additionally, results from mediation analyses show that across disciplines and divisions, for women only, test anxiety negatively influences exam performance.

Introduction

To effectively promote student groups who have been historically underrepresented in science, technology, engineering, and math (STEM), we need to provide students from different backgrounds equal opportunities to perform in these fields. Results from previous studies, however, demonstrate that schools are still unable to provide all students with equal opportunities, as evidenced by gaps in performance based on gender and other demographic descriptors of student identities ( McGrath and Braunstein, 1997 ; Kao and Thompson, 2003 ; DeBerard et al., 2004 ; Ballen and Mason, 2017 ). Demographic gaps in performance in higher education can be partly explained by demographic gaps in student incoming preparation ( Sun, 2017 ; Salehi et al., 2019 ). However, they can also be due to biased education structures such as methods used to assess student performance in STEM fields ( Stanger-Hall, 2012 ), introductory gateway courses that “weed out” students ( Mervis, 2010 , 2011 ), traditional uninterrupted lectures rather than high-structure active learning methods ( Haak et al., 2011 ; Ballen et al., 2017b ), feelings of exclusion ( Hurtado and Ruiz, 2012 ), stereotype threat ( Steele, 1997 ; Cohen et al., 2006 ), and discrimination ( Milkman et al., 2015 ). While many demographics and identities remain underrepresented in STEM, such as certain racial and ethnic groups, and first-generation college students, the work described herein focuses broadly on women in STEM.

Previous work has demonstrated that using high stakes exams as an assessment method has disparate impacts on male and female students. Even after controlling for student incoming preparation, this work shows female students underperformed on exams across multiple introductory biology courses, due in part to test anxiety ( Ballen et al., 2017a ). This negative effect of anxiety on performance was observed only for female students, and only on exam assessments. Anxiety did not impact male student performance or female student performance on non-exam assessments such as homework and in-class assignments.

Women's underperformance on exams is troubling for two reasons in particular. First, exam scores usually constitute a high proportion of grades in introductory courses ( Koester et al., 2016 ). If the primary assessment method in entry-level STEM courses leads to a “gender penalty” for female students, then institutions are creating an early obstacle that may prevent women from advancing to the upper-level subject material. Second, studies across STEM courses show that in some disciplines, low performance in introductory courses is disproportionately impactful for women, often resulting in the abandonment of their major, while men with similar performance are more likely to continue in the discipline ( Grandy, 1994 ; McCullough, 2004 ; Rask and Tiefenthaler, 2008 ; Rauschenberger and Sweeder, 2010 ; Creech and Sweeder, 2012 ; Eddy and Brownell, 2016 ; Koester et al., 2016 ; Matz et al., 2017 ). Among women who perform well in introductory courses, Marshman et al. (2018) showed those who received high scores on a physics conceptual survey (or who were receiving A's) reported similar self-efficacy measures as male students with medium or low scores on the physics conceptual surveys (or who were receiving B's and C's). Therefore, female underperformance on exams, if generalizable across disciplines and over time, leads to a consequential gender performance gap that systematically disadvantages female students during their undergraduate pursuit of a degree.

Our previous work showed that women in introductory biology classes underperformed relative to men on exams, and that exam anxiety and interest in course content mediated the relationship between incoming preparation and exam performance ( Ballen et al., 2017a ). Until now, it was unclear whether the patterns we observed in undergraduate biology persist (1) in other disciplines, and (2) among students who have advanced beyond introductory science courses. First, biological sciences are among the most female-dominant fields in undergraduate STEM; ~60% of undergraduate students in the life sciences are women ( Neugebauer, 2006 ). If the gender gap in exam performance in introductory biology is due in part to the impact of test anxiety, this gap might be even more pronounced in male-dominated STEM fields where women are susceptible to negative social and learning experiences (e.g., tokenism, gender stereotypes about science abilities; Kanter, 1977 ; Miller et al., 2015 ).

Second, gender gaps in exam performance can be moderated by characteristics of the learning environment. Examples of characteristics that have documented impacts on student performance or experiences include group composition (e.g., gender ratio; Dahlerup, 1988 ; Sullivan et al., 2018 ), instructor traits (e.g., instructor gender: Crombie et al., 2003 ; Cotner et al., 2011 , or attitude: Alsharif and Qi, 2014 ; Cooper et al., 2018b ), and class size ( Ballen et al., 2018 ). These characteristics may vary across disciplines, divisions, or even over a single semester. For example, upper division courses differ from lower division courses in a number of ways, and performance gaps present at the introductory level might not be apparent in more advanced courses. In upper division courses, student grades are less reliant on scalable multiple-choice exams; instead, the reliance on “lower stakes” assessments might ameliorate the negative impact of test anxiety on performance. Alternatively, or additionally, capable but test-anxious women may be weeded out at the lower division, or become acclimated to high stakes exams—or develop tools to counter test anxiety–as they progress through higher education.

In this study, we examined the generalizability of the exam gender gap across different STEM fields, and across both lower and upper division courses. We also studied how underlying social psychological mechanisms that have been previously studied in the context of student performance in STEM courses (e.g., test anxiety, interest in the course material Ballen et al., 2017a ) change over time, and how they function as mediators of the gender gap in exam performance.

We address two multi-part questions as they apply across different fields (the College of Biological Sciences and the College of Science and Engineering) and divisions (lower and upper division courses):

1. Gender gap in different assessment methods (RQ1): (A) Do we observe a gender gap in performance across different assessment methods (i.e., exam, non-exam, laboratory, and course grades)? (B) To what extent can these potential gender gaps be explained by incoming preparation (as measured by students' American College Testing entrance exam score, hereafter ACT)?

2. Social psychological mediators of exam gender gap (RQ2): (A) How do test anxiety and interest in course content mediate performance outcomes on exams? (B) How do these two social psychological factors vary based on gender and over the course of a semester?

Data Collection

The study is based on a secondary analysis of previously collected data that were provided by CB and SC . The IRB of the University of Minnesota exempted this study from the ethics review process (University of Minnesota IRB 00000800).

Class Performance

Administrative data were obtained from 5,864 students between 2015 to 2017. Courses included those offered by the College of Biological Sciences (CBS) or the College of Science and Engineering (CSE). A subset of the CBS data were explored in prior reports (e.g., Ballen et al., 2017a ; Cotner and Ballen, 2017 ). CBS is a relatively small college (~2,500 undergraduates) with a large percentage of women (the 2018 first year class was 66% female) and is restricted to biological fields including neuroscience, ecology, and genetics (“College of Biological Sciences,” 2018). However, the lower-division courses involved in this study primarily target non-biology majors and only one of the courses enrolls students interested in pursuing biology as a major. These introductory biology courses not only include the standard curriculum, but also include courses that are customized to student interests such as “Environmental Biology: Science and Solutions,” and the “Evolution and Biology of Sex,” all of which fulfill introductory biology requirements for the university. The upper-division courses in CBS enroll predominantly students majoring in biology. CSE is a larger college (~5,500 undergraduates) with a relatively small percentage of women (the 2018 percent of graduate and undergraduate females was 27.4% female; an all-time high percentage), and houses the departments of chemistry, physics and astronomy, chemical engineering and materials science, computer science and engineering, and the school of mathematics (“CSE: By numbers,” 2018). Lower division courses included a mix of majors and non-majors, and upper division courses primarily served students who intended to major in the discipline (e.g., chemistry, computer science). We only included students who reported their gender in our sample ( N = 5766) ( Table 1 ).

www.frontiersin.org

Table 1 . Descriptive statistics.

In this sample, we compared (1) average exam scores, or scores on all high-stakes assessments that accounted for a relatively large portion of a student's grade, (2) average non-exam scores including in-class assignments, credit for participation, and group work (note: these scores do not include out-of-class homework), (3) average laboratory scores, where applicable, and (4) final course grades (i.e., student cumulative performance in the course based on their performance on all exam, lecture, and laboratory activities). For each of these items, we transformed all raw percentage scores into class Z-scores (a measure of how many standard deviations a value is from the class section's mean score) for ease of interpretation. We calculated Z-scores using the formula Z-score = (X–μ)/σ, where X is the grade of interest, μ is the class mean score, and σ is the standard deviation.

Social Psychological Factors

In addition to performance data, for a subsample, we also examined change in exam anxiety and interest in course content over time. We conducted a survey at the beginning of the semester (pre-survey) and at the end of the semester before the final exam (post-survey). The survey included measures of student interest in course content as well as test anxiety ( Table 2 ). For both metrics, we used multi-item constructs from Pintrich's et al. (1993) Motivated Strategies for Learning Questionnaire (MSLQ; Table 2 ). The MSLQ is a common tool for assessing motivated strategies for learning, with historically high reliability and validity across different student populations (e.g., Pintrich et al., 1993 ; McClendon, 1996 ; Büyüköztürk et al., 2004 ; Feiz and Hooman, 2013 ; Jakešová and Hrbáčková, 2014 ). Items on each subscale were rated on a 7-point scale (1 = not at all true for me to 7 = very true for me). Factor loadings of items were between 0.64 to 0.87 for interest in course content, and 0.73 to 0.89 for test anxiety. In the reliability study, the internal consistency alpha coefficient was calculated as 0.89 and 0.88, respectively, for these two subscales.

www.frontiersin.org

Table 2 . Items used in a survey of students in courses offered by the College of Biological Sciences (CBS) and the College of Science and Engineering (CSE) at the University of Minnesota ( N = 3,368).

Statistical Analyses

For our analyses, we parsed our data across colleges and divisions: lower division CBS, lower division CSE, upper division CBS, and upper division CSE. We divided data across colleges because the students may be systematically different in each college in ways that impact our outcome variables. Differences across colleges in our sample are discussed in the data collection section. We also divided data across divisions for each college, as there may be a selection bias among those who pursue upper division courses. Upper division courses target students who have already chosen a STEM field for their major, while lower division courses also target non-major students. For each of the four sub-samples, we examined: (RQ1.A) the gender performance gap across different assessment methods (e.g., exam, non-exam, laboratory, and overall course grade); (RQ1.B) the impact of incoming preparation on assessment measures for men and women; (RQ2.A) the mediation effects of test anxiety and interest in course content on exam performance across genders; and (RQ2.B) how social psychological factors vary across genders and over time.

RQ1.A. Gender Gaps in Performance Across Different Assessment Methods

First, we analyzed gender performance gaps for different assessment methods without controlling for student incoming preparation and other demographic factors. These raw, “transcriptable” performance measures are what students see on their transcripts, use to assess their performance relative to their peers, and submit in graduate school applications. In order to examine the gender gap in performance, we used mixed-model regression analysis to predict student performance by gender without controlling for student incoming preparation. In this analysis, we included the fixed effect of gender, and the random effects of courses and sections to reflect the nested structure of the data (i.e., when sections are nested within courses).

RQ1.B. The Impact of Incoming Preparation on Gender Performance Gaps

Second, we examined the gender gap in student performance while controlling for student incoming preparation as well as their underrepresented minority (URM) status and first generation status (FGEN). Here we define URM students as those who are African American, Latino/a, Pacific Islander, and Native American. Incoming preparation was measured as students' American College Testing (ACT) score. The ACT is a standardized test that covers English, mathematics, reading, and science reasoning, and is commonly used for college admissions as well as in education research as a general measure of “incoming academic preparation.” High schools in the United States vary substantially with respect to coursework, institution type (e.g., public, private, home-schooled), size, and grading scale. Admissions officers in higher education use tests such as the ACT to place student metrics such as grades and class rank in a national perspective ( https://www.act.org ). However, the location of public schools in the United States also dictates financial resources committed to them, such that a district with higher socio-economic status has more educational resources going to each individual student ( Parrish et al., 1995 ). Thus, variation observed in ACT score can also be explained by socio-economic status of students or proxies thereof (e.g., minority status, first-generation status; Carnevale and Rose, 2013 ).

For this analysis, to find the simplest best-fitting model, we first started with a basic additive model. Then, we added different interaction terms between variables to this basic model, and tested whether addition of any interaction term would significantly improve the fit of the model. Our final model included gender (a factor with two levels: male = 0, female = 1), URM status (a factor with two levels: non-URM = 0, URM = 1), first generation (FGEN) status (i.e., whether the student was among the first generation in their family to attend university; a factor with two levels: continuing generation = 0, first generation = 1), and ACT score, as well as any interaction terms between these variables that improved the model fit significantly. Similar to the previous analyses, we also included the random effects of courses and sections.

RQ2.A. The Mediation Effect of Social Psychological Factors on Student Performance

For a subsample of students from whom we collected surveys, we used structural equation modeling (SEM) with lavaan R package ( Rosseel, 2012 ) in order to test the structural relationship between incoming preparation, self-reported test anxiety, interest in course content, and exam performance for different genders. SEM is a statistical tool that allows us to address mechanisms underlying documented trends ( Taris, 2002 ; Jeon, 2015 ). We used CFI, RMSEA, and SRMR to evaluate model fits. In this analysis, we normalized ACT score, test anxiety, and interest in course content for the whole sample. The normalized scores represent a measure of how many standard deviation a value is from the sample mean score. For students' general levels of test anxiety and course interest, we used data from the survey administrated at the beginning of the semester. The descriptive statistics of the subsample used in SEM are reported in Table 3 .

www.frontiersin.org

Table 3 . Descriptive statistics of the data used for structural equation modeling.

RQ2.B. The Variation of Social Psychological Factors Across Genders and Time

To examine the variation of social psychological factors across genders and time, we analyzed how test anxiety and interest in course content vary over the semester for men and women. We used mixed-model multivariable regression analyses to regress either of these two psychological factors on gender and time points (beginning and end of the semester), while including the random effect of students.

Figure 1 shows the average normalized score for different assessment methods across genders. In the next section, we report the sizes of gender gaps, and their significance across colleges and divisions for different assessment methods based on mixed-model single variable regression.

www.frontiersin.org

Figure 1 . Average normalized performance for women and men across different assessment methods (exam, non-exam assessments, laboratory, and overall course grade), disciplines (College of Biological Sciences, CBS; College of Science and Engineering, CSE), and divisions (lower division or upper division courses).

Consistent with the pattern observed in Ballen et al. (2017b) , in lower division courses in CBS, women underperformed by a relatively small but significant margin on exams ( p = 0.033) ( Table 4 ). However, they significantly overperformed relative to men in non-exam ( p < 0.0001) and laboratory measures ( p < 0.0001). Due to their overperformance in these measures, women overperformed relative to men in overall course grades ( p = 0.044). For CSE lower division courses, which include more male-stereotyped STEM disciplines such as physics, math, and chemistry, we observed the same trend of female underperformance on exams ( p < 0.0001); of note, the size of the exam gender gap in CSE was three times that of CBS (−0.24 compared to −0.08 standard deviation). However, there was no gender gap in the non-exam measure ( p = 0.233), and women significantly overperformed relative to men in the laboratory measure ( p = 0.033). Due in part to the difference in the size of the gender gap in exams, as well as the differential weighting of exams in the overall course grade (e.g., Cotner and Ballen, 2017 ), women underperformed relative to men in overall course grades in lower division CSE ( p = 0.002).

www.frontiersin.org

Table 4 . Regression estimates for the gender gap in performance in lower division courses in the College of Biological Sciences (CBS) and the College of Science and Engineering (CSE) across different assessment methods, including the overall course grade.

For upper division students in both CBS and CSE, we observed no influence of gender on exam performance ( p CBS = 0.164, p CSE = 0.987, Table 5 ). However, in non-exam assessments, women marginally overperformed relative to men in CBS courses (b CBS = 0.28, p CBS = 0.072), and significantly overperformed in CSE couses (b CSE = 0.54, p CSE <0.0001). On overall course grades, we did not observe gender differences across disciplines ( p CBS = 0.108; p CSE = 0.352, Table 5 ). Due to a left skew in our data, to be conservative we also re-ran all the above analyses using non-parametric tests. The outcomes were the same, and results of non-parametric analyses were very similar to regression analyses reported here ( Tables S1, S2 ).

www.frontiersin.org

Table 5 . Regression estimates for the gender gap in performance in upper division courses in the College of Biological Sciences (CBS) and the College of Science and Engineering (CSE) across different assessment methods, including the overall course grade.

To analyze what portion of gender performance gaps described in Tables 4 , 5 are due to differences in incoming academic preparation, we used mixed-model multivariable linear regression with ACT as a measure of incoming preparation. In this analysis, we also controlled for URM and FGEN status of students. Table 6 reports the coefficients of the simplest best fitting regression models predicting performance in each assessment method for lower division courses across colleges. In the following, we will focus on the effect of gender in this analysis. However, there are noteworthy effects of URM and first generation status on student performance that we discuss in detail in a forthcoming publication (Salehi et al., in preparation). Because our data violated the assumption of normality of residual distribution, we also analyzed the data using robust regression ( Yaffee, 2002 ; Koller, 2015 , 2016 ). The results of robust regression are reported in ( Tables S3, S4 ). The results of these analyses were aligned with the following reported results.

www.frontiersin.org

Table 6 . The simplest best fitting models predicting performance in lower division courses across different assessment methods in the College of Biological Sciences (CBS) and the College of Science and Engineering (CSE).

For lower division courses, we found no significant gender gap in exam performance in CBS courses ( p = 0.404) after controlling for incoming preparation ( Table 6 ). However, even after controlling for incoming preparation, women performed 0.19 standard deviation lower than men on exams in lower division CSE courses ( p < 0.0001). In contrast, women overperformed relative to men in non-exam and laboratory scores by 0.23 ( p < 0.0001), and 0.22 standard deviation ( p < 0.0001), respectively, in lower division CBS courses; and 0.17 standard deviation ( p = 0.004) in laboratory scores in lower division CSE courses. Women also marginally overperformed by 0.10 standard deviation in non-exam scores of lower division CSE courses ( p = 0.090). For the overall course grade, after controlling for incoming preparation, women significantly overperformed relative to men by 0.16 standard deviation in CBS courses ( p = 0.0001), but they marginally underperformed by 0.09 standard deviation in CSE courses ( p = 0.057).

In upper division courses, after controlling for incoming preparation, we found no gender difference in exam performance for both colleges ( Table 7 ). However, female students overperformed in non-exam measures significantly in upper division CSE courses, and marginally in CBS courses. They had on average 0.58 standard deviation higher non-exam score in upper division CSE courses ( p < 0.0001), and 0.28 standard deviation in CBS courses ( p = 0.096). Upper division CSE courses in this sample did not have lab components, and in CBS courses, we did not observe differences in lab scores ( p = 0.330). For the overall course grade, there was no gender gap in CBS ( p = 0.178), and marginal female overperformance of 0.21 standard deviation in CSE ( p = 0.095). This marginal overperformance of females in CSE can be explained by their overperformance in non-exam assessments.

www.frontiersin.org

Table 7 . Predictors of performance in upper division courses across different assessment methods in the College of Biological Sciences (CBS) and the College of Science and Engineering (CSE).

In summary, female students only underperformed on exams in lower division, introductory courses. After accounting for incoming preparation through ACT score, this gender gap in exam performance closed in one college (CBS), and decreased in size in the other (CSE). In other forms of assessment, if we observed any gender difference, it was female students outperforming their male counterparts.

Previous work demonstrates that test anxiety and interest in the course content exert gender-specific impacts on exam performance in introductory biology ( Ballen et al., 2017b ). To test whether these patterns persisted across different disciplines and divisions, we re-tested the same model on this larger sample using structural equation modeling (SEM) analysis. We fit the hypothesized model, shown in Figure 2 , to four sub-samples of data (CBS lower division, CSE lower division, CBS upper division, CSE upper division), and used gender as a grouping variable to fit this model to the data of each gender separately.

www.frontiersin.org

Figure 2 . Hypothesized model of the relationship among incoming preparation (ACT), social psychological factors (test anxiety, and interest in course content), and exam performance. We fit the structural equation model to data collected from men and women separately in the College of Biological Sciences (CBS) and the College of Science and Engineering (CSE), at both lower and upper division courses.

We hypothesized that for women and men in each of these four sub-samples, exam performance is influenced by incoming preparation, text anxiety, and interest in course content. Furthermore, test anxiety and interest in course content are influenced by student incoming preparation. Therefore, this model suggests that incoming preparation influences student exam performance directly, as well as indirectly through test anxiety and interest in course content. In other words, test anxiety and interest in course content partially mediate the effect of incoming preparation on exam performance. By fitting this model to the data of each gender separately, we tested whether these mediation effects are different across genders for each sub-sample.

Acceptable ranges for SEM fit indices are: 0–0.07 for root mean square error (RMSEA), above 0.95 for comparative fit index (CFI), and 0–0.1 for standardized root mean square residual (SRMR) ( Taris, 2002 ). This model had acceptable fit indices for all four subsamples, suggesting that it was an acceptable model to describe the variation in the data (CBS-lower: RMSEA = 0.064, CFI = 0.990, SRMR = 0.020; CBS-upper: RMSEA = 0.000, CFI = 1.00, SRMR = 0.017; CSE-lower: RMSEA = 0.057, CFI = 0.989, SRMR = 0.023; CSE-upper: RMSEA = 0.000, CFI = 1.000, SRMR = 0.021).

For women in CBS courses, test anxiety negatively influenced exam scores in both lower and upper divisions; for male students, however, test anxiety did not correlate with exam scores ( Figure 3 ). Further, for CBS female students, ACT score was also negatively correlated with test anxiety. Therefore, this model suggests that incoming preparation influences student exam performance positively and directly, as well as indirectly through test anxiety (the red path in Figure 3 ). It is also notable that this indirect effect was stronger for upper division courses, as the negative relationship between test anxiety and female student exam score was stronger in upper division courses than in lower division courses. One standard deviation increase in test anxiety decreased exam score by 0.11 standard deviation in lower division courses and by 0.34 standard deviation in upper division courses.

www.frontiersin.org

Figure 3 . Partial mediation analyses show differences in the significant effects of incoming preparation (ACT) on exam performance for female (Left) and male (Right) students in lower division and upper division courses in the College of Biological Sciences (CBS). Red arrows depict negative effects and blue arrows show positive effects. ACT has direct, positive effects on exam performance for all lower division students and for female upper division students. This effect was marginally significant for male upper division students. For female students in the lower and upper division courses, ACT negatively predicts test anxiety, which in turn influences exam performance. For male students at the lower and upper division, ACT negatively affects test anxiety, but test anxiety does not in turn influence exam performance. In the graphs, “e” circles indicate error terms in estimations of the structural equation model variables.

Similarly, in CSE, test anxiety negatively correlated with exam scores for female students in both lower and upper divisions, but did not correlate with exam scores for male students in both divisions ( Figure 4 ). However, unlike CBS, female student anxiety was correlated with their ACT scores only for the lower division courses, and not for the upper division courses. Therefore, the negative influence of anxiety is mediator for the indirect effect of ACT on exam for lower division CSE course. Like CBS, the negative influence of test anxiety on exam score increased in the upper division courses. One standard deviation increase in test anxiety decreased exam score by 0.14 standard deviation in lower division courses and 0.25 standard deviation in upper division courses. In summary, while the relationship between incoming preparation and test anxiety varied across women studying STEM at the University, we observed a consistently significant negative relationship between test anxiety and exam performance.

www.frontiersin.org

Figure 4 . Partial mediation analyses show differences in the significant effects of incoming preparation (ACT) on exam performance for female and male students in the College of Science and Engineering (CSE) in lower division and upper division courses. Red arrows depict negative effects and blue arrows show positive effects. ACT has direct, positive effects on exam performance for all lower division and upper division students. For female students at the lower division, ACT predicts test anxiety, which negatively predicts exam performance. For women in upper division courses, ACT does not predict test anxiety, but test anxiety negatively predicts exam performance. For male students at the lower and upper division, ACT negatively affects test anxiety, but test anxiety does not in turn influence exam performance. In the graphs, “e” circles indicate error terms in estimations of the structural equation model variables.

Our results suggest that regardless of discipline, exam performance for women was negatively influenced by their test anxiety, and surprisingly, this influence was more pronounced in upper division courses ( Figures 3 , 4 ).

In all four sub-samples, except for CBS upper division courses ( p = 0.272), women reported significantly higher levels of test anxiety than men ( Figure 5 ). Women reported on average 0.35 standard deviation higher levels of test anxiety ( p = 0.0001) than men in CBS lower division courses, and 0.6 standard deviation higher level of test anxiety in both lower and upper division CSE courses ( p < 0.0001). Furthermore, in CSE upper division courses, test anxiety increased by 0.29 standard deviation over the semester ( p < 0.0001).

www.frontiersin.org

Figure 5 . Change in test anxiety over the course of the semester for students in CBS and CSE for women (green) and men (blue) in lower division courses (Top) and upper division courses (Bottom) . The survey was administered at the beginning of the semester (pre-survey) and at the end of the semester (post-survey; i.e., after students completed the last in-class test, but before their final exam). On average, women (green) reported higher levels of test anxiety than men (blue) in lower division courses in the College of Biological Sciences (CBS) and in both upper and lower divisions in the College of Science and Engineering (CSE). In upper division CSE, average test anxiety significantly increased for all students over the course of the semester.

Interest in course content was not a significant factor in predicting exam performance in any of the subsamples. That said, we still examined the variation in interest across genders and over the semester. We also found no gender difference in interest in upper division courses across both colleges ( p CBS = 0.257, p CSE = 0.665), and no significant change in interest over the semester for CBS ( p CBS = 0.900, p CSE = 0.131). However, in lower division courses, female students expressed 0.35 standard deviation higher interest in course content than male students in CBS courses ( p = 0.0001), and 0.49 standard deviation lower interest in course content than male students in CSE courses ( p < 0.0001). For all students, interest increased by 0.18 standard deviation in CBS lower division courses ( p = 0.023), and decreased by 0.17 standard deviation in CSE lower division courses ( p = 0.005). Changes over time in interest in lower division courses were not different between genders ( p CBS = 0.648, p CSE = 0.711) ( Figure 6 ).

www.frontiersin.org

Figure 6 . Change in interest in course content over the course of the semester for students in CBS and CSE for women (green) and men (blue) in lower division courses (Top) and upper division courses (Bottom) . The survey was administered at the beginning (pre-survey) and at the end of semester (post-survey).

Gaps in academic performance are attributable to a host of different external factors, including measures of academic preparation. However, even when accounting for preparation (e.g., via the ACT, SAT, or high-school grade-point average), achievement in some disciplines can be predicted by student characteristics such as gender, underrepresented minority status, and first generation status. We explored how factors other than these unidimensional categories of student identity—such as social psychological factors—impacted performance among students in science. We focused on mechanisms that underlie the gender-based performance gaps in different assessment methods across STEM fields and divisions.

We showed that women only underperformed in high stakes examinations in lower division introductory courses across multiple STEM fields. However, in non-exam and laboratory assessment methods in these introductory courses, either there was no gender gap or female students overperformed relative to their male peers. In CBS courses the gender gap in exam performance became non-significant when incoming preparation was accounted for. However, in CSE, even after controlling for incoming preparation, we observed a significant gender gap in exam scores. For upper division courses, unlike lower division courses, there was no gender gap in exam performance; and similar to lower division courses, in non-exam and laboratory assessment methods, either there was no gender gap, or female students overperformed relative to their male peers.

The gender difference in “transcriptable” grades in introductory courses in the two colleges could be due to several factors. First, the courses included in this study in CBS are some of a number of courses that meet the university's liberal education requirement for “biology-with-lab.” The majority of the CBS courses included in this study do not serve as prerequisites for any other courses nor are they specifically required for most majors. All the CSE courses in this study are prerequisites for numerous courses and are required (or one of several challenging course options) for various majors. Therefore, the pressure to perform in the introductory level courses included in this study might be very different between the colleges. The grade pressure in a biology-with-lab course that is not a requirement for a student's major is likely lower than the grade pressure in courses that are considered gateways into a major. Further, this grade pressure may differentially impact the level of exam anxiety students feel. However, we did not see meaningful differences in test anxiety between the two colleges in these lower division courses.

We examined the mediation impact of test anxiety and interest in course content on gender performance on exams. The underperformance of women in lower division exams was explained in part by reported test anxiety. In upper division courses, which lack gender gaps in exam performance, test anxiety still negatively impacted exam performance for women, but not for men. For the remainder of this work, we further explore the phenomenon of anxiety—both general and test anxiety, especially as it pertains to gender-biased gaps in performance in STEM fields.

Test anxiety is common among university students; in one sample of undergraduates, 30.0% of males and 46.3% of females reported suffering from test anxiety. In this same report, students often declined seeking help from their peers or instructor for fear of the stigma associated with test anxiety ( Gerwing et al., 2015 ). A meta-analysis of 126 studies found a negative correlation between test anxiety and performance, reporting that overall, students who reported low test anxiety overperformed relative to students who reported high test anxiety by nearly half of a standard deviation ( Seipp, 1991 ).

Further, women are more likely than men to be diagnosed with a generalized anxiety disorder ( Wittchen et al., 1994 ; Kessler et al., 2005 ; Leach et al., 2008 ). Similarly, some investigators have documented higher levels of test anxiety in women than in their male peers ( Osborne, 2001 ; Núñez-Peña et al., 2016 ; Ballen et al., 2017a ). Our current work connects these threads by demonstrating that test anxiety negatively impacts exam performance for women, but not for men. Not only do these data confirm prior findings ( Ballen et al., 2017a ), but they elaborate on earlier work by identifying these trends in courses offered through multiple STEM disciplines besides biology.

While some hypothesize that heightened emotionality during an exam causes heightened anxiety, which in turn depresses performance ( Maloney and Beilock, 2012 ; Ramirez et al., 2013 ), others suggest that it is the awareness of poor past performance that causes test anxiety ( Hembree, 1990 ). In the first case, it is the anxiety that leads to the poor performance, and in the second, it is the poor performance that leads to the anxiety. Regardless of the origins of anxiety, there are certainly strategies instructors can use to minimize test anxiety and its impacts—strategies that are likely to benefit all students. And, given the demonstrated connection between introductory-level performance and retention in STEM ( Seymour and Hewitt, 1997 ), it is worthwhile to pay closer attention to social psychological factors—such as test anxiety—that may disadvantage underrepresented groups.

How Can Instructors Address Student Anxiety?

It may be difficult to target each individual student's experience of anxiety, especially in the lower-division, high-enrollment courses. However, there are certain strategies instructors can employ to decrease the anxiety itself, or the impacts of the anxiety on a student's performance.

Rethinking assessment can be a helpful strategy that directly addresses test anxiety. Prior work in several introductory biology courses demonstrated that gendered performance gaps disappeared when exams were devalued in favor of the addition of multiple, lower-stakes assessments—possibly as a result of a reduction in test anxiety ( Ballen et al., 2017a ; Cotner and Ballen, 2017 ). The fact that women in our sample were more likely to underperform, relative to their male peers, on anxiety-inducing high stakes exams, combined with the fact that, across the board, women express higher levels of test anxiety, suggests that minimizing the impact of exams could lower performance gaps–such as those documented here.

Instructors can also create a classroom environment that minimizes general anxiety. Tanner (2013) discusses several instructional strategies for creating a welcoming classroom environment and reducing general class anxiety—from playing music before class to taking time to hear a range of student voices. Avoiding anxiety-inducing behaviors such as cold-calling on individual students ( England et al., 2017 ; Cooper et al., 2018a ), and opting for less stressful options such as calling on groups via randomly appointed spokespeople can minimize anxiety ( Rocca, 2010 ). And creating a pattern of frequently encountered behaviors will allow students to adjust to the specific in-class expectations of the instructor ( McCroskey, 2009 ). Finally, simply being transparent in expectations (about grading, test content, learning goals) can minimize anxiety ( Neer, 1990 ). These are strategies that target student general anxiety in class, not particularly their test anxiety. While these two anxiety constructs can be positively correlated, they might differ significantly as well. Future works should explore whether and how reducing general anxiety in class would impact test anxiety, and how this effect is moderated by demographic status.

Recommendations for Future Research

While there is compelling evidence that test anxiety, as well as anxiety in general, affects performance and retention, there is little if any work demonstrating the impacts of the above interventions on student anxiety, or the connections between lowered anxiety and improved performance. Thus, future work could measure the impacts of experimentally reducing anxiety on student outcomes such as performance, self-efficacy, sense of social belonging, and retention. For example, instructors could incorporate weekly quizzes, instead of or in addition to higher-stakes midterm exams, to test whether this reduces test anxiety, and in turn, improves performance for those impacted by test anxiety. Additionally, adding constructed response questions to summative assessments in large enrollment courses mitigates gender-biased performance outcomes ( Stanger-Hall, 2012 ), and future work would benefit from exploring the impacts of different types of exam questions on student anxiety. Also, offering the option of retaking high stakes exams might reduce the anxiety associated with single metrics, as could extending the time allowed to complete exams.

With this current work, we cannot explain why the gendered gaps in performance disappear in upper division courses. Are women being “weeded out” after introductory courses, are they learning to cope over time, or benefitting from small classrooms ( Ballen et al., 2018 )? Also, because the populations are different—representing a greater range of majors in the introductory courses—we cannot rule out the possibility that the differences seen in lower division courses are driven largely by students not intending to major in science. These questions should be experimentally addressed, and will also benefit from longitudinal studies of individual students in the STEM pathway.

In this study we did not have any data about specific instructional practices employed in each particular course. Therefore, we could not examine how instructional practices in each course influenced gender gaps in different assessment methods. Second, the courses in our sample do not represent a cross-section of all courses offered in each college, nor were they selected to be contrasting cases of instructional practices. Our data collection was based on a convenience sample of instructors willing to share their data. Given that, we could only examine whether, on average, there existed gender gaps in different assessment methods in a set of different STEM courses in lower and upper divisions. Despite differences in student composition across two diverse colleges, the similar results we observed suggest these trends are generalizable to science majors and non-majors. Future studies might explore how different instructional practices affect demographic performance gaps, and which STEM fields have been more successful in employing these equitable instructional practices.

Finally, we used ACT as a measure of incoming preparation. We recognize that the ACT itself is a crude measure of student incoming preparation, and that it is also a high stakes examination. Other metrics, such as high-school ranking or GPA, might give a more accurate snapshot of a student's incoming preparation. Given the evidence from this study and previous studies, it is clear that the way in which we assess students should be reconsidered—not only within colleges and universities, but also in the admission process of higher education.

For investigators, there is still much work to be done to establish the salient connections between student affect, performance, and retention in STEM. And for instructors, it's clearly time to reconsider long-standing norms related to assessment strategies. Specifically, it may be time for a shift away from reliance on high stakes, timed examinations, which have negative effects on female students and may not be telling of a students' ability to succeed in a discipline. Rather, we encourage the use of evaluation that measures relevant skills, encourages growth, and allows instructors and students to better assess student progress.

Data Availability Statement

The datasets generated for this study are available on request to the corresponding author.

Ethics Statement

We received IRB exemption to work with student data from University of Minnesota, IRB 00000800.

Author Contributions

SS, SC, and CB contributed to the conception and design of the study. CB organized the database. SS performed the statistical analysis. SS and CB wrote the first draft of the manuscript. SC wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.

This work was funded in part by a Research Coordination Network grant no. 1729935, awarded to SC and CB, from the National Science Foundation, RCN–UBE Incubator: Equity and Diversity in Undergraduate STEM.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We thank Daniel Baltz for help with data organization, and the students who contributed performance data and their honest input on surveys.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/feduc.2019.00107/full#supplementary-material

Alsharif, N. Z., and Qi, Y. (2014). A three-year study of the impact of instructor attitude, enthusiasm, and teaching style on student learning in a medicinal chemistry course. Am. J. Pharmac. Educ . 78:132. doi: 10.5688/ajpe787132

PubMed Abstract | CrossRef Full Text | Google Scholar

Ballen, C. J., Aguillon, S. M., Brunelli, R., Drake, A. G., Wassenberg, D., Weiss, S. L., et al. (2018). Do small classes in higher education reduce performance gaps in STEM?. BioScience . 68, 593–600. doi: 10.1093/biosci/biy056

CrossRef Full Text | Google Scholar

Ballen, C. J., and Mason, N. A. (2017). Longitudinal analysis of a diversity support program in biology: a national call for further assessment. Bioscience 67, 367–373. doi: 10.1093/biosci/biw187

Ballen, C. J., Salehi, S., and Cotner, S. (2017a). Exams disadvantage women in introductory biology. PLoS ONE 12:e0186419. doi: 10.1371/journal.pone.0186419

Ballen, C. J., Wieman, C., Salehi, S., Searle, J. B., and Zamudio, K. R. (2017b). Enhancing diversity in undergraduate science: self-efficacy drives performance gains with active learning. CBE-Life Sci. Educ . 16:ar56. doi: 10.1187/cbe.16-12-0344

Büyüköztürk, S., Akgün, Ö. E., Özkahveci, Ö., and Demirel, F. (2004). The validity and reliability study of the Turkish version of the motivated strategies for learning questionnaire. Educ. Sci. Theory Pract. 14, 821–833. doi: 10.12738/estp.2014.3.1871

Carnevale, A. P., and Rose, S. (2013). Socioeconomic Status, Race/Ethnicity, and Selective College Admissions . Center on Education and the Workforce.

Google Scholar

Cohen, G. L., Garcia, J., Apfel, N., and Master, A. (2006). Reducing the racial achievement gap: a social-psychological intervention. Science 313, 1307–1310. doi: 10.1126/science.1128317

Cooper, K. M., Downing, V. R., and Brownell, S. E. (2018a). The influence of active learning practices on student anxiety in large-enrollment college science classrooms. Int. J. STEM Educ. 5:23. doi: 10.1186/s40594-018-0123-6

Cooper, K. M., Hendrix, T., Stephens, M. D., Cala, J. M., Mahrer, K., Krieg, A., et al. (2018b). To be funny or not to be funny: gender differences in student perceptions of instructor humor in college science courses. PLoS ONE 13:e0201258. doi: 10.1371/journal.pone.0201258

Cotner, S., Ballen, C., Brooks, D. C., and Moore, R. (2011). Instructor gender and student confidence in the sciences: a need for more role models. J. College Sci. Teach . 40, 96–101.

Cotner, S., and Ballen, C. J. (2017). Can mixed assessment methods make biology classes more equitable? PLoS ONE 12:e0189610. doi: 10.1371/journal.pone.0189610

Creech, L. R., and Sweeder, R. D. (2012). Analysis of student performance in large-enrollment life science courses. CBE Life Sci. Educ. 11, 386–391. doi: 10.1187/cbe.12-02-0019

Crombie, G., Pyke, S. W., Silverthorn, N., Jones, A., and Piccinin, S. (2003). Students perceptions of their classroom participation and instructor as a function of gender and context. J. High. Educ. 74, 51–76. doi: 10.1353/jhe.2003.0001

Dahlerup, D. (1988). From a small to a large minority: women in Scandinavian politics. Scand. Pol. Stud. 11, 275–298. doi: 10.1111/j.1467-9477.1988.tb00372.x

DeBerard, M. S., Spielmans, G. I., and Julka, D. L. (2004). Predictors of academic achievement and retention among college freshmen: a longitudinal study. Coll. Stud. J . 38, 66–81.

Eddy, S. L., and Brownell, S. E. (2016). Beneath the numbers: a review of gender disparities in undergraduate education across science, technology, engineering, and math disciplines. Phys. Rev. Phys. Educ. Res. 12:20106. doi: 10.1103/PhysRevPhysEducRes.12.020106

England, B. J., Brigati, J. R., and Schussler, E. E. (2017). Student anxiety in introductory biology classrooms: perceptions about active learning and persistence in the major, PLoS ONE 12:e0182506. doi: 10.1371/journal.pone.0182506

Feiz, P., and Hooman, H. A. (2013). Assessing the Motivated Strategies for Learning Questionnaire (MSLQ) in Iranian students: construct validity and reliability. Proc. Soc. Behav. Sci. 84, 1820–1825. doi: 10.1016/j.sbspro.2013.07.041

Gerwing, T. G., Rash, J. A., Allen Gerwing, A. M., Bramble, B., and Landine, J. (2015). Perceptions and incidence of test anxiety. Can. J. Scholars. Teach. Learn. 6:3. doi: 10.5206/cjsotl-rcacea.2015.3.3

Grandy, J. (1994). Gender and ethnic differences among science and engineering majors: experiences, achievements, and expectations. ETS Res. Rep. Ser . 1994, i−63. doi: 10.1002/j.2333-8504.1994.tb01603.x

Haak, D. C., HilleRisLambers, J., Pitre, E., and Freeman, S. (2011). Increased structure and active learning reduce the achievement gap in introductory biology. Science 332, 1213–1216. doi: 10.1126/science.1204820

Hembree, R. (1990). The nature, effects, and relief of mathematics anxiety. J. Res. Math. Educ. 21, 33–46. doi: 10.2307/749455

Hurtado, S., and Ruiz, A. (2012). The climate for underrepresented groups and diversity on campus. Am. Acad. Polit. Soc. Sci. 634, 190–206. doi: 10.1177/0002716210389702

Jakešová, J., and Hrbáčková, K. (2014). The Czech adaptation of motivated strategies for learning questionnaire (MSLQ). Asian Soc. Sci. 10, 72–78. doi: 10.5539/ass.v10n12p72

Jeon, J. (2015). The strengths and limitations of the statistical modeling of complex social phenomenon: focusing on SEM, path analysis, or multiple regression models. Int. J. Soc. Behav. Educ. Econ. Bus. Ind. Eng. 9, 1594–1602.

Kanter, R. M. (1977). Some effects of proportions on group life: Skewed sex ratios and responses to token women. Am. J. Soc . 82, 965–990. doi: 10.1086/226425

Kao, G., and Thompson, J. S. (2003). Racial and ethnic stratification in educational achievement and attainment. Ann. Rev. Soc. 29, 417–442. doi: 10.1146/annurev.soc.29.010202.100019

Kessler, R. C., Chiu, W. T., Demler, O., Merikangas, K. R., and Walters, E. E. (2005). Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. Arch. Gen. Psychiatry . 62, 617–627. doi: 10.1001/archpsyc.62.6.617

Koester, B. P., Grom, G., and McKay, T. A. (2016). Patterns of gendered performance difference in introductory STEM courses. arXiv arXiv: 1608.07565.

Koller, M. (2015). robustlmm: Robust Linear Mixed Effects Models. R package version 2.1 . Available online at: http://CRAN.R-project.org/package=robustlmm (accessed August 8, 2019).

Koller, M. (2016). robustlmm: an R package for robust estimation of linear mixed-effects models. J. Stat. Softw. 75, 1–24. doi: 10.18637/jss.v075.i06

Leach, L. S., Christensen, H., Mackinnon, A. J., Windsor, T. D., and Butterworth, P. (2008). Gender differences in depression and anxiety across the adult lifespan: the role of psychosocial mediators. Soc. Psychiatry Psychiat. Epidemiol . 43, 983–998. doi: 10.1007/s00127-008-0388-z

Maloney, E. A., and Beilock, S. L. (2012). Math anxiety: who has it, why it develops, and how to guard against it. Trends Cogn. Sci. 16, 404–406. doi: 10.1016/j.tics.2012.06.008

Marshman, E. M., Kalender, Z. Y., Nokes-Malach, T., Schunn, C., and Singh, C. (2018). Female students with As have similar physics self-efficacy as male students with Cs in introductory courses: a cause for alarm? Phys. Rev. Phys.Educ. Res. 14:20123. doi: 10.1103/PhysRevPhysEducRes.14.020123

Matz, R. L., Koester, B. P., Fiorini, S., Grom, G., Shepard, L., Stangor, C. G., et al. (2017). Patterns of gendered performance differences in large introductory courses at five research universities. AERA Open . 3:2332858417743754. doi: 10.1177/2332858417743754

McClendon, R. C. (1996). Motivation and cognition of preservice teachers: MSLQ. J . Instruc. Psychol . 23:216.

McCroskey, J. C. (2009). Communication apprehension: what have we learned in the last four decades. Hum. Commun. 12, 157–171.

McCullough, L. (2004). Gender, context, and physics assessment. J. Int. Womens Stud. 5, 20–30.

McGrath, M., and Braunstein, A. (1997). The prediction of freshmen attrition: an examination of the importance of certain demographic, academic, financial and social factors. Coll. Stud. J. 31, 396–408.

Mervis, J. (2010). Better intro courses seen as key to reducing attrition of STEM majors. Science 330:306. doi: 10.1126/science.330.6002.306

Mervis, J. (2011). Weed-out courses hamper diversity. Science 334:1333. doi: 10.1126/science.334.6061.1333

Milkman, K. L., Akinola, M., and Chugh, D. (2015). What happens before? A field experiment exploring how pay and representation differentially shape bias on the pathway into organizations. J. Appl. Psychol . 100, 1678–1712. doi: 10.1037/apl0000022

Miller, D. I., Eagly, A. H., and Linn, M. C. (2015). Women's representation in science predicts national gender-science stereotypes: evidence from 66 nations. J. Educ. Psychol. 107, 631–644. doi: 10.1037/edu0000005

Neer, M. R. (1990). Reducing situational anxiety and avoidance behavior associated with classroom apprehension. South. J. Commun. 56, 49–61. doi: 10.1080/10417949009372815

Neugebauer, K. M. (2006). Keeping tabs on the women: life scientists in Europe. PLoS Biol . 4:e97. doi: 10.1371/journal.pbio.0040097

Núñez-Peña, M. I., Suárez-Pellicioni, M., and Bono, R. (2016). Gender differences in test anxiety and their impact on higher education students academic achievement. Proc. Soc. Behav. Sci. 228, 154–160. doi: 10.1016/j.sbspro.2016.07.023

Osborne, J. W. (2001). Testing stereotype threat: does anxiety explain race and sex differences in achievement? Contemp. Educ. Psychol. 26, 291–310. doi: 10.1006/ceps.2000.1052

Parrish, T. B., Matsumoto, C. S., and Fowler, W. (1995). Disparities in Public School District Spending 1989–90. Washington, DC: National Center for Education Statistics Research and Development Report.

Pintrich, P. R., Smith, D. A. F., Duncan, T., and Mckeachie, W. J. (1993). Reliability and predictive validity of the Motivated Strategies for Learning Questionnaire (MSLQ). Educ. Psychol. Measure . 53, 801–813. doi: 10.1177/0013164493053003024

Ramirez, G., Gunderson, E. A., Levine, S. C., and Beilock, S. L. (2013). Math anxiety, working memory, and math achievement in early elementary school. J. Cogn. Dev . 14, 187–202. doi: 10.1080/15248372.2012.664593

Rask, K., and Tiefenthaler, J. (2008). The role of grade sensitivity in explaining the gender imbalance in undergraduate economics. Econ. Educ. Rev . 27, 676–687. doi: 10.1016/j.econedurev.2007.09.010

Rauschenberger, M. M., and Sweeder, R. D. (2010). Gender performance differences in biochemistry. Biochem. Mol. Biol. Educ. 38, 380–384. doi: 10.1002/bmb.20448

Rocca, K. A. (2010). Student participation in the college classroom: an extended multidisciplinary literature review. Commun. Educ. 59, 185–213. doi: 10.1080/03634520903505936

Rosseel, Y. (2012). Lavaan: An R Package for Structural Equation Modeling and More. Version 0.5–12 (BETA). Ghent: Ghent University.

Salehi, S., Burkholder, E., Lepage, G. P., Pollock, S., and Wieman, C. (2019). Demographic gaps or preparation gaps? the large impact of incoming preparation on performance of students in introductory physics. Phys. Rev. Phys. Educ. Res. 15:020114. doi: 10.1103/PhysRevPhysEducRes.15.020114

Seipp, B. (1991). Anxiety and academic performance: a meta-analysis of findings. Anxiety Res. 4, 27–41.

Seymour, E., and Hewitt, N. M. (1997). Talking About Leaving . Boulder, CO: Westview Press.

Stanger-Hall, K. F. (2012). Multiple-choice exams: an obstacle for higher-level thinking in introductory science classes. CBE Life Sci. Educ. 11, 294–306. doi: 10.1187/cbe.11-11-0100

Steele, C. M. (1997). A threat in the air. Am. Psychol. 52, 613–629. doi: 10.1037//0003-066X.52.6.613

Sullivan, L. L., Ballen, C. J., and Cotner, S. (2018). Small group gender ratios impact biology class performance and peer evaluations. PLoS ONE 13:e0195129. doi: 10.1371/journal.pone.0195129

Sun, L. (2017). How high school records and ACT scores predict college graduation (Master of Science). Utah State University, Logan, UT, United States.

Tanner, K. D. (2013). Structure matters: twenty-one teaching strategies to promote student engagement and cultivate classroom equity. CBE Life Sci. Educ. 12, 322–331. doi: 10.1187/cbe.13-06-0115

Taris, T. W. (2002). BM Byrne, structural equation modeling with AMOS: basic concepts, applications, and programming Mahwah NJ: Lawrence Erlbaum, 2001 0-8058-3322-6. Eur. J. Work Org. Psychol. 11, 243–246.

Wittchen, H.-U., Zhao, S., Kessler, R. C., and Eaton, W. W. (1994). DSM-III-R generalized anxiety disorder in the National Comorbidity Survey. Arch. Gen. Psychiatry 51, 355–364. doi: 10.1001/archpsyc.1994.03950050015002

Yaffee, R. A. (2002). Robust Regression Analysis: Some Popular Statistical Package Options. ITS Statistics, Social Science and Mapping Group, New York State University.

Keywords: gender, STEM equity, high stakes assessment, test anxiety, mediation analysis

Citation: Salehi S, Cotner S, Azarin SM, Carlson EE, Driessen M, Ferry VE, Harcombe W, McGaugh S, Wassenberg D, Yonas A and Ballen CJ (2019) Gender Performance Gaps Across Different Assessment Methods and the Underlying Mechanisms: The Case of Incoming Preparation and Test Anxiety. Front. Educ. 4:107. doi: 10.3389/feduc.2019.00107

Received: 07 May 2019; Accepted: 13 September 2019; Published: 27 September 2019.

Reviewed by:

Copyright © 2019 Salehi, Cotner, Azarin, Carlson, Driessen, Ferry, Harcombe, McGaugh, Wassenberg, Yonas and Ballen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shima Salehi, salehi@stanford.edu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

How Common is Cheating in Online Exams and did it Increase During the COVID-19 Pandemic? A Systematic Review

  • Open access
  • Published: 04 August 2023
  • Volume 22 , pages 323–343, ( 2024 )

Cite this article

You have full access to this open access article

coursework vs exams statistics

  • Philip M. Newton   ORCID: orcid.org/0000-0002-5272-7979 1 &
  • Keioni Essex 1  

15k Accesses

25 Citations

32 Altmetric

Explore all metrics

Academic misconduct is a threat to the validity and reliability of online examinations, and media reports suggest that misconduct spiked dramatically in higher education during the emergency shift to online exams caused by the COVID-19 pandemic. This study reviewed survey research to determine how common it is for university students to admit cheating in online exams, and how and why they do it. We also assessed whether these self-reports of cheating increased during the COVID-19 pandemic, along with an evaluation of the quality of the research evidence which addressed these questions. 25 samples were identified from 19 Studies, including 4672 participants, going back to 2012. Online exam cheating was self-reported by a substantial minority (44.7%) of students in total. Pre-COVID this was 29.9%, but during COVID cheating jumped to 54.7%, although these samples were more heterogenous. Individual cheating was more common than group cheating, and the most common reason students reported for cheating was simply that there was an opportunity to do so. Remote proctoring appeared to reduce the occurrence of cheating, although data were limited. However there were a number of methodological features which reduce confidence in the accuracy of all these findings. Most samples were collected using designs which makes it likely that online exam cheating is under-reported, for example using convenience sampling, a modest sample size and insufficient information to calculate response rate. No studies considered whether samples were representative of their population. Future approaches to online exams should consider how the basic validity of examinations can be maintained, considering the substantial numbers of students who appear to be willing to admit engaging in misconduct. Future research on academic misconduct would benefit from using large representative samples, guaranteeing participants anonymity.

Similar content being viewed by others

coursework vs exams statistics

The temptation to cheat in online exams: moving beyond the binary discourse of cheating and not cheating

coursework vs exams statistics

Chegg’s Growth, Response Rate, and Prevalence as a Cheating Tool: Insights From an Audit within an Australian Engineering School

coursework vs exams statistics

Student Integrity Violations in the Academy: More Than a Decade of Growing Complexity and Concern

Explore related subjects.

  • Artificial Intelligence

Avoid common mistakes on your manuscript.

Introduction

Distance learning came to the fore during the global COVID-19 pandemic. Distance learning, also referred to as e-learning, blended learning or mobile learning (Zarzycka et al., 2021 ) is defined as learning with the use of technology where there is a physical separation of students from the teachers during the active learning process, instruction and examination (Armstrong-Mensah et al., 2020 ). This physical separation was key to a sector-wide response to reducing the spread of coronavirus.

COVID prompted a sudden, rapid and near-total adjustment to distance learning (Brown et al., 2022 ; Pokhrel & Chhetri, 2021 ). We all, staff and students, had to learn a lot, very quickly, about distance learning. Pandemic-induced ‘lockdown learning’ continued, in some form, for almost 2 years in many countries, prompting predictions that higher education would be permanently changed by the pandemic, with online/distance learning becoming much more common, even the norm (Barber et al., 2021 ; Dumulescu & Muţiu, 2021 ). One obvious potential change would be the widespread adoption of online assessment methods. Online exams offer students increased flexibility, for example the opportunity to sit an exam in their own homes. This may also reduce some of the anxiety experienced during attending in-person exams in an exam hall, and potentially reduce the administrative cost to universities.

However, assessment poses many challenges for distance learning. Summative assessments, including exams, are the basis for making decisions about the grading and progress of individual students, while aggregated results can inform educational policy such as curriculum or funding decisions (Shute & Kim, 2014 ). Thus, it is essential that online summative assessments can be conducted in a way that allows for their basic reliability and validity to be maintained. During the pandemic, Universities shifted, very rapidly, in-person exams to an online format, with limited time to ensure that these methods were secure. There were subsequent media reports that academic misconduct was now ‘endemic’, with universities supposedly ‘turning a blind eye’ towards cheating (e.g. Henry, 2022 ; Knox, 2021 ). However, it is unclear whether this media anxiety is reflected in the real-world experience in universities.

Dawson defines e-cheating as ‘cheating that uses or is enabled by technology’ (Dawson, 2020 , p. 4). Cheating itself is then defined as the gaining of an unfair advantage (Case and King 2007, in Dawson, 2020 , P4). Cheating poses an obvious threat to the validity of online examinations, a format which relies heavily on technology. Noorbebahani and colleagues recently reviewed the research literature on a specific form of e-cheating; online exam cheating in higher education. They found that students use a variety of methods to gain an unfair advantage, including accessing unauthorized materials such as notes and textbooks, using an additional device to go online, collaborating with others, and even outsourcing the exam to be taken by someone else. These findings map onto the work of Dawson, 2020 , who found a similar taxonomy when considering ‘e-cheating’ more generally. These can be driven by a variety of motivations, including a fear of failure, peer pressure, a perception that others are cheating, and the ease with which they can do it (Noorbehbahani et al., 2022 ). However, it remains unclear how many students are actually engaged in these cheating behaviours. Understanding the scale of cheating is an important pragmatic consideration when determining how, or even if, it could/should be addressed. There is an extensive literature on the incidence of other types of misconduct, but cheating in online exams has received less attention than other forms of misconduct such as plagiarism (Garg & Goel, 2022 ).

One seemingly obvious response to concerns about cheating in online exams is to use remote proctoring systems wherein students are monitored through webcams and use locked-down browsers. However, the efficacy of these systems is not yet clear, and their use has been controversial, with students feeling that they are ‘under surveillance’, anxious about being unfairly accused of cheating, or of technological problems (Marano et al., 2023 ). A recent court ruling in the USA found that the use of a remote proctoring system to scan a student’s private resident prior to taking an online exam was unconstitutional (Bowman, 2022 ), although, at the time of writing, this case is ongoing (Witley, 2023 ). There is already a long history of legal battles between the proctoring companies and their critics (Corbyn, 2022 ), and it is still unclear whether these systems actually reduce misconduct. Alternatives have been offered in the literature, including guidance for how to prepare online exams in a way that reduces the opportunity for misconduct (Whisenhunt et al., 2022 ), although it is unclear whether this guidance is effective either.

There is a large body of research literature which examines the prevalence of different types of academic dishonesty and misconduct. Much of this research is in the form of survey-based self-report studies. There are some obvious problems with using self-report as a measure of misconduct; it is a ‘deviant’ or ‘undesirable’ behaviour, and so those invited to participate in survey-based research have a disincentive to respond truthfully, if at all, especially if there is no guarantee of anonymity. There is also some evidence that certain demographic characteristics associated with an increased likelihood of engaging in academic misconduct are also predictive of a decreased likelihood of responding voluntarily to surveys, meaning that misconduct is likely under-reported when a non-representative sampling method is used such as convenience sampling (Newton, 2018 ).

Some of these issues with quantifying academic misconduct can be partially addressed by the use of rigorous research methodology, for example using representative samples with a high response rate, and clear, unambiguous survey items (Bennett et al., 2011 ; Halbesleben & Whitman, 2013 ). Guarantees of anonymity are also essential for respondents to feel confident about answering honestly, especially when the research is being undertaken by the very universities where participants are studying. A previous systematic review of academic misconduct found that self-report studies are often undertaken with small, convenience samples with low response rates (Newton, 2018 ). Similar findings were reported when reviewing the reliability of research into the prevalence of belief in the Learning Styles neuromyth, suggesting that this is a wider concern within survey-based education research (Newton & Salvi, 2020 ).

However, self-report remains one of the most common ways that academic misconduct is estimated, perhaps in part because there are few other ways to meaningfully measure it. There is also a basic, intuitive objective validity to the method; asking students whether they have cheated is a simple and direct approach, when compared to other indirect approaches to quantifying misconduct, based on (for example) learner analytics, originality scores or grade discrepancies. There is some evidence that self-report correlates positively with actual behaviour (Gardner et al., 1988 ), and that data accuracy can be improved by using methods which incentivize truth-telling (Curtis et al., 2022 ).

Here we undertook a systematic search of the literature in order to identify research which studied the prevalence of academic dishonesty in summative online examinations in Higher Education. The research questions were thus.

How common is self-report of cheating in online exams in Higher Education? (This was the primary research question, and studies were only included if they addressed this question).

Did cheating in online exams increase during the COVID-19 pandemic?

What are the most common forms of cheating?

What are student motivations for cheating?

Does online proctoring reduce the incidence of self-reported online exam cheating?

The review was conducted according to the principles of the PRISMA statement for reporting systematic reviews (Moher et al., 2009 ) updated for 2020 (Page et al., 2021 ). We adapted this methodology based on previous work systematically reviewing survey-based research in education, misbelief and misconduct (Fanelli, 2009 ; Newton, 2018 ; Newton & Salvi, 2020 ), based on the limited nature of the outcomes reported in these studies (i.e. percentage of students engaging in a specific behaviour).

Search Strategy and Information Sources

Searches were conducted in July and August 2022. Searches were first undertaken using the ERIC education research database (eric.ed.gov) and then with Google Scholar. We used Google Scholar since it covers grey literature (Haddaway et al., 2015 ), including unpublished Masters and PhD theses (Jamali & Nabavi, 2015 ) as well as preprints. The Google Scholar search interface is limited, and the search returns can include non-research documents search as citations, university policies and handbooks on academic integrity, and multiple versions of papers (Boeker et al., 2013 ). It is also not possible to exclude the results of one search from another. Thus it is not possible for us to report accurately the numbers of included papers returned from each term. ‘Daisy chaining’ was also used to identify relevant research from studies that had already been identified using the aforementioned literature searches, and recent reviews on the subject (Butler-Henderson & Crawford, 2020 ; Chiang et al., 2022 ; Garg & Goel, 2022 ; Holden et al., 2021 ; Noorbehbahani et al., 2022 ; Surahman & Wang, 2022 ).

Selection Process

Search results were individually assessed against the inclusion/exclusion criteria, starting with the title, followed by the abstract and then the full text. If a study clearly did not meet the inclusion criteria based on the title then it was excluded. If the author was unsure, then the abstract was reviewed. If there was still uncertainty, then the full text was reviewed. When a study met the inclusion criteria (see below), the specific question used in that study to quantify online exam cheating was then itself also used as a search term. Thus the full list of search terms used is shown in Supplementary Online Material S1 .

Eligibility Criteria

The following criteria were used to determine whether to include samples. Many studies included multiple datasets (e.g. samples comprising different groups of students, across different years). The criteria here were applied to individual datasets.

Inclusion Criteria

Participants were asked whether they had ever cheated in an online exam (self-report).

Participants were students in Higher Education.

Reported both total sample size and percent of respondents answering yes to the relevant exam cheating questions, or sufficient data to allow those metrics to be calculated.

English language publication.

Published 2013-present, with data collected 2012-present. We wanted to evaluate a 10 year timeframe. In 2013, at the beginning of this time window, the average time needed to publish an academic paper was 12.2 months, ranging from 9 months (chemistry) to 18 months (Business) (Björk & Solomon, 2013 ). It would therefore be reasonable to conclude that a paper published in 2013 was most likely submitted in 2012. Thus we included papers whose publication date was 2013 onwards, unless the manuscript itself specifically stated that the data were collected prior to 2012.

Exclusion Criteria

Asking participants would they cheat in exams (e.g. Morales-Martinez et al., 2019 ), or did not allow for a distinction between self-report of intent and actual cheating (e.g. Ghias et al., 2014 ).

Phrasing of survey items in a way that does not allow for frequency of online exam cheating to be specifically identified according to the criteria above. Wherever necessary, study authors were contacted to clarify.

Asking participants ‘how often do others cheat in online exams’.

Asking participants about helping other students to cheat.

Schools, community colleges/further education, MOOCS.

Cheating in formative exams, or did not distinguish between formative/summative (e.g. quizzes/exams (e.g. Alvarez, Homer et al., 2022 ; Costley, 2019 ).

Estimates of cheating from learning analytics or other methods which did not include directly asking participants if they had cheated.

Published in a predatory journal (see below).

Predatory Journal Criteria

Predatory journals and publishers are defined as “ entities which prioritize self-interest at the expense of scholarship and are characterised by false or misleading information, deviation from best editorial and publication practices, a lack of transparency, and/or the use of aggressive and indiscriminate solicitation practices .” (Grudniewicz et al., 2019 ). The inclusion of predatory journals in literature reviews may therefore have a negative impact on the data, findings and conclusions. We followed established guidelines for the identification and exclusion of predatory journals from the findings (Rice et al., 2021 ):

Each study which met the inclusion criteria was checked for spelling, punctuation and grammar errors as well as logical inconsistencies.

Every included journal was checked against open access criteria;

If the journal was listed on the Directory of Open Access Journals (DOAJ) database (DOAJ.org) then it was considered to be non-predatory.

If the journal was not present in the DOAJ database, we looked for it in the Committee on Publication Ethics (COPE) database (publicationethics.org). If the journal was listed on the COPE database then it was considered to be non-predatory.

Only one paper met these criteria, containing logical inconsistencies and not listed on either DOAJ or COPE. For completeness we also searched an informal list of predatory journals ( https://beallslist.net ) and the journal was listed there. Thus the study was excluded.

All data were extracted by both authors independently. Where the extracted data differed between authors then this was clarified through discussion. Data extracted were, where possible, as follows:

Author/date

Year of Publication

Year study was undertaken . If this was a range (e.g. Nov 2016-Apr 2017) then the most recent year was used as the data point (e.g. 2017 in the example). If it was not reported when the study was undertaken, then we recorded the year that the manuscript was submitted. If none of these data were available then the publication year was entered as the year that the study was undertaken.

Publication type. Peer reviewed journal publication, peer reviewed conference proceedings or dissertation/thesis.

Population size. The total number of participants in the population, from which the sample is drawn and supposed to represent. For example, if the study is surveying ‘business students at University X’, is it clear how many business students are currently at University X?

Number Sampled. The number of potential participants, from the population, who were asked to fill in the survey.

N . The number of survey respondents.

Cheated in online summative examinations . The number of participants who answered ‘yes’ to having cheated in online exams. Some studies recorded the frequency of cheating on a scale, for example a 1–5 Likert scale from ‘always’ to ‘never’. In these cases, we collapsed all positive reports into a single number of participants who had ever cheated in online exams. Some studies did not ask for a total rate of cheating (i.e. cheating by any/all methods) and so, for analysis purposes the method with the highest rate of cheating was used (see Results).

Group/individual cheating. Where appropriate, the frequency of cheating via different methods was recorded. These were coded according to the highest level of the framework proposed by Noorbehbahani (Noorbehbahani et al., 2022 ), i.e. group vs. individual. More fine-grained analysis was not possible due to the number and nature of the included studies.

Study Risk of Bias and Quality metrics

Response rate . Defined as “ the percentage of people who completed the survey after being asked to do so” (Halbesleben & Whitman, 2013 ).

Method of sampling. As one of the following; convenience sampling, where all members of the population were able to complete the survey, but data were analysed from those who voluntarily completed it. ‘Unclassifiable’ where it was not possible to determine the sampling method based on the data provided (no other sampling methods were used in the included studies).

Ethics. Was it reported whether ethical/IRB approval had been obtained? (note that a recording of ‘N’ here does not mean that ethical approval was not obtained, just that it is not reported)

Anonymity . Were participants assured that they were answering anonymously? Students who are found to have cheated in exams can be given severe penalties, and so a statement of anonymity (not just confidentiality) is important for obtaining meaningful data.

Synthesis Methods

Data are reported as mean ± SEM unless otherwise stated. Datasets were tested for normal distribution using a Kolmogorov-Smirnov test prior to analysis and parametric tests were used if the data were found to be normally distributed. The details of the specific tests used are in the relevant results section.

25 samples were identified from 19 studies, containing a total of 4672 participants. Three studies contained multiple distinct samples from different participants (e.g. data was collected in different years (Case et al., 2019 ; King & Case, 2014 ), or were split by two different programmes of study (Burgason et al., 2019 ), or whether exams were proctored or not (Owens, 2015 ). Thus, these samples were treated as distinct in the analysis since they represent different participants. Multiple studies asked the same groups of participants about different types of cheating, or the conditions under which cheating happens. The analysis of these is explained in the relevant results subsection. A summary of the studies is in Table  1 . The detail of each individual question asked to study participants is in supplementary online data S2 .

Descriptive Metrics of Studies

Sampling method.

23/25 samples were collected using convenience sampling. The remaining two did not provide sufficient information to determine the method of sampling.

Population Size

Only two studies reported the population size.

Sample Size

The average sample size was 188.7 ± 36.16.

Response Rate

Fifteen of the samples did not report sufficient information to allow a response rate to be calculated. The ten remaining samples returned an average response rate of 55.6% ±10.7, with a range from 12.2 to 100%.

Eighteen of the 23 samples (72%) stated that participant responses were collected anonymously.

Seven of the 25 samples (28%) reported that ethical approval was obtained for the study.

How Common is Self-Reported Online Exam Cheating in Higher Education?

44.7% of participants (2088/4672) reported engaging in some form of cheating in online exams. This analysis included those studies where total cheating was not recorded, and so the most commonly reported form of cheating was substituted in. To check the validity of this inclusion, a separate analysis was conducted of only those studies where total cheating was recorded. In this case, 42.5% of students (1574/3707) reported engaging in some form of cheating. An unpaired t -test was used to compare the percentage cheating from each group (total vs. highest frequency), and returned no significant difference ( t (23) = 0.5926, P = 0.56).

Did the Frequency of Online Exam Cheating Increase During COVID?

The samples were classified as having been collected pre-COVID, or during COVID (no samples were identified as having been collected ‘post-COVID’). One study (Jenkins et al., 2022 ) asked the same students about their behaviour before, and during, COVID. For the purposes of this specific analysis, these were included as separate samples, thus there were 26 samples, 17 pre-COVID and 9 during COVID. Pre-COVID, 29.9% (629/2107) of participants reported cheating in online exams. During COVID this figure was 54.7% (1519/2779).

To estimate the variance in these data, and to test whether the difference was statistically significant, the percentages of students who reported cheating for each study were grouped into pre-and during-COVID and the average calculated for each group. The average pre-COVID was 28.03% ± 4.89, (N = 17), whereas during COVID the average is 65.06 ± 9.585 (N = 9). An unpaired t- test was used to compare the groups, and returned a statistically significant difference ( t (24) = 3.897, P = 0.0007). The effect size (Hedges g) was 1.61, indicating that the COVID effect was substantial (Fig.  1 ).

figure 1

Increased self-report of cheating in online exams during the COVID-19 pandemic. Data represent the mean ± SEM of the percentages of students who self-report cheating in online exams pre-and-during COVID. *** = P < 0.005 unpaired t- test

To test the reliability of this result, we conducted a split sample test as in other systematic reviews of the prevalence of academic misconduct (Newton, 2018 ), wherein the data for each group were ordered by size and then every other sample was extracted into a separate group. So, the sample with the lowest frequency of cheating was allocated into Group A, the next smallest into Group B, the next into Group A, and so on. This was conducted separately for the pre-COVID and ‘during COVID’. Each half-group was then subject to an unpaired t- test to determine whether cheating increased during COVID in that group. Each group returned a significant difference ( t (10) = 2.889 P = 0.0161 for odd-numbered samples, t (12) = 2.48, P = 0.029 for even-numbered samples. This analysis gives confidence that the observed increase in self-reported online exam cheating during the pandemic is statistically robust, although there may be other variables which contribute to this (see discussion).

Comparison of Group vs. Individual Online Exam Cheating in Higher Education

In order to consider how best to address cheating in online exams, it is important to understand the specific behaviours of students. Many studies asked multiple questions about different types of cheating, and these were coded according to the typology developed by Noorbehbehani which has a high-level code of ‘individual’ and ‘group’ (Noorbehbahani et al., 2022 ). More fine-grained coding was not possible due to the variance in the types of questions asked of participants (see S2). ‘Individual’ cheating meant that, whatever the type of cheating, it could be achieved without the direct help of another person. This could be looking at notes or textbooks, or searching for materials online. ‘Group’ cheating meant that another person was directly involved, for example by sharing answers, or having them sit the exam on behalf of the participant (contract cheating). Seven studies asked their participants whether they had engaged in different forms of cheating where both formats (Group and Individual) were represented. For each study we ranked all the different forms of cheating by the frequency with which participants reported engaging in it. For all seven of the studies which asked about both Group and Individual cheating, the most frequently reported cheating behaviour was an Individual cheating behaviour. For each study we calculated the difference between the two by subtracting the frequency of the most commonly reported Group cheating behaviour from the frequency of the most commonly reported Individual cheating behaviour. The average difference was 23.32 ± 8.0% points. These two analyses indicate that individual forms of cheating are more common than cheating which involves other people.

Effect of Proctoring/Lockdown Browsers

The majority of studies did not make clear whether their online exams were proctored or unproctored, or whether they involved the use of related software such as lockdown browsers. Thus it was difficult to conduct definitive analyses to address the question of whether these systems reduce online exam cheating. Two studies did specifically address this issue in both cases there was a substantially lower rate of self-reported cheating where proctoring systems were used. Jenkins et al., in a study conducted during COVID, asked participants whether their instructors used ‘anti cheating software (e.g., Lockdown Browser)’ and, if so, whether they had tried to circumvent it. 16.5% admitted to doing this, compared to the overall rate of cheating of 58.4%. Owens asked about an extensive range of different forms of misconduct, in two groups of students whose online exams were either proctored or unproctored. The total rates of cheating in each group did not appear to be reported. The most common form of cheating was the same in both groups (‘web search during an exam’) and was reported by 39.8% of students in the unproctored group but by only 8.5% in the proctored group (Owens, 2015 ).

Reasons Given for Online Exam Cheating

Ten of the studies asked students why they cheated in online exams. These reasons were initially coded by both authors according to the typology provided in (Noorbehbahani et al., 2022 ). Following discussion between the authors, the typology was revised slightly to that shown in Table  1 , to better reflect the reasons given in the reviewed studies.

Descriptive statistics (the percentages of students reporting the different reasons as motivations for cheating) are shown in Table  2 . Direct comparison between the reasons is not fully valid since different studies asked for different options, and some studies offered multiple options whereas some only identified one. However in the four studies that offered multiple options to students, three of them ranked ‘opportunities to cheat’ as the most common reason (and the fourth study did not have this as an option). Thus students appear to be most likely to cheat in online exams when there is an opportunity to do so.

We reviewed data from 19 studies, including 25 samples totaling 4672 participants. We found that a substantial proportion of students, 44.7%, were willing to admit to cheating in online summative exams. This total number masks a finding that cheating in online exams appeared to increase considerably during the COVID-19 pandemic, from 29.9 to 54.7%. These are concerning findings. However, there are a number of methodological considerations which influence the interpretation of these data. These considerations all lead to uncertainty regarding the accuracy of the findings, although a common theme is that, unfortunately, the issues highlighted seem likely to result in an under-reporting of the rate of cheating in online exams.

There are numerous potential sources of error in survey-based research, and these may be amplified where the research is asking participants to report on sensitive or undesirable behaviours. One of these sources of error comes from non-respondents, i.e. how confident can we be that those who did not respond to the survey would have given a similar pattern of responses to those that did (Goyder et al., 2002 ; Halbesleben & Whitman, 2013 ; Sax et al., 2003 ). Two ways to minimize non-respondent error are to increase the sample size as a percentage of the population, and then simply to maximise the percentage of the invited sample who responds to the survey. However only nine of the samples reported sufficient information to even allow the calculation of a response rate, and only two reported the total population size. Thus for the majority of samples reported here, we cannot even begin to estimate the extent of the non-response error. For those that did report sufficient information, the response rate varied considerably, from 12.2% to 100, with an average of 55.6%. Thus a substantial number of the possible participants did not respond.

Most of the surveys reviewed here were conducted using convenience sampling, i.e. participation was voluntary and there was no attempt to ensure that the sample was representative, or that the non-respondents were followed up in a targeted way to increase the representativeness of the sample. People who voluntarily respond to survey research are, compared to the general population, older, wealthier, more likely to be female and educated (Curtin et al., 2000 ). In contrast, individuals who engage in academic misconduct are more likely to be male, younger, from a lower socioeconomic background and less academically able (reviewed in Newton, 2018 ). Thus the features of the survey research here would suggest that the rates of online exam cheating are under-reported.

A second source of error is measurement error – for example, how likely is it that those participants who do respond are telling the truth? Cheating in online exams is clearly a sensitive subject for potential survey participants. Students who are caught cheating in exams can face severe penalties. Measurement error can be substantial when asking participants about sensitive topics, particularly when they have no incentive to respond truthfully. Curtis et al. conducted an elegant study to investigate rates of different types of contract cheating and found that rates were substantially higher when participants were incentivized to tell the truth, compared to traditional self-report (Curtis et al., 2022 ). Another method to increase truthfulness is to use a Randomised Response Technique, which increases participants confidence that their data will be truly anonymous when self-reporting cheating (Mortaz Hejri et al., 2013 ) and so leads to increased estimates of the prevalence of cheating behaviours when measured via self-report (Kerkvliet, 1994 ; Scheers & Dayton, 1987 ). No studies reviewed here reported any incentivization or use of a randomized response technique, and many did not report IRB (ethical) approval or that participants were guaranteed anonymity in their responses. Absence of evidence is not evidence of absence, but it again seems reasonable to conclude that the majority of the measurement error reported here will also lead to an under-reporting of the extent of online exam cheating.

However, there are very many variables associated with likelihood of committing academic misconduct (also reviewed in Newton, 2018 ). For example, in addition to the aforementioned variables, cheating is also associated with individual differences such as personality traits (Giluk & Postlethwaite, 2015 ; Williams & Williams, 2012 ), motivation (Park et al., 2013 ), age and gender (Newstead et al., 1996 ) and studying in a second language (Bretag et al., 2019 ) as well as situational variables such as discipline studied (Newstead et al., 1996 ). None of the studies reviewed here can account for these individual variables, and this perhaps explains, partly, the wide variance in the studies reported, where the percentage of students willing to admit to cheating in online exams ranges from essentially none, to all students, in different studies. However, almost all of the variables associated with differences in likelihood of committing academic misconduct were themselves determined using convenience sampling. In order to begin to understand the true nature, scale and scope of academic misconduct, there is a clear need for studies using large, representative samples, with appropriate methodology to account for non-respondents, and rigorous analyses which attempt to identify those variables associated with an increased likelihood of cheating.

There are some specific issues which must be considered when determining the accuracy of the data showing an increase in cheating during COVID. In general, the pre-COVID group appears to be a more homogenous set of samples, for example, 11 of the 16 samples are from students studying business, and 15 of the 16 pre-COVID samples are from the USA. The during-COVID samples are from a much more diverse range of disciplines and countries. However the increase in self-reported cheating was replicated in the one study which directly asked students about their behaviour before, and during, the pandemic; Jenkins and co-workers found that 28.4% of respondents were cheating pre-COVID, nearly doubling to 58.4% during the pandemic (Jenkins et al., 2022 ), very closely mirroring the aggregate results.

There are some other variables which may be different between the studies and so affect the overall interpretation of the findings. For example, the specific questions asked of participants, as shown in the supplemental online material ( S2 ) reveal that most studies do not report on the specific type of exam (e.g. multiple choice vs. essay based), or the exam duration, weighting, or educational level. This is likely because the studies survey groups of students, across programmes. Having a more detailed understanding of these factors would also inform strategies to address cheating in online exams.

It is difficult to quantify the potential impact of these issues on the accuracy of the data analysed here, since objective measures of cheating in online exams are difficult to obtain in higher education settings. One way to achieve this is to set up traps for students taking closed-book exams. One study tested this using a 2.5 h online exam administered for participants to obtain credit from a MOOC. The exam was set up so that participants would “likely not benefit from having access to third-party reference materials during the exam” . Students were instructed not to access any additional materials or to communicate with others during the exam. The authors built a ‘honeypot’ website which had all of the exam questions on, with a button ‘click to show answer’. If exam participants went online and clicked that button then the site collected information which allowed the researchers to identify the unique i.d. of the test-taker. This approach was combined with a more traditional analysis of the originality of the free-text portions of the exam. Using these methods, the researchers estimated that ~ 30% of students were cheating (Corrigan-Gibbs et al., 2015b ). This study was conducted in 2014-15, and the data align reasonably well with the pre-COVID estimates of cheating found here, giving some confidence that the self-report measures reported here are in the same ball park as objective measures, albeit from only one study.

The challenges of interpreting data from small convenience samples will also affect the analysis of the other measures made here; that students are more likely to commit misconduct on their own, because they can. The overall pattern of findings though does align somewhat, suggesting that concerns may be with the accuracy of the numbers rather than a fundamental qualitative problem (i.e. it seems reasonable to conclude that students are more likely to cheat individually, but it is challenging to put a precise number to that finding). For example, the apparent increase in cheating during COVID is associated with a rapid and near-total transition to online exams. Pre-covid, the use of online exams would have been a choice made by education providers, presumably with some efforts to ensure the security and integrity of that assessment. During COVID lockdown, the scale and speed of the transition to online exams made it much more challenging to put security measures in place, and this would therefore almost certainly have increased the opportunities to cheat.

It was challenging to gather more detail about the specific types of cheating behaviour, due to the considerable heterogeneity between the studies regarding this question. The sector would benefit from future large-scale research using a recognized typology, for example those proposed by Dawson (Dawson, 2020 , p. 112) or Noorbehbahani (Noorbehbahani et al., 2022 ).

Another important recommendation that will help the sector in addressing the problem is for future survey-based research of student dishonesty to make use of the abundant methodological research undertaken to increase the accuracy of such surveys. In particular the use of representative sampling, or analysis methods which account for the challenges posed by unrepresentative samples. Data quality could also be improved by the use of question formats and survey structures which motivate or incentivize truth-telling, for example by the use of methods such as the Randomised Response Technique which increase participant confidence that their responses will be truly anonymous. It would also be helpful to report on key methodological features of survey design; pilot testing, scaling, reliability and validity, although these are commonly underreported in survey based research generally (Bennett et al., 2011 ).

Thus an aggregate portrayal of the findings here is that students are committing misconduct in significant numbers, and that this has increased considerably during COVID. Students appear to be more likely to cheat on their own, rather than in groups, and most commonly motivated by the simple fact that they can cheat. Do these findings and the underlying data give us any information that might be helpful in addressing the problem?

One technique deployed by many universities to address multiple forms of online exam cheating is to increase the use of remote proctoring, wherein student behaviour during online exams is monitored, for example, through a webcam, and/or their online activity is monitored or restricted. We were unable to draw definitive conclusions about the effectiveness of remote proctoring or other software such as lockdown browsers to reduce cheating in online exams, since very few studies stated definitively that the exams were, or were not, proctored. The two studies that examined this question did appear to show a substantial reduction in the frequency of cheating when proctoring was used. Confidence in these results is bolstered by the fact that these studies both directly compared unproctored vs. proctored/lockdown browser. Other studies have used proxy measures for cheating, such as time engaged with the exam, and changes in exams scores, and these studies have also found evidence for a reduction in misconduct when proctoring is used (e.g. (Dendir & Maxwell, 2020 ).

The effectiveness (or not) of remote proctoring to reduce academic misconduct seems like an important area for future research. However there is considerable controversy about the use of remote proctoring, including legal challenges to its use and considerable objections from students, who report a net negative experience, fuelled by concerns about privacy, fairness and technological challenges (Marano et al., 2023 ), and so it remains an open question whether this is a viable option for widespread general use.

Honour codes are a commonly cited approach to promoting academic integrity, and so (in theory) reducing academic misconduct. However, empirical tests of honour codes show that they do not appear to be effective at reducing cheating in online exams (Corrigan-Gibbs et al., 2015a , b ). In these studies the authors likened them to ‘terms and conditions’ for online sites, which are largely disregarded by users in online environments. However in those same studies the authors found that replacing an honour code with a more sternly worded ‘warning’, which specifies the consequences of being caught, was effective at reducing cheating. Thus a warning may be a simple, low-cost intervention to reduce cheating in online exams, whose effectiveness could be studied using appropriately conducted surveys of the type reviewed here.

Another option to reduce cheating in online exams is to use open-book exams. This is often suggested as a way of simultaneously increasing the cognitive level of the exam (i.e. it assesses higher order learning) (e.g. (Varble, 2014 ), and was suggested as a way of reducing the perceived, or potential increase in academic misconduct during COVID (e.g. (Nguyen et al., 2020 ; Whisenhunt et al., 2022 ). This approach has an obvious appeal in that it eliminates the possibility of some common forms of misconduct, such as the use of notes or unauthorized web access (Noorbehbahani et al., 2022 ; Whisenhunt et al., 2022 ), and can even make this a positive feature, i.e. encouraging the use of additional resources in a way that reflects the fact that, for many future careers, students will have access to unlimited information at their fingertips, and the challenge is to ensure that students have learned what information they need and how to use it. This approach certainly fits with our data, wherein the most frequently reported types of misconduct involved students acting alone, and cheating ‘because they could’. Some form of proctoring or other measure may still be needed in order to reduce the threat of collaborative misconduct. Perhaps most importantly though, it is unclear whether open-book exams truly reduce the opportunity for, and the incidence of, academic misconduct, and if so, how might we advise educators to design their exams, and exam question, in a way that delivers this as well as the promise of ‘higher order’ learning. These questions are the subject of ongoing research.

In summary then, there appears to be significant levels of misconduct in online examinations in Higher Education. Students appear to be more likely to cheat on their own, motivated by an examination design and delivery which makes it easy for them to do so. Future research in academic integrity would benefit from large, representative samples using clear and unambiguous survey questions and guarantees of anonymity. This will allow us to get a much better picture of the size and nature of the problem, and so design strategies to mitigate the threat that cheating poses to exam validity.

Alvarez, Homer, T., Reynald, S., Dayrit, Maria Crisella, A., Dela Cruz, C. C., Jocson, R. T., Mendoza, A. V., & Reyes (2022). & Joyce Niña N. Salas. Academic dishonesty cheating in synchronous and asynchronous classes: A proctored examination intervention. International Research Journal of Science, Technology, Education, and Management , 2 (1), 1–1.

Armstrong-Mensah, E., Ramsey-White, K., Yankey, B., & Self-Brown, S. (2020). COVID-19 and Distance Learning: Effects on Georgia State University School of Public Health Students. Frontiers in Public Health , 8 . https://www.frontiersin.org/articles/ https://doi.org/10.3389/fpubh.2020.576227 .

Barber, M., Bird, L., Fleming, J., Titterington-Giles, E., Edwards, E., & Leyland, C. (2021). Gravity assist: Propelling higher education towards a brighter future - Office for Students (Worldwide). Office for Students. https://www.officeforstudents.org.uk/publications/gravity-assist-propelling-higher-education-towards-a-brighter-future/ .

Bennett, C., Khangura, S., Brehaut, J. C., Graham, I. D., Moher, D., Potter, B. K., & Grimshaw, J. M. (2011). Reporting guidelines for Survey Research: An analysis of published Guidance and Reporting Practices. PLOS Medicine , 8 (8), e1001069. https://doi.org/10.1371/journal.pmed.1001069 .

Article   Google Scholar  

Björk, B. C., & Solomon, D. (2013). The publishing delay in scholarly peer-reviewed journals. Journal of Informetrics , 7 (4), 914–923. https://doi.org/10.1016/j.joi.2013.09.001 .

Blinova, O., & WHAT COVID TAUGHT US ABOUT ASSESSMENT: STUDENTS’ PERCEPTIONS OF ACADEMIC INTEGRITY IN DISTANCE LEARNING. (2022). INTED2022 Proceedings , 6214–6218. https://doi.org/10.21125/inted.2022.1576 .

Boeker, M., Vach, W., & Motschall, E. (2013). Google Scholar as replacement for systematic literature searches: Good relative recall and precision are not enough. BMC Medical Research Methodology , 13 , 131. https://doi.org/10.1186/1471-2288-13-131 .

Bowman, E. (2022, August 26). Scanning students’ rooms during remote tests is unconstitutional, judge rules. NPR . https://www.npr.org/2022/08/25/1119337956/test-proctoring-room-scans-unconstitutional-cleveland-state-university .

Bretag, T., Harper, R., Burton, M., Ellis, C., Newton, P., Rozenberg, P., Saddiqui, S., & van Haeringen, K. (2019). Contract cheating: A survey of australian university students. Studies in Higher Education , 44 (11), 1837–1856. https://doi.org/10.1080/03075079.2018.1462788 .

Brown, M., Hoon, A., Edwards, M., Shabu, S., Okoronkwo, I., & Newton, P. M. (2022). A pragmatic evaluation of university student experience of remote digital learning during the COVID-19 pandemic, focusing on lessons learned for future practice. EdArXiv . https://doi.org/10.35542/osf.io/62hz5 .

Burgason, K. A., Sefiha, O., & Briggs, L. (2019). Cheating is in the Eye of the beholder: An evolving understanding of academic misconduct. Innovative Higher Education , 44 (3), 203–218. https://doi.org/10.1007/s10755-019-9457-3 .

Butler-Henderson, K., & Crawford, J. (2020). A systematic review of online examinations: A pedagogical innovation for scalable authentication and integrity. Computers & Education , 159 , 104024. https://doi.org/10.1016/j.compedu.2020.104024 .

Case, C. J., King, D. L., & Case, J. A. (2019). E-Cheating and Undergraduate Business Students: Trends and Role of Gender. Journal of Business and Behavioral Sciences , 31 (1). https://www.proquest.com/openview/9fcc44254e8d6d202086fc58818fab5d/1?pq-origsite=gscholar&cbl=2030637 .

Chiang, F. K., Zhu, D., & Yu, W. (2022). A systematic review of academic dishonesty in online learning environments. Journal of Computer Assisted Learning , 38 (4), 907–928. https://doi.org/10.1111/jcal.12656 .

Corbyn, Z. (2022, August 26). ‘I’m afraid’: Critics of anti-cheating technology for students hit by lawsuits. The Guardian . https://www.theguardian.com/us-news/2022/aug/26/anti-cheating-technology-students-tests-proctorio .

Corrigan-Gibbs, H., Gupta, N., Northcutt, C., Cutrell, E., & Thies, W. (2015a). Measuring and maximizing the effectiveness of Honor Codes in Online Courses. Proceedings of the Second (2015) ACM Conference on Learning @ Scale , 223–228. https://doi.org/10.1145/2724660.2728663 .

Corrigan-Gibbs, H., Gupta, N., Northcutt, C., Cutrell, E., & Thies, W. (2015b). Deterring cheating in Online environments. ACM Transactions on Computer-Human Interaction , 22 (6), 28:1–2823. https://doi.org/10.1145/2810239 .

Costley, J. (2019). Student perceptions of academic dishonesty at a Cyber-University in South Korea. Journal of Academic Ethics , 17 (2), 205–217. https://doi.org/10.1007/s10805-018-9318-1 .

Curtin, R., Presser, S., & Singer, E. (2000). The Effects of Response Rate Changes on the index of consumer sentiment. Public Opinion Quarterly , 64 (4), 413–428. https://doi.org/10.1086/318638 .

Curtis, G. J., McNeill, M., Slade, C., Tremayne, K., Harper, R., Rundle, K., & Greenaway, R. (2022). Moving beyond self-reports to estimate the prevalence of commercial contract cheating: An australian study. Studies in Higher Education , 47 (9), 1844–1856. https://doi.org/10.1080/03075079.2021.1972093 .

Dawson, R. J. (2020). Defending Assessment Security in a Digital World: Preventing E-Cheating and Supporting Academic Integrity in Higher Education (1st ed.). Routledge. https://www.routledge.com/Defending-Assessment-Security-in-a-Digital-World-Preventing-E-Cheating/Dawson/p/book/9780367341527 .

Dendir, S., & Maxwell, R. S. (2020). Cheating in online courses: Evidence from online proctoring. Computers in Human Behavior Reports , 2 , 100033. https://doi.org/10.1016/j.chbr.2020.100033 .

Dumulescu, D., & Muţiu, A. I. (2021). Academic Leadership in the Time of COVID-19—Experiences and Perspectives. Frontiers in Psychology , 12 . https://www.frontiersin.org/article/ https://doi.org/10.3389/fpsyg.2021.648344 .

Ebaid, I. E. S. (2021). Cheating among Accounting Students in Online Exams during Covid-19 pandemic: Exploratory evidence from Saudi Arabia. Asian Journal of Economics Finance and Management , 9–19.

Elsalem, L., Al-Azzam, N., Jum’ah, A. A., & Obeidat, N. (2021). Remote E-exams during Covid-19 pandemic: A cross-sectional study of students’ preferences and academic dishonesty in faculties of medical sciences. Annals of Medicine and Surgery , 62 , 326–333. https://doi.org/10.1016/j.amsu.2021.01.054 .

Fanelli, D. (2009). How many scientists fabricate and falsify Research? A systematic review and Meta-analysis of Survey Data. PLOS ONE , 4 (5), e5738. https://doi.org/10.1371/journal.pone.0005738 .

Gardner, W. M., Roper, J. T., Gonzalez, C. C., & Simpson, R. G. (1988). Analysis of cheating on academic assignments. The Psychological Record , 38 (4), 543–555. https://doi.org/10.1007/BF03395046 .

Garg, M., & Goel, A. (2022). A systematic literature review on online assessment security: Current challenges and integrity strategies. Computers & Security , 113 , 102544. https://doi.org/10.1016/j.cose.2021.102544 .

Gaskill, M. (2014). Cheating in Business Online Learning: Exploring Students’ Motivation, Current Practices and Possible Solutions. Theses, Student Research, and Creative Activity: Department of Teaching, Learning and Teacher Education . https://digitalcommons.unl.edu/teachlearnstudent/35 .

Ghias, K., Lakho, G. R., Asim, H., Azam, I. S., & Saeed, S. A. (2014). Self-reported attitudes and behaviours of medical students in Pakistan regarding academic misconduct: A cross-sectional study. BMC Medical Ethics , 15 (1), 43. https://doi.org/10.1186/1472-6939-15-43 .

Giluk, T. L., & Postlethwaite, B. E. (2015). Big five personality and academic dishonesty: A meta-analytic review. Personality and Individual Differences , 72 , 59–67. https://doi.org/10.1016/j.paid.2014.08.027 .

Goff, D., Johnston, J., & Bouboulis, B. (2020). Maintaining academic Standards and Integrity in Online Business Courses. International Journal of Higher Education , 9 (2). https://econpapers.repec.org/article/jfrijhe11/v_3a9_3ay_3a2020_3ai_3a2_3ap_3a248.htm .

Goyder, J., Warriner, K., & Miller, S. (2002). Evaluating Socio-economic status (SES) Bias in Survey Nonresponse. Journal of Official Statistics , 18 (1), 1–11.

Google Scholar  

Grudniewicz, A., Moher, D., Cobey, K. D., Bryson, G. L., Cukier, S., Allen, K., Ardern, C., Balcom, L., Barros, T., Berger, M., Ciro, J. B., Cugusi, L., Donaldson, M. R., Egger, M., Graham, I. D., Hodgkinson, M., Khan, K. M., Mabizela, M., Manca, A., & Lalu, M. M. (2019). Predatory journals: No definition, no defence. Nature , 576 (7786), 210–212. https://doi.org/10.1038/d41586-019-03759-y .

Haddaway, N. R., Collins, A. M., Coughlin, D., & Kirk, S. (2015). The role of Google Scholar in evidence reviews and its applicability to Grey Literature Searching. Plos One , 10 (9), https://doi.org/10.1371/journal.pone.0138237 .

Halbesleben, J. R. B., & Whitman, M. V. (2013). Evaluating Survey Quality in Health Services Research: A decision Framework for assessing Nonresponse Bias. Health Services Research , 48 (3), 913–930. https://doi.org/10.1111/1475-6773.12002 .

Henry, J. (2022, July 17). Universities “turn blind eye to online exam cheats” as fraud rises. Mail Online . https://www.dailymail.co.uk/news/article-11021269/Universities-turning-blind-eye-online-exam-cheats-studies-rates-fraud-risen.html .

Holden, O. L., Norris, M. E., & Kuhlmeier, V. A. (2021). Academic Integrity in Online Assessment: A Research Review. Frontiers in Education , 6 . https://www.frontiersin.org/articles/ https://doi.org/10.3389/feduc.2021.639814 .

Jamali, H. R., & Nabavi, M. (2015). Open access and sources of full-text articles in Google Scholar in different subject fields. Scientometrics , 105 (3), 1635–1651. https://doi.org/10.1007/s11192-015-1642-2 .

Janke, S., Rudert, S. C., Petersen, Ä., Fritz, T. M., & Daumiller, M. (2021). Cheating in the wake of COVID-19: How dangerous is ad-hoc online testing for academic integrity? Computers and Education Open , 2 , 100055. https://doi.org/10.1016/j.caeo.2021.100055 .

Jantos, A., & IN SUMMATIVE E-ASSESSMENT IN HIGHER EDUCATION - A QUANTITATIVE ANALYSIS. (2021). MOTIVES FOR CHEATING. EDULEARN21 Proceedings , 8766–8776. https://doi.org/10.21125/edulearn.2021.1764 .

Jenkins, B. D., Golding, J. M., Grand, L., Levi, A. M., M. M., & Pals, A. M. (2022). When Opportunity knocks: College Students’ cheating amid the COVID-19 pandemic. Teaching of Psychology , 00986283211059067. https://doi.org/10.1177/00986283211059067 .

Jones, I. S., Blankenship, D., Hollier, G., & I CHEATING? AN ANALYSIS OF ONLINE STUDENT PERCEPTIONS OF THEIR BEHAVIORS AND ATTITUDES. (2013). AM. Proceedings of ASBBS , 59–69. http://asbbs.org/files/ASBBS2013V1/PDF/J/Jones_Blankenship_Hollier(P59-69).pdf .

Kerkvliet, J. (1994). Cheating by Economics students: A comparison of Survey results. The Journal of Economic Education , 25 (2), 121–133. https://doi.org/10.1080/00220485.1994.10844821 .

King, D. L., & Case, C. J. (2014). E-CHEATING: INCIDENCE AND TRENDS AMONG COLLEGE STUDENTS. Issues in Information Systems , 15 (1), 20–27. https://doi.org/10.48009/1_iis_2014_20-27 .

Knox, P. (2021). Students “taking it in turns to answer exam questions” during home tests. The Sun . https://www.thesun.co.uk/news/15413811/students-taking-turns-exam-questions-cheating-lockdown/ .

Larkin, C., & Mintu-Wimsatt, A. (2015). Comparing cheating behaviors among graduate and undergraduate Online Business Students—ProQuest. Journal of Higher Education Theory and Practice , 15 (7), 54–62.

Marano, E., Newton, P. M., Birch, Z., Croombs, M., Gilbert, C., & Draper, M. J. (2023). What is the Student Experience of Remote Proctoring? A Pragmatic Scoping Review . EdArXiv. https://doi.org/10.35542/osf.io/jrgw9 .

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & Group, T. P. (2009). Preferred reporting items for systematic reviews and Meta-analyses: The PRISMA Statement. PLOS Medicine , 6 (7), e1000097. https://doi.org/10.1371/journal.pmed.1000097 .

Morales-Martinez, G. E., Lopez-Ramirez, E. O., & Mezquita-Hoyos, Y. N. (2019). Cognitive mechanisms underlying the Engineering Students’ Desire to Cheat during Online and Onsite Statistics Exams. Cognitive mechanisms underlying the Engineering Students’ Desire to Cheat during Online and Onsite Statistics Exams , 8 (4), 1145–1158.

Mortaz Hejri, S., Zendehdel, K., Asghari, F., Fotouhi, A., & Rashidian, A. (2013). Academic disintegrity among medical students: A randomised response technique study. Medical Education , 47 (2), 144–153. https://doi.org/10.1111/medu.12085 .

Newstead, S. E., Franklyn-Stokes, A., & Armstead, P. (1996). Individual differences in student cheating. Journal of Educational Psychology , 88 , 229–241. https://doi.org/10.1037/0022-0663.88.2.229 .

Newton, P. M. (2018). How Common Is Commercial Contract Cheating in Higher Education and Is It Increasing? A Systematic Review. Frontiers in Education , 3 . https://www.frontiersin.org/article/ https://doi.org/10.3389/feduc.2018.00067 .

Newton, P. M., & Salvi, A. (2020). How Common Is Belief in the Learning Styles Neuromyth, and Does It Matter? A Pragmatic Systematic Review. Frontiers in Education , 5 . https://doi.org/10.3389/feduc.2020.602451 .

Nguyen, J. G., Keuseman, K. J., & Humston, J. J. (2020). Minimize Online cheating for online assessments during COVID-19 pandemic. Journal of Chemical Education , 97 (9), 3429–3435. https://doi.org/10.1021/acs.jchemed.0c00790 .

Noorbehbahani, F., Mohammadi, A., & Aminazadeh, M. (2022). A systematic review of research on cheating in online exams from 2010 to 2021. Education and Information Technologies . https://doi.org/10.1007/s10639-022-10927-7 .

Owens, H. (2015). Cheating within Online Assessments: A Comparison of Cheating Behaviors in Proctored and Unproctored Environment. Theses and Dissertations . https://scholarsjunction.msstate.edu/td/1049 .

Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., & Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Bmj , 71. https://doi.org/10.1136/bmj.n71 .

Park, E. J., Park, S., & Jang, I. S. (2013). Academic cheating among nursing students. Nurse Education Today , 33 (4), 346–352. https://doi.org/10.1016/j.nedt.2012.12.015 .

Pokhrel, S., & Chhetri, R. (2021). A Literature Review on Impact of COVID-19 pandemic on teaching and learning. Higher Education for the Future , 8 (1), 133–141. https://doi.org/10.1177/2347631120983481 .

Rice, D. B., Skidmore, B., & Cobey, K. D. (2021). Dealing with predatory journal articles captured in systematic reviews. Systematic Reviews , 10 , 175. https://doi.org/10.1186/s13643-021-01733-2 .

Romaniuk, M. W., & Łukasiewicz-Wieleba, J. (2022). Remote and stationary examinations in the opinion of students. International Journal of Electronics and Telecommunications , 68 (1), 69.

Sax, L. J., Gilmartin, S. K., & Bryant, A. N. (2003). Assessing response Rates and Nonresponse Bias in web and paper surveys. Research in Higher Education , 44 (4), 409–432. https://doi.org/10.1023/A:1024232915870 .

Scheers, N. J., & Dayton, C. M. (1987). Improved estimation of academic cheating behavior using the randomized response technique. Research in Higher Education , 26 (1), 61–69. https://doi.org/10.1007/BF00991933 .

Shute, V. J., & Kim, Y. J. (2014). Formative and Stealth Assessment. In J. M. Spector, M. D. Merrill, J. Elen, & M. J. Bishop (Eds.), Handbook of Research on Educational Communications and Technology (pp. 311–321). Springer. https://doi.org/10.1007/978-1-4614-3185-5_25 .

Subotic, D., & Poscic, P. (2014). Academic dishonesty in a partially online environment: A survey. Proceedings of the 15th International Conference on Computer Systems and Technologies , 401–408. https://doi.org/10.1145/2659532.2659601 .

Surahman, E., & Wang, T. H. (2022). Academic dishonesty and trustworthy assessment in online learning: A systematic literature review. Journal of Computer Assisted Learning , n/a (n/a). https://doi.org/10.1111/jcal.12708 .

Tahsin, M. U., Abeer, I. A., & Ahmed, N. (2022). Note: Cheating and Morality Problems in the Tertiary Education Level: A COVID-19 Perspective in Bangladesh. ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies (COMPASS) , 589–595. https://doi.org/10.1145/3530190.3534834 .

Valizadeh, M. (2022). CHEATING IN ONLINE LEARNING PROGRAMS: LEARNERS’ PERCEPTIONS AND SOLUTIONS. Turkish Online Journal of Distance Education , 23 (1), https://doi.org/10.17718/tojde.1050394 .

Varble, D. (2014). Reducing Cheating Opportunities in Online Test. Atlantic Marketing Journal , 3 (3). https://digitalcommons.kennesaw.edu/amj/vol3/iss3/9 .

Whisenhunt, B. L., Cathey, C. L., Hudson, D. L., & Needy, L. M. (2022). Maximizing learning while minimizing cheating: New evidence and advice for online multiple-choice exams. Scholarship of Teaching and Learning in Psychology , 8 (2), 140–153. https://doi.org/10.1037/stl0000242 .

Williams, M. W. M., & Williams, M. N. (2012). Academic dishonesty, Self-Control, and General Criminality: A prospective and retrospective study of academic dishonesty in a New Zealand University. Ethics & Behavior , 22 (2), 89–112. https://doi.org/10.1080/10508422.2011.653291 .

Witley, S. (2023). Virtual Exam Case Primes Privacy Fight on College Room Scans (1) . https://news.bloomberglaw.com/privacy-and-data-security/virtual-exam-case-primes-privacy-fight-over-college-room -scans?context=search&index=1.

Zarzycka, E., Krasodomska, J., Mazurczak-Mąka, A., & Turek-Radwan, M. (2021). Distance learning during the COVID-19 pandemic: Students’ communication and collaboration and the role of social media. Cogent Arts & Humanities , 8 (1), 1953228. https://doi.org/10.1080/23311983.2021.1953228 .

Download references

Acknowledgements

We would like to acknowledge the efforts of all the researchers whose work was reviewed as part of this study, and their participants who gave up their time to generate the data reviewed here. We are especially grateful to Professor Carl Case at St Bonaventure University, NY, USA for his assistance clarifying the numbers of students who undertook online exams in King and Case ( 2014 ) and Case et al. ( 2019 ).

No funds, grants, or other support was received.

Author information

Authors and affiliations.

Swansea University Medical School, Swansea, SA2 8PP, Wales, UK

Philip M. Newton & Keioni Essex

You can also search for this author in PubMed   Google Scholar

Contributions

PMN designed the study. PMN + KE independently searched for studies and extracted data. PMN analysed data and wrote the results. KE checked analysis. PMN + KE drafted the introduction and methods. PMN wrote the discussion and finalised the manuscript.

Corresponding author

Correspondence to Philip M. Newton .

Ethics declarations

Conflict of interest.

The authors have no relevant financial or non-financial interests to disclose.

Ethics approval and consent to participate

This paper involved secondary analysis of data already in the public domain, and so ethical approval was not necessary.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Newton, P.M., Essex, K. How Common is Cheating in Online Exams and did it Increase During the COVID-19 Pandemic? A Systematic Review. J Acad Ethics 22 , 323–343 (2024). https://doi.org/10.1007/s10805-023-09485-5

Download citation

Accepted : 17 June 2023

Published : 04 August 2023

Issue Date : June 2024

DOI : https://doi.org/10.1007/s10805-023-09485-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Academic Integrity
  • Distance Learning
  • Digital Education

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

You are here:

Should coursework be included in exams?

Examining English for doctors the right way

As the exams regulator prepared to abandon the assessment of coursework counting towards the grades for science A Levels, Tim Oates, Group Director of Assessment Research and Development at Cambridge Assessment, considered the impact of current coursework – or controlled assessment – arrangements on science teaching.

Speaking at the Association for Science Education Conference at the University of Birmingham on 9 January, Tim said that although it is widely recognised that the current system of coursework assessment does not work, it could be detrimental to promote approaches which may also result in the abolishment of science experiments in the classroom. He went on to say that there is "unequivocal evidence from many years of research that shows that children and young people acquire understanding of vital aspects of biology, chemistry and physics far more effectively when programmes include learning grounded in experiments in the classroom".

Tim called for a shift in the idea about the role and use of practical work. Acknowledging the pressures on teachers to enhance exam grades, he said that some mistakenly believed practical work was a simply a necessary part of preparation for the exam. Calling on examples from medical education, he claimed that a rich mixture of practice and theory results in a deep, secure learning which ensures that pupils are ready to go into industry or higher education with robust practical, as well as theoretical, scientific knowledge.

Explaining that much of the recent critique on controlled assessment in science has been that it has tended to err towards assessing knowledge rather than practical activity, Tim called for a greater precision about what the purpose of coursework should be in terms of learning outcomes. This, he said, would ensure that it can be designed more effectively into learning programmes and public examinations, where marks would need to be measured with specific constructs. Perhaps controversially, he suggested that we consider whether we should not include coursework within public examinations themselves, saying that "we should not be irrationally opposed to high quality written end-assessment focussing on knowledge, where the best preparation is rich, immersive practical work".

In 2013, Tim published a paper on alternative approaches for appropriate placing of ‘coursework components’ in GCSE examinations. At the same time, our UK exam board, OCR proposed that practical experiments in science, fieldwork in geography and creative activities in arts subjects, among others, should continue to be a part of the subject syllabus – with the knowledge assessed as part of the final exams – but with a key change being that the coursework itself would not contribute to a final grade.

Related materials

  • -->Coursework – radical solutions in demanding times (PDF, 354KB)

Related links

  • OCR responds to DfE and Ofqual consultations on GCSE reform

Research Matters

Research Matters 32 promo image

Research Matters is our free biannual publication which allows us to share our assessment research, in a range of fields, with the wider assessment community.

New Times, New Thinking.

The A-level debacle shows why coursework and AS-levels should never have been scrapped

The all-or-nothing education system introduced by Michael Gove was always bound to fail at some point. 

By Rohan Banerjee

coursework vs exams statistics

The impact of Covid-19 on A-level results this year has renewed the case for coursework. With pupils unable to sit their exams due to the lockdown, grades were awarded based on predictions from their teachers, which were then moderated by Ofqual, England’s exam regulator, and its equivalents in Northern Ireland and Wales. Ofqual used a statistical model, which took into account factors such as a school ’ s recent exam history, and students’ previous external exam results.

In others words, students have not been judged solely on their own performance, but also other people’s, or other people’s perceptions of them. And this seems remarkably unfair. 

Almost 40 per cent (39.1 per cent) of A-Level grades were downgraded. The 39.1 per cent total comprised 35.6 per cent of results being lowered by one grade, 3.3 per cent reduced by two grades, and 0.2 per cent by three grades. A frenzy around university places has followed, with thousands of students, as it stands, potentially missing out on the chance to attend their preferred institution.

[see also:  The A-level results injustice shows why algorithms are never neutral ]

Students can appeal their grades – as many will do – by notifying their school or college, which will then send evidence, such as their mock exam results, to Ofqual. But the lack of consistency in how mocks are administered from school to school presents another challenge. 

The Saturday Read

Morning call.

[see also:  A majority of UK voters think that teachers alone should set this year’s exam results ]

The all-or-nothing education system installed  in England when Michael Gove was education secretary (2010-2014) was bound to be exposed eventually. And the pandemic underlines its absurdity.

The Conservatives scrapped coursework for most subjects across GCSEs and A-levels between 2013 and 2017. The party also decoupled AS-levels, the exams sat at the end of Year 12, from the overall A-level grade awarded in Year 13, meaning that a student’s qualification is assessed entirely on one exam or a series of exams at the end of a two-year course. 

The abandonment of modular course structures – that allowed people to pace and compartmentalise their learning over several different sittings – means that students are under extreme pressure throughout their studies, culminating in one intense flurry that takes little consideration of their health or other circumstances. 

Coursework – usually essays or project-based reports – is often criticised as a less intense mode of assessment that is susceptible to cheating. Parental and teacher influence or input into submitted work are cited as the key problems. But, surely, there are ways of better policing coursework, rather than abandoning it entirely?

If students had completed externally moderated coursework before the pandemic, it would have given a more reliable projection of their ability than what former students were graded the previous year. Moreover, this criterion unfairly disadvantages high-achieving pupils who attend historically lower-achieving schools. 

[see also:  Top A-level grades soar at private schools as sixth form colleges lose out ]

While coursework isn’t suited to every subject – maths and the sciences lend themselves more easily to exams – there is some merit in the skills it requires. Researching, referencing and reading broadly to produce one overall project over an extended period of time are the essence of most university courses: the very thing that A-levels are supposed to lead towards. Why, then, should the process of learning be reduced to a giant memory test? 

A return to coursework, with reforms enabled by technology, is possible in the future. Students can be assessed remotely and even, if essential, under timed conditions – albeit for longer periods than an hour or two. Coursework also better serves those students who struggle with exam-induced anxiety. And if Ofqual is so concerned about teacher input, why not let the exam boards grade the coursework, as they would have done for modular exams in the past? 

This year’s A-levels – and most likely the GCSE results announced next week – have delivered mass injustice. Too many students have been let down by a postcode lottery. But, beyond that, the rigid, unforgiving absolutism of the education system has been exposed. Coursework is not a panacea but it is, perhaps, a leveller. And in the event of another pandemic, it would at least give students more agency over their futures.

Content from our partners

Clean power 2030? Mission accepted

Clean power 2030? Mission accepted

How a digital approach to trade could empower economic growth

How a digital approach to trade could empower economic growth

The UK’s skills shortfall is undermining growth

The UK’s skills shortfall is undermining growth

More cash won't save the NHS

More cash won’t save the NHS

The Scottish Tories will regret their self-indulgence

The Scottish Tories will regret their self-indulgence

AP Statistics

Ap statistics course and exam description.

This is the core document for the course.

Preview the Revised AP Statistics Course Framework

This draft AP Statistics course framework (.pdf) includes proposed revisions to the course content. Revisions would take effect in the 2026-27 school year or later.

Course Overview

AP Statistics is an introductory college-level statistics course that introduces students to the major concepts and tools for collecting, analyzing, and drawing conclusions from data. Students cultivate their understanding of statistics using technology, investigations, problem solving, and writing as they explore concepts like variation and distribution; patterns and uncertainty; and data-based predictions, decisions, and conclusions.

Course and Exam Description

This is the core document for this course. Unit guides clearly lay out the course content and skills and recommend sequencing and pacing for them throughout the year. The CED was updated in March 2021.

Course Resources

Ap statistics course overview.

This resource provides a succinct description of the course and exam.

AP Statistics Course and Exam Description Walk-Through

Learn more about the CED in this interactive walk-through.

AP Statistics Course at a Glance

Excerpted from the AP Statistics Course and Exam Description, the Course at a Glance document outlines the topics and skills covered in the AP Statistics course, along with suggestions for sequencing.

AP Statistics CED Errata Sheet

This document details the updates made to the course and exam description (CED) in March 2021.

Course Content

Based on the Understanding by Design® (Wiggins and McTighe) model, this course framework provides a clear and detailed description of the course requirements necessary for student success. The framework specifies what students must know, be able to do, and understand, with a focus on three big ideas that encompass the principles and processes in the discipline of statistics. The framework also encourages instruction that prepares students for advanced coursework in statistics or other fields using statistical reasoning and for active, informed engagement with a world of data to be interpreted appropriately and applied wisely to make informed decisions.

The AP Statistics framework is organized into nine commonly taught units of study that provide one possible sequence for the course. As always, you have the flexibility to organize the course content as you like.

 Unit

Exam Weighting (Multiple-Choice Section)

 Unit 1: Exploring One-Variable Data

 15%–23%

 Unit 2: Exploring Two-Variable Data

 5%–7%

 Unit 3: Collecting Data

 12%–15%

 Unit 4: Probability, Random Variables, and Probability Distributions

 10%–20%

 Unit 5: Sampling Distributions

 7%–12%

 Unit 6: Inference for Categorical Data: Proportions

 12%–15%

 Unit 7: Inference for Quantitative Data: Means

 10%–18%

 Unit 8: Inference for Categorical Data: Chi-Square

 2%–5%

 Unit 9: Inference for Quantitative Data: Slopes

 2%–5%

Course Skills

The AP Statistics framework included in the course and exam description outlines distinct skills that students should practice throughout the year—skills that will help them learn to think and act like statisticians.

Skill

Description

Exam Weighting (Multiple-Choice Section)

1. Selecting Statistical Methods

Select methods for collecting and/or analyzing data for statistical inference.

15%–23%

2. Data Analysis

Describe patterns, trends, associations, and relationships in data.

15%–23%

3. Using Probability and Simulation

Explore random phenomena.

30%–40%

4. Statistical Argumentation

Develop an explanation or justify a conclusion using evidence from data, definitions, or statistical inference.

25%–35%

AP and Higher Education

Higher education professionals play a key role in developing AP courses and exams, setting credit and placement policies, and scoring student work. The AP Higher Education section features information on recruitment and admission, advising and placement, and more.

This chart  shows recommended scores for granting credit, and how much credit should be awarded, for each AP course. Your students can look up credit and placement policies for colleges and universities on the  AP Credit Policy Search .

Meet the AP Statistics Development Committee

The AP Program is unique in its reliance on Development Committees. These committees, made up of an equal number of college faculty and experienced secondary AP teachers from across the country, are essential to the preparation of AP course curricula and exams.

AP Statistics Development Committee

Browse Course Material

Course info, instructors.

  • Dr. Jeremy Orloff
  • Dr. Jennifer French Kamrin

Departments

  • Mathematics

As Taught In

  • Discrete Mathematics
  • Probability and Statistics

Learning Resource Types

Introduction to probability and statistics, exams with solutions.

facebook

You are leaving MIT OpenCourseWare

IMAGES

  1. Coursework and final exams results by subjects

    coursework vs exams statistics

  2. COMPARISON BETWEEN COURSEWORK SCORE AND EXAM PERFORMANCE SHOWING THE

    coursework vs exams statistics

  3. COMPARISON BETWEEN COURSEWORK SCORE AND EXAM PERFORMANCE SHOWING THE

    coursework vs exams statistics

  4. What effect did my new coursework have on exam outcome?

    coursework vs exams statistics

  5. Stats coursework

    coursework vs exams statistics

  6. Summary statistics of the coursework and exam marks used in the

    coursework vs exams statistics

VIDEO

  1. here are 2.5 reasons to join our YTTC. #YTTC #yogattc #learnyoga #yoga

  2. The TUM Experience: Living in Munich (Undergraduate Students)

  3. classwork vs homework vs exams😋

  4. @V.com30 || teachers vs homework vs exams ||

  5. 7-HOURS STUDY WITH ME DAY 22 |🔴LIVE ||CA|CS|CMA || CA ASPIRANT || CA FINAL NOV 24 EXAM || POMODORO

  6. Class work vs homework vs exams #foryou #viral #trending #relatable #haha #funny #memes #shorts

COMMENTS

  1. Coursework versus examinations in end-of-module assessment: a

    Meta-analytic reviews have clearly established that such 'test anxiety' is negatively correlated with attainment in higher education (e.g. Hembree ... Gender and ethnicity are the main demographic characteristics that have been discussed in the literature on coursework vs. examinations, but there is a suggestion that age is another factor ...

  2. PDF Evidence for the reliability of coursework

    coursework (max=50) GCSE exam (max=50) Correlations for the 2008-2010 data set A level exam 0.54 GCSE coursework 0.44 0.34 GCSE exam 0.41 0.37 0.36 0.62 Descriptives N 3571 49892 ... Table 1: Inter-component correlations and descriptive statistics for History GCSE and A level

  3. Coursework versus examinations in end-of-module assessment: a

    Regression line analysis of the marks in the exam versus combined coursework showed (a) a poor line fit and (b) the correlation coefficient was moderate (r = 0.51), for the individual laboratory ...

  4. Rethinking assessments: Combining exams and coursework

    Rethinking educational assessments: the matrimony of exams and coursework. It's 2019 - and high time we rethink educational assessments. Source: Shutterstock. Standardised tests have been cemented in education systems across the globe, but whether or not they are a better assessment of students' ability compared to coursework still divides ...

  5. Full article: The Impact of Assignments and Quizzes on Exam Grades: A

    The quizzes and homework were mapped to course topics and were designed to test whether students have the knowledge of the course concepts and whether they can apply statistical techniques. 2 An upper-level Statistics course, focusing on ANOVA, regression analyses, and forecasting, is also required of students who are majoring in business or ...

  6. Students on the Future of Exams « Discover Cambridge# « Cambridge Core Blog

    Students on the Future of Exams. Constance Lam and Joanna Berry. 23 August 2022 Last update: 12/08/22 11:12. About Constance: I'm currently studying for my master's in publishing at University College London, and in 2021 I graduated from Durham University with a degree in English Literature. During the pandemic, I studied remotely back home ...

  7. Impact of coursework on attainment dependent on student characteristics

    Research and statistics. Reports, analysis and official statistics. Policy papers and consultations. Consultations and strategy. Transparency. Data, Freedom of Information releases and corporate ...

  8. End-of-course exams benefit students—and states

    End-of-course exams benefit students—and states. Education reformers in the United States have stumbled when it comes to high schools and the achievement evidence shows it. National Assessment results in grade twelve have been flat for a very long time. ACT and SAT scores are flat. U.S. results on PISA and TIMSS are essentially flat.

  9. PDF The impact of coursework on attainment dependent on student characteristics

    included optional coursework, flexible coursework formats or a mixture of formats. The principle behind including coursework as part of a qualification is sound in that it is designed to increase the validity of the assessment. The role of coursework is to test skills, performances, achievements that might not be so readily assessed by a

  10. Open Research Online

    In the UK and other countries, the use of end-of-module assessment by coursework in higher education has increased over the last 40 years. This has been justified by various pedagogical arguments. In addition, students themselves prefer to be assessed either by coursework alone or by a mixture of coursework and examinations than by examinations ...

  11. Coursework vs Exams: What's Easier? (Pros and Cons)

    This work makes up a student's coursework and contributes to their final grade. In comparison, exams often only take place at the end of the year. Therefore, students are only assessed at one point in the year instead of throughout. All of a student's work then leads up to them answering a number of exams which make up their grade.

  12. Are exams a more effective tool to assess students than coursework?

    For thousands of years, exams have been used to assess students' abilities and intelligence. But recently, more schools have been opting for a different approach to assessment: coursework. Exams ...

  13. Exploring relationships between coursework and examination marks: A

    Table I. Nature of coursework in Courses A-G of the M.Pharm degree 2002-2003. Course Exams (Papers) Nature and amount of assessed Coursework A Year 1, semester 1 2 papers 18 practicals, of which, 9 are completed in class and handed in at the end of the class; 2 are written up and the reports are self-assessed (20%) and

  14. Choosing the Right Statistical Test

    Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical variables include: Ordinal: represent data with an order (e.g. rankings). Nominal: represent group names (e.g. brands or species names). Binary: represent data with a yes/no or 1/0 outcome (e.g. win or lose).

  15. Khan Academy

    If you're seeing this message, it means we're having trouble loading external resources on our website. If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

  16. Frontiers

    In this sample, we compared (1) average exam scores, or scores on all high-stakes assessments that accounted for a relatively large portion of a student's grade, (2) average non-exam scores including in-class assignments, credit for participation, and group work (note: these scores do not include out-of-class homework), (3) average laboratory scores, where applicable, and (4) final course ...

  17. How Common is Cheating in Online Exams and did it Increase ...

    Academic misconduct is a threat to the validity and reliability of online examinations, and media reports suggest that misconduct spiked dramatically in higher education during the emergency shift to online exams caused by the COVID-19 pandemic. This study reviewed survey research to determine how common it is for university students to admit cheating in online exams, and how and why they do ...

  18. Coursework, A Discussion of its Role and Assessment

    coursework element of the course. Comments on assessment are made in the context of statistics teaching in a number of different types of service course each having its own problems. 1. Main stream statistics courses where the subject is a major component of the course. For example degrees in mathematics, statistics and combined studies with ...

  19. Should coursework be included in exams?

    Perhaps controversially, he suggested that we consider whether we should not include coursework within public examinations themselves, saying that "we should not be irrationally opposed to high quality written end-assessment focussing on knowledge, where the best preparation is rich, immersive practical work". In 2013, Tim published a paper on ...

  20. Probability & Statistics

    Probability & Statistics introduces students to the basic concepts and logic of statistical reasoning and gives the students introductory-level practical ability to choose, generate, and properly interpret appropriate descriptive and inferential methods. In addition, the course helps students gain an appreciation for the diverse applications of statistics and its relevance to their lives and

  21. The A-level debacle shows why coursework and AS-levels should never

    [see also: A majority of UK voters think that teachers alone should set this year's exam results] The all-or-nothing education system installed in England when Michael Gove was education secretary (2010-2014) was bound to be exposed eventually. And the pandemic underlines its absurdity. The Conservatives scrapped coursework for most subjects across GCSEs and A-levels between 2013 and 2017.

  22. AP Statistics Course

    AP Statistics is an introductory college-level statistics course that introduces students to the major concepts and tools for collecting, analyzing, and drawing conclusions from data. Students cultivate their understanding of statistics using technology, investigations, problem solving, and writing as they explore concepts like variation and ...

  23. Exams with Solutions

    MIT OpenCourseWare is a web based publication of virtually all MIT course content. OCW is open and available to the world and is a permanent MIT activity Browse Course Material Syllabus Calendar ... 18.05 Introduction to Probability and Statistics (S22), Exam 1 Review: all questions: solutions. pdf. 144 kB