I am returning to this model in my classes: pen in paper quizzes, no digital devices. I also do seven equally weighted quizzes to deescalate them individually. I have reduced project/programming weight from 60-80% of my grade to 50% because it is not possible to tell if the students actually did the work.
Mostly. 50% for midterm and final, plus 10% of the 50% project work is individual contributions to the project to account for varying interest/contributions to the project work.
But the problem is, students need to learn to do the easy things themselves before they can do the hard things with LLMs.
If you ask them to build a web browser when they can't do a hello world on their own, it's going to be a disaster. LLMs are like dumb juniors that you command, but students are less skilled than dumb juniors when they start programming classes..
Why should my 5 year old learn anything if he can just ask chatGPT?
Using chatGPT as a professional is different than using it for homework. Homework and school teaches you many things, not only the subject. You discover how you learn, what your interests are, etc.
ChatGPT can assist with learning also but SHOULD NOT be doing any of the work for the student. It is okay to ask "can you explain big O", then answer follow up questions. However, "give me method to reverse a string" will only hurt.
Depends. Does the college want to graduate computer scientists or LLM operators?
More importantly, does the student want to be a computer scientist or a LLM operator? If they think the future belongs to LLM operators (not a bet I'd recommend), college might not even be the right path for them versus a trade school / bootcamp.
Do you think children should still be expected to be able to do arithmetic by hand?
I think the answer maybe comes down to figuring out exactly what the goal of school is. Are you trying to educate people or train them? There is for sure a lot of overlap, but I think there's a pretty clear distinction and I definitely favor the education side. On the job, a person with a solid education should be able to use whatever language or framework they need with very little training required.
At the university, you are supposed to learn the foundational knowledge. If you let the LLM do the work, you are simply not learning. There are no shortcuts.
And learning how to use LLMs is pathetically easy. Really.
Open book exams are not a new thing and I've often had them for STEM disciplines (maths and biology). Depending on the subject, you will often fail those unless you had a good prior understanding of the material.
If you can pass an exam just by googling something, it means you're just testing rote-memorization rather, and maybe a better design is needed where synthesis and critical thinking skills are evaluated more actively.
I make a point of only using references that are either available for free online or through our university’s library subscriptions. These are all electronic. My open book exam became an open computer exam when I realized students were printing hundreds of pages just for a 3-hour exam. This semester I’m switching to no-computer, bring your own printed cheat-sheet for the exam.
I had a Continuous and Discrete Systems class that allowed open everything during exams. You could google whatever you wanted but the exam was so lengthy that if you had to google something, you really did not have much time to do it and would definitely not have enough time to do it a second time. I would load up a PDF of the chapters and lectures I needed and my homeworks for that unit with everything properly labeled. It was much faster looking for a similar problem you already did in the homework than trying to find the answer online.
Offer to make everyone espresso and macchiato with you GPU cooling module. They won't be able to hear the fan over the grinder and pump and milk foamer!
Except that the physical book isn't the way people lookup facts these days.
The open book test is purposes is to not have to know all facts (formulas) but proving how to find them and how to apply them. (Finding is part of it as the more you look, the less time you got to use it, thus there is an optimisation problem which things to remember and which to look up)
In modern times you wouldn't look those up in a book, thus other research techniques are required to deal with real life (which advanced certifications should prove)
Going to university isn't how people learn these days, so there is already a real-world disconnect, fundamentally. But that's okay as it isn't intended to be a reflection of the real world.
Observation? Children show clear signs of learning before they even make it through their first year out of the womb. Man, most people don't even consider university as an option until they are around 17-18 years of age, after they have already learned the vast majority of the things they will learn in life.
Data? Only 7-8% of the population have a university degree. Obviously you could learn in university without graduating, and unfortunately participation data is much harder to come by, but there is no evidence to suggest that the non-completion rate is anywhere high enough to think that even a majority of the population have step foot in a university even if for just one for day. If we go as far as to assume a 50% dropout rate, that is still no more than 16% of the population. Little more than rounding error.
Nothing? It's a random comment on the internet. It is not necessarily based on anything. Fundamentally, comments are only ever written for the enjoyment of writing. One trying to derive anything more from it has a misunderstanding of the world around them. I suppose you have a point that, for those who struggle to see the obvious, a university education would teach the critical thinking necessary to recognize the same. But, the fact that we are here echoes that university isn't how people learn these days.
> citing
Citing...? Like, as in quoting a passage? I can find no reason why I would want to repeat what someone else has written about. Whatever gives you enjoyment, but that seems like a pointless waste of time. It is already right there. You must be trying to say something else by this? I, unfortunately, am not in tune with your pet definition.
> This approach still works, why do something else?
One issue is that the time provided to mark each piece of work continues to decrease. Sometimes you are only getting 15 minutes for 20 pages, and management believe that you can mark back-to-back from 9-5 with a half hour lunch. The only thing keeping people sane is the students that fail to submit, or submit something obviously sub-par. So where possible, even for designing exams, you try to limit text altogether. Multiple choice, drawing lines, a basic diagram, a calculation, etc.
Some students have terrible handwriting. I wouldn't be against the use of a dumb terminal in an exam room/hall. Maybe in the background it could be syncing the text and backing it up.
> Unless you're specifically testing a student's ability to Google, they don't need access to it.
I've been the person testing students, and I don't always remember everything. Sometimes it is good enough for the students to demonstrate that they understand the topic enough to know where to find the correct information based on a good intuition.
Your blue book is being graded by a stressed out and very underpaid grad student with many better things to do. They're looking for keywords to count up, that's it. The PI gave them the list of keywords, the rubric. Any flourishes, turns of phrase, novel takes, those don't matter to your grader at 11 pm after the 20th blue book that night.
Yeah sure, that's not your school, but that is the reality of ~50% of US undergrads.
Very effective multiple choice tests can be given, that require work to be done before selecting an answer, so it can be machine graded. Not ideal in every case but a very quality test can be made multiple choice for hard science subjects
But again, the test creator matters a lot here too. To make such an exam is quite the labor. Especially as many/most PIs have other better things to do. Their incentives are grant money, then papers, then in a distant 3rd their grad students, and finally undergrad teaching.any departments are explicit on this. To spend the limited time on a good undergrad multiple choice exam is not in the PIs best interest.
Which is why, in this case of a good Scantron exam, they're likely to just farm it out to Claude. Cheap, easy, fast, good enough. A winner in all dimensions.
Also, as an aside to the above, an AI with OCR for your blue book would likely be the best realistic grader too. Needs less coffee after all
This is what my differential equations exams were like almost 20 years ago. Honestly, as a student I considered them brutal (10 questions, no partial credit available at all) even though I'd always been good at math. I scraped by but I think something like 30% of students had to retake the class.
Now that I haven't been a student in a long time and (maybe crucially?) that I am friends with professors and in a relationship with one, I get it. I don't think it would be appropriate for a higher level course, but for a weed-out class where there's one Prof and maybe 2 TAs for every 80-100 students it makes sense.
> Very effective multiple choice tests can be given, that require work to be done before selecting an answer, so it can be machine graded.
As someone who has been part of the production of quite a few high stakes MC tests, I agree with this.
That said, a professor would need to work with a professional test developer to make a MC that is consistently good, valid, and reliable.
Some universities have test dev folks as support, but many/most/all of them are not particularly good at developing high quality MC tests imho.
So, for anyone in a spot to do this, start test dev very early, ideally create an item bank that is constantly growing and being refined, and ideally have some problem types that can be varied from year-to-year with heuristics for keys and distractors that will allow for items to be iterated on over the years while still maintaining their validity. Also, consider removing outliers from the scoring pool, but also make sure to tell students to focus on answering all questions rather than spinning their wheels on one so that naturally persistent examinees are less likely to be punished by poor item writing.
Pros and cons. Multiple choice can be frustrating for students because it's all or nothing. Spend 10 minutes+ on question, make a small calculation error and end up with a zero. It's not a great format for a lot of questions.
They're also susceptible to old-school cheating - sharing answers. When I was in college, multiple choice exams were almost extinct because students would form groups and collect/share answers over the years.
You can solve that but it's a combinatorial explosion.
A long time ago, when I handed out exams, for each question, I used to program my exam questions into a generator that produced both not-entirely-identical questions for each student (typically, only the numeric values changed) and the matching answers for whoever was in charge of assessing.
For large classes or test questions used over multiple years, you need to take care that the answers are not shared. It means having large question banks which will be slowly collected. A good question can take a while to design, and it can be leaked very easily.
Stanford started doing 15 minute exams with ~12 questions to combat LLM use. OTOH I got a final project feedback from them that was clearly done by an LLM :shrug:
> I got a final project feedback from them that was clearly done by an LLM
I've heard of this and have been offered "pre-prepared written feedback banks" for questions, but I write all of my feedback from scratch every time. I don't think students should have their work marked by an LLM or feedback given via an LLM.
An LLM could have a place in modern marking, though. A student submits a piece of work and you may have some high level questions:
1. Is this the work of an LLM?
2. Is this work replicated elsewhere?
3. Is there evidence of poor writing in this work?
4. Are there examples where the project is inconsistent or nonsensical?
And then the LLM could point to areas of interest for the marker to check. This wouldn't be to replace a full read, but would be the equivalent of passing a report to a colleague and saying "is there anything you think I missed here?".
> Then they should have points deducted for that. Effective communication of answers is part of any exam.
Agreed. Then let me type my answers out like any reasonable person would do.
For reference…
For my last written blue book exam (in grad school) in the 90s, the professor insisted on blue books and handwriting.
I asked if I could type my answers or hand write my answers in the blue books and later type them out for her (with the blue book being the original source).
I told her point blank that my “clean” handwriting was produced at about a third of the speed that I can type, and that my legible chicken scratch was at about 80% of my typing rate. I hadn’t handwritten anything longer than a short note in over 5 years. She insisted that she could read any handwriting, and she wasn’t tech savvy enough to monitor any potential cheating in real time (which I think was accurate and fair).
I ended up writing my last sentence as the time ran out. I got an A+ on the exam and a comment about one of my answers being one of the best and most original that she had read. She also said that I would be allowed to type out my handwritten blue book tests if I took her other class.
All of this is to say that I would have been egregiously misgraded if “clean handwriting” had been a requirement. There is absolutely no reason to put this burden on people, especially as handwriting has become even less relevant since that exam I took in the 90s.
I was in university around the same time. While there I saw a concerted effort to push online courses. Professors would survey students fishing for interest. It was unpopular. To me the motivation seemed clear: charge the same or more for tuition, but reduce opex. Maybe even admit more students to just have then be remote. It watered down the value of the degree while working towards a worse product. Why would a nonprofit public university be working on maximizing profit?
Universities aren’t profit maximizing. They are admin maximizing. Admin are always looking to expand admins budget. Professors, classrooms, facilities all divert money away from admin and they don’t want to pay it unless they have to.
> Why would a nonprofit public university be working on maximizing profit?
Because 'nonprofit' is only in reference to the legal entity, not the profit-seeking people working there? There is still great incentive to increase profitability.
You're thinking of not-for-profit. Non-profits do not seek increased profitability in the same way since it's expected (mandated?) they don't have any.
So they can educate more students? Many university classes are lecture only with 200+ students in the class and no direct interactions with profs. Those courses might was well be online.
One potential answer is that this tests more heavily for the ability to memorise, as opposed to understanding. My last exams were over ten years ago and I was always good at them because I have a good medium-term memory for names and numbers. But it's not clearly useful to test for this, as most names and numbers can just be looked up.
When I was studying at university there was a rumour that one of the dons had scraped through their fourth-year exams despite barely attending lectures, because he had a photographic memory and just so happened to leaf through a book containing a required proof, the night before the exam. That gave him enough points despite not necessarily understanding what he was writing.
Obviously very few students have that sort of memory, but it's not necessarily fair to give advantage to those like me who can simply remember things more easily.
Have you ever seen a programmer who really understands C going to stackoverflow every time they have to use an fopen()? Memorization is part of understanding. You cannot understand something without it being readily available in your head
Right, and a lot of them probably got that understanding by going to stackoverflow every time they needed to use fopen() until they eventually didn’t need to anymore.
In the book days, I sometimes got to where I knew exactly where on a page I would find my answer without remembering what that answer was. Nowadays I remember the search query I used to find an answer without remembering what that answer was.
I wrote a long answer, but I realised that even advanced C users are unlikely to have memorised every possible value of errno and what they all mean when fopen errors. There's just no point as you can easily look it up. You can understand that there is a maximum allowable number of opened files without remembering what exact value errno will have in this case.
Yes, I have. I do it too, even some basic functions, I would look up on SO.
You really just need to know that there's a way to open files in C.
I don't think you can reach any sort of scale of breadth or depth if you try to memorize things. Programmers have to glue together a million things, it's just not realistic for them to know all the details of all of them.
It's unfortunate for the guy who has memorized all of K&R, but we have tools now to bring us these details based on some keywords, and we should use them.
I dunno, I went to a high school reunion last year, and a dude seemed to know people's phone numbers from 30 years ago.
If he could remember that sort of thing, I can believe there are people who can remember steps of a proof, which is a much less random thing that you can feel your way around, given a few queues from memory.
Plus, realistically, how closely does an examiner read a proof? They have a stack of dozens of almost the same thing, I bet they get pretty tired of it and use a heuristic.
I think many people who grew up before cell phones remember phone numbers from the past. I just thought about it and can list the phone numbers of 3 houses that were on my childhood street in the early 2000s + another 5 that were friends in the area. I remember at least a handful of cell phone numbers from the mid to late 2000s as friends started to get those; some of them are still current. On the other hand, I don't know the number of anyone I've met in the last 15 years besides my wife, and haven't tried to.
When I was in university, in my program, the most common format was that you were allowed to bring in a single page of notes (which you prepared ahead of time based on your understanding of what topics were likely to come up). That seemed to work fine for everyone.
My students then often ask me to do the same, to permit them to bring one page of notes as he does.
Then I would say: just assume you're writing the exam with him and work on your one-pager of notes, optimize your notes by copying and re-writing them a few times. Now, the only difference between my exam and his exam is that the night before, you memorize your one-pager (if you re-wrote it a few times you should be able to recreate it purely from memory from that practice alone).
I believe having had all material in your memory at the same time, at least once for a short while, gives students higher self-confidence; they may forget stuff again, but they hopefully remember the feeling of mastering it.
I teach at MSc level. My students are scattered around the country and world. This makes hand-written exams tricky. Luckily, the nature of the questions they are asked to solve in the essay I give them following their coursework are that chatbots produce appalling bad submissions.
In my case, I set some course-work, where they have to log in to a Linux server in the university and process a load of data, get the results, and then write the essay about the process. Because the LLM hasn't been able to log in and see the data or indeed the results, it doesn't have a clue what it's meant to talk about.
for most of the low hanging fruit it's as easy as copy-pasting the question into multiple LLMs and logging the output
do it again from a different IP or two.
there will be some pretty obvious patterns in responses. the smart kids will do minor prompt engineering "explain like you're Peter Griffin from Family Guy" or whatever, but even then there will be some core similarities.
or follow the example of someone here and post a question with hidden characters that will show up differently when copy-pasted.
I dont know what you majored. But when I was a CS major maybe 50% of my grade came from projects. We wrote a compiler from scratch, wrote something that resembled a SQL engine from scratch, and wrote sizeable portions of an operating system. In my sophomore year we spent at least 20 hours a week on various projects a week.
We could use any resource we coulc find as long as we didn't submit anything we didn't write ourselves. This meant stackoverflow and online documentations.
There is no way you can test a student's ability to implement a large, complex system with thousands of lines of code in a three hour exam. There is just no way. I am not against closed book paper exams, I just wish the people touting them as the solution can be more realistic about what they can and cannot do.
I had some take home exams in Physics that you could use internet, books, anything except other people (but that was honor code based). Those were some of the hardest exams I ever took in my life. Pages and pages of mathematical derivations. An LLM with how they can do a pretty good job at constructing mathematics, would actually have solved that issue pretty well.
People really struggle to go back once a technology has been adopted. I think for the most part, people cannot really evaluate whether or not the technology is a net positive; the adoption is more social than it is rational, and so it'd be like asking people to change their social values or behaviors.
It was the same when I graduated 6 years ago. We had projects to test our ability to use tools and such, and I guess in that context LLMs might be a concern. But exams were pencil and paper only.
10 years ago, we wrote exams by hand with whatever we understood (in our heads.)
No colleagues, no laptops, no internet, no LLMs.
This approach still works, why do something else? Unless you're specifically testing a student's ability to Google, they don't need access to it.