Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't understand.

10 years ago, we wrote exams by hand with whatever we understood (in our heads.)

No colleagues, no laptops, no internet, no LLMs.

This approach still works, why do something else? Unless you're specifically testing a student's ability to Google, they don't need access to it.





I am returning to this model in my classes: pen in paper quizzes, no digital devices. I also do seven equally weighted quizzes to deescalate them individually. I have reduced project/programming weight from 60-80% of my grade to 50% because it is not possible to tell if the students actually did the work.

I am also doing the same. 50% for project work and 50% for individual work, including paper and pen exams with no digital devices allowed.

The days of take home exams and coding lab assignments are gone...


is "individual work" the pen and paper?

Mostly. 50% for midterm and final, plus 10% of the 50% project work is individual contributions to the project to account for varying interest/contributions to the project work.

For a project I'm but so sure banning LLMs is actually the right approach.

Industry is full of people trying to use them to become more productive.

Why wouldn't you let students use the same tools?

Seems like you need to make the projects much harder.


But the problem is, students need to learn to do the easy things themselves before they can do the hard things with LLMs.

If you ask them to build a web browser when they can't do a hello world on their own, it's going to be a disaster. LLMs are like dumb juniors that you command, but students are less skilled than dumb juniors when they start programming classes..


Why should my 5 year old learn anything if he can just ask chatGPT?

Using chatGPT as a professional is different than using it for homework. Homework and school teaches you many things, not only the subject. You discover how you learn, what your interests are, etc.

ChatGPT can assist with learning also but SHOULD NOT be doing any of the work for the student. It is okay to ask "can you explain big O", then answer follow up questions. However, "give me method to reverse a string" will only hurt.


Depends. Does the college want to graduate computer scientists or LLM operators?

More importantly, does the student want to be a computer scientist or a LLM operator? If they think the future belongs to LLM operators (not a bet I'd recommend), college might not even be the right path for them versus a trade school / bootcamp.


It's like learning to factor polynomials even thought a computer algebra system on a graphic calculator can do that.

Do you think children should still be expected to be able to do arithmetic by hand?

I think the answer maybe comes down to figuring out exactly what the goal of school is. Are you trying to educate people or train them? There is for sure a lot of overlap, but I think there's a pretty clear distinction and I definitely favor the education side. On the job, a person with a solid education should be able to use whatever language or framework they need with very little training required.


Don’t compare LLMs to calculators. Only one of those is deterministic.

We're trying to evaluate the student not the LLM. You need to tease apart their contributions. Isn't this obvious?

At the university, you are supposed to learn the foundational knowledge. If you let the LLM do the work, you are simply not learning. There are no shortcuts.

And learning how to use LLMs is pathetically easy. Really.


> Industry is full of people trying to use them to become more productive

The goal of learning is not to be as productive as possible. You need to learn the material so you can check and fix the output of an LLM.

> Why wouldn't you let students use the same tools?

Students should use those tools, after they've learned how to do it without

> Seems like you need to make the projects much harder.

The goal of a project should be to learn the material with hands on experience.

I will always be grateful to my school teachers that forced me to learn arithmetic and not rely on the calculators we all carry in our pockets.


Open book exams are not a new thing and I've often had them for STEM disciplines (maths and biology). Depending on the subject, you will often fail those unless you had a good prior understanding of the material.

If you can pass an exam just by googling something, it means you're just testing rote-memorization rather, and maybe a better design is needed where synthesis and critical thinking skills are evaluated more actively.


Open book, sure. But you don't even need a computer for that.

I make a point of only using references that are either available for free online or through our university’s library subscriptions. These are all electronic. My open book exam became an open computer exam when I realized students were printing hundreds of pages just for a 3-hour exam. This semester I’m switching to no-computer, bring your own printed cheat-sheet for the exam.

And even if you are allowed to use a computer, you cannot use internet (and should not be hard to prevent that).

I had a Continuous and Discrete Systems class that allowed open everything during exams. You could google whatever you wanted but the exam was so lengthy that if you had to google something, you really did not have much time to do it and would definitely not have enough time to do it a second time. I would load up a PDF of the chapters and lectures I needed and my homeworks for that unit with everything properly labeled. It was much faster looking for a similar problem you already did in the homework than trying to find the answer online.

Local LLMs

Be sure to bring an extra power strip for all your plugs and adaptors.

https://www.tomshardware.com/pc-components/gpus/tiny-corp-su...


My laptop runs gpt-oss 120B with none of that. Don't know how long though. I suspect a couple of hours continuous.

Which laptop?

ROG Flow Z13 with maxxed out RAM.

Nice laptop. I love my current laptop in general, but it is lagging in performance.

how is anyone going to be able to take a test with all of the noise from that fan as it cranks through tokens?

Offer to make everyone espresso and macchiato with you GPU cooling module. They won't be able to hear the fan over the grinder and pump and milk foamer!

You can make it as slow as you want. At half TDP it is silent.

Except that the physical book isn't the way people lookup facts these days.

The open book test is purposes is to not have to know all facts (formulas) but proving how to find them and how to apply them. (Finding is part of it as the more you look, the less time you got to use it, thus there is an optimisation problem which things to remember and which to look up)

In modern times you wouldn't look those up in a book, thus other research techniques are required to deal with real life (which advanced certifications should prove)


Going to university isn't how people learn these days, so there is already a real-world disconnect, fundamentally. But that's okay as it isn't intended to be a reflection of the real world.

> Going to university isn't how people learn these days

That’s a surpsing statement that doesn’t ring true to me, what are you basing that off of / citing?


> what are you basing that off of

Observation? Children show clear signs of learning before they even make it through their first year out of the womb. Man, most people don't even consider university as an option until they are around 17-18 years of age, after they have already learned the vast majority of the things they will learn in life.

Data? Only 7-8% of the population have a university degree. Obviously you could learn in university without graduating, and unfortunately participation data is much harder to come by, but there is no evidence to suggest that the non-completion rate is anywhere high enough to think that even a majority of the population have step foot in a university even if for just one for day. If we go as far as to assume a 50% dropout rate, that is still no more than 16% of the population. Little more than rounding error.

Nothing? It's a random comment on the internet. It is not necessarily based on anything. Fundamentally, comments are only ever written for the enjoyment of writing. One trying to derive anything more from it has a misunderstanding of the world around them. I suppose you have a point that, for those who struggle to see the obvious, a university education would teach the critical thinking necessary to recognize the same. But, the fact that we are here echoes that university isn't how people learn these days.

> citing

Citing...? Like, as in quoting a passage? I can find no reason why I would want to repeat what someone else has written about. Whatever gives you enjoyment, but that seems like a pointless waste of time. It is already right there. You must be trying to say something else by this? I, unfortunately, am not in tune with your pet definition.


> This approach still works, why do something else?

One issue is that the time provided to mark each piece of work continues to decrease. Sometimes you are only getting 15 minutes for 20 pages, and management believe that you can mark back-to-back from 9-5 with a half hour lunch. The only thing keeping people sane is the students that fail to submit, or submit something obviously sub-par. So where possible, even for designing exams, you try to limit text altogether. Multiple choice, drawing lines, a basic diagram, a calculation, etc.

Some students have terrible handwriting. I wouldn't be against the use of a dumb terminal in an exam room/hall. Maybe in the background it could be syncing the text and backing it up.

> Unless you're specifically testing a student's ability to Google, they don't need access to it.

I've been the person testing students, and I don't always remember everything. Sometimes it is good enough for the students to demonstrate that they understand the topic enough to know where to find the correct information based on a good intuition.


I want to echo this.

Your blue book is being graded by a stressed out and very underpaid grad student with many better things to do. They're looking for keywords to count up, that's it. The PI gave them the list of keywords, the rubric. Any flourishes, turns of phrase, novel takes, those don't matter to your grader at 11 pm after the 20th blue book that night.

Yeah sure, that's not your school, but that is the reality of ~50% of US undergrads.


Very effective multiple choice tests can be given, that require work to be done before selecting an answer, so it can be machine graded. Not ideal in every case but a very quality test can be made multiple choice for hard science subjects

True! Good point!

But again, the test creator matters a lot here too. To make such an exam is quite the labor. Especially as many/most PIs have other better things to do. Their incentives are grant money, then papers, then in a distant 3rd their grad students, and finally undergrad teaching.any departments are explicit on this. To spend the limited time on a good undergrad multiple choice exam is not in the PIs best interest.

Which is why, in this case of a good Scantron exam, they're likely to just farm it out to Claude. Cheap, easy, fast, good enough. A winner in all dimensions.

Also, as an aside to the above, an AI with OCR for your blue book would likely be the best realistic grader too. Needs less coffee after all


This is what my differential equations exams were like almost 20 years ago. Honestly, as a student I considered them brutal (10 questions, no partial credit available at all) even though I'd always been good at math. I scraped by but I think something like 30% of students had to retake the class.

Now that I haven't been a student in a long time and (maybe crucially?) that I am friends with professors and in a relationship with one, I get it. I don't think it would be appropriate for a higher level course, but for a weed-out class where there's one Prof and maybe 2 TAs for every 80-100 students it makes sense.


> Very effective multiple choice tests can be given, that require work to be done before selecting an answer, so it can be machine graded.

As someone who has been part of the production of quite a few high stakes MC tests, I agree with this.

That said, a professor would need to work with a professional test developer to make a MC that is consistently good, valid, and reliable.

Some universities have test dev folks as support, but many/most/all of them are not particularly good at developing high quality MC tests imho.

So, for anyone in a spot to do this, start test dev very early, ideally create an item bank that is constantly growing and being refined, and ideally have some problem types that can be varied from year-to-year with heuristics for keys and distractors that will allow for items to be iterated on over the years while still maintaining their validity. Also, consider removing outliers from the scoring pool, but also make sure to tell students to focus on answering all questions rather than spinning their wheels on one so that naturally persistent examinees are less likely to be punished by poor item writing.


Pros and cons. Multiple choice can be frustrating for students because it's all or nothing. Spend 10 minutes+ on question, make a small calculation error and end up with a zero. It's not a great format for a lot of questions.

They're also susceptible to old-school cheating - sharing answers. When I was in college, multiple choice exams were almost extinct because students would form groups and collect/share answers over the years.

You can solve that but it's a combinatorial explosion.


A long time ago, when I handed out exams, for each question, I used to program my exam questions into a generator that produced both not-entirely-identical questions for each student (typically, only the numeric values changed) and the matching answers for whoever was in charge of assessing.

That was a bit time-consuming, of course.


For large classes or test questions used over multiple years, you need to take care that the answers are not shared. It means having large question banks which will be slowly collected. A good question can take a while to design, and it can be leaked very easily.

Scantron and a #2 pencil.

Stanford started doing 15 minute exams with ~12 questions to combat LLM use. OTOH I got a final project feedback from them that was clearly done by an LLM :shrug:

> I got a final project feedback from them that was clearly done by an LLM

I've heard of this and have been offered "pre-prepared written feedback banks" for questions, but I write all of my feedback from scratch every time. I don't think students should have their work marked by an LLM or feedback given via an LLM.

An LLM could have a place in modern marking, though. A student submits a piece of work and you may have some high level questions:

1. Is this the work of an LLM?

2. Is this work replicated elsewhere?

3. Is there evidence of poor writing in this work?

4. Are there examples where the project is inconsistent or nonsensical?

And then the LLM could point to areas of interest for the marker to check. This wouldn't be to replace a full read, but would be the equivalent of passing a report to a colleague and saying "is there anything you think I missed here?".


> Some students have terrible handwriting.

Then they should have points deducted for that. Effective communication of answers is part of any exam.


> Then they should have points deducted for that. Effective communication of answers is part of any exam.

Agreed. Then let me type my answers out like any reasonable person would do.

For reference…

For my last written blue book exam (in grad school) in the 90s, the professor insisted on blue books and handwriting.

I asked if I could type my answers or hand write my answers in the blue books and later type them out for her (with the blue book being the original source).

I told her point blank that my “clean” handwriting was produced at about a third of the speed that I can type, and that my legible chicken scratch was at about 80% of my typing rate. I hadn’t handwritten anything longer than a short note in over 5 years. She insisted that she could read any handwriting, and she wasn’t tech savvy enough to monitor any potential cheating in real time (which I think was accurate and fair).

I ended up writing my last sentence as the time ran out. I got an A+ on the exam and a comment about one of my answers being one of the best and most original that she had read. She also said that I would be allowed to type out my handwritten blue book tests if I took her other class.

All of this is to say that I would have been egregiously misgraded if “clean handwriting” had been a requirement. There is absolutely no reason to put this burden on people, especially as handwriting has become even less relevant since that exam I took in the 90s.


I personally don't believe that terrible handwriting should have any hold over a computer science student.

Doctors (medicine) get away with it.

> Then they should have points deducted for that. Effective communication of answers is part of any exam.

...even when it's for a medical reason?


I was in university around the same time. While there I saw a concerted effort to push online courses. Professors would survey students fishing for interest. It was unpopular. To me the motivation seemed clear: charge the same or more for tuition, but reduce opex. Maybe even admit more students to just have then be remote. It watered down the value of the degree while working towards a worse product. Why would a nonprofit public university be working on maximizing profit?

Online courses are also increases admin overhead.

Universities aren’t profit maximizing. They are admin maximizing. Admin are always looking to expand admins budget. Professors, classrooms, facilities all divert money away from admin and they don’t want to pay it unless they have to.

Also applies to hospitals in USA.


> Why would a nonprofit public university be working on maximizing profit?

Because 'nonprofit' is only in reference to the legal entity, not the profit-seeking people working there? There is still great incentive to increase profitability.


You're thinking of not-for-profit. Non-profits do not seek increased profitability in the same way since it's expected (mandated?) they don't have any.

I'm not thinking of either. The profit-seeking people looking to increase their profitability spoken of are neither non-profits nor not-for-profits.

So they can educate more students? Many university classes are lecture only with 200+ students in the class and no direct interactions with profs. Those courses might was well be online.

One potential answer is that this tests more heavily for the ability to memorise, as opposed to understanding. My last exams were over ten years ago and I was always good at them because I have a good medium-term memory for names and numbers. But it's not clearly useful to test for this, as most names and numbers can just be looked up.

When I was studying at university there was a rumour that one of the dons had scraped through their fourth-year exams despite barely attending lectures, because he had a photographic memory and just so happened to leaf through a book containing a required proof, the night before the exam. That gave him enough points despite not necessarily understanding what he was writing.

Obviously very few students have that sort of memory, but it's not necessarily fair to give advantage to those like me who can simply remember things more easily.


Have you ever seen a programmer who really understands C going to stackoverflow every time they have to use an fopen()? Memorization is part of understanding. You cannot understand something without it being readily available in your head

Right, and a lot of them probably got that understanding by going to stackoverflow every time they needed to use fopen() until they eventually didn’t need to anymore.

In the book days, I sometimes got to where I knew exactly where on a page I would find my answer without remembering what that answer was. Nowadays I remember the search query I used to find an answer without remembering what that answer was.


I wrote a long answer, but I realised that even advanced C users are unlikely to have memorised every possible value of errno and what they all mean when fopen errors. There's just no point as you can easily look it up. You can understand that there is a maximum allowable number of opened files without remembering what exact value errno will have in this case.

Yes, I have. I do it too, even some basic functions, I would look up on SO.

You really just need to know that there's a way to open files in C.

I don't think you can reach any sort of scale of breadth or depth if you try to memorize things. Programmers have to glue together a million things, it's just not realistic for them to know all the details of all of them.

It's unfortunate for the guy who has memorized all of K&R, but we have tools now to bring us these details based on some keywords, and we should use them.


I still look up PHP builtins often because they're so inconsistent. What comes first, the needle or the haystack?

> because he had a photographic memory and just so happened to leaf through a book containing a required proof

It makes for good rumours and TV show plots, but this sort of "photographic memory" has never been shown to actually exist.


Huh, TIL [0]. Thanks. There are people who can perform extraordinary memory feats, but they're very rare and/or self-trained.

[0] https://skeptoid.com/episodes/542


I dunno, I went to a high school reunion last year, and a dude seemed to know people's phone numbers from 30 years ago.

If he could remember that sort of thing, I can believe there are people who can remember steps of a proof, which is a much less random thing that you can feel your way around, given a few queues from memory.

Plus, realistically, how closely does an examiner read a proof? They have a stack of dozens of almost the same thing, I bet they get pretty tired of it and use a heuristic.


I think many people who grew up before cell phones remember phone numbers from the past. I just thought about it and can list the phone numbers of 3 houses that were on my childhood street in the early 2000s + another 5 that were friends in the area. I remember at least a handful of cell phone numbers from the mid to late 2000s as friends started to get those; some of them are still current. On the other hand, I don't know the number of anyone I've met in the last 15 years besides my wife, and haven't tried to.

>His photographic memory manifested itself early — he would amuse his parents’ friends by instantly memorizing pages of phone books on command.

https://medium.com/young-spurs/the-unsung-genius-of-john-von...


When I was in university, in my program, the most common format was that you were allowed to bring in a single page of notes (which you prepared ahead of time based on your understanding of what topics were likely to come up). That seemed to work fine for everyone.

I have a colleague who does that.

My students then often ask me to do the same, to permit them to bring one page of notes as he does.

Then I would say: just assume you're writing the exam with him and work on your one-pager of notes, optimize your notes by copying and re-writing them a few times. Now, the only difference between my exam and his exam is that the night before, you memorize your one-pager (if you re-wrote it a few times you should be able to recreate it purely from memory from that practice alone).

I believe having had all material in your memory at the same time, at least once for a short while, gives students higher self-confidence; they may forget stuff again, but they hopefully remember the feeling of mastering it.


I teach at MSc level. My students are scattered around the country and world. This makes hand-written exams tricky. Luckily, the nature of the questions they are asked to solve in the essay I give them following their coursework are that chatbots produce appalling bad submissions.

This is great. Do you have advice on making questions that LLMs are bad at answering?

In my case, I set some course-work, where they have to log in to a Linux server in the university and process a load of data, get the results, and then write the essay about the process. Because the LLM hasn't been able to log in and see the data or indeed the results, it doesn't have a clue what it's meant to talk about.

for most of the low hanging fruit it's as easy as copy-pasting the question into multiple LLMs and logging the output

do it again from a different IP or two.

there will be some pretty obvious patterns in responses. the smart kids will do minor prompt engineering "explain like you're Peter Griffin from Family Guy" or whatever, but even then there will be some core similarities.

or follow the example of someone here and post a question with hidden characters that will show up differently when copy-pasted.


They don't work very well with large numbers. Try asking Claude to find the prime factors of 83521.

I dont know what you majored. But when I was a CS major maybe 50% of my grade came from projects. We wrote a compiler from scratch, wrote something that resembled a SQL engine from scratch, and wrote sizeable portions of an operating system. In my sophomore year we spent at least 20 hours a week on various projects a week.

We could use any resource we coulc find as long as we didn't submit anything we didn't write ourselves. This meant stackoverflow and online documentations.

There is no way you can test a student's ability to implement a large, complex system with thousands of lines of code in a three hour exam. There is just no way. I am not against closed book paper exams, I just wish the people touting them as the solution can be more realistic about what they can and cannot do.


I had some take home exams in Physics that you could use internet, books, anything except other people (but that was honor code based). Those were some of the hardest exams I ever took in my life. Pages and pages of mathematical derivations. An LLM with how they can do a pretty good job at constructing mathematics, would actually have solved that issue pretty well.

People really struggle to go back once a technology has been adopted. I think for the most part, people cannot really evaluate whether or not the technology is a net positive; the adoption is more social than it is rational, and so it'd be like asking people to change their social values or behaviors.

> 10 years ago, we wrote exams by hand with whatever we understood (in our heads.)

You did, but the best exam I had was open book bring anything. 25 and some change years ago even.

I've also had another professor do the "you can bring one A4 sheet with whatever notes you want to make on it."


It was the same when I graduated 6 years ago. We had projects to test our ability to use tools and such, and I guess in that context LLMs might be a concern. But exams were pencil and paper only.

I think the key difference is what you're trying to measure

Grading the students. Usually for bigger classes, universities (at least mine) don't provide adequate support for grading the tests.

I go to school right now, and most classes actually enforce paper and pencil tests despite how annoying it is to grade and code on.

We had to write C code in paper. It was horrible.

It's not like writing prose And there is no syntax highlighting, no compile errors


I don't know if it's the reason, but some students do need a computer for medical reasons.

We had open notebook group exams back then too.

Optics.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: