Hacker Newsnew | past | comments | ask | show | jobs | submit | erlapso's commentslogin

Question: Did you try Greptile?


Must have missed it when we did our research but it looks promising. What does it excel at?


I just open-sourced CodeBeaver, a tool I built after LLM-generated code kept sneaking weird bugs into my projects.

With just a few lines of YAML, CodeBeaver can:

- Run end-to-end (E2E) tests written in natural language

- Generate, maintain, and execute unit tests automatically

- Analyze test failures to determine if it’s a bug or just a flaky test

You can run it locally with a quick pip install, or integrate it into CI/CD with GitHub Actions, where it will even open PRs with missing tests.

It's basically vibe testing :D

We use BrowserUse for the E2E, O3-mini for the unit test generation, plus a bunch of shell scripts to make everything seamless.

Currently supports Python & TypeScript, with more languages on the roadmap. Would love to hear your thoughts!


Super interesting approach! We've been working on the opposite - always getting your Unit tests written with every PR. The idea is that you don't have to bother running or writing them, you just get them delivered in your Github repo. You can check it out here https://www.codebeaver.ai


First, I'm a fan of LLMs reducing friction in tests, but I would be concerned with the false sense of confidence here. The demo gif shows "hey I wrote your tests and they pass, go ahead and merge them"

OP makes a valid point

> Now we contend with the “who guards the guard” problem. Because LLMs are unreliable agents, it might so happen that Claude just scammed us by spitting out useless (or otherwise low-effort) test cases. [...] So it’s time to introduce some human input in the form of additional test cases, which is made extra convenient since the model already provided the overall structure of our test. If those cases pass, we can be reasonably confident in integrating this function into our codebase.

In our repos, I would love to have an LLM tool/product that helps out with test writing, but the workflow certainly needs to have some human in the loop for the time being. More like "Here I got you started with test coverage, add a few more of your own" or "Give me a few bullet points of cases that should pass or fail" and review the test code, not "go ahead and merge these tests I wrote for you"


Test driven development is sequenced the way it is for a reason. Getting a failing test first builds confidence that the test is, you know, actually testing something. And the process of writing the tests is often where the largest amount of reasoning about design choices takes place.

Having an LLM generate the tests after you've already written the code for them is super counterproductive. Who knows whether those tests actually test anything?

I know this gets into "I wanted AI to do my laundry, not my art" territory, but a far more rational division of labor is for the humans to write the tests (maybe with the assistance of an autocomplete model) and give those as context for the AI. Humans are way better at thinking of edge cases and design constraints than the models are at this point in the game.


But the CEO has a different level of understanding of logistics compared to the major. They may be looking at the same port, but they see different things. The CEO saw bottlenecks like Neo sees the Matrix


Why doesn't the mayor of a city with one of America's most important ports call in experts like this the second trouble started? It was this easy and he never bothered to ask the experts?


> Why doesn't the mayor of a city with one of America's most important ports call in experts like this the second trouble started? It was this easy and he never bothered to ask the experts?

Because we pay our politicians terribly low compared to other leadership positions.

Our best leaders have gone to Facebook / Google to make better ads. It makes no sense for a 18-year-old going into college to study political theory and become a mayor by 30 or so.

Our political system is broken because there's no incentives to get good leaders into our political system. There's far more leadership positions available in private industry, and they all pay maybe 500% higher.

Remember: Senators are only paid like $180,000/year. Most other positions are paid much much less. In contrast, you can easily get $250k+/year as a VP for... well... pretty much anyone else. (Exxon, Facebook, Microsoft). Reach "3-letter" positions (CEO, CFO, CIO) at FANNGs and you're upwards of $1MM/year.

--------

Bonus points: a typical VP at Microsoft probably doesn't have to worry about legitimate death threats / assassination attempts like our politicians do. Its a quieter, safer, easier life. You put your family through hell, the media hound you and try to dig up dirt on you constantly. Etc. etc.

Does anyone here actually want to be a politician? Or would you rather continue your path in Engineering / programming / whatever you're doing right now? I'm not necessarily saying Hacker News is the "best and brightest", but... a lot of us are at least _trying_ to be the best-and-brightest in our selective fields. How many of us actually think about going into politics?


This. But it's not like politicians aren't intelligent and ambitious, so many of them look to earn money in other ways, ie the stock market, which gets dangerously close to conflicts of interest because they are, by design, there to regulate industry.


$180K puts you well within the top 20% in income. pay is not the problem. in fact, trying to solve politician quality by increasing pay would likely worsen the problem by misaligning incentives even more. also the assumption that the best and the brightest are managers at tech companies is amusingly naive.


That's for a literal federal Senator. Even mayors don't make near that in the general case.


The mayor of Long Beach makes >180k in pay and benefits: https://patch.com/california/longbeach-ca/long-beach-mayor-r...


He makes $143k + medical and pension.

Those numbers are nutty, I know 22yos that make more than that.


But how much opportunity do they have for graft and corruption? Most of such money does not actually go through the mayor's bank account; instead, it is directed to people who then provide favors, e.g. employing his associates. Informal exchange of favors is the lifeblood of politicians.


Maybe if we paid a decent salary, we wouldn't get the bottom feeders who are only in the job for grift and corruption opportunities.


You don't seem to be getting what motivates people to go into politics. They are not looking for ways to avoid involvement in corruption. The opportunity to be involved in the favors economy is most of the job's appeal. Paying them more would just cost more.


I can see how you can think that's the case if you set up the incentives to only attract those people.


It is the nature of the job to set up its own incentives. How it is is exactly how the people who do it want it to be.


Because in this case the "solution" doesn't solve the rest of the problem: that their aren't enough truckers or locomotives to haul the cargo inward to their domestic destinations due to the unprecedented demand for shipped goods, which is why containers were piling up in the port in the first place.

This just solves the problem of allowing slightly more ships to offload their cargo before they run out of space again. But as there are 100+ ships currently waiting to offload, this expanded "buffer" still isn't big enough.

EDIT: left out of the one-sided linked article: the city of Long Beach had been planning to waive the stacking requirements for a while prior to the Flexport CEO going on his rant due to pressure from the White House dating back to this summer. Container storage near (not in) the ports actually falls into 3 separate jurisdictions: the ports of LA and Long Beach, and the cities of Long Beach, LA, and Wilmington, and required coordination between all these agencies, coordination with the logistics companies operating at the ports, and coordination with the domestic shipping companies that would be moving containers out of the container storage areas (via truck or train).


> Why doesn't the mayor of a city with one of America's most important ports call in experts like this the second trouble started?

(1) Because the Mayor of Long Beach is a primus inter pares legislator; as is the common for cities in California, Long Beach is a Council-Manager system, the chief executive is the appointed City Manager.

(2) But, anyhow, under the City Charter (basically, its Constitution) the harbor is actually governee by the Harbor Commission, anyway, which (like the city itself) also has appointed chief executive (the Executive Director),

So, the question should probably be “Why didn't the Executive Director of the Harbor Commission call in a experts like this..." (or, why aren’t the members and Executive Director of the Harbor Commission experts like this in the first place.)


Because incompetence is everywhere. I think most people assume that high level positions are filled by people who know what they’re doing, but my experience has shown that to be an incorrect assumption time and time again.


It's not a requirement that high level people know what they're doing: in fact, I'd go so far as to say that's impossible.

What is a requirement (when you're a high level person) is ensuring people under you know what they're doing.

It feels like we have far too little of that in our culture.

It doesn't take a rocket scientist to do enough research to understand if the person who's advising you on rocket science knows what they're about. It takes a good reading list, some time, and effort.

And yet far too many manager+ just... don't.

Which allows frauds to persist on teams, and ultimately breaks things when they're asked to advise or implement things they're unqualified to do.

Every good company I've worked at expected its managers and advisors to get up to speed ASAP on (insert new thing they're working on). Every bad company had a culture that that wasn't a manager or advisor's job, and it was sufficient to repackage the words of direct reports.


What interaction does the mayor (or the administration) of a city normally have with the ports? I mean, beyond keeping an eye on the wear to road surfaces of port traffic.


In this case, its a city operated port, so its different than a private port or one operated by a special purpose public agency outside of city government (but operation is, under the City Charter, in the hands of a Harbor Commission, so its under a special body within city government, so the Mayor, Council, and City Manager—the last holding the executive role lots of people associate with the title “mayor” because Long Beach is Council-Manager model not Strong Mayor model—have less direct role than might at first be assumed from “city operated port”.)


According to the Bloomberg article in the Wapo, the regulation was suspended by Long Beach's city manager, not its mayor.


This is inching close to the conclusion that mandatory expert panels are required for government to function.

But then you go back to the problem of "who determines who are the experts". Point in case, the anti-vaccine politicians dredge up the 1 out of a 1000 doctors that spouts whatever fits their narrative. Lots o people die gasping for air unnecessarily as a result...

And we have no idea how to begin to solve that problem while keeping a functional democracy, it seems

Sorry to bring in vaccines into the topic - it's just the clear parallel between these situations that I wanted to draw on.

Experts are what you want them to be


Huh? Calling up your local shipping exec for a meeting is most definitely not forming "expert panels"


It might me. If they are just called at random things are fine. However if you do a little work you can figure out who will support whatever position you want.

A few months ago I listened to one "expert panel" called before congress about high speed trail. Most of the people didn't have any useful expertise on the subject. There was the union rep who considered anything good so long as it makes jobs - if they could dig and refill the same hole all day that would be good). There was the you are not listening to NIMBYs enough - without any acknowledgement on how much NIMBYs had been listened to. There were several people who define HSR so slow that Amtrak meets it.

I believe the above is typical of congressional hearings, though I don't have 4 hours to sit through them on a regular basis. (I had a lot of long compiling tasks to do that day)


Yes, it is. And can also result in decisions that benefit your local shipping exec over any other considerations. Informality in cases like this is just another way to say "completely avoids oversight. "


> This is inching close to the conclusion that mandatory expert panels are required for government to function.

Notionally, one would think that's what the existing Harbor Commission is.


Ryan Petersen is a smart guy with a good perspective but he started Flexport in 2013. He didn't go to Cal Maritime; he went to Berkeley. He doesn't have a deck license or even a CDL; he was a member of Cal Sailing.

He is however smart and smart is good. Time will tell whether his suggestion was a major factor or just a good idea.

I like his Twitter thread:

  What caused all the supply chain bottlenecks? Modern finance with its obsession with "Return on Equity."
https://twitter.com/typesfast/status/1453753924960219145


> The new changes will take effect in 2022, and will apply to doctors, hospitals and air ambulances, though not ground ambulances.

From 2022 and no ambulances. This is a joke..


Optimists here would claim that this is slightly less of a joke than the status quo.


How come it covers air ambulances but not ground ambulances?

I know the answer is probably that the ambulance companies lobbied (bribed) the hell out of these laws, but with the ambulance costs this is comical.


Yes exactly. We think that a tool that does not encompass every use case and device is too narrow and not appealing to the design community


Agreed, as in: Steampunk is actually a form of retrofuturism


The Wikipedia article does describe steampunk as a high-profile example.

Also interesting to note the two non-distinct forms of retrofuturism: "the future as seen from the past" and "the past as seen from the future". Steampunk is an example of the latter.


Renditions of things described by Jules Verne could be considered both "the future as seen from the past" and "Steampunk"


Or working models of Babbage's machines!


I really like it!


cool! :)


What do you think about conversational interfaces?


Why didn't you contribute to https://github.com/FezVrasta/bootstrap-material-design/ instead?


To make money I'd assume.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: