Hacker Newsnew | past | comments | ask | show | jobs | submit | lou1306's commentslogin

Good question! It's a bit of a stretch. BEAM has mailboxes, non-blocking sends, and asynchronous handling of messages, whereas the original CSP is based on blocking sends and symmetric channels. Symmetric means you have no real difference between sends and receives: two processes synchrnoise when they are willing to send the same data on the same channel. (A "receive" is just a nondeterministic action where you are willing to send anything on a channel).

Occam added types to channels and distinguished sends/receives, which is the design also inherited by Go.

In principle you can emulate a mailbox/message queue in CSP by a sequence of processes, one per queue slot, but accounting for BEAM's weak-ish ordering guarantees might be complicated (I suppose you should allow queue slots to swap messages under specific conditions).


You can't explicitly allocate cache, but you can lay things out in memory to minimize cache misses.

A fun fact for the people who like to go on rabbit holes. There is an x86 technique called cache-as-RAM (CAR) that allows you to explicitly allocate a range of memory to be stored directly in cache, avoiding the DRAM entirely.

CAR is often used in early boot before the DRAM is initialized. It works because the x86 disable cache bit actually only decouples the cache from the memory, but the CPU will still use the cache if you primed it with valid cache lines before setting the cache disable bit.

So the technique is to mark a particular range of memory as write-back cacheable, prime the cache with valid cache lines for the entire region, and then set the bit to decouple the cache from memory. Now every access to this memory region is a cache hit that doesn't write back to DRAM.

The one downside is that when CAR is on, any cache you don't allocate as memory is wasted. You could allocate only half the cache as RAM to a particular memory region, but the disable bit is global, so the other half would just sit idle.


Thanks – I was wondering how code that initializes DRAM actually runs

Out of curiosity, why has there not been a slight paradigm shift in modern system programming languages to expose more control over the caches?

Same as the failure of Itanium VLIW instructions: you don't actually want to force the decision of what is in the cache back to compile time, when the relevant information is better available at runtime.

Also, additional information on instructions costs instruction bandwidth and I-cache.


> you don't actually want to force the decision of what is in the cache back to compile time, when the relevant information is better available at runtime

That is very context-dependent. In high-performance code having explicit control over caches can be very beneficial. CUDA and similar give you that ability and it is used extensively.

Now, for general "I wrote some code and want the hardware to run it fast with little effort from my side", I agree that transparent caches are the way.


x86 provides this control with non-temporal load/store instructions.

that solves the pollution problem, but it doesn't pin cache lines. it also doesn't cover the case that ppc does where you want to assert a line is valid without actually fetching.

That seems correct, but it also doesn’t account for managed languages with runtimes like JavaScript or Java or .NET, which probably have a lot of interesting runtime info they could use to influence caching behavior. There’s an amount of “who caches the cacher” if you go down this path (who manages cache lines for the V8 native code that is in turn managing cache lines for jitted JavaScript code), but it still seems like there is opportunity there?

thats a strange statement. its certainly not black and white, but the compiler has explicit lifetime information, while the cache infrastructure is using heuristics. I worked on a project which supported region tags in the cache for compiler-directed allocation and it showed some decent gains (in simulation).

I guess this is one place where it seems possible to allow for compiler annotations without disabling the default heuristics so you could maybe get the best of both.


There are cache control instructions already. The reason why it goes no further than prefetch/invalidate hints is probably because exposing a fuller api on the chip level to control the cache would overcomplicate designs, not be backwards compatible/stable api. Treating the cache as ram would also require a controller, which then also needs to receive instructions, or the cpu has to suddenly manage the cache itself.

I can understand why they just decide to bake the cache algorithms into hardware, validate it and be done with it. Id love if a hardware engineer or more well-read fellow could chime in.


Another reason for doing cache algorithms in hardware is that cache access (especially for level 1 caches) has to be low latency to be useful.

Because programmers are in general worse at managing them than the basic LRU algorithm.

And because the abstraction is simple and easy enough to understand that when you do need close control, it's easy to achieve by just writing to the abstraction. Careful control of data layout and nontemporal instructions are almost always all you need.


There has! Intel has Cache Acceleration Technology, and I was very peripherally involved in reviewing research projects at Boston University into this. One that I remember was allowing the operating system to divide up cache and memory bandwidth for better prioritization.

https://www.intel.com/content/www/us/en/developer/articles/t...


This is not applicable to most programming scenarios since the cache gets trashed unpredictably during context switches (including the user-level task switches involved in cooperative async patterns). It's not a true scratchpad storage, and turning it into one would slow down context switches a lot since the scratchpad would be processor state. Maybe this can be revisited once even low-end computers have so many hardware cores/threads that context switches become so rare that the overhead is not a big deal. But we are very far from anything of the sort.

I would say this is the main benefit of cuda programming on gpu. You get to control local memory. Maybe nvidia will bring it to the cpu now that the make CPU’s

They believe so because we have spent decades using the term AI for another category of symbolic methods (search-based chess engines, theorem provers, planners). In the areas where they were successful, these methods _were_ infallible (of course, compared to humans and modulo programming bugs).

Meanwhile, neural techniques have flown under the public consciousness radar until relatively recent times, when they had a huge explosion in popularity. But the term "AI" had retained that old aura of superhuman precision and correctness.


The fact is, in search you get one single result and (unless you're extremely gullible) that raises some red flags. But chatbots will give you an answer + a reference and never mention *how many* references for their answer are there on the 'net.

Then why companies aren't offering minimum-wage SWE-1 jobs already? Could it be that the output of an AI tool still needs a modicum of skill and craft to evaluate?


Internships are these, and are getting longer. I'm also thinking this is what will stop the transfer of roles to another part of the world.

Of course using AI is a skill. But the effort needed to get there is becoming lower and lower.


Well not exactly. An internship is a temporary position, which people mostly just take to improve a CV at an early career stage, or as a fallback after being laid off. A "minimum wage job" is... A job.


Internships at my employer pay 6 figures. I've never seen a minimum wage SWE internship.


Hell I had an internship in 1995 and they paid $10 an hour then and provided housing.

For context, my take home was $650 every two weeks - my total quarterly tuition at school and the next year the cost to rent a one bedroom in the northern burbs of Atlanta.


They might manage to pin it all on Alan Dye, who recently jumped ship to Meta.


If you are so adamant about this, why don't you release all your own code in the public domain? Aren't you gatekeeping knowledge too?


I agree with GP, and so, yes, I release everything I do — code and the hundreds of thousands of painstakingly researched, drafted, deeply thought through words of writing that I do — using a public domain equivalent license (to ensure it's as free as possible), the zero clause BSD.


That's commendable, but unfortunately I asked GP.


Is there a link?


Sure!

Personal blog: https://neonvagabond.xyz/ (591,305 total words, written over 6 years; feel free to do whatever you want with it)

My personal github page: https://github.com/alexispurslane/ (I only recently switched to Zero-Clause BSD for my code, and haven't gotten around to re-licensing all my old stuff, but I give you permission to send a PR with a different license to any of them if you wanna use any of it)


The first three things are, in this order: collaborative editing, collaborative editing, collaborative editing. Seriously, this cannot be understated.

Then: The LaTeX distribution is always up-to-date; you can run it on limited resources; it has an endless supply of conference and journal templates (so you don't have to scavenge them yourself off a random conference/publisher website); Git backend means a) you can work offline and b) version control comes in for free. These just off the top of my head.


One answer is right under Introduction:

> Content portability

> Users move between hosts without losing their content, audience, or metadata.


Did that require an entire new protocol though? I am 100% sure that if Twitter, Facebook and all the other platforms decided that they want to offer a way to move around accounts they could do it.


Maybe, coordination is the problem. What does that data look like, what does the target look like, can they be transformed?

ATProto has lexicon, which are more about social coordination than schemas for data correctness

https://pfrazee.com/blog/lexicon-guidance

The protocol is much more than data portability, it essentially turns the global social media system into a giant distributed system anyone can participate in at any point. Imagine if FB also let you tap into the event stream or produce your own event stream other FB users could listen to in the official FB app. That would be a pretty awesome requirement for all social media apps, yea?

https://atproto.com/articles/atproto-for-distsys-engineers


> it essentially turns the global social media system into a giant distributed system anyone can participate in at any point.

Don’t we already have that and is called “the web”? It’s already a giant distributed system anyone can participate in at any point.

What are we really gaining here?


A shared event bus, lexicon for coordination, apps that store user data in the users database, separation of client from app data


if they decided to, sure they could. they don't want to and never will.


I am not debating that. But this same reasoning applies to @at or any other implementation. You have to be willing to implement the features and use the protocol. So I still don’t see why this is any different.


You keep asking questions, rejecting answers, and then saying you don't understand.

Perhaps it is time to read more about the protocol directly instead of asking questions on HN to poke holes in it from a position of ignorance.


> relatively harmless and minor errors

They are not harmless. These hallucinated references are ingested by Google Scholar, Scopus, etc., and with enough time they will poison those wells. It is also plain academic malpractice, no matter how "minor" the reference is.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: