The big reason is that domestic ISPs don't want to switch (not just in the US, but everywhere really.)
Data centers and most physical devices made the jump pretty early (I don't recall a time where the VPS providers I used didn't allow for IPv6 and every device I've used has allowed IPv6 in the last 2 decades besides some retro handhelds), but domestic ISPs have been lagging behind. Mobile networks are switching en masse because of them just running into internal limits of IPv4.
Domestic ISPs don't have that pressure; unlike mobile networks (where 1 connection needing an IP = 1 device), they have an extra layer in place (1 connection needing an IP = 1 router and intranet), which significantly reduces that pressure.
The lifespan of domestic ISP provided hardware is also completely unbound by anything resembling a security patch cycle, cost amortization or value depreciation. If an ISP supplies a device, unless it fundamentally breaks to a point where it quite literally doesn't work anymore (basically hardware failure), it's going to be in place forever. It took over 10 years to kill WEP in favor of WPA on consumer grade hardware. To support IPv6, domestic ISP providers need to do a mass product recall for all their ancient tech and they don't want to do that, because there's no real pressure to do it.
IPv6 exists concurrently with IPv4, so it's easier for ISPs to make anyone wanting to host things pay extra for an IPv4 address (externalizing an ever increasing cost on sysadmins as the IP space runs out of addresses) rather than upgrade the underlying tech. The internet default for user facing stuff is still IPv4, not IPv6.
If you want to force IPv6 adoption, major sites basically need to stop routing over IPv4. Let's say Google becomes inaccessible over IPv4 - I guarantee you that within a year, ISPs will suddenly see a much greater shift towards IPv6.
It's frustrating that even brand new Unifi devices that claim to support IPv6 are actually pretty broken when you try to use it. So 10 years from right now even, unless they can software patch it upwards.
I can understand in theory why they wouldn't want to back up .git folders as-is. Git has a serious object count bloat problem if you have any repository with a good amount of commit history, which causes a lot of unnecessary overhead in just scanning the folder for files alone.
I don't quite understand why it's still like this; it's probably the biggest reason why git tends to play poorly with a lot of filesystem tools (not just backups). If it'd been something like an SQLite database instead (just an example really), you wouldn't get so much unnecessary inode bloat.
At the same time Backblaze is a backup solution. The need to back up everything is sort of baked in there. They promise to be the third backup solution in a three layer strategy (backup directly connected, backup in home, backup external), and that third one is probably the single most important one of them all since it's the one you're going to be touching the least in an ideal scenario. They really can't be excluding any files whatsoever.
The cloud service exclusion is similarly bad, although much worse. Imagine getting hit by a cryptoworm. Your cloud storage tool is dutifully going to sync everything encrypted, junking up your entire storage across devices and because restoring old versions is both ass and near impossible at scale, you need an actual backup solution for that situation. Backblaze excluding files in those folders feels like a complete misunderstanding of what their purpose should be.
I've actually spent some time debugging why git causes so many issues with the backup software I use (restic).
Ironically, I believe you have it backwards: pack files, git's solution to the "too many tiny files" problem, are the issue here; not the tiny files themselves.
In my experience, incremental backup software works best with many small files that never change. Scanning is usually just a matter of checking modification times and moving on. This isn't fast, but it's fast enough for backups and can be optimized by monitoring for file changes in a long-running daemon.
However, lots of mostly identical files ARE an issue for filesystems as they tend to waste a lot of space. Git solves this issue by packing these small objects into larger pack files, then compressing them.
Unfortunately, it's those pack files that cause issues for backup software: any time git "garbage collects" and creates new pack files, it ends up deleting and creating a bunch of large files filled with what looks like random data (due to compression). Constantly creating/deleting large files filled with random data wreaks havoc on incremental/deduplicating backup systems.
Why should a file backup solution adapt to work with git? Or any application? It should not try to understand what a git object is.
I’m paying to copy files from a folder to their servers just do that. No matter what the file is. Stay at the filesystem level not the application level.
I'm not saying Backblaze should adapt to git; the issue isn't application related (besides git being badly configured by default; there's a solution with git gc, it's just that git gc basically never runs).
It's that to back up a folder on a filesystem, you need to traverse that folder and check every file in that folder to see if it's changed. Most filesystem tools usually assume a fairly low file count for these operations.
Git, rather unusually, tends to produce a lot of files in regular use; before packing, every commit/object/branch is simply stored as a file on the filesystem (branches only as pointers). Packing fixes that by compressing commit and object files together, but it's not done by default (only after an initial clone or when the garbage collector runs). Iterating over a .git folder can take a lot of time in a place that's typically not very well optimized (since most "normal" people don't have thousands of tiny files in their folders that contain sprawled out application state.)
The correct solution here is either for git to change, or for Backblaze to implement better iteration logic (which will probably require special handling for git..., so it'd be more "correct" to fix up git, since Backblaze's tools aren't the only ones with this problem.)
7za (the compression app) does blazingly fast iteration over any kind of folder. This doesn't require special code for git. Backblaze's backup app could do the same but rather than fix their code they excluded .git folders.
When I backup my computer the .git folders are among the most important things on there. Most of my personal projects aren't pushed to github or anywhere else.
Fortunately I don't use Backblaze. I guess the moral is don't use a backup solution where the vendor has an incentive to exclude things.
IMHO, you can't do blazingly fast iteration over folders with small files in Windows, because every open is hooked by the anti-virus, and there goes your performance.
Windows has a much harsher approach to file locking than Linux and backup software like BackBlaze absolutely should be making use of it (lest they back up files that are being modified while they back them up), but that also means that the software effectively has to ask the OS each time to lock the file, then release the lock when the software is done with it. With a large amount of files, that does stack up.
Linux file locking is to put it mildly, deficient. Most software doesn't even bother acquiring locks in the first place. Piling further onto that, basically nobody actually uses POSIX locks because the API has some very heavy footguns (most notably, every lock on a file is released whenever any close() for that file is called, even if another component of the same process is also having a second lock open). Most Linux file locks instead work on the honor system; you create a file called filename.lock in the same directory as the file you're working on, and then any software that detects the filename.lock file exists should stop reading the file.
Nobody using file locks is probably the bigger reason why Linux chokes less on fast iteration than Windows, given that Windows is slow with loads of files even when you aren't running a virus scanner.
> Linux file locking is to put it mildly, deficient.
Since the introduction of flock on Linux, how bad is it really though? I don't see why one would need kludges like filename.lock. Though of course flock is still an "honor system" as you put it.
Same - on one of my computers (Linux, btw) the only directories in the list of directories to back up are .git directories. That's what I'm concerned with, so that's what I back up. And it works just fine, with my provider.
Actually once the initial backup is done there is no reason to scan for changes. They can just use a Windows service that tells them when any file is modified or created and add that file to their backup list.
No they don’t. They just have to price the product to reflect changing user patterns. When backblaze started, it was simply “we back up all the files on your drive” they didn’t even have a restore feature that was your job when you needed it. Over time they realized some user behavior changed, these Cloud drives where a huge data source they hadn’t priced in, git gave them some problems that they didn’t factor in, etc. The issue is there solution to dealing with it is to exclude it and that means they’re now a half baked solution to many of their users, they should have just changed the pricing and supported the backup solution people need today.
Well, I checked and it looks like none of my .git repos are backed up. All attempts to restore only restore the working copy. -_- I'm not sure why it was working for the person in the comment I linked.
I think it's understandable for both Backblaze and most users, but surely the solution is to add `.git` to their default exclusion list which the user can manage.
I think they shouldn't back up git objects individually because git handles the versioning information. Just compress the .git folder itself and back it up as a single unit.
Better yet, include dedpulication, incremental versioning, verification, and encryption. Wait, that's borg / restic.
This is a joke, but honestly anyone here shouldn't be directly backing up their filesystems and should instead be using the right tool for the job. You'll make the world a more efficient place, have more robust and quicker to recover backups, and save some money along the way.
Eh, you really shouldn't do that for any kind of file that acts like a (an impromptu) database. This is how you get corruption. Especially when change information can be split across more than one file.
Sorry, what are you saying shouldn't be done? Backing up untracked/modified files in a bit repo? Or compressing the .git folder and backing it up as a unit?
> Backing up untracked/modified files in a bit repo?
This. It's best to do this in an atomic operation, such as a VSS style snapshot that then is consistent and done with no or paused operations on the files. Something like a zip is generally better because it takes less time on the file system than the upload process typically takes.
It's probably primarily because Linus is a kernel and filesystem nerd, not a database nerd, so he preferred to just use the filesystem which he understood the performance characteristics of well (at least on linux).
> SourceGear Vault Pro is a version control and bug tracking solution for professional development teams. Vault Standard is for those who only want version control. Vault is based on a client / server architecture using technologies such as Microsoft SQL Server and IIS Web Services for increased performance, scalability, and security.
I decided to look into this (git gc should also be doing this), and I think I figured out why it's such a consistent issue with git in particular. Running git gc does properly pack objects together and reduce inode count to something much more manageable.
It's the same reason why the postgres autovacuum daemon tends to be borderline useless unless you retune it[0]: the defaults are barmy. git gc only runs if there's 6700 loose unpacked objects[1]. Most typical filesystem tools tend to start balking at traversing ~1000 files in a structure (depends a bit on the filesystem/OS as well, Windows tends to get slower a good bit earlier than Linux).
To fix it, running
> git config --global gc.auto 1000
should retune it and any subsequent commit to your repo's will trigger garbage collection properly when there's around 1000 loose files. Pack file management seems to be properly tuned by default; at more than 50 packs, gc will repack into a larger pack.
[0]: For anyone curious, the default postgres autovacuum setting runs only when 10% of the table consists of dead tuples (roughly: deleted+every revision of an updated row). If you're working with a beefy table, you're never hitting 10%. Either tune it down or create an external cronjob to run vacuum analyze more frequently on the tables you need to keep speedy. I'm pretty sure the defaults are tuned solely to ensure that Postgres' internal tables are fast, since those seem to only have active rows to a point where it'd warrant autovacuum.
A few thousand files shouldn't be a problem to a program designed to scan entire drives of files. Even in a single folder and considering sloppy programs I wouldn't worry just yet, and git's not putting them in a single folder.
The trick is to not overengineer your hobby if you're only doing it to prove a point.
ie. Yes, you could run a full on corporate CA, issue SSL certificates for your domains, manually rig up wireguard and run your own internal corporate VPN... or you just accept that your grand total of 1 concurrent user on an intranet is probably just better served by setting up Tailscale and a wildcard LE certificate so that the browser shuts up. (Which is still not great, but the argument over HTTPS and intranets is not for right now.)
Same with other deployment tools like Docker - yes, there's a ton of fancy ways to do persistent storage for serverless setups but get real: you're throwing the source folder in /opt/ and you have exactly one drive on that server. Save yourself the pain and just bind mount it to somewhere on your filesystem. Being able to back the folder up just with cp/rsync/rclone/scp is a lot easier than fiddling with docker's ambiguous mess of overlay2 subfolders.
Every overengineered decision of today is tomorrow's "goddammit I need to ssh into the server again for an unexpected edgecase".
Spoken from the pretty obvious position of never having to have worked a low-wage people facing job.
Here's the real situation: the people that pick up the phone when you call them up aren't going to be paid much above minimum wage at all. They have zero institutional power to fix anything. You're yelling at people that, themselves, almost certainly are only barely making enough money to get by either.
It is worthless to yell at these people because they can't fix shit; they don't set policies, they have no power to fix things and all your yelling is going to achieve is at best counterproductive to what you want to get done (since now the front facing employee dislikes you personally and is less inclined to try and help you out) and at worst is going to get you into further trouble when you do need something routine done. (Since now you're on the list of "people that the employees don't want to put any extra effort into since they're jerks".)
There are people that get paid to be the complaints facing entity of the organization, who are paid to withstand whatever shit you can throw at them and who have an ability to fix up whatever you needed in specific. They're not the people that pick up the phone.
What you need to do is channel the inner Karen and ask to speak to the manager. The manager can help you with this sort of thing, they are the ones that can do shit to avoid sustaining the machine, because they have a career they want to grow into and risk actual consequences for pissing people off.
Be polite (but firm; you don't need to be walked over) to the first tier support employees, even if they can't help you. Save the complaints for the manager (who you shouldn't be afraid to ask to speak to either). The managers job is to deal with the real complaints, not the routine stuff that just happens to need a human involved. They are taking a job to be the face of the machine for reasons other than "I literally need a minimum wage job to survive".
The employee didn't mistreat anyone. She simply stated the procedure (which sucks!).
It was OOP that chose to escalate this to malicious compliance and ascribed a lot more to her attitude than what's actually said. OOP assumed that she was out to get him in specific, when nothing in the described call even suggests as much.
The correct response would've been to ask for the manager and if the manager chooses to stonewall in an obnoxious way (which is possible!), then you pull the frustrating fax from hell on them. At that point, you're not just speaking to someone who has no power to fix shit, you're talking to someone who does have the power to fix shit and chooses to be a stick in the mud about it. That's when being a jerk back is deserved.
Being a jerk to low paid employees in this manner is unacceptable, rude and makes me think a lot less of the person writing it.
lol. stopped reading after your first line because I've worked every low-wage, customer facing job you can imagine. shoe salesman, phone rep for verizon and then t mobile and then at&t, fast food, local diner waitstaff, office receptionist, contract installer, HVAC repair, cable service tech. that's a truncated list. I have the opinions I have because I've had those jobs, not in spite of them. I know how I carried myself, and it was a very low bar to reach. it's only when people don't reach that bar that I raise issue. because the bar is "know where you are, know what you can do, know what you can't do, and be as accommodating and responsive to the client/customer as you possibly can be, given your constraints." doesn't feel onerous to me. and in this specific case, I don't even have a problem with karen, per se. at least not from the content of the story. my reply was in response to other people insisting that karen needs to be coddled because all she did was answer a phone and this horrible man sent her a fax! (the horror)
I don't dislike Codeberg inherently, but it's not a "true" GitHub replacement. It can handle a good chunk of GitHub repositories (namely those for well established FOSS projects looking to have everything a proper capital P project has), but if you're just looking for a generic place to put your code projects that aren't necessarily intended for public release and support (ie. random automation scripts, scraps of concepts that never really got off the ground, things not super cleaned up), they're not really for that - private repositories are discouraged according to their FAQ and are very limited (up to 100mb).
They also don't want to host your homepage, so if GitHub Pages is why you used GitHub, they are not a replacement.
Unfortunately I don't think there's really an answer to that conundrum that doesn't involve just spinning up your own git server and accepting all the operational overhead that comes with it. At least Forgejo (software behind Codeberg) is FOSS, so you can do that and it should cover most of what you need (and while you're in the realm of having a server, a Pages-esque replacement is trivial since you're configuring a webserver anyway.) Maybe Gitlab.com, although I am admittedly unfamiliar with how Gitlab's "main" instance has changed over the years wrt features.
> If you do not contribute to free/libre software (or if it is limited to your personal homepage), and we feel like you only abuse Codeberg for storing your commercial projects or media backups, we might get unhappy about that.
Emphasis mine. This isn't about if it's technically possible (it certainly is), it's whether or not it's allowed by their platform policies.
Their page publishing feature seems more like it's meant for projects and organizations rather than individual people. The way it's described here indicates that using them to host your own blog/portfolio/what have you is considered to be abusing their services.
Seems fair to me, they're a nonprofit that exists in our lived reality and not an abusive monopolist that can literally throw a billion dollars to subsidize loss leaders.
All it shows the world is why there needs to be a VAT like tax against US digital services to help drive a public option for developers.
There's no reason why the people can't make our own solutions rather than be confined to abusive private US tech platforms.
Are you seriously trying to pitch the flaming garbage heap that is Microsoft Windows as "better technology"? Microsoft is a predator, they offer licenses to schools at a knock-down rate in order to nurture a dependency on their product. The volume of cash that has been extracted from the general populace in this way is obscene. To top it off they have gone out of their way to sabotage free and open competitors, limiting the market to their shitty and overpriced offerings.
Disagree the only alternative is to let the people decide, I don't trust a dozen men that already have deeply undemocratic beliefs to dictate the direction of tech for society.
You are against democracy, I am not. Democracy has led to some of the best advances of civilization, all oligarchies have done is introduce mass poverty, mass misery, and mass death.
At least with democracy we went to the moon for mankind, not shareholders.
Reading what you quoted, no it is not, as long as you contribute to free software or you have projects that are open source. Not just your personal homepage. If you only have a personal homepage and nothing else that is open source, then they have a problem.
Which makes it not really a suitable replacement for GitHub, which is my entire point.
Keep in mind, I'm not saying Codeberg is bad, but it's terms of use are pretty clear in the sense that they only really want FOSS and anyone who has something other than FOSS better look elsewhere. GitHub allowed you to basically put up anything that's "yours" and the license wasn't really their concern - that isn't the case with Codeberg. It's not about price or anything either; it'd be fine if the offer was "either give us 5$ for the privilege of private repositories or only publish and contribute public FOSS code" - I'm fine paying cash for that if need be.
One of the big draws of GitHub (and what got me to properly learn git) back in the day with GitHub Pages in particular was "I can write an HTML page, do a git push and anyone can see it". Then you throw on top an SSG (GitHub had out of the box support for Jekyll, but back then you could rig Travis CI up for other page generators if you knew what you were doing), and with a bit of technical knowledge, anyone could host a blog without the full on server stack. Codeberg cannot provide that sort of experience with their current terms of service.
Even sourcehut has, from what I can tell, a more lenient approach to what they provide (and the only reason why I wouldn't recommend sourcehut as a GitHub replacement is because git-by-email isn't really workable for most people anymore). They encourage FOSS licensing, but from what I can tell don't force it in their platform policies. (The only thing they openly ban is cryptocurrency related projects, which seems fair because cryptocurrency is pretty much always associated with platform abuse.)
I mean, it is arguably much easier to just write the HTML page and upload it with FTP and everyone can see it. I never understood why github became a popular place to host your site in the first place.
> I never understood why github became a popular place to host your site in the first place.
Easy: it was free, it was accessible to people that couldn't spend money for a hosting provider (read: high schoolers) and didn't impose arbitrary restrictions on what you were hosting.
Back then, your options as a high school student were basically to either try and reskin a closed off platform as much as you could (Tumblr could do that, but GitHub Pages also released in the time period where platforms were cracking down on all user customization larger than "what is my avatar") or to accept that the site you wanted to publish your stuff on could disappear at any moment the sketchy hosting provider that provided you a small amount of storage determined your bandwidth costs meant upselling you on the premium plan.
GitHub didn't impose those restrictions in exchange for being a bit less interactive when it came to publishing things (so no such thing as a comment section without using Disqus or something like that, and chances are you didn't need the comments anyways so win-win) That's why it got a lot more popular than just using an FTP server.
There are multiple reasons why FTP by itself became obsolete. Some of them I can think of off the top of my head:
1) Passive mode. What is it and why do I need it? Well, you see, back in the old days, .... It took way too long for this critical "option" to become well supported and used by default.
2) Text mode. No, I don't want you to corrupt some of my files based on half-baked heuristics about what is and isn't a text file, and it doesn't make any sense to rewrite line endings anymore anyway.
3) Transport security. FTPS should have become the standard decades ago, but it still isn't to this day. If you want to actually transfer files using an FTP-like interface today, you use SFTP, which is a totally different protocol built on SSH.
chrome and firefox dropped support for it 5 years or so ago, it has had a lot of security issues over the years, was annoying over NAT, and there are better options for secure bulk transfers (sftp, rsync, etc)
Depending on your hardware (SBC), FTP can also be several times faster than SFTP for transferring files over a LAN. Though I'll admit to having used other protocols like torrents for large files that had bad transfers or other issues (low-quality connection issues causing dropped connections, etc).
Finding an HTTP+FTP server was easier than finding github. Your OS probably has a FTP client installed already, but finding another one is easier than finding and most definitely easier than learning git.
And if you already knew how to write/make HTML you'd for sure already know all of that too.
This is definitely a matter of perspective. I have had a Github account since 2010, and git comes installed on Linux and macOS.
I don't always have a server available to host an HTTP+FTP server on. Or want to pay for one, or spend time setting one up. I can trust that Github Pages will have reasonable uptime, and I won't have to monitor it at all.
> And if you already knew how to write/make HTML you'd for sure already know all of that too.
This seems unnecessarily aggressive, and I don't really understand where it's coming from.
BTW, you can absolutely host plain HTML with Github Pages. No SSG required.
> And if you already knew how to write/make HTML you'd for sure already know all of that too.
That's a completely false statement. My kid took very basic programming classes in school which covered HTML so they could build webpages, which is a fantastic instant-results method. Hooray, now the class is finished, he wants to put it on the web. Just like millions of other kids who couldn't even spell FTP.
I touched on the issues with FTP itself in another comment, but who can forget the issues with HTTP+FTP, like: modes (644 or 755? wait, what is a umask?), .htaccess, MIME mappings, "why isn't index.html working?", etc. Every place had a different httpd config and a different person you had to email to (hopefully) get it fixed.
There was a lot of sites that provided some cpanel-like option as long as you're ok with yourcoolname.weirdhostingname.com. I believe they all came with a filebrowser and the always present public_html folder.
There was geocities (now gone) and a couple of *.tk domains that would inject their ads all over your page. Neither makes a great substitute for GitHub pages these days.
I just checked, I’m not using the feature but my current ISP still offers it: https://assistance.free.fr/articles/631 (10 GB FTP storage tied to the ISP-specific e-mail address).
Having looked it up, mine makes it an add-on service for 1,045円/month + 5,500円 set-up fee, at which point you might as well use a dedicated VPS service (which is probably what's actually going on behind the scenes anyway).
That FAQ snippet is insane to me. Maybe it's a cultural thing but I'd never do business with a company that has implicit threats in their ToS based on something so completely arbitrary.
The worst part is really the unclear procedure. If they set out terms that say they'll give me 4 weeks to migrate projects they don't like off the platform, with n email reminders in between, then that's not ideal but fine. As it is, I'd be worried I'll wake up to data loss if they get 'unhappy'. I have the same problem with sourcehut, actually, with their content policy.
What an absurd double standard. The language is patterned after GitHub's own caveats about misuse of GitHub Pages:
> you may receive a polite email from GitHub Support suggesting strategies[… such as, and including] moving to a different hosting service that might better fit your needs
GitHub Pages has never been a free-for-all. The acceptable use policy makes it clear:
> the primary focus of the Content posted in or through your Account to the Service should not be advertising or promotional[…] You may include static images, links, and promotional text in the README documents or project description sections associated with your Account, but they must be related to the project you are hosting on GitHub
Well it's kind of describing the reality that exists at other companies today. Most ToS's have clauses where they can kick you off for not using it as intended, solely at their discretion. At least these guys are honest and upfront about it. I do agree though some more guidelines around their policy would be nice.
Unfortunately I don't think there's really an answer to that conundrum that doesn't involve just spinning up your own git server and accepting all the operational overhead that comes with it.
Hmm all that operational overhead... Of an ssh server? If you literally just want a place to push some code, then that really isn't that hard.
Lots and lots of programmers have very little understanding and especially operation knowledge of how to host a public service. You can be an extreme graphics programmer and not know the web stack at all.
And no, its not that hard once you learn. Except, now its a never ending chore when it was an appliance. Instead of a car you have a project car.
> Lots and lots of programmers have very little understanding and especially operation knowledge of how to host a public service. You can be an extreme graphics programmer and not know the web stack at all.
Can confirm.
Also, not everyone who wants to share content publicly has a domain name with which to do so, or the kind of Internet connection that allows running a server. If you include "hosting" by using a hosting provider... it's perfectly possible (raises hand) to not even have any experience with that after decades of writing code and being on the Internet. (Unless you count things like, well, GitHub and its services, anyway.)
I think both of you are misunderstanding what I proposed. You just need a single VM with an ssh server. Literally no web service needed, if all you want to do is host some code remotely.
I didn't misunderstand. Sshd is a web service. Most folks don't already know how and don't want to set up a machine that is always on, that will restart on power loss, that will have a static IP or dynDNS, with a domain name and proper routing and open ports and certs and enough bandwidth and that's before you even worry about actual security and not just what is needed to work.... It's actually a big annoyance if you don't do it all the time.
The rest of the owl: go to provider, set up VM (20 questions) log into root. SSH for login. set up firewalls. create non-root user. useradd or adduser? depends if you want a home dir I guess. debug why you can't ssh in. Finally get in. sudo apt update. sudo apt install git (or is it named something else?). install failtoban. install fw.
If it's your ssh server and it's single user you don't need to use the "git@" part at all.
Just store the repo and access it with your account.
The whole git@ thing is because most "forge" software is built around a single dedicated user doing everything, rather than taking advantage of the OS users, permissions and acl system.
For a single user it's pointless. For anyone who knows how to setup filesystem permissions it's not necessary.
There isn't much advantage that can be taken from O/S users and perms anyway, at least as far as git is concerned. When using a shared-filesystem repository over SSH (or NFS etc.), the actually usable access levels are: full, including the abilities to rewrite history, forge commits from other users, and corrupt/erase the repo; read-only; and none.
Git was build to be decentralized with everyone having its own copy. If it's an organization someone trusted will hold the key to the canonical version. If you need to discuss and review patches, you use a communication medium (email, forums, IRC, shared folder,...)
Git was built to be decentralized but it ended up basically displacing all other version control systems, including centralized ones. There are still some holdouts on SVN and even CVS, and there are niche professional fields where other systems are preferred due to tighter integration with the rest of their tools and/or better binary file support, but, for most people, Git is now synonymous with version control.
but if you're just looking for a generic place to put your code projects that aren't necessarily intended for public release and support (ie. random automation scripts, scraps of concepts that never really got off the ground, things not super cleaned up), they're not really for that - private repositories are discouraged according to their FAQ and are very limited (up to 100mb).
Until the AI scrapers[1] come for you at 5k requests per second and you're doing operations in hard-mode.
1. Most forges have http pages for discoverability. I suppose one could hypothetically setup an ssh-only forge and statically generate a html site periodically, but this is already advanced ops for the average Github user
This isn't a real thing and if it ever becomes a thing you can sue them for DDOS and send Sam Altman to jail. AI scraping is in the realm of 1-5 requests per second, not 5000.
Hey, I’m building Monohub - as a GitHub alternative, and having private repositories is perhaps a key feature - it started as a place for me to host my own random stuff. Monohub [dot] dev is the URL. It’s quite early in development, so it’s quite rough around the edges. It has PR support though.
Hosted in EU, incorporated in EU.
Would be happy if you tried it out — maybe it’s something for you.
I started developing it as a slim wrapper around Git to support my own needs. At the same time, it is essential to have rich features like pull requests/code review, so I started focusing on designing a tool that strikes an appropriate balance between being minimalistic and functional. One thing that I focus on is allowing users to disable any feature they don't need.
And the site also uses Cloudflare (for domain registrar, DNS and CDN):
ipinfo monohub.dev
Core
- IP 188.114.96.1
- Anycast true
- Hostname
- City San Francisco
- Region California
- Country United States (US)
- Currency USD ($)
- Location 37.7621,-122.3971
- Organization AS13335 Cloudflare, Inc.
- Postal 94107
- Timezone America/Los_Angeles
Auth is hosted by Kinde (an Australian company, uses AWS)
FWIW, Pierre's "Code Storage" project [1] seems like it simplifies a lot of the operational overhead of running git servers, if what you want is "an API for git push". Not affiliated with the company (and I haven't tried it myself, so I can't vouch for how well it works), I just think it's a neat idea.
I think "Code Storage" (definitely needs a unique name), is less an API for git push (surely git push is that API?), and more an API for "git init"? It seems to be Git as infrastructure, rather than Git as a product. i.e. if you're using it for a single repo it's probably not a good fit, it's for products that themselves provide git repos.
Either take some responsibility and properly evaluate what that convenience means for you long term or don't complain when they leverage that vendor lock in at your disadvantage.
Yeah, but ooh boy is a private gitlab server complicated. Omnibus installation helps manage that, but if you outgrow it you're in for a complicated time.
Also gitlab has cves like every other week... You're going to be on that upgrade train, unless you keep access locked down (no internet access!!) and accept the admittedly lower risk of a vulnerable gitlab server on your LAN/VPN network.
Even if gitlab is updated fully, you're fighting bot crawlers 24/7.
I think the internet has "GitHub Derangement Syndrome" right now. It's an outlet for people's frustration.
The current trend reminds me a lot of the couple years we had where Game Developers were that outlet. They needed to "Wake up" and not "Go woke, go broke". An incredible amount of online discourse around gaming was hijacked by toxic negativity.
I'm sure every individual has their really good logical reasons, but zooming out I think there is definitely a similar social pathology at play.
> I think the internet has "GitHub Derangement Syndrome" right now. It's an outlet for people's frustration.
I would argue that the open source people aren't the only ones paying attention right now.
If you are hosting proprietary code on Github, it has become clear that Microsoft is going to feed that into their AI training set. If you don't want that, you don't have a choice but to leave Github.
I take it you've never disabled Windows telemetry settings and had them magically restored after an update?
This company either does what it wants to abuse people, or is too incompetent to make their software work as instructed. Both possibilities are bad. I expect the same translates to GitHub.
While the donation banner doesn't seem like an issue to me, the WMF comparison is extremely inappropriate if they want to talk about appropriate means of fundraising.
The WMF is notorious for its donation banners making wildly exaggerated claims about the state of the Foundation (it needs some money to be operational, it is however not by any real stretch of the imagination in financial trouble or losing its independence because it doesn't get enough money; they have a massive endowment that can run Wikipedia for the next 50 years or so, and major corporations already give money to the WMF to keep it in the air, making the statements those donation messages give to regular readers very deceptive), scaring people in third world countries into parting with their meager savings because they are scared of the WMF vanishing through deceptive language and in general their donation drives are extremely intrusive to the respective Wikipedias.
I understand that the Document Foundation just wants to bring donations to the attention of their users, but the WMF is the worst point to compare it to.
If anything I think the WMF approach is why people are upset with the LibreOffice banner.
They have been breeding bad will and it is overflowing onto others.
That said, the failure of this post to recognise the problem of the WMF approach does not build confidence in the ability to recognise when users might have a legitimate complaint. That leads them to wonder where LibreOffice is headed.
I think that may be the first time I've seen licensing drama over something as minor as adding another author to the copyright list.
Pretty sure those are completely standard for major changes in maintainers/hostile forks/acknowledging major contributors. I've seen a lot of abandoned MIT/BSD projects add a new line for forks/maintainers being active again in order to acknowledge that the project is currently being headed by someone else.
From my "I am not a lawyer" view, Kludex is basically correct, although I suppose to do it "properly", he might need to just duplicate the license text in order to make it clear both contributors licensed under BSD 3-clause. Probably unnecessary though, given it's not a license switch (you see that style more for ie. switching from MIT to BSD or from MIT/BSD to GPL, since that's a more substantial change); the intent of the license remains the same regardless and it's hard to imagine anyone would get confused.
I suspect (given the hammering on it in responses), that Kludex asking ChatGPT if it was correct is what actually pissed off the original developer, rather than the addition of Kludex to the list in and of itself.
The original author said they were “the license holder”, specifically with a “the”, in discussions around both Starlette and MkDocs, which yes, just isn’t true even after rounding the phrase to the nearest meaningful, “the copyright holder”. This appears to be an honest misconception of theirs, so, not the end of the world, except they seem to be failing at communication hard enough to not realize they might be wrong to begin with.
Note though that with respect to Starlette this ended up being essentially a (successful and by all appearances not intentionally hostile?) project takeover, so the emotional weight of the drama should be measured with respect to that, not just an additional copyright line.
Inherently, not really. An expired, self-signed or even incorrect (as in, the wrong domain is listed) certificate can be used to secure a connection just as well as a perfectly valid certificate.
Rather, the purpose of all of these systems (in theory) is to verify that the certificate belongs to the correct entity, and not some third party that happens to impersonate the original. It's not just security, but also verification: how do I know that the server that responds to example.com controls the domain name example.com (and that someone else isn't just responding to it because they hijacked my DNS.)
The expiration date mainly exists to protect against 2 kinds of attacks: the first is that, if it didn't exist, if you somehow obtained a valid certificate for example.com, it'd just be valid forever. All I'd need to do is get a certificate for example.com at some point, sell the domain to another party and then I'd be able to impersonate the party that owns example.com forever. An expiration date limits the scope of that attack to however long the issued certificate was valid for (since I wouldn't be able to re-verify the certificate.)
The second is to reduce the value of a leaked certificate. If you assume that any certificate issued will leak at some point, regardless of how it's secured (because you don't know how it's stored), then the best thing you can do is make it so that the certificate has a limited lifespan. It's not a problem if a certificate from say, a month ago, leaks if the lifespan of the certificate was only 3 days.
Those are the on paper reasons to distrust expired certificates, but in practice the discussion is a bit more nuanced in ways you can't cleanly express in technical terms. In the case of a .mil domain (where the ways it can resolve are inherently limited because the entire TLD is owned by a single entity - the US military), it's mostly just really lazy and unprofessional. The US military has a budget of "yes"; they should be able to keep enough tech support around to renew their certificates both on time and to ensure that all their devices can handle cert rotations.
Similarly, within a network you fully control, the issues with a broken certificate setup mostly just come down to really annoying warnings rather than any actual insecurity; it's hard to argue that the device is being impersonated when it's literally sitting right across from you and you see the lights on it blink when you connect to it.
Most of the issues with bad certificate handling come into play only when you're dealing with an insecure network, where there's a ton of different parties that could plausibly resolve your request... like most of the internet. (The exception being specialty domains like .gov/.mil and other such TLDs that are owned by singular entities and as a result have secondary, non-certificate ways in which you can check if the right entity owns them, such as checking which entity the responding IP belongs to, since the US government literally owns IP ranges.)
It's probably a bit of both from what I've seen of how Americans tend to react to their government doing things (online anyways).
The US's quagmire of incoherent laws and many jurisdictions seems to be a bad combination of:
* Apathetic voters that are raised on a media diet of "big government bad", which impedes any regulations on a federal level. (Note that this is irrespective on if the voters actually want a small government, it's what they're led to believe.)
* Politicians that don't like to give up power; there's an unusual desire for local/state US officials to claim responsibility and get very pissy when the federal government steps in with a standardized solution. This is very unusual compared to other countries; punting responsibilities to local officials in other countries is generally seen as a way for politicians to abdicate responsibility by letting it die in micromanagement and overworked administrative workers and isn't popular to do anymore these days. (This is also a two way street, where federal US lawmakers can abdicate making any legislation that isn't extremely popular by just punting it down to the states, even if they have legal majorities.)
* The US has a court system that overly favors case law rather than actual law. Laws in the US are permitted to be painfully underdefined since there's an assumption that the courts will work out all the finer details. It's an old system more designed around the days of bad infrastructure across large distances (like well, the British Empire, which it's copied from). It's meant to empower the judicial branch to be able to make the snap decision even if there's not directly a law on the books (yet) or if a law hasn't actually reached the judiciary in question. The result is that you end up with a bunch of different judiciaries, each with their own slightly different rules. It also encourages other bad behavior like jurisdiction shopping where people will try to find the judiciary most favorable to them, crafting "the perfect case" to get a case law on the books the way you want it to get judges to override similar cases and so on and so forth - in other countries, what the supreme court judges doesn't have nearly the same lasting impact that a decision in the US has.
* And finally, the entire system is effectively kept stuck in place because lobbyists like it this way; if they want to kill regulation, they just get some states to pass on it and then hem and haw at the notion of a federal regulation. Politicians keep it in place on their own, lobbyists provide them the grease/excuse to keep doing it. (And those lobbyists these days also have increasing amounts of ownership over the US media, so the rethoric about voters not liking big government regulations is reinforced by them as well.)
It didn't end up this way on purpose; the historical reasons for this are mostly untied from lobby interests (which is mostly just "the US is the size of a continent in width", "states didn't actually work together that much at first" and "the US copied shit from the British Empire"), but they're kept this way by lobby interests.
That does sound like there's an exploitable element there isn't it?
Statistically speaking, most people use one of the biggest email providers, which use their own models to detect spam (or even quietly drop messages). If you're doing an unpopular TOS change, why not set the mail up to still be RFC compliant but in such a way where the mail isn't going to be allowed through by any of the providers. Then you can just claim the problem is userside.
For example, the Message-ID header is technically not required (SHOULD rather than MUST), but as a spam detection measure, Gmail just drops the message entirely for workspace domains: https://news.ycombinator.com/item?id=46989217
The exploitability goes both ways, I think. Users can also mark similar emails as spam to keep such emails out of their inboxes. Not sure how one could deal with that.
Data centers and most physical devices made the jump pretty early (I don't recall a time where the VPS providers I used didn't allow for IPv6 and every device I've used has allowed IPv6 in the last 2 decades besides some retro handhelds), but domestic ISPs have been lagging behind. Mobile networks are switching en masse because of them just running into internal limits of IPv4.
Domestic ISPs don't have that pressure; unlike mobile networks (where 1 connection needing an IP = 1 device), they have an extra layer in place (1 connection needing an IP = 1 router and intranet), which significantly reduces that pressure.
The lifespan of domestic ISP provided hardware is also completely unbound by anything resembling a security patch cycle, cost amortization or value depreciation. If an ISP supplies a device, unless it fundamentally breaks to a point where it quite literally doesn't work anymore (basically hardware failure), it's going to be in place forever. It took over 10 years to kill WEP in favor of WPA on consumer grade hardware. To support IPv6, domestic ISP providers need to do a mass product recall for all their ancient tech and they don't want to do that, because there's no real pressure to do it.
IPv6 exists concurrently with IPv4, so it's easier for ISPs to make anyone wanting to host things pay extra for an IPv4 address (externalizing an ever increasing cost on sysadmins as the IP space runs out of addresses) rather than upgrade the underlying tech. The internet default for user facing stuff is still IPv4, not IPv6.
If you want to force IPv6 adoption, major sites basically need to stop routing over IPv4. Let's say Google becomes inaccessible over IPv4 - I guarantee you that within a year, ISPs will suddenly see a much greater shift towards IPv6.
reply