S3 as a Git remote and LFS server

by kbumsikon 10/19/24, 10:37 AMwith 52 comments
by mdanielon 10/19/24, 5:20 PM

All this mocking when moto exists is just :-( https://github.com/awslabs/git-remote-s3/blob/v0.1.19/test/r...

Actually, moto is just one bandaid for that problem - there are SO MANY s3 storage implementations, including the pre-license-switch Apache 2 version of minio (one need not use a bleeding edge for something as relatively stable as the S3 Api)

by Scribbdon 10/19/24, 9:28 PM

This is something I was trying to implement myself. I am surprised it can be done with just an s3 bucket. I was messing with API Gateways, Lambda functions and DynamoDB tables to support the s3 bucket. It didn't occur to me to implement it client side. I might have stuck a bit too much to the lfs test server implementation. https://github.com/git-lfs/lfs-test-server

by CGamesPlayon 10/20/24, 1:20 PM

If you are interested in using S3 as a git remote but are concerned with privacy, I built a tool a while ago to use S3 as an untrusted git remote using Restic. https://github.com/CGamesPlay/git-remote-restic

by zmmmmmon 10/20/24, 10:25 AM

Just remember, the mininum billing increment for file size is 128KB in real AWS S3. So your Git repo may be a lot more expensive than you would think if you have a giant source tree full of small files.

by doctorpanglosson 10/20/24, 3:48 PM

https://alanedwardes.com/blog/posts/serverless-git-lfs-for-g...

I’ve used this guy’s CloudFormation template since forever for LFS on S3.

GitHub has to lower its egregious LFS pricing.

by x3n0ph3n3on 10/20/24, 4:36 AM

Wow, AWS really wants to get rid of CodeCommit.

by Evidloon 10/19/24, 11:37 PM

For the LFS part there is also dvc which works better than git-lfs and natively supports S3.

by milkey_mouseon 10/20/24, 11:18 AM

You can also do this with Cloudflare Workers for fewer setup steps/moving parts:

https://github.com/milkey-mouse/git-lfs-s3-proxy

by philsnowon 10/19/24, 7:13 PM

I'm surprised they just punt on concurrent updates [0] instead of locking with something like dynamodb, like terraform does.

[0] https://github.com/awslabs/git-remote-s3?tab=readme-ov-file#...

by kernelsanderzon 10/20/24, 9:11 PM

I’ve been using https://github.com/jasonwhite/rudolfs - which is written in rust. It’s high performance but doesn’t have all the features (auth) that you might need.

by fortran77on 10/19/24, 8:50 PM

Amazon has deprecated Amazon Code Commit, so this may be an interesting alternative.

by tonymeton 10/19/24, 10:18 PM

how does it handle incremental changes? If it’s writing your entire repo on a loop, I could see why AWS would promote it.

by WhyNotHugoon 10/20/24, 2:53 PM

git-annex also has native support for s3.

by xenaon 10/20/24, 12:48 PM

How do you install this? Homebrew broke global pip install. Is there a homebrew package or something?

by mattxxxon 10/20/24, 1:30 PM

This seems wrong, since you can't push transactionally + consistently in S3.

They address this directly in their section on concurrent writes: https://github.com/awslabs/git-remote-s3?tab=readme-ov-file#...

And in their design: https://github.com/awslabs/git-remote-s3?tab=readme-ov-file#...

But it seems like this is just the wrong tool for the job (hosting git repos).

by Havocon 10/21/24, 12:12 PM

Does this work with other s3 implementations like minio?