-
Notifications
You must be signed in to change notification settings - Fork 81
Backup/Restore to/from AWS S3 bucket #100
Comments
+1 |
Would you intend to configure credentials to write to S3 via some AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY? Or via which other method |
Yes, that could be the initial setup but I would also consider the possibility of having the ServiceAccount associated to the Pod linked with an AWS IAM Role when deployed inside an AWS EKS cluster. (see IAM for ServiceAccount) |
We would also be interested in this specifically with IRSA |
@bbenzikry has submitted a PR on this topic here: #118 👀 |
@moxious Thanks for letting me know. I see a few issues with the PR:
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT_ID:role/ROLE_NAME"
Unrelated to that, really nice work on https://github.com/neo4j-contrib/neo4j-spark-connector, I'm really looking forward to a 3.0.0 release |
On your point #1, I think there's a clear tension between the design approach of "make a backup container and expand its support for multi-cloud" (which is what the PR is evolving) and "make a bespoke backup approach for each cloud". Flexibility & power to use cloud-specific things like IRSA is eventually going to pull in the direction of option #2. As noted in the PR comments, eventually someone will ask for Azure, and then Alibaba, and so on. TBH, my main goal for the backup approach in this repo is to act as an example. In working with production customers, they end up having so many custom requirements that a pre-cooked single backup container usually ends up infeasible. For example, they'll want a split schedule: incremental every 24 hours, full backup weekly. Or they have pre-preparation requirements, etc. No single approach can capture all the requirements, frankly. Add to this that the cloud k8s distributions evolve and various security features are added, you get the picture. This is a long way of saying it doesn't bother me if the PR doesn't support IRSA. But if it really bothers you, we need consider whether modifying this is the right approach or whether best available guidance to people will be to roll their own on the basis of an example. On your second point -- volume sizing. I get this pain, but it's a complex issue. Sizing backups & database volumes the same is definitely not the right approach. Neo4j has a lot of options dealing with things like transaction & log retention, and we'd usually expect backups to be smaller than a real database. Doing them separately is a point of complexity, but also ... flexibility. This could be broken out into a separate issue for discussion though in terms of (a) what can be done (b) what's good to do here. On your third point -- yes, I'm aware. A hard limitation here is that the neo4j backup tool is a CLI that doesn't know how to write to any storage layer other than local disk. And so fundamentally all backup approaches will "double the data". You'll take a backup -> local disk, and you'll push that up to cloud storage, so 2 full copies of the same data. As I've thought through this the best we can do here is choose the best performing storage layer. I've looked into some options where you can mount s3 storage as a local FS, but this has so many extra issues & complexity it doesn't seem like a good option for the kind of file access we're doing here. |
What bothers me the most are security implications.
I don't really see it that way as the PR already has cloud vendor specific code ( e.g. using a baked image with specific AWS / GCP CLIs, expecting credentials to be in a certain format etc. ) I'm just afraid what will happen is that users will just use this with a non rotated secret they'll forget about. Obviously this could also go in a direction of showcasing a more agnostic solution ( e.g. vault ), but I would think that would be even more of a bother. As IRSA support is off the table, would you be open to the following?
Thanks for clarifying this - the easiest way I can think of is using some kind of elastic store or pre-provisioned options ( gluster etc. ), but that's really outside of the chart purview IMO. |
It must be said, because it's not explicit & clear everywhere: there is some general security guidelines for running Neo4j, but I know you're asking about k8s specifically and that information hasn't been written up. The status of this repo at the moment is that it's an open source project by yours truly, and has no official standing and isn't officially supported by Neo4j.
Yep. But it must be said: people will frequently do this even if you write detailed guidance telling them not to. Doing it in the more complicated way is simply more work, and people who use the helm chart are not frequently looking to become experts in these issues, they're usually trying to get something rolling. Endemic to the industry, really.
More agnostic solutions are separately problematic, as a lot of people are in locked-down environments where they can't necessarily adopt extra recommended technologies. For example, you can do nice things if you require this or that k8s add-on, but requiring such tends to progressively narrow the audience that can apply it, which is why most of the repo uses very vanilla k8s techniques, and why I haven't jumped on #42
This is fine, and I'm not hostile to it, but this is expanding the ask; what's really wanted is a generalized approach that works across 3 major clouds, with security best practices on securing the backups & specifying the access credentials path. That's a legitimate request, but a big chunk of work for me alone. In the context of this issue & the linked PR, what we need to do is decide whether some iteration of the PR makes meaningful progress towards that goal. |
Thanks for mentioning that, I was under the assumption that it was officially supported.
I hope I'll have some time in the coming weeks; maybe I'll try and have a go at this to help out. In general, thinking more about cloud specific auth methods, perhaps this should be a larger issue that adds a generalized approach for dealing with cloud auth for both the cluster itself and backup / restore procedures - but this should probably be done once there's progress on related issues in other projects that can help such security features be more mainstream in Neo4j use ( neo4j-contrib/neo4j-apoc-procedures#1596 as an example )
Would love if you could weigh in on that, didn't see anything specific about full / incremental backup sizes on the op manual / quick search. |
I don't think there's much guidance on differential storage planning. K8s is a new environment for Neo4j relatively speaking. In a lot of deployments, it's done on VMs in environments where disk is cheap & plentiful, and storage is planned for in combination, since folks would run the backup tool on one of their server instances. As a result, only k8s folks tend to think about capacity planning in separate terms like this. In actuality, when working with a particular team, you tend to do it based on their needs. The variables at play are things like transaction log retention settings, log file retention & rotation settings, and overall ingest rate, which gives you very different results for hyper-active transactional setups vs. big analytical graphs |
* #backup aws support #100 * #backup secretName option * #restore aws support for core pods * #restore aws support for read pods * #default values updated * #single secret path for backup * #restore deployment-scenarios
Closing this issue in light of the merge of the linked PR above, and also some odds and ends in this PR. #122 I'm working on the release process for 4.1.3-1 right now, and it'll probably be a few hours before the docs site is updated, but that should be updated soon accordingly by a different build chain. |
Many thanks to @srfaytkn who did the bulk of the work to bring in the implementation of this! |
Provide backup/restore helm chart for AWS S3 bucket
The text was updated successfully, but these errors were encountered: