Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure s3 for checkpointing #246

Open
mootezbessifi opened this issue Jan 16, 2022 · 1 comment
Open

Configure s3 for checkpointing #246

mootezbessifi opened this issue Jan 16, 2022 · 1 comment

Comments

@mootezbessifi
Copy link

Dears

It is recommended to copy the s3 fs proper jar plugin to the plugin path (weither it is hdfs or presto) before starting the job manager. How to support this from CR config part ??

My regards

@liad5h
Copy link

liad5h commented Mar 23, 2022

@mootezbessifi I don't know if this is still relevant, but this is what we did to store our checkpoints and savepoints in aws s3:
under flinkConfig we set the following:

  • s3.access-key: <your access key>
  • s3.secret-key: <your secret key>
  • state.checkpoints.dir: s3://<bucket name>/checkpoints/
  • state.savepoints.dir: s3://<bucket name>/savepoints/
  • state.backend: filesystem

the s3.access-key and s3.secret-key are not required if you are using eks / ec2 with an IAM role that can access S3.

Create the S3 bucket.

In your dockerfile add (see https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/filesystems/plugins/ for more details):

RUN mkdir /opt/flink/plugins/s3-fs-hadoop/
RUN cp /opt/flink/opt/flink-s3-fs-hadoop-*.jar /opt/flink/plugins/s3-fs-hadoop/ && chown -R flink: /opt/flink/plugins/s3-fs-hadoop/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants