This project demonstrates how to create an Amazon MWAA environment that uses AWS CodeArtifact for Python dependencies. This enables users to avoid providing MWAA with an internet access via NAT Gateway and hence reduce the cost of their infrastructure.
AWS Lambda runs every 10 hours to obtain the authorization token for AWS CodeArtifact, which is then used to create index-url
for pip
remote repository (CodeArtifact repository). Generated index-url
is saved to codeartifact.txt
file that is then uploaded to an Amazon S3 bucket. MWAA fetches DAGs and codeartifact.txt
at the runtime, and installs Python dependencies from the CodeArtifact repository.
.
├── infra/ // AWS CDK infrastructure
├── mwaa-ca-bucket-content/ // DAGs and requirements.txt
├── lambda/ // Lambda handler
├── .env // Environment variables
├── Makefile // Make rules for automation
Before moving on with the project deployment, complete the following checks:
- Install
npm
on your machine - Install
Python
on your machine - Ensure that AWS CLI is installed and configured on your machine
- Ensure that AWS CDK is installed and configured on your machine
NOTE: 1.102.0
, hence the same version or higher is required.
To create a virtual environment run the following make
rule:
# from the root directory
$ make venv
This rule will create a virtual environment in infra/venv
and install all the necessary dependencies.
Set environment variables in .env
file.
AWS_REGION
: AWS region to which you wish to deploy this projectBUCKET_NAME
: choose a unique name for an Amazon S3 bucket that will contain Airflow DAGsAIRFLOW_VERSION
: Apache Airflow version (v1.10.12
orv2.0.2
) - set to the latestv2.0.2
Execute deploy
rule to deploy the infrastructure:
# from the root directory
$ make deploy
NOTE: y
and press Enter.
To destroy all resources created for this project execute the destroy
rule:
# from the root directory
$ make destroy
NOTE: y
and press Enter.
To install preferred Python dependencies to your MWAA environment, update the requirements.txt
file and upload it to S3 bucket. To make these changes take effect, you will need to update your MWAA environment by selecting a new version of requirements.txt
. You can do so in AWS Console or via AWS CLI.
Upload requirements.txt
with new Python dependencies:
aws s3 cp mwaa-ca-bucket-content/requirements.txt s3://YOUR-BUCKET-NAME/
To get requirements.txt
versions run:
aws s3api list-object-versions --bucket YOUR-BUCKET-NAME --prefix requirements.txt
Finally, update your MWAA environment with a new version of requirements.txt
:
aws mwaa update-environment --name mwaa_codeartifact_env --requirements-s3-object-version OBJECT_VERSION
If you build your own Python packages, you could also add this process to update requirements.txt
and MWAA environment as part of your release pipeline.
This library is licensed under the MIT-0 License. See the LICENSE file.