Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Bottlerocket on Neuron Instance types (Inferentia and Trainium) #8173

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vigh-m
Copy link

@vigh-m vigh-m commented Jan 30, 2025

Description

Issue:

Closes: #8111

Description:

This change is to add support for Bottlerocket AMIs on Neuron (Inferentia and Trainium) Instance types. There is no new variant so the standard AMIs will also work on the Neuron instances.

Also updated the neuron-device-plugin.yaml spec using the new Neuron Helm Chart. The update was done by running

helm template neuron-helm-chart oci://public.ecr.aws/neuron/neuron-helm-chart`

and pulling the relevant pieces out.

Testing

  • make test

  • Manual Testing: Launched a cluster with the following config.yaml

    ---
    apiVersion: eksctl.io/v1alpha5
    kind: ClusterConfig
    
    metadata:
      name: mng-testing
      region: us-west-2
      version: '1.30'
    
    managedNodeGroups:
      - name: inf2-mng-test
        instanceType: inf2.xlarge
        amiFamily: Bottlerocket
        minSize: 1
        maxSize: 2
        desiredCapacity: 1
        volumeSize: 100
        disableIMDSv1: true
      iam:
          attachPolicyARNs:
            - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
            - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
            - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess
            - arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
            - arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
            - arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess
        ssh:
            allow: true
            publicKeyName: "angrykitten_server_key"
        bottlerocket:
          enableAdminContainer: true
          settings:
            motd: "Hello from eksctl!"
            kubernetes:
              device-ownership-from-security-context: true
    
    nodeGroups:
      - name: inf2-testing-1
        instanceType: inf2.xlarge
        desiredCapacity: 1
        amiFamily: Bottlerocket
        disableIMDSv1: true
        iam:
          attachPolicyARNs:
            - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
            - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
            - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess
            - arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
            - arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
            - arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess
        ssh:
            allow: true
            publicKeyName: "angrykitten_server_key"
        bottlerocket:
          enableAdminContainer: true
          settings:
            motd: "Hello from eksctl!"
            kubernetes:
              device-ownership-from-security-context: true

    The cluster came up as expected. The Neuron devices are recognized and allocatable on the nodes.

Checklist

  • Added tests that cover your change (if possible)
  • Added/modified documentation as required (such as the README.md, or the userdocs directory)
  • Manually tested
  • Made sure the title of the PR is a good description that can go into the release notes
  • (Core team) Added labels for change area (e.g. area/nodegroup) and kind (e.g. kind/improvement)

BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯

  • Backfilled missing tests for code in same general area 🎉
  • Refactored something and made the world a better place 🌟

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello vigh-m 👋 Thank you for opening a Pull Request in eksctl project. The team will review the Pull Request and aim to respond within 1-10 business days. Meanwhile, please read about the Contribution and Code of Conduct guidelines here. You can find out more information about eksctl on our website

@bryantbiggs bryantbiggs added kind/feature New feature or request area/nodegroup labels Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/nodegroup kind/feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Enable launching Bottlerocket AMIs on AWS Neuron Instances
2 participants