2016-09-03 61 views
1

我想使用AWS Spot實例來訓練神經網絡。爲防止現場實例終止時模型丟失,我計劃創建EBS卷的快照,創建一個新卷並將其附加到保留實例。我如何安裝,或使用python & boto3提供EBS卷。python - 使用boto3掛載EBS卷

這些是在Linux上用於make the volume available的步驟,但我希望自動化該過程,以便我不需要每次都將SSH插入到實例中。這裏是我用來附加音量的代碼 -

import boto3 
ec2 = boto3.resource('ec2') 

spot = ec2.Instance('i-9a8f5082') 
res = ec2.Instance('i-86e65a13') 

snapshot = ec2.create_snapshot(VolumeId="vol-5315f7db", Description="testing spot instances") 
volume = ec2.create_volume(SnapshotId=snapshot.id, AvailabilityZone='us-west-2a') 
res.attach_volume(VolumeId="vol-5315f7db", Device='/dev/sdy') 
snapshot.delete() 

回答

0

您需要在實例上運行mount命令。 2種方式。一個是像@ mootmoot所寫的帶有ssh連接的發送命令。另一個是使用AWS SSM服務(如@Mark B寫的)發送命令。使用AWS SSM

發送的bash命令實例:下面是詳細的SSM溶液樣品,你可以忽略不必要的部分你

# Amazon EC2 Systems Manager requires 
# 1. An IAM role for EC2 instances that will process commands. There should be a system manager role and the instance should use this role ! (Did it while creation instance) 
# 2. And a separate role for users executing commands. Aws IAM user that has access and secret keys should have ssm permission. (i.e. AmazonSSMFullAccess) 
# http://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-configuring-access-policies.html 
def execute_commands_on_linux_instances(commands, instance_ids): 
    client = boto3.client('ssm', **conn_args) # Need your credentials here 

    all_ssm_enabled_instances, ssm_enabled_instances, not_worked_instances = [],[],[] 
    not_worked_instances = instance_ids.copy() 
    all_ssm_enabled_instances = list() 
    outputs = list({}) 
    not_executed = list() 

    # Select only the Instances that have an active ssm agent. 
    if len(client.describe_instance_information()['InstanceInformationList']) > 0: 
     resp = client.describe_instance_information(MaxResults=20)['InstanceInformationList'] 
     for ins in resp: 
      all_ssm_enabled_instances.append(ins['InstanceId']) 
     ssm_enabled_instances = list(set(all_ssm_enabled_instances).intersection(instance_ids)) 
     not_worked_instances = list(set(instance_ids).difference(all_ssm_enabled_instances)) 


     # Now, send the command ! 
     resp = client.send_command(
     DocumentName="AWS-RunShellScript", 
     Parameters={'commands': [commands]}, 
     InstanceIds=ssm_enabled_instances, 
     ) 

     # get the command id generated by the send_command 
     com_id = resp['Command']['CommandId'] 

     # Wait until all the commands status are out of Pending and InProgress 
     list_comm = client.list_commands(CommandId=com_id) 
     while True: 
      list_comm = client.list_commands(CommandId=com_id) 
      if (list_comm['Commands'][0]['Status'] == 'Pending'or list_comm['Commands'][0]['Status'] == 'InProgress'): 
       continue 
      else: 
       # Commands on all Instances were executed 
       break 

     # Get the responses the instances gave to this command. (stdoutput and stderror) 
     # Althoug the command could arrive to instance, if it couldn't be executed by the instance (response -1) it will ignore. 
     for i in ssm_enabled_instances: 
      resp2 = client.get_command_invocation(CommandId=com_id, InstanceId=i) 
      if resp2['ResponseCode'] == -1: 
       not_executed.append(i) 
      else: 
       outputs.append({'ins_id': i, 'stdout': resp2['StandardOutputContent'], 
          'stderr': resp2['StandardErrorContent']}) 

     # Remove the instance that couldn't execute the command ever, add it to not_worked_instances 
     ssm_enabled_instances = list(set(ssm_enabled_instances).difference(not_executed)) 
     not_worked_instances.extend(not_executed) 

     return ssm_enabled_instances, not_worked_instances, outputs 
    else: 
     print("There is no any available instance that has a worked SSM service!") 
     return ssm_enabled_instances, not_worked_instances, outputs 

與已要求已要求角色所需的IAM實例配置文件創建實例政策。由於此實例創建,實例已運行SSM代理:

def create_ec2_instance(node_type): 
    # define userdata to be run at instance launch 

    userdata = """#cloud-config 

    runcmd: 
    - cd /tmp 
    - sudo yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm 
    """ 

    ec2_r = boto3.resource('ec2', **conn_args) 

    rolename = "amazonec2ssmrole" 
    i_pro_name = "ins_pro_for_ssm" 

    # Create an iam instance profile and add required role to this instance profile. 
    # Create a role and attach a policy to it if not exist. 
    # Instances will have this role to build ssm (ec2 systems manager) connection. 
    iam = boto3.resource('iam', **conn_args) 

    try: 
     response= iam.meta.client.get_instance_profile(InstanceProfileName=i_pro_name) 
    except: 
     iam.create_instance_profile(InstanceProfileName=i_pro_name) 
    try: 
     response = iam.meta.client.get_role(RoleName=rolename) 
    except: 
     iam.create_role(
        AssumeRolePolicyDocument='{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"Service":["ec2.amazonaws.com"]},"Action":["sts:AssumeRole"]}]}', 
        RoleName=rolename) 
     role = iam.Role(rolename) 
     role.attach_policy(PolicyArn='arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM') 
     iam.meta.client.add_role_to_instance_profile(InstanceProfileName=i_pro_name, RoleName=rolename) 

    iam_ins_profile = {'Name': i_pro_name} 

    if node_type == "Medium": 
     instance = ec2_r.create_instances(
      ImageId='ami-aa5ebdd2', 
      MinCount=1, 
      MaxCount=1, 
      UserData=userdata, 
      InstanceType='t2.medium', 
      KeyName=key_pair_name, 
      IamInstanceProfile=iam_ins_profile, 
      BlockDeviceMappings=[{"DeviceName": "/dev/xvda", "Ebs": {"VolumeSize": 20}}]) 
    elif node_type == "Micro": 
     instance = ec2_r.create_instances(
      ImageId='ami-aa5ebdd2', 
      MinCount=1, 
      MaxCount=1, 
      UserData=userdata, 
      InstanceType='t2.micro', 
      KeyName=key_pair_name, 
      IamInstanceProfile=iam_ins_profile, 
      BlockDeviceMappings=[{"DeviceName": "/dev/xvda", "Ebs": {"VolumeSize": 10}}]) 
    else: 
     print("Node Type Error") 
     return -1 

    # Wait for the instance state, default --> one wait is 15 seconds, 40 attempts 
    print('Waiting for instance {0} to switch to running state'.format(instance[0].id)) 
    waiter = ec2_r.meta.client.get_waiter('instance_running') 
    waiter.wait(InstanceIds=[instance[0].id]) 
    instance[0].reload() 
    print('Instance is running, public IP: {0}'.format(instance[0].public_ip_address)) 

    return instance[0].id 

不要忘記給ssm權限。 (即AmazonSSMFullAccess)發送給具有訪問密鑰和密鑰的Aws IAM用戶。

順便提一下,可以conn_args如下定義:

conn_args = { 
     'aws_access_key_id': Your_Access_Key, 
     'aws_secret_access_key': Your_Secret_Key, 
     'region_name': 'us-west-2' 
    } 
1

您必須在操作系統中執行這些步驟。您無法通過AWS API執行這些步驟(Boto3)。您最好的選擇是腳本化這些步驟,然後通過Boto3以某種方式啓動腳本,可能使用AWS SSM服務。

1

遠程發送和執行ssh腳本有什麼問題?假設如果將標籤對這些資源的使用的是Ubuntu的,即

ssh -i your.pem [email protected]_name_or_ip 'sudo bash -s' < mount_script.sh 

,您可以在以後使用boto3由通用標籤名,而不是依賴於特定的靜態ID來詢問的資源。