Tag: microsoft azure

  • Use Additional Storage Accounts with HDInsight Hive

    When you create an HDInsight Hadoop cluster you pass in one or more storage accounts and their associated keys. This allows you to access the files on all associated storage accounts from the cluster. If you want to use public storage that isn’t passed in at create time that’s easy – simply supply the storage account name each time you run a job. But how do you access data on private storage accounts that need an access key?

    The steps are laid out in this wiki by Eric Hanson: Using an HDInsight Cluster with Alternate Storage Accounts and Metastores

    http://social.technet.microsoft.com/wiki/contents/articles/23256.using-an-hdinsight-cluster-with-alternate-storage-accounts-and-metastores.aspx

    I am providing a variable based variation of the PowerShell sample for Hive. To set up PowerShell for use with Azure see Getting Started with Azure PowerShell Cmdlets–Subscription Management.

    First you will set some values for your environment. If you use your default subscription you don’t need to pass in the subscription name and select it. However, you will always need to specify the HDInsight cluster name. In this example $undefinedStorageAccount is the name of an account that you want to access from a cluster but you didn’t define it when you created the cluster. You always need to specify which container to use for any given reference so you also need to define $undefinedContainer. If the storage account belongs to the current subscription you can simply ask Azure to return the key (#commented out in the example below) or you can paste in the key that someone has given you.

    $subscriptionName = "LocalAzureSubscriptionName"
    
    $clusterName = "HDInsightClusterName"
    
    $undefinedStorageAccount = "AdditionalStorageAccount"
    
    $undefinedContainer = "ContainerOnAdditionalStorageAccount"
    
    #$undefinedStorageKey = Get-AzureStorageKey $undefinedStorageAccount | %{ $_.Primary }
    
    $undefinedStorageKey = "YourActualAccessKeyFromAzurePortal"

    Now choose which of your locally defined subscriptions to use:

    Select-AzureSubscription -SubscriptionName $subscriptionName

    Set the context of the cluster you want to use:

    Use-AzureHDInsightCluster $clusterName

    Now let’s check your HDInsight cluster properties.

    $defaultStorageAccount  = (Get-AzureHDInsightCluster -Name $clusterName).DefaultStorageAccount.StorageAccountName #default/only storage account
    
    $defaultContainerName   = (Get-AzureHDInsightCluster -Subscription $SubID -Cluster $ClusterName).DefaultStorageAccount.StorageContainerName
    
    $definedStorageAccounts = (Get-AzureHDInsightCluster -Name $clusterName).StorageAccounts #no 2nd account is associated, no value is returned

    Let’s check the values and verify that the storage account you want to use is not listed as either the DefaultStorageAccount (every cluster has one) or as one of the additional known storage accounts configured during provisioning (you may have zero, one, or many).

    write-host "===Default storage account"
    
    $defaultStorageAccount
    
    write-host "===Default container name"
    
    $defaultContainerName
    
    write-host "===Other defined storage accounts for this cluster"
    
    $definedStorageAccounts

    Next we’ll get a non-recursive listing of the files in the default location:

    invoke-hive "dfs -ls wasb://$defaultContainerName@$defaultStorageAccount/;" #default storage

    And then try to get a listing for the private storage account that we have not associated with the cluster:

    invoke-hive "dfs -ls wasb://$undefinedContainer@$undefinedStorageAccount/;" #not associated, errors

    Because the storage account access key is not yet known you will see an error similar to this one:

    Logging initialized using configuration in file:/C:/apps/dist/hive-0.12.0.2.0.7.0-1559/conf/hive-log4j.properties
    
    ls: org.apache.hadoop.fs.azure.AzureException: Unable to access container xyz in account abc using anonymous credentials, 
    
    and no credentials found for them  in the configuration.
    
    Command failed with exit code = 1

    But we can fix this! From PowerShell we can pass in “defines” statements to change configuration values, add libraries, etc.

    $defines = @{}
    
    $defines.Add("fs.azure.account.key.$undefinedStorageAccount.blob.core.windows.net", $undefinedStorageKey)
    
    Invoke-Hive -Defines $defines -Query "dfs -ls wasb://$undefinedContainer@$undefinedStorageAccount.blob.core.windows.net/;"

    The access key is only available to this Hive query, but now that I have the variables set I can pass it in to other queries as well. Happy Hiving!

    I hope you enjoyed this small bite of Big Data!

  • Getting Started with Azure PowerShell Cmdlets–Subscription Management

    I’ve started using the Azure PowerShell cmdlets more often to manage virtual machines and HDInsight in Azure. Once you connect to a subscription everything just works. However, the initial steps to get one or more subscriptions configured to be used from your machine or understanding how to change subscription information on your machine can be confusing. Some of the docs are contradictory, outdated, or incomplete. Often they assume you are only a co-admin of one subscription. The below steps should get you going with Azure cmdlets whether you admin one or many subscriptions.

    You need to enable your machine to talk to one or more Azure subscriptions. The first step is creating a certificate. Do NOT do this if you already used the PublishSettings commands unless you first use Remove-AzureSubscription (which removes the locally stored information about the specified subscription). Makecert is more secure than PublishSettings, especially if you (a given email address) have multiple co-administrators per subscription and/or you (a given email address) are a co-administrator of multiple subscriptions.

    The steps to get going are documented in Shep’s blog “Cloud Spelunking, Managing Azure form your Desktop via PowerShell (the Setup)” http://blogs.msdn.com/b/sql_shep/archive/2013/03/29/cloud-spelunking-managing-azure-form-your-desktop-via-powershell.aspx. I’ll go a bit deeper and fill in a few additional details on what Shep calls the “hard” option.

    Create a Certificate

    If you have IIS, Visual Studio, or the Windows SDK you will have some variation of a “Developer Command Prompt” (or VS201x or Visual Studio Command Prompt). Open that command prompt with the “run as administrator” option. Replace YourCertName with a meaningful name and run the below command. The cert always goes to the cert store on your local machine – the last parameter is an optional file based copy of that certificate that we will need for the next step. If you don’t specify the location it goes to %windir%system32. Be very protective of the .cer file – delete it once you have uploaded it. You can always generate another file if you need it.

    makecert -sky exchange -r -n “CN=<YourCertName>” -pe -a sha1 -len 2048 -ss My “c:temp<YourCertName>.cer”

    This certificate is yours – do not share it with others. If you want to reuse the certificate on other machines that you control, you can copy the .cer file to those machines and import them into the local certificate store on each machine. The .cer is just a copy, the actual certificate was loaded into your local certificate store (Manage Computer Certificates) by makecert.

    Upload Certificate to Azure Subscription(s)

    Generally you will not want to share certificates with others. Any certificate you use must be in your local certificate store (Manage Computer Certificates). The same certificate must also be uploaded to the portal and associated with each subscription you wish to manage from your machine.

    From your local machine where you created the certificate in the above step:

    • Log in to the Azure Portal with an email address that is associated with the subscription you want to use from your own machine.
    • Scroll to the bottom of the left pane and choose “SETTINGS”

    settings

    • Choose “MANAGEMENT CERTIFICATES”

    AzurePortalSettingsMgmtCert

    • Click on the “UPLOAD” button in the middle of the bar at the bottom of the screen.

    image

    • In the “Upload a management certificate” dialog navigate to the location specified in the last parameter above or %windir%system32 if you didn’t specify a location. Choose the .cer file you just created with makecert (or export a certificate from the local certificate store – just make sure it has the right properties). If you have multiple subscriptions there is a 2nd drop down box where you need to choose the subscription that the certificate will be associated with.

    image

    • Repeat for any additional subscriptions that you want to manage with the same certificate (or create one certificate per subscription for additional security granularity).

    Install and Configure the Azure PowerShell Cmdlets

    Follow the steps here to install the Azure Cmdlets. Basically you are selecting “Azure PowerShell” from the Web Platform Installer. You can also check in the Web Platform Installer for updated versions of the cmdlets.

    A very common setting that many admins set is the RemoteSigned Execution Policy. This is less secure than AllSigned or Restricted but allows you to use most downloaded scripts.

    Open Windows Azure PowerShell with the “run as admin” option and run:

    Set-ExecutionPolicy RemoteSigned –Force
    Get-ExecutionPolicy –list

    If you see errors when setting the execution policy, search on your specific error or start with this blog: Set-ExecutionPolicy : Windows PowerShell updated your execution policy successfully, but the setting is overridden by a policy defined at a more specific scope!!! You may need to open “Edit Group Policy” (in Windows 8 that opens the Local Group Policy Editor) and make a change.  Sometimes you may need to set each individual scope, but process scope settings go back to the default when the process is closed:

    Set-ExecutionPolicy RemoteSigned -Scope Process -Force

    Then import the Azure cmdlets:

    Import-Module Azure

    You can close the PowerShell window, you no longer need to “run as admin”.

    Enable PowerShell to use a Subscription via a Certificate

    Repeat this section on each machine that will be used to execute PowerShell code. Also repeat for additional subscriptions on each machine.

    Open Windows Azure PowerShell. Optionally type ISE to open the Integrated Scripting Environment where you can edit, save, and run collections of cmdlets.

    First, set some variables. You will need to copy some basic settings from the Azure Management Portal. On the far left side of the portal, scroll all the way to the bottom and choose “SETTINGS” and “MANAGEMENT CERTIFICATES” (see the “Upload Certificate to Azure Subscription(s)” section of this blog for more details – you are copying from the same place where you uploaded the certificate). Choose the certificate you just uploaded. Don’t worry if the numbers are cut off on the screen, if you highlight and copy it will get the whole value, even the part that doesn’t show on the screen. Replace the $subID and $thumbprint below – do not update $myCert as that is done based on your other variables. Execute the code in the PowerShell window.

    #copy SUBSCRIPTION ID from portal 
    #lower left, settings, management certificates
    $subID = "11111111-2222-3333-4444-555555555555"
    #copy THUMBPRINT from portal 
    #lower left, settings, management certificates
    $thumbprint = "1234567891234567891234567891234567891234"
    $myCert = Get-Item cert:\CurrentUserMy$thumbprint  
    

    Now set the subscription name you will use to refer to this subscription from this machine. In most cases you will choose the NAME of the subscription from the portal but that is not required. The matching between your machine’s knowledge of the subscription and the subscription on Azure is done via the SUBSCRIPTION ID. Update $localSubName below and execute the code in the PowerShell window. Note that the local subscription name is case-sensitive.

    #subname to be used locally
    #usually you will choose the actual subscription name
    #stored in %appdata%Windows Azure PowerShellWindowsAzureProfile.xml
    $localSubName = "MyFavSub"
    

    Now that you have set the values for your own environment, run the code to actually update your machine’s knowledge of the subscription. Note that I used the back tick “`” to specify that the command continues on a new line.

    Set-AzureSubscription –SubscriptionName $localSubName `
    –SubscriptionId $subID -Certificate $myCert

    Some operations rely on a default storage account, you may want to set the default storage account you want to use for each subscription.

    #optionally set "current" storage account for this sub
    $defaultStorageAccount = 'MyFavStorageAccount'
    Set-AzureSubscription -SubscriptionName $localSubName `
    -CurrentStorageAccount $defaultStorageAccount
    

    Next you can set the default subscription that you will start with when you open PowerShell on this machine (note that we’ve changed from the Set cmdlet to the Select one):

    Select-AzureSubscription –Default $localSubName

    You can change which of the configured subscriptions is the current one:

    Select-AzureSubscription –Current $localSubName

    Check to see which subscription you are currently using:

    Get-AzureSubscription –Current
    (Get-AzureSubscription -Current).SubscriptionName

    Verify that you can connect and list the services associated with the current subscription:

    Get-AzureService | select ServiceName

    Look at the Local Configuration

    Now let’s look at what got updated on the local machine.

    Open File Explorer and go to %appdata%Windows Azure PowerShell. Open WindowsAzureProfile.xml in Notepad or your favorite editor. Here are a few of the key values for each subscription you have mapped on your machine:

    IsDefault tells you which one is the default subscription for your machine

    <IsDefault>true</IsDefault>

    The thumbprint id is stored as the ManagementCertificate:

    <ManagementCertificate>1234567891234567891234567891234567891234</ManagementCertificate>

    The local name you chose for the subscription is stored in Name (to avoid confusion chose the name used in the portal):

    <Name>MyFavSub</Name>

    The subscription id is stored in SubscriptionId:

    <SubscriptionId>11111111-2222-3333-4444-555555555555</SubscriptionId>

    Remove Subscription

    If you need to remove a subscription from your machine, whether because you no longer have access to it or because you want to change one of the properties such as the name or which certificate you use, you can use Remove-AzureSubscription. This updates your local %appdata%Windows Azure PowerShell.

    #RemoveSub
    #Remove my machine's knowledge of a subscription 
    #Removes info from %appdata%Windows Azure PowerShellWindowsAzureProfile.xml
    Remove-AzureSubscription -SubscriptionName MyFavSub

    Sample Script

    Here is a handy dandy cut/paste version of the above PowerShell code to add a subscription and make it your default and current subscription:

    #copy SUBSCRIPTION ID from portal 
    
    #lower left, settings, management certificates
    
    $subID = "YourOwnSubID"
    
    #copy THUMBPRINT from portal 
    
    #lower left, settings, management certificates
    
    $thumbprint = "YourCertThumbprint"
    
    $myCert = Get-Item cert:\CurrentUserMy$thumbprint  
    
    #subname to be used locally
    
    #usually you will choose the actual subscription name
    
    #stored in %appdata%Windows Azure PowerShellWindowsAzureProfile.xml
    
    $localSubName = "YourSubcriptionName"
    
    #optionally set "current" storage account for this sub
    
    $defaultStorageAccount = 'OptionalDefaultStorage'
    
    Set-AzureSubscription –SubscriptionName $localSubName `
    
        –SubscriptionId $subID -Certificate $myCert
    
    Set-AzureSubscription -SubscriptionName $localSubName `
    
        -CurrentStorageAccount $defaultStorageAccount
    
    Select-AzureSubscription –Default $localSubName
    
    Select-AzureSubscription –Current $localSubName
    
    Get-AzureSubscription –Current
    
    (Get-AzureSubscription -Current).SubscriptionName

    You are Ready for PowerShell Gooey Goodness!

    Woohoo! Now you can access your Azure subscriptions from your machine without entering ids and passwords. You can automate, simplify, and standardize any Azure activity that has an associated cmdlet! Happy PowerShelling!