Befriending Dragons

Transform Tech with Anti-bullying Cultures


10 Comments

Sample PowerShell Script: HDInsight Custom Create

This is a working script I use to create various HDInsight clusters. For a really reproducible, automated environment you would want to put this into a .ps1 script that accepts parameters (see here for an example). However, you may find the method below good for learning and experimenting. Replace all the “YOURxyz” sections with your actual information. Beware of oddities introduced by cut/paste such as spaces being replaced by line breaks or quotes being replaced by smart quotes. The # is a comment, some commands that you rarely run are commented out so remove the # to run them if you need them.

# This PowerShell script is meant to be a cut/paste of specific parts, it is NOT designed to be run as a whole.

# Do once after you install the cmdlets
#Get-AzurePublishSettingsFile
#Import-AzurePublishSettingsFile C:UsersYOURDirectoryDownloadsYOURName-credentials.publishsettings

# Use if you admin more than one subscription
#Get-AzureAccount # This may be needed to log in to Azure
Select-AzureSubscription –SubscriptionName YOURSubscription
Get-AzureSubscription -Current

# Many things are easier in the ISE
ise

###############################################
### create clusters ###
###############################################

# Add your specific information here
# Previous failures may make a name unavailable for a while – check to see if previous cluster was partially created
$ClusterName = “YOURNewHDInsightClusterName” #the name you will give to your cluster
$Location = “YOURDataCenter” #cluster data center must be East US, West US, or North Europe (as of December 2013)
$NumOfNodes = 1 #start small
$StorageAcct1 = “YOURExistingStorageAccountName” #currently must be in same data center as the cluster
$DefaultContainer = “YOURExistingContainerName” #already exists on the storage account

# These variables are automatically set for you
$FullStorage1 = “${StorageAcct1}.blob.core.windows.net”
$Key1 = Get-AzureStorageKey $StorageAcct1 | %{ $_.Primary }
$SubID = Get-AzureSubscription -Current | %{ $_.SubscriptionId }
$SubName = Get-AzureSubscription -Current | %{ $_.SubscriptionName }
$Cert = Get-AzureSubscription -Current | %{ $_.Certificate }
$Creds = Get-Credential -Message “New admin account to be created for your HDInsight cluster” #this prompts you

###############################################
# Sample quick create
###############################################
# Equivalent of quick create
# The ` specifies that the cmd continues on the next line, beware of artifical line breaks added during cut/paste from the blog
New-AzureHDInsightCluster -Name $ClusterName -ClusterSizeInNodes $NumOfNodes -Subscription $SubID -Location “$Location” `
-DefaultStorageAccountName $FullStorage1 -DefaultStorageAccountKey $Key1 -DefaultStorageContainerName $DefaultContainer -Credential $Creds

###############################################
# Sample custom create
###############################################
#https://hadoopsdk.codeplex.com/wikipage?title=PowerShell%20Cmdlets%20for%20Cluster%20Management
# Most params are the same as quick create, use a new cluster name
# Pass in a 2nd storage account, a SQLAzure db for the metastore (assume same db for Oozie and Hive), add Avro library, some config values
# Execute all the variable settings from above

# This value is set for you, don’t change!
$configvalues = new-object ‘Microsoft.WindowsAzure.Management.HDInsight.Cmdlet.DataObjects.AzureHDInsightHiveConfiguration’

# Add your specific information here
$ClusterName = “YOURNewHDInsightClusterName
$StorageAcct2 = “YOURExistingStorageAccountName2
$MetastoreAzureSQLDBName = “YOURExistingSQLAzureDBName
$MetastoreAzureServerName = “YOURExistingSQLAzureServer.database.windows.net” #gives a DNS error if you don’t use the full name
$configvalues.Configuration = @{ “hive.exec.compress.output”=”true” }  #this is an example of a config value you may pass in

# These variables are automatically set for you
$FullStorage2 = “${StorageAcct2}.blob.core.windows.net”
$Key2 = Get-AzureStorageKey $StorageAcct2 | %{ $_.Primary }
$MetastoreCreds = Get-Credential -Message “existing id/password for your SQL Azure DB (metastore)” #This prompts for the existing id and password of your existing SQL Azure DB

# Add a config file value
# Add AVRO SerDe libraries for Hive (on storage 1)
$configvalues.AdditionalLibraries = new-object ‘Microsoft.WindowsAzure.Management.HDInsight.Cmdlet.DataObjects.AzureHDInsightDefaultStorageAccount’
$configvalues.AdditionalLibraries.StorageAccountName = $FullStorage1
$configvalues.AdditionalLibraries.StorageAccountKey = $Key1
$configvalues.AdditionalLibraries.StorageContainerName = “hivelibs” #container called hivelibs must exist on specified storage account
# Create custom cluster
New-AzureHDInsightClusterConfig -ClusterSizeInNodes $NumOfNodes `
| Set-AzureHDInsightDefaultStorage -StorageAccountName $FullStorage1 -StorageAccountKey $Key1 -StorageContainerName $DefaultContainer `
| Add-AzureHDInsightStorage -StorageAccountName $FullStorage2 -StorageAccountKey $Key2 `
| Add-AzureHDInsightMetastore -SqlAzureServerName $MetastoreAzureServerName -DatabaseName $MetastoreAzureSQLDBName -Credential $MetastoreCreds -MetastoreType OozieMetastore `
| Add-AzureHDInsightMetastore -SqlAzureServerName $MetastoreAzureServerName -DatabaseName $MetastoreAzureSQLDBName -Credential $MetastoreCreds -MetastoreType HiveMetastore `
| Add-AzureHDInsightConfigValues -Hive $configvalues `
| New-AzureHDInsightCluster -Subscription $SubID -Location “$Location” -Name $ClusterName -Credential $Creds

###############################################
# get status, properties, etc.
###############################################
#$SubName = $SubID = Get-AzureSubscription -Current | %{ $_.SubscriptionName }
Get-AzureHDInsightProperties -Subscription $SubName
Get-AzureHDInsightCluster -Subscription $SubName
Get-AzureHDInsightCluster -Subscription $SubName -name YOURClusterName

###############################################
# remove cluster
###############################################
#Remove-AzureHDInsightCluster -Name $ClusterName -Subscription $SubName


1 Comment

Interview with Julie Strauss–Microsoft BI WIT

clip_image002Julie Strauss is a very accomplished and respected Senior PM at Microsoft. Her current role is technical assistant for Microsoft Data Platform Group (DPG) Corporate Vice President Quentin Clark. She has been the public face of Microsoft BI at conferences and helps deliver great technical content and data stories to the public. Julie loves to help others so she has shared some background on herself and some great business advice that could be helpful to others seeking to improve their success.

Julie saw a job posting for the support team in Microsoft Norway (at the time Great Plains) looking for an individual willing to learn the ins and outs of the Microsoft BI products. She was excited that the posting indicated a willingness to learn was more important than previous knowledge of the particular Microsoft product. This was how and why Julie came here – she loves the technology and the data driven parts of the business and finds them fascinating.

Julie has a notable role with a wide range of responsibilities. The majority of her time is spent working on strategic projects to meet the goals of the team at the DPG Vice President level. Projects can vary in nature and cover everything from exploratory and technical projects to organizational projects. She gets to work with many areas of the business and enjoys interactions across the org. In addition to these internal facing responsibilities Julie also manages a set of customer and partner engagements for the business. Overall this role has provided Julie with an amazing learning opportunity. She gets to widen her scope while maintaining her data and BI focus and also use her years of experience from responsibilities ranging through sales, marketing, support, engineering, program management and people management. She merged these experiences into a role as technical assistant that utilizes some aspects of all those areas. Throughout her career she has chosen new jobs that allowed her to stretch and grow with a significant amount of change. But throughout it all she kept one core thing the same – her focus on BI and data. This mix of old and new in each role helps her cultivate new skills while leveraging what she already knows and expanding her influence. Within Microsoft there are many opportunities, something Julie feels is unique in the corporate world, and we can all find a way to shine and grow here.

imageJulie has an extensive network she finds invaluable in navigating all that opportunity. Her network lets her know about new opportunities and the network members also influence decision makers. She emphasizes that your reputation is everything – your network carries that reputation to others. In a strong network everyone is contributing to each other’s success. She has a large network though at any given point in time she is only actively interacting with a few people.

In addition to a network of contacts, Julie has closer relationships with a smaller group of people as both a mentor and a mentee. When Julie made the decision to move from marketing to engineering she leveraged her close mentoring relationship with Donald Farmer. Donald knew Julie and her work ethic and was willing to take a chance on Julie’s ability to succeed even though on paper it wasn’t an obvious fit. She stresses the importance of having semi-formal mentoring relationships with people at various levels. She asks various mentors for advice with experiences, projects, and specific interactions. Julie contributes back as a mentor to others – this keeps her coaching skills active. Julie observed that while she doesn’t treat her mentees differently based on their gender they tend to bucket themselves. More often than not women ask how to handle a specific situation or how to become more efficient or appear more confident. On the other hand men are more likely to ask task oriented questions such as how to make a specific change or how to write a better spec. She enjoys helping with both types of questions. Some of her mentees and mentors are people she already knew and some are people she grew to know only after the mentor-mentee relationship started.

imageI asked Julie what advice she feels is most important to her success that would be helpful to others in the organization. In addition to networking and mentors, she offered these pearls of wisdom:

  • Be willing to take risks and take on new challenges. She has few regrets because she goes after what she wants. She does wonder if having no regrets at all means she didn’t stretch enough. You have to find your own balance.
  • Be true to who you are – how people see you, your brand, should reflect the real you. For Julie it has been very important to never compromise on being true to herself. Julie’s brand is “Give me a challenge and I will work my butt off to get it done, being creative as needed, bringing in people who will make it work.”
  • Never be a victim. Women are strong.
  • Pick something concrete to improve upon and just do it. For example, Julie was ranked as the lowest presenter at a conference. She decided to become a top 10 presenter – she achieved that goal and grew to truly enjoy presenting along the way.
  • Find work you love. Julie finds data fascinating because it is very tangible and with BI you control how it leads to insights, learnings, and possibilities. She loves how data and BI let you use your own imagination and set your own boundaries.
  • State your needs and get buy-in. For example you might tell your manager that you want a promotion and lay out your plan to get there. Then you ask “Is this realistically going to get me to my goal”? Make sure your manager understands your value and gives you feedback, then follow through on the actions with appropriately timed check-ins on whether you are still on track.

Over the years Julie has lived in Denmark, Norway, the UK, and the US. She is always looking for new challenges whether it’s how to succeed in a new country or job or taking on a demanding project. Whatever she does she is working hard and getting things done. Follow her advice – build your network, find a mentor or two, be clear on expectations, and always be true to who you are.

I want to thank Julie for sharing herself and her ideas with us – it can be tough to open up but Julie did a stellar job!


2 Comments

Self-Service BI Works!

When I talk to people about adding self-service BI to their company’s environment I generally get a list of reasons why it won’t work. Some things I commonly hear:

  • I can’t get anyone in IT or on the business side to even try it.
  • The business side doesn’t know how to use the technology.
  • This threatens my job.
  • I just don’t know where to start either politically/culturally or with the technology.
  • I have too many other things to do.
  • How can it possibly be secure, allow standardization, or result in quality data and decisions?
  • That’s not the way we do things.
  • I don’t really know what self-service BI means.

#PASSBAC 2013 Cindy and Eduardo 

So what is a forward thinking BI implementer to do? Well, Intel just went out and did it, blowing through the supposed obstacles. Eduardo Gamez of Intel’s Technology Manufacturing Engineering (TME) group interviewed business folks to find those who were motivated for change, found a great pilot project with committed employees, and drove the process forward. They put a “sandbox” environment up for the business to use and came up with a plan for monitoring the sandbox activity to find models and reports worth adding to their priority queue for enterprise BI projects. The business creates their own data models and their own reports for both high and low priority items. IT provides the infrastructure and training including products like Analysis Services, PowerPivot, Power View, SharePoint, Excel, SQL Server, and various data sources. The self-service models and reports are useful to the business – they reduce manual efforts, give them the reports they want much faster, and ultimately drive better, more agile business decisions. If a model isn’t quite right after the first try, they can quickly modify it. The same models and reports are useful to IT – they are very refined and complete requirements docs that shorten the time to higher quality enterprise models and reports, they free up IT resources to build a more robust infrastructure and allow IT to concentrate on projects that require specialized IT knowledge. Everyone wins with a shorter time to decision, higher quality decisions, and a significant impact on the bottom line.

Learn more about how Intel TME is implementing self-service BI:

Eduardo (eduardo.m.gamez@intel.com) and I (cgross@microsoft.com or @SQLCindy) are happy to talk to you about Self-Service BI – let us know what you need to know!

Digg This

How_Intel__Integrates_Self-Service_BI_with_IT_for_Better_Business_Results_[DAV-208-M].zip


Leave a comment

PASS BAC PREVIEW SERIES: SQL Professionals and the World of Self-service BI and Big Data

Are you excited about the upcoming PASS Business Analytics Conference? You should be! This conference will offer a wide range of sessions about Microsoft’s End to End Business Intelligence (including Self-Service BI), Analytics, Big Data, Architecture, Reporting, Information Delivery, Data Management, and Visualization solutions. Whether you are an implementer, a planner, or a decision maker there is something here for you!

PASS_BAC_Horizontal_Banner

What makes this conference different? Why should you put in the effort to attend this conference in particular? We are seeing a paradigm shift focused on shorter time to decision, more data available than ever before, and the need for self-service BI. There are exciting technology solutions being presented to deal with these needs and new architectural skills are needed to implement them properly. Self-Service BI and Big Data are very different in many ways but also responding to the same problem – the need for additional insights and less time spent getting to those insights and the resulting impactful decisions. Self-Service BI via PowerPivot, Power View, Excel, and existing and new data sources including HDInsight/Hadoop (usually via Hive) offers fast time to decision, but you still sometimes need Enterprise BI to add additional value via services such as data curation, data stewardship, collaboration tools, additional security, training, and automation. Add in the powerful new data sources available with Big Data technologies such as HDInsight/Hadoop that can also reduce time to decision and open up all sorts of new opportunities for insight and you have many powerful new areas to explore. Not to mention that Dr. Steven Levitt, author of Freakonomics and SuperFreakonomics, is one of the keynote speakers!

Read more about my thoughts on Self-Service BI and Big Data in this #PASSBAC guest blog published today: PASS BAC PREVIEW SERIES: SQL Professionals and the World of Self-service BI and Big Data

And sign up for the session I am co-presenting at #PASSBAC with Eduardo Gamez of Intel: How Intel Integrates Self-Service BI with IT for Better Business Results

Take a look at all the information tagged with #PASSBAC and tweeted by @PASSBAC, there are some good blogs, preview sessions, and tidbits being posted. Get your own Twibbon for Twitter, Facebook, or however you want to use it, the Twibbon site will add a ribbon to the picture of your choice:

PASSBA2013Cindy

If you’re going to be in Chicago anyway, you might as well stay a few extra days for two nearby SQL Saturdays. The weekend before the conference take a short hop over to Madison, WI for #SQLSAT206 on April 6, 2013 at the Madison Area Tech College. Then head over to the bacon, uhhh, PASS BA CONference April 10-12. Stay one more day in Chicago (technically Addison, IL) for the #SQLSAT211 sessions at Devry. This is a great opportunity for even more SQL Server immersion and networking!

See you at #PASSBAC in Chicago in April!

@SQLCindy

Small Bites of Big Data