One of the frequently overlooked yet essential best practices for Hadoop is to prefer fewer, bigger files over more, smaller files. How small is too small and how many is too many? How do you stitch together all those small Internet of Things files into files “big enough” for Hadoop to process efficiently? The ProblemContinue reading “Hadoop Likes Big Files”
Tag Archives: HDP
Create HDInsight Cluster in Azure Portal
Creating an HDInsight cluster from the Azure portal is very easy. However, sometimes you want all the choices and best practices explained as well as the “how to”. I have created a series of slides with audio recordings to walk you through the process and choices. They are available as sessions 1-8 of “Create HDInsightContinue reading “Create HDInsight Cluster in Azure Portal”
Master Choosing the Right Project for Hadoop
Hadoop is the hot buzzword of the Big Data world, and many IT people are being told “go create a Hadoop cluster and do some magic”. It’s hard to know where to start or which projects are a good fit. The information available online is sparse, often conflicting, and usually focused on how to solveContinue reading “Master Choosing the Right Project for Hadoop”
Understanding WASB and Hadoop Storage in Azure
Yesterday we learned Why WASB Makes Hadoop on Azure So Very Cool. Now let’s dive deeper into Windows Azure storage and WASB. I’ll answer some of the common questions I get when people first try to understand how WASB is the same as and different from HDFS. What is HDFS? The Hadoop Distributed File SystemContinue reading “Understanding WASB and Hadoop Storage in Azure”
Why WASB Makes Hadoop on Azure So Very Cool
Data. It’s all about the data. We want to make more data driven decisions. We want to keep more data so we can make better decisions. We want that data stored cheaply, easily accessible, and quickly ingested. Hadoop promises to help with all those things. However, when you deal with Hadoop on-premises you have aContinue reading “Why WASB Makes Hadoop on Azure So Very Cool”
Taking Flight a.k.a. The Data Dragon’s Life After Microsoft
Cross-posted (with slightly worse formatting) from http://befriendingdragons.com/2014/07/23/taking-flight-a-k-a-the-data-dragons-life-after-microsoft/ Life is a journey – we can choose to fly through it with our wings spread to catch and channel the winds, or we can let the winds pummel us to the ground. I choose to take flight, enjoy the journey, and land on my feet. Then take off again. Even whenContinue reading “Taking Flight a.k.a. The Data Dragon’s Life After Microsoft”