One of the frequently overlooked yet essential best practices for Hadoop is to prefer fewer, bigger files over more, smaller files. How small is too small and how many is too many? How do you stitch together all those small Internet of Things files into files “big enough” for Hadoop to process efficiently? The ProblemContinue reading “Hadoop Likes Big Files”
Tag Archives: best practices
Create HDInsight Cluster in Azure Portal
Creating an HDInsight cluster from the Azure portal is very easy. However, sometimes you want all the choices and best practices explained as well as the “how to”. I have created a series of slides with audio recordings to walk you through the process and choices. They are available as sessions 1-8 of “Create HDInsightContinue reading “Create HDInsight Cluster in Azure Portal”
Master Choosing the Right Project for Hadoop
Hadoop is the hot buzzword of the Big Data world, and many IT people are being told “go create a Hadoop cluster and do some magic”. It’s hard to know where to start or which projects are a good fit. The information available online is sparse, often conflicting, and usually focused on how to solveContinue reading “Master Choosing the Right Project for Hadoop”
Azure Maximums and Resource Usage from PowerShell
Technorati Tags: Azure,PowerShell Have you ever struggled to find out how many VM cores, HDInsight cores, storage accounts, or other Azure resources your subscription is set to allow or how many you actually use? Maybe you want to use this information in your automation scripts to avoid trying to create components for which you don’tContinue reading “Azure Maximums and Resource Usage from PowerShell”
SQL PASS: All the Magic Knobs – Tools
SQL PASS: All the Magic Knobs – Tools In my All the Magic Knobs talk at #SQLPASS 2011 I discussed some easy ways to determine if you’re using some of the performance magic for SQL Server. When you have many consolidated, non-tier 1 databases you don’t have a lot of control over, the best wayContinue reading “SQL PASS: All the Magic Knobs – Tools”
SQL PASS: All the Magic Knobs
SQL PASS 2011 DBA-319-C #SQLPASS All the Magic Knobs – Low Effort, High Return Tuning Key points covered: Power Savings = High Performance Smart Virtualization Enough Hardware Control other apps, filter drivers Optimize for ad hoc workloads = ON Compression = ON Set LPIM + Max Server Memory Pre-size files, avoid shrink and autogrow FastContinue reading “SQL PASS: All the Magic Knobs”
Taming the Tempdb Tempest – WI SQL Server Virtual User Group, 22 Apr 2011
Thanks to the Wisconsin Virtual SQL Server User Group for letting me talk about tempdb today! The slides and demo queries are attached. Once the recording is available I will update this blog with a link to it. Taming the Tempdb Tempest Summary: · Multiple data files of the same size, one log file · Continue reading “Taming the Tempdb Tempest – WI SQL Server Virtual User Group, 22 Apr 2011”
General Hardware/OS/Network Guidelines for a SQL Box
I have put together some general guidelines for how you want a server to be delivered to the DBA team for a new SQL Server install. You won’t necessarily use all of them, but consider it a starting point for your SQL Server install standards. Places where it may be common to change the statements areContinue reading “General Hardware/OS/Network Guidelines for a SQL Box”
Compilation of SQL Server TempDB IO Best Practices
It is important to optimize TempDB for good performance. In particular, I am focusing on how to allocate files. TempDB is a unique database in several ways. The ones most relevant to this discussion are: · It is often one of the busiest databases on an instance. This means the performance of TempDB isContinue reading “Compilation of SQL Server TempDB IO Best Practices”