Hadoop Likes Big Files

One of the frequently overlooked yet essential best practices for Hadoop is to prefer fewer, bigger files over more, smaller files. How small is too small and how many is too many? How do you stitch together all those small Internet of Things files into files “big enough” for Hadoop to process efficiently? The ProblemContinue reading “Hadoop Likes Big Files”

Azure Data Factory: Hub Not Found

You can use the new Azure portal to create or edit Azure Data Factory components. Once you are done you may automate the process of creating future Data Factory components from PowerShell. In that case you can use the JSON files you edited in the portal GUI as configuration files for the PowerShell cmdlets. ForContinue reading “Azure Data Factory: Hub Not Found”

SQL PASS: All the Magic Knobs

SQL PASS 2011 DBA-319-C #SQLPASS All the Magic Knobs – Low Effort, High Return Tuning Key points covered: Power Savings = High Performance Smart Virtualization Enough Hardware Control other apps, filter drivers Optimize for ad hoc workloads = ON Compression = ON Set LPIM + Max Server Memory Pre-size files, avoid shrink and autogrow FastContinue reading “SQL PASS: All the Magic Knobs”

Do I need DTC for my SQL Server?

I get a lot of questions about the “best” way to configure the Distributed Transaction Coordinator (DTC) for SQL Server on a Windows 2008 cluster. There is no one best way to do it, and the first question you ask should be “do you even use DTC and if so how and how often?”. IfContinue reading “Do I need DTC for my SQL Server?”

%d bloggers like this: