360Data - Organize, protect, connect and present your information

Wednesday, November 6, 2013

10 SQL Azure Gotchas

Another list! Honestly, I am perfectly capable of stringing words together into a semi-coherent self-contained piece with a beginning, middle and end, but this, I'm afraid is going to be another list. This one focusses on some of the differences you'll face if you're used to working with a traditional local SQL Server server or database and you're faced with working in the cloud for the first time. Don't be afraid, though: once you get used to it, SQL Azure (or Windows Azure SQL Database as we're somewhat tiresomely supposed to refer to it these days) rocks.

Without further ado, the list:

- Can't connect? Make sure your current external IP address is in the permitted list.

- It's perfectly possible to have a view or a stored procedure in a traditional SQL database that
refers to a non-existent SQL table or column. You can't run this view or SP successfully, but it'll sit there quite happily and be included in backups, restores, etc. However, the Azure Management Portal will report errors if you attempt to back up a database (export to BACPAC file) that includes such a view.

- Much the same happens when you try to upload a local database DACPAC to Azure. Views with external references (objects in other databases or on linked servers) aren't allowed and will need to be dropped before uploading.

- DACPAC upload doesn't like extended properties, either.

- Make sure the connection string or connection options for your application specify encrypted connections. Azure doesn't allow unencrypted connections.

- No database context switching is allowed in a script, so use of USE isn't permitted.

- The upload doesn't like a SQL Server 2012 database as source, either. (Note: I haven't tried this very recently, and I can't imagine in any case that it's a situation that'll continue very much longer even if it's still the case).

- Don't try uploading a DACPAC containing logins that don't (or can't - like Windows authentication AD
logins) exist on the Azure server. You'll be sorr-eeeee.

- Working with ADODB recordsets can be tricky; if Azure doesn't maintain the connection, your recordset object is suddenly null. There's a good article here on handling dropped connections in your application. My experience, however, is that connections are very stable in general, and when your application is dropping the connection there's usually a problem in the underlying data source or query. This is a topic I may revisit later, because it's a whole series of gotchas all on it's own.

- Tables with varchar/nvarchar(max) fields can't be reindexed online with the REBUILD WITH ONLINE option...but a simple index REORGANIZE isn't supported in Azure. Think about whether you really need those (MAX) length fields, or if you can use restricted-length fields instead.

Labels: Azure, Cloud

Thursday, May 30, 2013

Azure Wish List

Microsoft really hit the bullseye with Windows Azure SQL Database, delivering an amazingly mature and feature-rich product practically from day one. It's not perfect, though; a number of functions and features that we've been using for years on local SQL Server installs are still missing, as well as a few items that would make Azure easier to use and a more attractive choice for new development work. Here's my wish list:

Make it possible to use the Profiler

The SQL Server Profiler is one of the most important tools in the database administrator/developer's toolbox, giving detailed, fine-grained realtime insight into database activity. It's unmissable for both performance analysis and tuning and debugging. Use of the tool on Azure databases isn't supported at the moment, and it's a big gap. There's good and useful information available from the online management application, but this needs to be real-time, filterable, sortable and configurable to match the power of SQL Profiler.

Add Pricing analysis tools/reports

Microsoft offer a number of different pricing plans for their cloud database service, but what they all boil down to is a fixed price per database per month, plus charges based on the amount of outgoing data traffic. What I miss, though, is a simple way to see which queries or database activity in a given database are generating the most billable traffic. In other words, which queries and commands are the most costly in terms of euros and cents rather than in the traditional SQL usage of "costly" meaning resource-intensive. It's relatively easy to find slow-performing queries and memory-intensive commands, for instance, but now we need ways to match recorded database activity to the charges we see appearing on the monthly invoices.

Add the possibility to schedule database operations

The ability to schedule tasks is unmissable. We don't necessarily need the sophistication of the Jobs functionality offered in traditional offline SQL Server installations, but the basic ability to schedule SQL commands at fixed times or outside production hours is sorely missed. Even the possibility to run command-line tasks like you can with the Windows Scheduler would already be a big improvement.

These three items seem like "must-haves" to me. In the "nice-to-have" corner, the following:

Make it easier to upload and convert a database.

Converting an existing traditional offline SQL Server database for use on an Azure cloud database server is by no means impossible, but it's not as easy as it should be. Firstly, the "Extract Data-tier Application" wizard (that name alone!) stops at the first problem or Azure-incompatible usage it encounters rather than analyzing the source database for any and all potential issues at once. This means that arriving at a clean uploadable DACPAC (deployable database package) can involve starting and restarting the wizard many times, while fixing reported issues in between. Worse, the resulting package only contains database objects, no data, meaning that this needs to be scripted or uploaded separately. It's all do-able and these are not insuperable problems, but it should be possible to streamline the process.

Make it possible to edit directly in SSMS

With an offline traditional SQL Server installation, the SQL Server Management Studio tool we all know and love makes it easy to edit data directly in a table for those quick manual edits. Right-click on a table in the Object Explorer, "Edit Top 200 Rows", and Bob's your uncle. You can't do this with a table in an Azure database though. Stranger still, you can edit records directly in an Azure-hosted table using SQL Server Data Tools, so there doesn't appear to be any really good reason for preventing this in Management Studio. This is by no means an important drawback, but it is one that I encounter (and curse) frequently.

Make it easier to switch database context

Another minor but daily irritation is the necessity to disconnect a query and reconnect it when you want to switch between databases on the same cloud server, rather than simply selecting a different database from the Management Studio "Available Databases" drop-down. Again, there doesn't seem to be a good reason for this limitation: if you start a new query window in the master database, you can choose any available user database from the list, but once you've chosen, the only way to switch to another database is to disconnect your query.

And finally...

A snappier name than "Windows Azure SQL Database". What, exactly was wrong with "SQL Azure"?

Labels: Azure, Cloud, Development

Monday, May 20, 2013

Wish I'd discovered this years ago

One of the tasks I'm most frequently confronted with is updating or converting database queries. These are very often formatted as a fairly indigestible single block of text with no line breaks, just one solid word-wrapped block of code. This is mostly the result of people using WYSIWYG query editors where the underlying query code is invisible. The first thing I do, then, is to try to tidy up the code to make it readable and get a sense of what the query is supposed to accomplish, and how. This involves, among other things, inserting line breaks so that each field gets its own line, and formatting joins so that the joined tables and columns are distinct and easily readable. As you can imagine, this can be an tedious process, particularly if you're doing it a lot.

After spending most of a day last week doing nothing else but tidying up and converting Access queries to SQL views, I got fed up with doing it by hand and did a bit of Googling, and discovered this brilliant trick. In SQL Management Studio (and any other Visual Studio-based editor) you can use the Find and Replace dialog to replace any character with line breaks. Simply tick the Use check box and select the "Regular Expressions" option, then use "\n" as the replacement string. In the example below, I replace each comma (which occur after each field name in a query SELECT list) with comma + line break (", \n"), click Replace All and half the work in reformatting my query is done. Replacing with tab characters is simply a case of using "\t" instead of "\n" (so replacing "JOIN " with "JOIN \t" begins tidying up the table join clauses).

It's obviously possible to do much more sophisticated replace actions with this capability (there's a good list here), but this is already a life-saver. Hallelujah!

Labels: General

Saturday, March 16, 2013

I.T. phone home

A moment's silence, please: my Android smartphone died last week (after, in fairness, a couple of years of serious abuse). This means that I've had to fall back on a four-year-old mobile, one of the first Android smartphones. This is a somewhat painful experience when you've been used to the speed and ease of use of a relatively modern phone, but I can make and receive calls and use e-mail and my calendar, so I at least have the basics covered.

This means, of course, that I needed to choose a new phone. In 2013, the smartphone market has really matured. Every model from every manufacturer supports the same basic functionality; telephony (obviously), messaging, web browsing, location services & mapping, camera and social media. The only real differences are in design, UI and the data infrastructure and application ecosystem that underlie them.

Not that these are the only factors that play a role when we choose our phones, of course. Emotion is involved too; Apple's iPhone range has, until recently, included what were by some distance the most appealing and beautifully designed and engineered objects in the smartphone universe. (In my opinion, however, the elongated iPhone 5 is a design compromise and just looks wrong compared to its elegant forebears). The user interface is completely tired and unhelpful...all those rows of icons insisting that you press here to check your mail, here to check your agenda. I had an iPad for a while and it was a wonderful way to browse the web, but a tragedy as far as doing anything useful was concerned. Merely trying to get a document onto to it to read was an exercise in patience, and the iPhone is obviously no different.

The last time I chose a phone, Microsoft wasn't a realistic contender, but the current iteration of their smartphone OS, Windows Phone 8, demanded careful attention. It has by miles the most useful and appealing UI of any phone OS, with customizable information and functionality presented clearly and beautifully. I've had my eye on the new Nokia, and it really ticks a lot of boxes. I run my business on Google Apps, though, and WP8's integration with Google services is still patchy.

At the end of the day, then, I've decided to stick with Android this time. It's simple to set up the UI just the way you want it, there's huge app support and a vast array of different hardware available to run it on. The main reason for my choice, though is that my data is safe, instantly available and fully integrated. I can walk out of a phone shop with any Android phone, log in and within seconds I've got every e-mail I've ever sent or received (since 1995!), contacts and my full business and personal agendas, all synced automatically without me doing any more configuration beyond hat initial login.

Ultimately, the deciding factor is data portability. This, more so than app support or hardware is why I believe it's so difficult for Microsoft or other players to break the iOS/Android duopoly, but it's also valid at company and enterprise level for larger systems.

Features, performance, price, service and other factors will always play a role when companies need to update or replace their systems, but smart companies nearly always choose a straightforward data migration over new functionality, unless the business case for that functionality is so compelling that it justifies the costs of lost, incomplete or inaccessible data.

Smart companies know that value sits in data, not in applications, systems or hardware.

Tuesday, January 29, 2013

Azure databases

I've been working intensively with Windows Azure SQL for the past six months, and it's been tremendously interesting - exciting, even - to be in near the beginning of a technology that's maturing at such an astonishing speed. It feels similar to the sort of gold rush excitement that surrounded the web around 1994, when the possibilities of the web were only being hinted at. Who could have predicted then that we'd already live in a world of mobile internet, streaming video, and omnipresent social media?

Similiarly, it's difficult to predict all the uses to which Microsoft's cloud architecture solutions will be put in the next few years, but the impact is likely to be seismic. I smile when I see reports predicting Microsoft's demise just because Windows Phone hasn't taken off, or Windows 8, when it's obvious to me that what they're doing in the background with cloud engineering is where the real action is. It's difficult to overstate how impressive the various Azure services are, how well implemented, and the sheer value for money offered.

Just from the point of SQL databases, I'd be inclined to say that I will never willingly implement a physical SQL server again, or recommend the same to any client. The sheer ease, speed and cost-effectiveness of creating a new cloud database server just crush the limited benefits of a physical machine. The ability to fine-tune database files and queries to match the hardware of a physical database server remains, of course, but even this is a dying art these past few years since virtualization and SANs became commonplace. Who gets to install SQL Server on bare metal any more?

The learning curve is shallow. Most existing tools just work, and most existing applications that communicate with a traditional on-premises SQL Server database will do the same with an Azure SQL database without difficulty. The biggest change for me as a developer was reusing skills I'd learned long ago, when LAN bandwidth was limited and expensive; keeping data traffic lean and to-the-point, and pushing as much work as possible back to the (cloud) server rather than processing large data sets on the client workstation in code.

There are obviously a few things still missing. Top of my wishlist is the SQL Profiler, for instance, even though the online toolset is surprisingly thorough in the amount of data and metadata offered. I'd also like to see some sort of analysis tool that checks your existing offline database for potential issues and offers to fix them, or script the fixes. As is, deploying an existing database to Azure is a question of trial-and-error where the wizard fails a number of times, failing on each encountered problem until all issues are resolved.

Even with these caveats, though, it's an astonishingly mature platform. Last week, someone asked me if I had a USB stick and I had to think for a moment, because between SkyDrive and Google Drive I haven't copied files to physical media in an age. It's these changes, the ones you barely notice while they're happening, that are fundamental. Windows Azure services feels like one of these shifts to me, but on a far grander scale, and I'm really happy to be involved. I know what I'm going to be doing for the rest of my working career.

Labels: Azure, Cloud, Development, General

Saturday, August 18, 2012

Forgotten, but not gone

My first contact with Microsoft SQL Server was with version 7.0 back in the late nineties. Back then, there wasn't a whole lot more to the product than the database engine and the tools to manage it. That was plenty to be going on with, mind you: even then, there was a huge amount of information to be absorbed if you wanted to claim understanding of how to build, implement & manage databases built on SQL Server.

5 versions later, though, the amount of functionality bundled with even the Standard Edition is staggering, encompassing not just the relational database engine but reporting, BI, and advanced data management. Few can now claim with any authority to be a 'SQL Server expert'; each area within SQL Server is a lifetime's work on its own at this stage, even if new product versions with new functions and capabilities weren't arriving every few years. 'Expertise' is in many instances having enough background & experience with the product that you know in general terms what capabilities are there & where to find them, and being quickly able to get up to speed on the details of the less everyday areas of the suite and implement them successfully as the need arises.

All of which is a very long-winded way of saying that, once you've been working with SQL Server for more than a few years, you've likely forgotten more than you know about the parts of the product you're not using daily (or you've got a much better memory than I have).

A reminder came this week. I was working with a client who wanted to configure his server in such a way that another contractor could create, modify, test and schedule SQL jobs relating to the contractor's own project, but not modify other jobs (like general backup & maintenance jobs, for instance). Sysadmin permissions, then, were too broad, but any of the other fixed server roles were too restricted.

We took a look in MSDB, though, and came up with the SQLAgentOperatorRole role. This was added in SQL Server 2005, but I'd never had cause to use it in the meantime and subsequently forgotten about it.

For this purpose, though, it's almost ideal. We added the contractor's user account and a test account as members of the group, then logged in again with the test account to see what we could & couldn't do. As the test user, we could add new jobs, add job steps, modify both at will and schedule them as necessary. We weren't able to pick existing job schedules, interestingly, but were forced to create our own set (I can see the point of this, but it goes a bit far for most situations).

Better still, we couldn't modify other jobs (jobs with other owners) with one important caveat: we were able to disable and enable jobs and job schedules from other owners. It's not at all clear to me in which situations it would be desirable to block someone from, say, editing a job description while allowing them to disable the entire job, but that appears to be the situation.

We decided to use the role nonetheless, deciding that 95% was better than nothing. I dug a bit further afterwards, though, and discovered that the role SQLAgentUserRole fit our needs better. As a member of this role, you can still create, modify, run and schedule your own jobs, but neither modify or disable jobs from other owners.

Re-learning stuff like this, and getting to use it in production scenarios, is what keeps this line of work continually interesting.

Labels: Database Engine, SQL 2008, SQL 2008 R2

Friday, May 25, 2012

Being Resourceful

A client of mine has a system in development that consists of a normal, multi-user .NET desktop application plus a number of services that manage some occasional heavy-duty background processing on the database server. The problem is, when the services are running, the application performance grinds to a standstill and users are left for minutes on end waiting for their application to react.

This obviously isn't acceptable, but lowering the impact of the background processes isn't straightforward, either. It's not as simple as just rewriting a runaway query or adding a few indexes, and re-architecting the entire solution far into development just isn't going to happen. The question was, would it be possible to limit the amount of resources these background processes consume? My customer decided to try the Resource Governor feature in SQL Server but had no useful or tangible performance improvement, so I was asked to take a look at their setup and see what, if anything, was going wrong.

My own experience with the Resource Governor was limited to what I'd read about it, as well as trying out some example test setups, but I'd never configured or observed it on a semi-live environment, so I was delighted to get the opportunity to learn more about it.

We spent almost two days in total configuring and testing, making tiny changes in the RG configuration and observing the change in behaviour and processing time for similar workloads. With the RG disabled, the background processes would take a full 100% of the server processor time for minutes on end, so the goal was to brake this enough to leave enough processor capacity for the desktop app users without overly limiting the background processing.

Our results were surprising - to me, at least. The first surprise was how far down we had to take all the settings before we saw any effect at all. The second surprise (and I'd imagine that it's not always the case), was that we consistently observed a sort of threshold effect while changing settings; nothing would change in observed or measured load through a range of changes until suddenly, with one change up or down, there'd be a huge increase or decrease in consumed resources. After a lot of experimentation and measurement, we eventually arrived at a configuration that gave us a decent mix of resource use and allocation.

So, is the Resource Governor a worthwhile addition to the DBA's toolbox? Absolutely. You're obviously still better off trying to tackle the source of excessive resource consumption rather than artificially limiting it, but when this isn't possible the Resource Governor gives you one more option to work with before going to the boss, cap in hand, and asking for a more powerful server. It's tricky enough to configure properly and not massively well documented yet but you can definitely get results with it.

Labels: Database Engine, Performance, SQL 2008 R2