“Unknown Error” and OOTB SharePoint 2007 builtin Workflows

If you are playing with Approval workflows, and you find that your workflows are erroring out even when you think it should have completed successfully, make sure you aren’t in a situation where you’re updating the approval status without having the approval functionality of your document or workflow library enabled.

It’s all built-in, but it doesn’t all automatically enable itself on need.

I ultimately found my answer on SharePoint Blogs, but my discovery route was circuitous.

First, the Unknown Error statuses on my workflow status page for the workflow. These workflows would error out and require termination from the workflow status page so they’d stop contributing to my active workflow counts. Searching around, I found out (look at Eilene Hao’s response to Misha) that you can get more info about these by going to where your Diagnostic Logs are kept. Find out where those logs are kept by going to SharePoint Central Administration, Operations page/tab, then under the Logging and Reporting section, click the link to Diagnostic Logging. On the next page, you’ll find out where those logs are under Trace Log.

Go to that directory in your front end web server(s) and use a decent directory search/grep (I used Textpad I have installed in portable mode on a USB key) to find log files with the word “workflow”. Upon doing that, I found the string/error, “System.ArgumentNullException: Value cannot be null. Parameter name: name”. Googling eventually led me back to the SharePoint Blogs post above.

So the fix is that if you’re going to use an ootb (out of the box) builtin workflow that updates the approval status of an item, you should also enable the “Require content approval for submitted items?” option in Versioning Settings for the list settings. This will do a lot of things automatically for you:

  • When a list item is changed, the approval status for the item gets automatically changed back to “Pending”.
  • When a person with the permission level to approve items looks at the list, they get a special view option called “Approve/reject Items”. (So they can bypass the approval workflow)
  • The workflow that updates approval status stops erroring out.

It’s all pretty cool, but you have to know how it all hooks together.

SharePoint 2003 Reader Permissions include… Export to Spreadsheet!

I had to find this out for a customer this morning.

It’s true: Being able to read something means they’re okay with exporting to a spreadsheet. We may need to “Lock this down” but all I can think of is to use javascript to hide the control, which is not really locked down per se.

Web Part Installs on SharePoint 2003 in environments where you don’t control the Database

In a continuing series of mishaps involved with a very abstracted permissions environment where one group controls all admin rights on the server, except for the group that controls all admin rights on the Database, I discovered today that if you need to install a Web Part in SharePoint 2003, you need to do it on the Web Server(s) in question while running in the context of an account that has proper permissions (I assume at least DB Creator & Security Admin if not also System Admin) within the SQL Server where your Configuration Database lives.

What we were seeing was that the Office Web Parts installer and two MSIs for custom web parts ran. The Office Web Parts installer reported that it had installed correctly. On the other hand, the MSIs reported an install error and referred to the log which was essentially empty. (MSI Web Part Packager installer logs go in C:\Documents and Settings\\Local Settings\Temp\wppackager.log.) Some 3rd party manufacturers of Web Parts do support this issue (in comparison to Microsoft, who don’t seem to report this issue in any KB article), and suggest using stsadm to install instead.

We then tried using stsadm.exe to install (some Microsoft third party web part vendors report the unhelpful MSI install failure message, even though Microsoft doesn’t) the MSI packages, and instead got the familiar Configuration Database connectivity error that refers to KB 823287:

Cannot connect to the configuration database. For tips on troubleshooting this error, search for article 823287 in the Microsoft Knowledge Base at http://support.microsoft.com.

Here’s a link I found very helpful to help explain/troubleshoot the issue.

Unfortunately, at this client, I will now need to punt to a policy question about how to approach the install next, but hopefully in your situation you’ll be able to just use an account with the right permissions to do the install.

I have been negligent – bullet updates, but I’ll get around to the major stuff later

Since I fully expect next month to be a slow month, I should be able to catch up a little.

Anyhow:

  • I am installing the Release bits of Microsoft Office 2007. I don’t know if I’ve already plugged CCleaner but I’m doing so again. I needed it because Office 2007 Beta 2 Technical Refresh didn’t uninstall entirely cleanly. An add-on I’d installed after the original install had to be manually removed, but it didn’t show up in my Add/Remove Programs, so CCleaner was instrumental in my being able to find an uninstall the bugger so I could go ahead with the install of the Release version.
  • It turns out that the extended problems I had properly creating the Shared Service Provider portion of MOSS 2007 were due to two factors:
    • I had neglected to complete the MOSS 2007 Beta 2 TR install properly. I’ll go back to that article and add the details in, but instead of running the configuration wizard right away, I should instead have uninstalled Windows Workflow Framework from Add/Remove Programs, installed the .NET 3.0 Framework RC bits and then run the connfiguration wizard.
    • I wasn’t thinking about permissions and rights properly so was creating the app pool for the Web Application that was to support the SSP with Network Service as the ID, which of course has a different PID/GUID on each machine so wasn’t mapping to the Network Service ID on the database server (2-server setup). What I should have done was create the app pool with a domain account ID that had sufficient perms on both boxes and on the SQL Server itself. It never ceases to amaze me how my mind will just drop stuff. This stuff holds for SharePoint 2003 too and I know that cold, but I just didn’t make the leap to apply it to my knowledge of MOSS 2007. Duh.
  • So I need to blog permissions articles that have been popping up on Technet/MSDN lately.
  • I also need to update on my/my company’s progress in fixing (or trying to fix) the Full Text Search in our production deployment of SharePoint 2003. Client still not interested in calling Microsoft Product Support Services. Now it looks like it might have to do with the Cluster configuration and the FTDATA folder. If it isn’t that, not only am I, but my company is tapped out and it is totally time to stop playing political games and djust call Microsoft PSS.
  • There are some links I found to training materials that I’ll also blog (I’ve been doing research on behalf of my client’s Training department).
  • I’ll be working on customizing my company’s portal soon, and doing a little mini-app with a guy based in the Richmond office, so we’ll see how well the development/customization process on MOSS 2007 collaborates. More updates there, hopefully by next week.

Anyway, been terribly busy, too busy, perhaps, to blog, but I’ll try to return to it, because taking notes is important to me, and putting it here means I can find it whereever I have Net access, and maybe it’ll help out other folks too.

Full Text Search and Account Permissions

This is a more extended writeup of running Windows SharePoint Services 2003 and SQL Full Text Search on a Database box where Local Administrators (BUILTIN\Administrators) don’t have System Admin access in SQL Server 2000. (I mentioned this briefly in the Changing SharePoint Service Accounts article.)

Essentially, you’ll run up against this security policy requirement in some environments. It’s a sensible policy to make in situations/operations where the Local Administrators (of whom many are also Domain Administrators) are folks who are different from the folks who own, run and are responsible for the SQL Servers.

Part of the motivation for this separation is, of course, political. In some organizations you’ll find that folks in one team don’t want to share permissions/rights with other teams who aren’t directly responsible for the upkeep or maintenance of the bit of the sandbox they have dominion over.

The Sensible Computer Security Policy reason is the principle of Least Privileges. When the question, “Do these people/does this group need permissions to this resource?” is answered “No.”, then the principle of Least Privileges dictates that they not be given the access they don’t need. This Security Principle falls under the overall category of Risk Management. The fewer potential risks (i.e. fewer accounts sitting around waiting to be hacked that have permissions they don’t necessarily need), the fewer potential security vulnerabilities sit around waiting to be exploited by Joe Q. Attacker.

It should be noted that in the annals of computer attackers, the long-neglected account that just happens to be a local or domain administrator and just happens to have a really easy to guess password is the holy grail, and almost every computer system has at least one. So do what you can to manage your risks and reduce the number of holy grails that attackers can use to compromise your system.

Anyway, so for whatever reasons, you’ve decided that you wish to implement the policy that Local Administrators on the SQL Server are not allowed to be System Administrators (aka sa) within the SQL Server/Application itself. Note that while it appears that Microsoft “supports” this configuration, it’s not specifically allowed for in Microsoft’s relevant Knowledge Base articles, so if you do go this way, be on the lookout for potential complications. See that other article I mentioned and linked to above for an example of an unexpected consequence.

If you remove BUILTIN\Administrators from your SharePoint 2003 server’s SQL Server Logins, or remove the sa permissions from that group, you will hose up your Full Text Search in SQL Server, which of course (say it with me) will screw up your Full Text Search in your Windows SharePoint Services 2003 sites. (Because Windows SharePoint Services 2003 uses SQL Full Text Search to do its searching.)

How do you fix this?

According to KB Article 317746, if you don’t wish to add BUILTIN\Administrators back to the SQL Server Logins, you still have an out. You must:

  • Add the System Administrators Server Role to the account you are using as the Service Account for SQL Server.
  • Add the Local System account (NT AUTHORITY\System) to the SQL Server Logins.
  • Add the System Administrators Server Role to the Local System account (NT AUTHORITY\System).

You should not have to restart SQL Server after making this change. But you may also need to fix Full Text Search for other reasons, which I will elucidate in a (shortly to follow) article.

Changing SharePoint Service Accounts, Permissions and Troubles Found and Conquered Therein

(Note: Links to KB Articles open in new windows)

So I have a client who is security conscious enough to ask that we make SharePoint work even when the servers’ Local Administrators (members of BUILTIN\Administrators) are not members of the System Administrators Server Role in SQL Server. This is the default configuration for SQL Server – the BUILTIN\Administrators group should have a checkbox enabled next to the “System Administrators” role in the Server Roles tab of the Login properties for SQL Server.

After this helped break Full Text Search (KB Article 317746) and we fixed it, all was well and good, or so we thought.

It’s not completely obvious, but in SharePoint Portal Server 2003 running on Windows 2003 Enterprise SP1 and above (KB Article 555309), it’s required that the SharePoint Service Account be part of the servers’ (in the farm) Local Administrators. This would appear to convey the idea that SharePoint Service Accout should also be a System Administrator (role) in the SQL Server that hosts the configuration and content databases.

I believe, by running into it head first, we have corroborated this assertion.

Moving from a bad situation (where all three of our environments – Dev/Pilot, QA and Production – are running on the same service account whose password will expire in about 2 months), to a better one (we plan 2 service accounts for each environment), means we have to change the service accounts for both SQL Server services, and for SharePoint services in each environment.

(By the way: It’s actually recommended that if possible, service accounts associated with Enterprise servers like SharePoint be set with passwords that don’t expire. If your security policy requires it, you can then plan to change the password on a regular basis, but you won’t get caught out if you simply don’t have time to do it. The slightly more antagonistic policy is to make the automatic experation/reset loom over your operations people like an unassailable threat that motivates them to change the password regularly, manually, before the deadline. If you live in a more antagonistic sort of place like we do, at least have the common decency not to keep folks from changing the password early. Making the password change be required to happen on a certain day is the kind of sadism that breeds MAD IT CONSULTANTS! MAD, I TELL YOU!)

In doing so, we ran up against the problem of not having the SharePoint service account be a System Administrator in the Server Roles in SQL Server. We followed the KB Articles (SharePoint: KB Article 837813, SQL: KB Article 283811 – For SQL we actually used the SQL Enterprise Manager – it’s the easier way – try SQL Books Online for guides) for changing the service accounts, and when we got to checking the SharePoint Central Administration (to finish up the account changing process), instead of the SharePoint Central Administration, like we thought we’d get, we got a prompt to disconnect from the configuration database.

So, in a fit of troubleshooting pique, we followed the prompt and got ourselves into a deeper hole. Because at the instant that we disconnected, the server we were working on got out of synch from the rest of the server farm.

And how did we get out?

First, the primary fix is to make sure that if you’re not going to give all the Local Administrators System Administrator access in SQL Server, at least make sure that the SharePoint Service Account has System Administrator access. That’s the first thing.

Second, we need to fix the mess. If we hadn’t disconnected from the configuration database when the opportunity presented itself, restoring the System Administrator access for the service account in question would fix the whole kit and kaboodle (requiring, perhaps, closing and opening IE, or forcing a cache refresh [hold down Ctrl and click Refresh button] to see the results in IE).

Instead, since we disconnected the database at probably the worst time, the server that disconnected thinks it’s disconnected, but the rest of the farm doesn’t. Weirdly enough, the broken server acts like it thinks it’s both part and not part of the farm. I managed to get the server to the point where it serves content just fine from WSS, but SharePoint Central Administration thinks it isn’t connected to the configuration database (i.e. it’s broken).

When you try to connect to the existing configuration database, you get a weird and inappropriate error. Usually you get the error, “Unexpected error occurred.” (This error actually tells you a lot. If you google for it, you’ll find that most of the issues associated with this symptom are to do with network connectivity – this isn’t utterly bizarre – from SharePoint’s perspective, a SQL permissions error can look a lot like a network error, and whatever’s bunged up in the broken synchronization is also probably interfering and giving SharePoint weird results, which probably just go in the network bucket, because SharePoint can’t figure out where else to put it.)

Weirder still, it seems like the whole of the problem exists somewhere in the broken server’s configuration, with none of the actual problem on the working parts of the farm at all.

Why do I say this? Because the fix is really simple:

  1. Create a new configuration database with the broken server.
  2. Disconnect from the new configuration database.
  3. Connect to the old configuration database.
  4. You probably want to delete the new, unused configuration database too, but the things are pretty small, storage-wise.

If the synch error really were partly on the original configuration database and partly on the broken server, this shouldn’t fix the problem, because it would still exist in some part on the original configuration database. But in fact, this works.

Now, if, like us, you were in the middle of changing accounts for your farm, just continue on your merry way. Since you only need to make that System Administrator Server Role for the SharePoint Service Account fix once (assuming you’re only using one SQL Server in the farm), you should be good to go with the rest of the servers in the farm.

Since I should be doing this tomorrow in our QA environment (Dev’s down right now), I’ll let you know if there’s more to know, when I’ve figured it out.

Wish me luck!