cocoa

Applescript: Endianness, FREFs, ARGH!

The fun started yesterday when I was testing an application, and I noticed this little gem in Xcode's debug console:

CoreEndianFlipData: error -4940 returned for rsrc type FREF
     (id 128, length 7, native = no)

At about the time this message appeared, my application seemed to stop responding. I didn't know what was causing that, but this mysterious message had to be a clue. I was also, sometimes, getting a "Choose Application" window appearing on the screen, for no readily apparent reason.

I scratched my head for a minute and asked Google what it thought about it. Google wasn't sure. The FREF suggested some old-school pre-X Mac OS programming, but I wasn't doing anything of the sort. The part about endian flipping suggested an Intel vs. PowerPC issue, but I've been doing everything universal for a while now, so that shouldn't be it.

So I asked a team of experts. Thanks to IRC I was able to get some friends on the case, and Daniel Jalkut, Jon Wight, Manton Reece and I spent an interesting and bizarre afternoon trying to figure out what the hell was going on.

Early theories held that, while I wasn't personally using FREFs, perhaps I had included some code that did. That briefly made NDResourceFork a suspect. NDResourceFork is a Cocoa-style interface for dealing with HFS+ resources, and one of its associated classes mentioned FREFs. That ended up being a red herring; I have a graphical front-end app and a back-end worker app, and NDResourceFork is used in the front end while the error was coming from the back end. It's in my project but not actually in the application printing the error message.

So we looked at what the error message appeared to be trying to tell us. Clearly, something must going on with FREFs and/or switching from one endian style to another. Fortunately the experts know a thing or two about old-style Mac stuff, and before long I was setting breakpoints on functions like GetResource() and GetIndResource(), and using PrintResourceChain() in the debugger (functions which, I believe, are older than Mac OS X and which I had never heard of before-- heck, some of them aren't even documented anymore). I also found I could reproduce the problem in the debugger, and it's always nice when the application behaves the same way in a debugger as when it's running normally.

That led to a surprising discovery, that the mystery message was occurring when my application was making AppleScript calls to another application via NSAppleScript. In a way that just confused things for a while, though. There didn't seem to be any reason for this script to be using FREFs. And I verified that my application, the target application, and the AppeScript (compiled with osacompile) were all universal binaries, so there shouldn't have been any question of endianness. It was about at this time I noticed that, for some reason, the bug only manifested itself when the AppleScript was targeting OmniGraffle Pro. Cue ominous music here.

The debugging also led to some seriously weird results, in cases. Trying PrintResourceChain() at the gdb prompt showed resources loaded from completely unrelated applications-- not mine, not the target app, not the source of a scripting addition, not anything that should have been remotely involved.

Trying to pin down exactly when the mystery message appeared, Daniel suggested I set a breakpoint on write(), which is pretty near the lowest-level print statement that should get used by nearly everything printing anything. That was when I regretted all of those NSLog() statements I had added for what Jon Wight called "nuke 'em from orbit style debugging". Some careful disabling and reenabling of breakpoints got me through that without too much trouble, though. And, aha! CoreEndianFlipData appeared in the stack trace! And that was a direct result of my own NSAppleScript call, albeit about 45 levels down in the stack (not an exaggeration, BTW) from there. As for what was going on in between, well, that took some interpreting.

Daniel and Jon both noticed the presence of AEVTsysoppcb in the stack which is, they tell me, AppleScript deciding it needs to ask you where the target application is. The "choose application" window loads icons for all applications. And THAT, apparently, is where FREFs come into all of this. Remember the FREFs? This started with an error message about FREFs. I was now able to start reproducing the error message from Script Editor, which simplified things a bit and, frankly, made me feel a little better by knowing this might not be my fault.

Reducing the AppleScript to the bare essentials needed to demonstrate the behavior, I get this:

if application "OmniGraffle Pro" is running then
	beep
end if

Do this in Script Editor, and a "Choose Application" window appears, asking "Where is OmniGraffle Pro?", even if you have it installed. Click "Cancel", and Script Editor crashes. Watch the console, and you'll see the mystery message that started all of this. Of course OmniGraffle's full name is "OmniGraffle Professional", but fixing that doesn't change the behavior.

Out of curiosity I tried a bunch of other applications. The only other one I've found that causes this is RealPlayer. There might be others.

Apparently then, the chain of events is:

  1. My app calls its AppleScript, targeting OmniGraffle
  2. The AppleScript tries to find out if OmniGraffle is running
  3. For some reason it can't figure this out, because it can't find OmniGraffle. I can only guess this is because of something weird in OmniGraffle's Info.plist, though I'm not sure what.
  4. AppleScript helpfully asks the user where to find it.
  5. The "Choose Application" window, in trying to find application icons, runs smack into that unrelated application that PrintResourceChain showed. It prints out a weird message about FREFs and endianness
  6. If the user cancels, AppleScript freaks out and explodes, killing innocent bystanders like Script Editor or my application.

Now, all I want to do is run this damn AppleScript. If I can find a workaround, I don't actually care why OmniGraffle is making the existing script choke. Daniel pointed out that if an AppleScript says "application appname" somewhere, it causes a full name resolution for the application. If I could make the script check for a running application without that, no name resolution would happen, and tragedy would be avoided. The Pre-Leopard way to do this does just that:

tell application "System Events"
	return first application process of application
		"System Events" whose name is "OmniGraffle"
end tell

Unfortunately it's not workable for me. My script's not actually about OmniGraffle, it targets a bunch of different applications, and it gets the application name from NSWorkspace. NSWorkspace tells me that OmniGraffle is named "OmniGraffle Pro". But AppleScript sees it as "OmniGraffle Professional", and if I use anything but that then it can't find the process. AppleScript then reports that it's not running, even if it is. There's no name resolution, which is nice, but there's also no reliable result.

I'm still beating on this a bit, but it looks like another Leopard-ism may save the day. Beginning with 10.5 it's possible to target an AppleScript based on an application's bundle ID instead of its name. Bundle IDs are, fortunately, consistent where application names may not be. And since they're not application names, they don't get resolved in the same way. So I can do something like this:

if application id "com.omnigroup.OmniGrafflePro" is running then
	beep
end if

And... it works! I think. I need to do some testing to see if it's as reliable as I need it to be. But it looks like what I need.

I'll be reporting this to Apple, because whatever OmniGraffle is doing, AppleScript shouldn't crash and burn like that. I'll probably also report it to Omni, who may well be interested to know.

I'd also like to thank Daniel, Jon, and Manton for taking so much time to help track this down. I don't think I would have got this far without their help.

Update: Daniel Jalkut pointed out that it's possible to address applications by bundle ID in AppleScript on Mac OS X 10.4, although it involves what Apple's AppleScript release notes describe as "...a multi-line incantation using Finder." That incantation turns out to be something like this:

tell application "Finder"
	set appname to displayed name of application
		file id "com.omnigroup.OmniGrafflePro"
end tell
tell application "System Events"
	if exists process appname then
		beep
	end if
end tell

Framework Signing Update

I recently wrote about problems using Leopard code signing with Mac OS X frameworks. I've since gotten feedback on my bug report. It looks like the problem isn't so much that frameworks can't be signed but that the correct signing procedure isn't documented.

The code signing documentation indicates that bundles should be signed. Frameworks are bundles, so if you're looking to sign your code you'll likely be tempted to sign a framework like this:

$ codesign -s "authority info" Sparkle.framework

But if you do that you run into the confusing situation I encountered, with your framework structure modified and difficulty knowing if the bundle is valid or not.

The feedback I got on my bug report explains that code signing should be done differently for versioned bundles like frameworks and... whatever other versioned bundles there might be. Although frameworks commonly contain only one version, they're designed so that multiple versions can be present. When signing a framework then, you sign the specific version, not the entire framework bundle. So instead of the above, you instead do something like this:

$ codesign -s "authority info" Sparkle.framework/Versions/A

If there are other versions, sign them separately.

This leaves the framework structure unmodified:

$ find Sparkle.framework/ -name Sparkle -exec ls -l {} \;
lrwxr-xr-x  1 tph  wheel  24 Nov 29 11:44 Sparkle.framework/
    /Sparkle -> Versions/Current/Sparkle
-rwxr-xr-x  1 tph  wheel  242928 Nov 29 11:44 Sparkle.framework/
    /Versions/A/Sparkle

Of course when verifying the signature, you also need to verify the versions independently of the framework bundle. The framework itself won't have a valid signature, but that's not what you should be looking at anyway:

$ codesign -vvv Sparkle.framework/Versions/A
Sparkle.framework/Versions/A: valid on disk
$ codesign -vvv Sparkle.framework/
Sparkle.framework/: code or signature modified

This seems to make sense. A framework is designed to allow multiple independent bundles, with symbolic links to indicate which is current. So, sign each bundle on its own.


Don't Sign that Framework

Yesterday I was working on a forthcoming update to Chimey and I noticed something odd. Chimey of course makes use of SparklePlus for automatic updates, and after a test run of my build-for-release script, Sparkle was looking a little odd.

Normally a framework has one or more binaries, with a symbolic link pointing to the current version. From the command line you'd expect something like this:

$ find Sparkle.framework/ -name Sparkle -exec ls -l {} \;
lrwxr-xr-x  1 tph  wheel  24 Nov 21 11:20 Sparkle.framework/
    /Sparkle -> Versions/Current/Sparkle
-rwxr-xr-x  1 tph  wheel  233088 Nov 21 11:20 Sparkle.framework/
    /Versions/A/Sparkle

Instead I was seeing this:

find Sparkle.framework/ -name Sparkle -exec ls -l {} \;
-rwxr-xr-x  1 tph  wheel  242928 Nov 21 11:20 Sparkle.framework/
    /Sparkle
-rwxr-xr-x  1 tph  wheel  233088 Nov 21 11:20 Sparkle.framework/
    /Versions/A/Sparkle

Yow, how the hell did that happen? I thought it might have something to do with copying the framework, either when compiling or when building the disk image, but that wouldn't account for the different file sizes. A quick check on the current version of Chimey showed that this was something new, not something I'd been doing all along without realizing it.

So what was different? My build-for-release script now signs my code using Leopard code signing.

Apple's documentation on code signing indicates that "You should sign every program in your product, including applications, tools, hidden helper tools, utilities and so forth." Chimey's main bundle includes a preference pane and two helper tools, so I was making sure to sign all of them. The main documentation doesn't mention frameworks specifically, but the Code Signing Release notes indicate that "You may also sign any libraries, frameworks, plugins, and scripts you ship, whether they are delivered with an application or separately." Framework signatures don't get checked yet but might be in the future. I had made my script as future-proof as possible by signing the Sparkle framework now.

But if you sign a framework, the codesign tool modifies the structure of the framework, as I found with Sparkle. Just for sanity's sake I made sure that this affects any Mac OS X framework and is not some kind of Sparkle-specific behavior.

And the file sizes? I can guess what's going on, but I made sure:

$ codesign -vvv Sparkle.framework/Sparkle 
Sparkle.framework/Sparkle: valid on disk
$ codesign -vvv Sparkle.framework/Versions/A/Sparkle
Sparkle.framework/Versions/A/Sparkle: code object is not signed

Not only is codesign changing the framework structure, it's also leaving the framework in an inconsistent state with regard to whether it's signed. Instead of one binary I've got two, and even though I signed the framework, one of those two is still unsigned.

At the same time, checking on the framework bundle still returns a valid signature, despite the presence of unsigned code in there:

$ codesign -vvv Sparkle.framework
Sparkle.framework: valid on disk

It's a good thing that Leopard doesn't currently check on framework signatures. For now it seems it's probably best not to bother signing a framework. Although codesign leaves you with something that should work, it's not clear that it's actually doing anything useful, and it's bloating the framework size in the process.

This has been filed as bug #5609522 with Apple, in case anyone from Apple reads this.


SparklePlus has Moved

Mac Developers using the SparklePlus auto-update framework in their applications should note that the project has moved to a new online home, at Google Code hosting. Previously the project was at ironcoder.org, but a number of factors (not least of which being trouble with the hosting provider) made Google seem like a better choice. The old SparklePlus mailing list has also been superseded by a Google group.

The project page includes some migration information for those of you who have been using SparklePlus already.

Sparkle, for those of you who might not know, is a cool Mac OS X framework that makes it easy to have applications check for updated versions, and then optionally download and install those updates. It was written by Andy Matuschak. SparklePlus expands on Sparkle by adding anonymous system information with the update checks-- things like what version of Mac OS X the software is running on, and how much RAM is available. That kind of information is useful to developers when they're trying to figure out what version(s) of Mac OS X (etc...) they need to support. Without it, a developer can only guess if any of their users are still running older versions of Mac OS X, and whether dropping support for a version is likely to cause trouble for people.

The Adium team has a nice example of what can be done with this information at their Sparkle+ stats page.

Code Quickie: Redirect NSLog

When writing Cocoa the common function to write simple text messages in a non-GUI manner is NSLog(), rather than something like fprintf(). Mainly this is because NSLog adds the "%@" format specifier that (usually) works nicely when printing Cocoa objects. This is especially useful with caveman-style debugging, where instead of using a debugger you just add a ton of print statements and then pore over the results to see what went wrong. That can be ugly but most developers have resorted to it at some point.

Sometimes, though, you don't want those messages to just go to console and (eventually) disappear. It can be useful to make them go somewhere for later reference. The trick here is realizing that NSLog sends its messages to Unix's standard error for the process, which in a GUI app gets sent to the system console. Standard error is malleable, though, and can point wherever you want it to point. So:

// Set permissions for our NSLog file
umask(022);
// Send stderr to our file
FILE *newStderr = freopen("/tmp/redirect.log", "a", stderr);

NSLog(@"This goes to the file");

The effect of the call to freopen() is to open the file at the specified path, and make the already-existing stderr stream use it. Voila! In this case I'm using a mode of "a" so that I'll append to the file if it exists; if you'd prefer to overwrite it, change that to "w". Note that since we're redirecting standard error, anything else that uses stderr will also be redirected.

Even better, you can temporarily redirect NSLog, yet later restore normal NSLog/stderr operation if desired. This involves saving a reference to stderr before redirecting it, and then later restoring stderr to its original state.

#include <unistd.h>

// Set permissions for our NSLog file
umask(022);

// Save stderr so it can be restored.
int stderrSave = dup(STDERR_FILENO);

// Send stderr to our file
FILE *newStderr = freopen("/tmp/redirect.log", "a", stderr);

NSLog(@"This goes to the file");

// Flush before restoring stderr
fflush(stderr);

// Now restore stderr, so new output goes to console.
dup2(stderrSave, STDERR_FILENO);
close(stderrSave);

// This NSLog will go to the console.
NSLog(@"This goes to the console");


Atomic Bird, LLC