Syntax Coloring For Fun And Profit

At the last NYC Cocoaheads meeting, I did a presentation entitled “Let’s Build a Text Editor: A Practical Introduction to the Cocoa Text System.” In it, I basically go through building a code editor from scratch, pointing out how to use the Cocoa text system along the way. I’m not going to reprise the whole thing here. The first part was an introduction to the basics of creating a document-based app and fitting the text system into that and the second part dealt with adding line numbering, which, if you’ve been paying attention at home, I covered here a while back. Instead, I’m going to focus on the third part, which is implementing syntax coloring.

One major aspect of syntax coloring is actually parsing the text. And guess what? I’m not going to cover that here either. There are books you should read on the topic. Like many people during their college years, I learned from the Dragon book though there are probably books more specific to parsing that you can find out there. Keep in mind, though, that you may not have to write as sophisticated a parser as you think. Sometimes, a lexical scanner is all that you need. Just parse the different elements you want to highlight (comments, strings, keywords, etc.) and ignore the syntactic structure. In some cases, this may be enough.

The focus of this article will be getting the Cocoa text system to do the coloring based on what your parser parses. Before we start, you should download the example project as this article refers to it throughout. I’ve created various snapshots at key points to show the progression. If you want, you can go through the various snapshots to see how the app was built up but for the purposes of this article, start at the snapshot called “Version 4”.

The Basic Approach

At the core of Cocoa text system is the NSTextStorage object. It stores the actual text as well as any formatting attributes. Being a subclass of NSMutableAttributedString, NSTextStorage allows you to apply different attributes to ranges of characters. For syntax highlighting, all we care about is the NSForegroundColorAttributeName attribute, which is the attribute that affects the font color.

If you look at the –parse: method in NoodleSyntaxHighlighter, you’ll see that it scans for C-style comments (/*...*/) and string literals ("..."). When it finds the range of a particular entity, it calls -addAttributes:range: on the NSTextStorage to annotate the characters with the appropriate color based on which entity we found. Note that this is how you modify any attributed string; it’s not specific to NSTextStorage. To make sure we keep up with the changes, we implement NSTextStorageDelegate protocol’s -textStorageDidProcessEditing: method. There’s an equivalent NSTextStorageDidProcessEditingNotification though I believe modifying the text storage is only allowed via the delegate method.

If you compile and run at this point, you’ll see that it pretty much works as advertised. We’re done, right? Not exactly. Try loading in a large-ish document. Around a couple thousand lines or so. Start typing really fast. It might depend on your machine, but you’ll find that the typing lags a bit. It seems that modifying the NSTextStorage is not quite as efficient as we’d like. Luckily, there’s a better way to do this.

Using Temporary Attributes

NSLayoutManager has a thing called temporary attributes. These are attributes just like attributes you use on an attributed string, except that they’re, well, temporary. They are more lightweight and designed specifically for cases like this, where the attributes are only used for display and aren’t something intrinsic to the document itself. It’s used for features like spellchecking and other places where you need to make temporary annotations to the text.

Imagine writing a rich text editor. You don’t want to make display-only changes to the NSTextStorage since those changes could end up being saved with the document. Sure, the code editor in this example doesn’t allow rich text but you can see where semantic division lies. The NSTextStorage is your “model” and you don’t want something like syntax highlighting, a “view” feature, to change the model.

In the example project, restore snapshot “Version 5”. This code isn’t much different. It’s adding attributes as before but instead of adding them to the NSTextStorage, they are added via the NSLayoutManager. Now, compile and run it and, if it wasn’t automatically restored from before, load up the large document you used earlier. You should find that it keeps up with your typing this time around.

Now you could stop here but I’d like to take it a step further. You have this nice parser and all. Surely, the ability to pick out different entities in the document is useful for more than just syntax coloring. Maybe you want to add more smarts to your editor. Maybe triple clicks will select the whole comment or string literal instead of just a paragraph/line. Maybe you want to know where code blocks begin and end for code folding. Or maybe you want special tool tips when hovering over different entities.

You can ask the layout manager for the temporary attributes at any character position. Just check for the color that’s set there and deduce which entity (comment or string) it is. Suppose, though, that the user can change the colors in the preferences. If entities are set to be the same color, this method won’t work.

Creating Your Own Custom Attributes

There’s nothing saying that you can’t create and add your own attributes to an attributed string. Sure, only ones defined by Cocoa will be used by the system for display but let’s decide for the moment to annotate our text semantically. Instead of setting the NSForegroundColorAttributeName, we set our own custom attribute to indicate the type of element we’ve found.

Restore snapshot “Version 6” in the example project. At the top of the NoodleSyntaxHighlighter implementation you’ll see that I’ve defined our own attribute (“noodleElementType”) and the two possible values (“comment” and “string”). The -parse: method is the same as before except instead of setting colors, we set our own custom attribute, marking regions of text as comments or strings.

That’s great but how do we get our colors? If you look at the NSLayoutManagerDelegate protocol, you’ll see a method called -layoutManager:shouldUseTemporaryAttributes:forDrawingToScreen:atCharacterIndex: effectiveRange:. What this method does is allow you to substitute different attributes for the ones that are there already. In our implementation, we check what element type we set during the parse phase, look up its color and return it as the NSForegroundColor. So, our NSTextStorage has semantic markup but when displayed, we can substitute in the proper display attributes to get it to color properly.

• • •

Now, there’s one thing I glossed over which is an obvious optimization. Notice how we re-parse the whole document on every change. When you get the text storage notification, you can query it for the range of characters that was changed via its -editedRange method. You can use this information to limit the range of the document you have to re-parse. What you can do is look at what element is there already (from your annotations from the last parse pass) and start parsing from the beginning of that element. You can even tighten it up further by limiting how far it has to parse. I leave implementing this as an exercise for the reader.

As you can see, doing the coloring part isn’t so bad though there are some slightly subtle details to keep in mind. The hard part is the parsing which I’ve conveniently (for myself) left for you to figure out on your own. Enjoy.

And in case you missed the link to the example project above:

Category: Cocoa, OS X, Programming, User Interface 2 comments »

2 Responses to “Syntax Coloring For Fun And Profit”

  1. Adam

    Since temporary attributes are not supported on iOS what is the recommended approach to formatting the text without modifying the textStorage?

  2. Tim Larkin

    Thanks for this code, which allowed me quickly and lightly to color words in my application’s tiny language.

    I have one recommendation. Although the docs suggest that the textStorageWillProcessEditing delegate notification is not sent when the text is changed programmatically, my experience with OS 10.10.4 shows that applying temporary attributes causes the delegate notification to be raised. Unless your notification method checks whether the text has actually been changed, you will get the notification in an infinite loop.

Leave a Reply

Back to top