Objective-C Guide For Developers, Part 4

Now that we know how to create our own classes, we will go over some useful features of the languages to deal with the organization of class interfaces and memory management.

Table of contents

Categories
Protocols
ARC and memory management

Categories

Objective-C has a very powerful and useful feature that many other languages miss: categories. It’s a good programming practice to keep the inheritance hierarchy as shallow as possible, since inheritance it introduces complexity when overriding methods. The common way to do this is to use composition (objects that use other objects) and leave inheritance to cases where it’s necessary. For example, instead of subclassing NSArray, it’s better to write a class that uses a NSArray instance internally. Objective-C though offers another alternative to composition through categories, that allow you to add methods to existing classes.

This includes any class, so it means you can also add methods to classes from external frameworks, including the ones provided by Apple. This is very powerful because it does not only mean that you don’t need to subclass a class to add behavior to it, but also that the methods you add will be available to its subclasses, which you would not be able to do through subclassing. Moreover you can alter instances used internally by other classes. For example, UIViewController objects create their own UIView instance if you don’t provide one, or UILabel objects have their own instance of the UIFont class. With categories, your methods will also be available to these instances created by classes you don’t own.

The category declaration uses the @interface keyword like the class declaration but does not indicate any inheritance. Instead, it specifies the name of the category in parentheses:

@interface ClassName (CategoryName)

@end

A category then has an @implementation section like a normal class, where you put the additional method implementations:

@implementation ClassName (CategoryName)

@end

A category usually has a .h and .m files like normal classes. The file names are created with by the name of the class and the name of the category separated by a +, in the form ClassName+Category.h (or .m).

At runtime, there’s no difference between a method added by a category and one that is implemented by the original class.

Let’s say for example that we want to add a method to NSString to know if a string starts with a capital letter. The declaration of the category would be as follows:

@interface NSString (Capitals)

- (BOOL)startsWithACapitalLetter;

@end

And this the implementation:

#import "NSString+Capitals.h"

@interface NSString (Capitals)

- (BOOL)startsWithACapitalLetter {
    unichar firstCharacter = [self characterAtIndex:0];
    return [[NSCharacterSet uppercaseLetterCharacterSet] 
        characterIsMember:firstCharacter];
}

@end

You can then call this method on any NSString instance, even those coming from literals:

NSString *myCapitalizedString = @"This string starts with a capital letter";
if ([myCapitalizedString startsWithACapitalLetter]) {
    ...
}

Categories can add methods to classes, but not instance variables. So if you need to add functionality to a class that requires storing some value, the only option you have is to create a new subclass.

In the case of properties we have a partial behavior: as we have seen in Objective-C properties add new accessors methods to a class. This works in categories too, so a category can declare new properties for a class. But a category cannot add new instance variables to a class and this still holds true for properties. This means that properties added through a category are not backed up by instance variables like normal properties are. So, when you add properties through a category, you always have to provide your accessors since the compiler will not synthesize them for you. Moreover they can only reference existing instance variables.

Pay attention not to override existing methods in categories. Although I’ve seen some developers declare that this is fine, it’s not. As per Apple documentation, if there is a name clash with a method in a category, which one will be chosen at runtime is undefined, so you are never sure if your implementation is the one that is going to win. Categories are not a valid way to override methods.

To avoid name collisions when you declare a method on a class you don’t own, it’s best practice to prepend a prefix to the method name. In this case the prefix is lower case to respect conventions for method names and is usually separated with an underscore.

Interface extension

A special type of category is the class interface extension, also known as anonymous category. The interface extension can only be added to your own classes and the methods it declares are usually implemented in the class own @implementation block instead of a separate category implementation. An interface extension is declared without specifying the category name in the parentheses.

@interface ClassName ()

@end

What is special about it is that, unlike other categories, an interface extension can declare new instance variables and the properties it declares behave like properties declared directly in the class interface (they are backed up by instance variables and the accessor methods are automatically synthesized by the compiler).

Interface extensions are used to declare private information for a class. While other languages have a special keyword for this, Objective-C solves this problem with interface extensions. This allows to have partially private methods and properties for selected classes, by declaring the interface extension in a separate header file which is imported only by those classes. This is how Apple declares its own private API which is not available to other developers.

Protocols

Sometimes you need to declare a minimum interface that a class needs to implement to interact with another class. A class interface or a category declare methods that are specific to a class, while a protocol declares properties and methods that are independent and can be implemented in many different classes. Other languages have a similar feature to protocols (Java calls them interfaces, which might generate some confusion at first if you are a Java developer, since interfaces are a different thing in Objective-C). When a class conforms to a protocol, it must implement the required methods declared by it.

A very common example is the UITableView class. UITableView is a class found in iOS to display a vertically scrollable list of items. You stumbled upon one already if you use an iPhone or an iPad, since it’s omnipresent. Since UITableView is a generic class that is used to display many different kinds of lists, with diverse visualizations for the items, all this information needs to be provided to the table view by some other objects.

A UITableView defines two protocols that declare what methods it expects two other classes, called the data source and the delegate, to implement to be able to retrieve the information it needs. Since the protocols are separated, the two classes can also be separated, but they are generally the same class.

The data source implements methods that tell the table view how many items and sections there should be and provides them when the table view asks for them to display them on screen. The delegate provides instead information on the visualization of these items, like the kind and size of views used to represent items (called cells).

Any class (usually implemented by you) can be the data source or delegate of a table view and to do so it needs to conform to these two protocols.

A protocol is declared with the @protocol keyword:

@protocol ProtocolName

@end

Inside the protocol interface you declare the methods that a conforming class needs to implement. It is possible to declare optional methods in a protocol that a conforming class can implement only if it needs to. You do so using the @optional directive in the protocol declaration:

@protocol ProtocolName

// list of required methods

@optional

// list of optional methods

@end

There is also a @required directive to switch back to declaring required methods, but it’s better not to switch back and forth for the readability of the protocol. If you mark some method as optional, you will have to check if the receiving object implements the method before calling it, or you will get an exception. You check this by using the respondsToSelector: method of NSObject (so it’s available to every class). This method takes a selector as a parameter, which you can obtain with the @selector() directive around a method name, in this way:

if (object respondsToSelector:@selector(someMethod))
    [object someMethod];

To indicate that a class conforms to a protocol, the protocol name is indicated in angular brackets in the class interface:

@interface ClassName : Superclass <Protocol>

@end

A class can conform to multiple protocols, which are then comma separated inside of the angular brackets:

@interface ClassName : Superclass <Protocol, SecondProtocol, ThirdProtocol>

@end

The same syntax is used to declare that a variable or a property contains an object that must conform to one or more protocols:

id <protocol-list> variableName;

or:

@property id<protocol-list> propertyName;

In this way the compiler will check that the object stored in the variable or property conforms to the protocol, helping to avoid programming errors.

Protocols can conform to other protocols, to include the methods declared in the latter. You specify this conformance in the same way you do for a class:

@protocol ProtocolName <protocol-list>

@end

ARC and memory management

The approaches to memory management you find in other languages are usually two: either memory management is  completely left to the developer (like in C or C++) or is handled by a garbage collector (like in Java, C#, Python or Ruby).

In the first case developer has to know when to allocate and especially release memory “manually”, while avoiding to address memory that does not exist yet or releasing still used memory too soon. Both these tasks are tedious and error prone and might lead to crashes, unexpected behavior, or leaks that eventually fill up all the available memory.

In the case of the garbage collector, the developer abdicates the memory management to a process that periodically scans the memory and releases the one that is not used anymore. This relieves a lot of pain, so it has become the preferred way in modern languages, but still has some pitfalls. Since to know what parts of memory are used this the garbage collector looks at all the references in the object that are in memory at a given time, the developer has to pay attention not to create reference cycles between object, where two objects reference each other and the memory is not released even if those two object are not referenced anywhere else.

Objective-C comes from a history of semi automatic memory management. Apple used for a long time an in between approach, called reference counting. Reference counting works this way: whenever some objects needs to keep a reference to another object, it retains it. Retaining an object increases a count of references to the object by one. When the object is not needed anymore, object that retained it have to release it. Releasing decrements the count by one. When an objects reaches a retain count of 0, it gets removed from memory by the runtime.

Retaining and releasing are still responsibility of the developer and, if done wrong, they still lead to accessing deallocated memory (which usually causes crashes) or memory leaks. The benefit is that reference counting allows the developer to think about memory locally, asking when an object needs to retain another, instead of globally. This alleviates a lot the pain of manual memory management and paired with some common programming patterns was much safer and easier than manual memory management.

For a brief period Apple adopted garbage collection on Mac OS 10.6. But when the iPhone came out, the resources on the device were too limited to run a garbage collection process. One downside is that the garbage collector needs to be run periodically, while the program execution is halted to avoid problems with changing references. This is usually not perceived on a normal computer, but on a phone with limited resources it freezes apps for some time, leading to a bad user experience. Another downside is that allocated memory of the program keeps accumulating until the garbage collector is activated, which is again a problem on a device with very limited amount of memory. For this reason, when the iOS SDK was released, Apple switched back to reference counting.

In modern Objective-C, memory management is done through what is called ARC. Reference counting is still supported for old legacy code, but since ARC works back to iOS 4 and Mac OS 10.7, reference counting should not be needed anymore and we will not have a look to how it works.

So, what is ARC? As I said, reference counting is led by common patterns and best practices on when retain and release should be performed and how to name methods that involve reference counting. For this reason Apple saw an opportunity to automate it and introduced  Automatic Reference Counting, or ARC.

ARC removes reference counting from the developer hands and automates it in the compiler. The benefit is that, in addition to taking away responsibility for tedious memory management from the programmer, ARC is done at compile time, when the binary of the app is created, thus removing any runtime process that might slow down the device. What the compiler actually does is to add the proper memory retain and release calls in the code where they are needed.

ARC has been highly optimized, so it works generally faster than the memory management  done manually. Moreover it forces some memory checks into the compiler, which then signals problems to the developer to be fixed, or the app won’t compile, removing many memory management errors. Since at this point in time ARC is supported on the vast majority of machines and devices, it is advise to migrate all code bases, so probably you will never have to learn manual reference counting. XCode has a tool to automate this transition as much as possible.

ARC still suffers from one pitfall though, as garbage collection does. If an object circular references exists, a retain cycle is created and the memory used by the object will never be released, exactly how it happens in garbage collected languages.

To avoid retain cycles, Objective-C has some lifetime qualifiers. For properties, two qualifiers exist: strong and weak. The default qualifier is strong, which signals that the reference object needs to be kept in memory until that reference exists. Thus, the standard declaration we saw for properties

@property Class *propertyName;

is the same as

@property (strong) Class *propertyName;

If you need to create a reference cycle to make two (or more) objects communicate with each other in a circular manner, one of the two needs to have weak reference to the other one. This is used a lot, for example, in the delegate design pattern, a very common pattern in Objective-C. To avoid a retain cycle, one of the classes still uses a strong property:

@interface ClassA : NSObject

@property ClassB *objectB;

@end

while the other uses a weak one:

@interface ClassB : NSObject

@property (weak) ClassA *objectA;

@end

When nothing references the objectA anymore, it is removed from memory because the weak reference does not count when counting references. So objectA is not retained by objectB, making it possible to release objectA when it’s not referenced anymore by other objects. When objectA gone, the strong reference to objectB is removed, thus removing objectB too (if it’s not referenced strongly from anywhere else). When an object referenced weakly is removed from memory, all the weak references pointing to it are automatically set to nil, making it safe for the referencing objects to still call methods on it.

When using normal variables or instance variables, the corresponding lifetime qualifiers are __strong and __weak. As per Apple documentation, the qualifier needs to be specified after the * in the declaration, with this syntax:

NSArray * __weak array;

Although the documentation says that the compiler “forgives” other variants, it’s better not to use them since you never know when in the future the compiler won’t be so kind anymore.

Pay attention to __weak variables. When there is no other reference to the object they store they get immediately allocated leading to this common problem:

NSMutableArray * __weak strings = [NSMutableArray new];
[strings addObject:@"Hello"]; // On this line strings is already nil

On the second line, the strings array does not exist anymore and the variable will be nil already. This is because, even if the second line references it, there is no other strong reference to the array when it is created, therefore the compiler deallocates it immediately. Other subtle cases might be not so easy to spot, so pay attention when using __weak variables.

There are two more qualifiers for variables: __unsafe_unretained and __autoreleasing. The first one works like __weak, but the variable is not set to nil when the object it references is deallocated. For this reason it’s unsafe (as the name implies) because it leaves a “dangling” reference to deallocated memory. This leads to crashes if you try to call a method on it, unlike nil, which is safe. This identifier exists only to support ARC in iOS 4 and Mac OS 10.6, so you probably will never need it. If you inherit old codebases, pay attention also to the assign qualifier for properties, if you find any, because that’s equivalent to __unsafe_unretained. Change it to weak.

The __autoreleasing qualifier is used in parameters of methods passed by reference, which we will see later.