Implementing equals()

May 18th, 2006

Antabuse Online Buy Erythromycin Zyban Online Buy Soma Prednisone Online Buy Lotrisone Lipitor Online Buy Lipitor Erythromycin Online Buy Coumadin

Among Java developers, there exist [different ideas->http://www.artima.com/intv/bloch17.html] about how to implement the `[equals->http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Object.html#equals(java.lang.Object)]()` method. The disagreement is about whether to allow subclasses to be considered equal or not.

# Two ways to see it

Some say that it should be done using the `instanceof` operator, like this.

public boolean equals(Object obj) {
if (obj instanceof MyClass) {
MyClass that = (MyClass) obj;
// compare this and that
} else {
return false;
}
}

Other says that it should be done by comparing the exact classes of the two objects, like this.

public boolean equals(Object obj) {
if (obj != null && obj.getClass() == this.getClass()) {
MyClass that = (MyClass) obj;
// compare this and that
} else {
return false;
}
}

The difference lies in how they treat subclasses. As `instanceof` returns `true` if `obj` is of the same class *or any subclass*, it will allow subclasses to be considered semantically identical to their superclasses. Comparing the result of `getClass()` on both objects will require that they are of exactly the same class. Using that method, no subclass can ever be semantically equal to its superclass.

To understand why this difference matters, it is important to remember one of the requirements Java has on any implementation of `equals()`; the equality relation must be symmetrical. In other words, for any objects `a` and `b` the expression `a.equals(b)` may return `true` *if and only if* `b.equals(a)` also does.

Because of this, the `getClass()` camp argues that since we can't predict how subclasses might want to implement `equals()`, we better implement `equals()` in such a way that it only compares objects of the same class. That way a subclass can safely override and implement `equals()` any way they want. For example, a subclass might have additional fields which it wants to use in the equality comparison. Otherwise we might end up in a situation where e.g. the superclass says that is equal to the subclass but not the other way around.

The argument from the `instanceof` proponents is that not allowing a subclass to be considered equal breaks the notion of [polymorphism->http://en.wikipedia.org/wiki/Polymorphism_in_object-oriented_programming]. It also violates the [Liskov substitution principle->http://en.wikipedia.org/wiki/Liskov_substitution_principle] as well as the [principle of least astonishment->http://en.wikipedia.org/wiki/Principle_of_least_astonishment].

# How I see it

Okay, enough with the summary. Now I'll throw my own two cents in.

I think this whole question pretty much falls apart if we just make a better distinction between interface and implementation. An interface decides *what* a method is supposed to do, while a particular implementation knows *how* to do it. In other words, it is up to the interface to define what equality means for that interface. Being separated from its implementation(s), the interface must therefore define `equals()` in terms of comparisons of its own members. Thus, it cannot require the other object to "be of the same class" without having inappropriate knowledge about at least one of its implementations.

One problem is that a class in Java is somewhat ambiguous as it both has an interface and provides an implementation for it. This makes the interface more implicit. If you think it makes things clearer, extract an interface from your superclass and let it define (through documentation and unit tests) the exact behavior of the implementation. Then implement that exact behavior in any way you see fit in your class.

## Example: Collections in Java

A good example of how things should work is the collections framework in Java. If we look at the documentation for the `equals()` method of e.g. the `[Set->http://java.sun.com/j2se/1.5.0/docs/api/java/util/Set.html]` interface, we see the following. Pay special attention to the last sentence.

> Returns true if the specified object is also a set, the two sets have the same size, and every
> member of the specified set is contained in this set (or equivalently, every member of this set
> is contained in the specified set). This definition ensures that the equals method works
> properly across different implementations of the set interface.

This behavior is defined by the interface `[Set->http://java.sun.com/j2se/1.5.0/docs/api/java/util/Set.html]`. It is then implemented in the abstract class `[AbstractSet->http://java.sun.com/j2se/1.5.0/docs/api/java/util/AbstractSet.html]` which is in turn extended by `[HashSet->http://java.sun.com/j2se/1.5.0/docs/api/java/util/HashSet.html]` and `[TreeSet->http://java.sun.com/j2se/1.5.0/docs/api/java/util/TreeSet.html]`. None of the two subclasses override `equals()`.

## Example: Employees and Managers

In a blog entry, Cay Horstmann uses [another example->http://www.artima.com/weblogs/viewpost.jsp?thread=4744]. However, he is arguing for using `getClass()` rather than `instanceof`. His example includes a class hierarchy where we have an `Employee` class as the base and then a `Manager` class which extends `Employee` by adding a `bonus` field. Expressing the example in code, it would look something like below.

public Employee {
// …
public boolean equals(Object other) {
if (!(other instanceof Employee)) return false;
// cast other to Employee and compare fields
// …
}
}

public Manager extends Employee {
// …
public boolean equals(Object other) {
if (!super.equals(other)) return false;
// cast other to Manager and compare fields
return bonus == ((Manager)other).bonus;
}
}

In other words, the subclass `Manager` changes the meaning of equality it inherited from its superclass `Employee`. Is this a sensible thing to do? As you might guess, I would say no. We (hopefully) decided when we wrote `Employee` that there was a given way to uniquely identify any specific employee. That might have been an employee id, combination of first name and last name, or any other combination of `Employee` members. For sake of discussion, let us say we compare the values of the function `getId()`. Being an `Employee` itself, what reason would `Manager` have to change this definition? To check if two `Manager` objects have the same id but different values for the field `bonus`? Well, if we do have two such objects, then *that* is the problem. We shouldn't have allowed them exist at the same time. The job of `equals()` is not to guard us from having corrupt data in our model!

## Final thoughts

If a subclass truly has to override `equals()` to provide a semantically different implementation of `equals()` it might instead be a sign of that the subclass perhaps should not be a subclass after all. A subclass is supposed to extend or alter the behavior of a superclass or implement the behavior of an interface, but always doing so within the boundaries of all inherited interfaces.

One way to make sure subclasses does not break the contract is to make the `equals()` method in the superclass `final`. However, I think that is just overly protective. There could be a valid reason for wanting to override the `equals()` and still preserve its semantics. Performance optimization could be one such reason.

Thus, in my humble opinion, the correct way to implement `equals()` is as follows.

public boolean equals(Object obj) {
if (obj instanceof MyInterface) {
MyInterface that = (MyInterface) obj;
// compare public members of this and that
} else {
return false;
}
}

However, my opinion might not be the same as yours, and your opinion might not be the same as mine (for symmetry purposes ;-) ). If you happen to have another opinion, feel free to comment or trackback!

Entry Filed under: Software engineering

1 Comment

  • 1. Loriel  |  May 19th, 2006 at 10:26

    Du övertygade mig nästan! =) Men jag har fortfarande känslan av att det borde finnas undantag… Jag ska fundera på det mer någon gång när jag får tid…


Calendar

May 2006
M T W T F S S
« Apr   Jun »
1234567
891011121314
15161718192021
22232425262728
293031  

Most Recent Posts