Navigation, the ability to organize behavior adaptively to move from one place to another, appeared early in the evolution of animals and occurs in all mobile species. At the simplest level, navigation may require only movement toward or away from a stimulus, but at a more sophisticated level, it involves the formation of complex internal representations of the environment, the subject’s position within it, the location of goals, the various routes from current position to goal and possible obstacles along the way. The vast array of navigational capabilities in various species has made it challenging for students of comparative cognition to formulate unifying frameworks to describe and understand these capabilities, although the variety also confers an exciting opportunity for asking comparative questions that are hypothesis driven. A unifying framework, the navigation toolbox, is proposed to provide a way of formulating common underlying principles that operate across many different taxa. The toolbox contains a hierarchy of representations and processes, ranging in complexity from simple and phylogenetically old sensorimotor processes, through the formation of navigational “primitives” such as orientation or landmark recognition, up to complex cognitive constructs such as cognitive maps, and ﬁnally culminating in the human capacity for symbolic representation and language. Each element in the hierarchy is positioned at a given level by virtue of being constructed from elements in the lower levels and having newly synthesized spatial semantic contents in the representations that were not present in the lower levels. In studying individual species, the challenge is to determine how given elements are implemented in that species, in view of its particular behavioral and anatomical constraints. The challenge for the ﬁeld as a whole is to understand the semantic structure of spatial representations in general, which ultimately entails understanding the behavioral and neural mechanisms by which semantic content is synthesized from sensory inputs, stored, and used to generate behavior.