A framework for intelligent image interpretation in which the generic knowledge of the system is separated from the specific model and task knowledge in order to provide modifiability, extensibility, and ease of tailoring for a particular environment is presented. Three types of knowledge are incorporated in the system: knowledge about objects, about relations, and about reasoning and problem solving. The model knowledge is represented in a hierarchical manner at three levels: primitive geometric entities, perceptual structures, and volumetric and functional primitives. The knowledge of the relations is used to form constraints, which are used for reasoning in the interpretation process. The reasoning process is described, and the handling of position and orientation is discussed. An example shows how the system uses bottom-up reasoning and top-down verification to match a particular set of images containing one object to its model in the database.