The Structure of Language

“Human language serves as a good example of the evolution of a robust, redundant, and relatively noise-insensitive means of social communication. Errors are corrected so effortlessly that often neither party is aware of the error or the correction...The result is a marvelously complex structure for social interaction and communication.”

- Don Norman, The Invisible Computer

To accurately discuss how language is used to communicate, we first start with what comprises the structure of language. It will help if we look at language as an onion, and we will peel away layers to reveal the smaller, more basic layers beneath until we reach the core of the onion. With that in mind, let’s begin: A language’s lexicon is the vocabulary of a language, the set of all words belonging to that language, and that lexicon is the entire inventory of that language’s lexemes. Lexemes are, in short, words; they are categorized into sets of all forms taken by a single word (e.g. run, ran, runs, etc.) and are used to express sememes. Sememes, in turn, are a unit of the transmitted or intended meaning of a word (for example, “go,” “run,” and “skate” all share the semantic meaning of action). Sememes can be further deconstructed into morphemes, which are the smallest linguistic units that have semantic meaning. Tricky? It is here that we discover that our lexical onion has two cores, for we have both verbal and written language: phonemes are the smallest linguistically (i.e. audibly) distinctive units of sound, thus used to describe spoken language, and graphemes are the most fundamental unit in written language. As this structure can be complicated, here is an example: Think of a word. Any word. Is that word “cough”? Good. If we were to write “cough”, we would use the letters c, o, u, g, and h. The written word “cough” is comprised of five graphemes (the individual letters), however it contains only three phonemes, as “ou” together make only one identifiable sound, as do “gh”. Our word “cough” is a morpheme (composed of phonemes), and could be used to express the sememes “sick” or “diseased.” Other forms of the word “cough” (such as “coughing,” or “coughed”) are part of the same lexeme. And as “cough” is used in English, we can say that “cough” is part of the English language lexicon. This study of lexicography is quite complex, however it raises questions about language, and in turn about the various forms of language — verbal, written, and visual. What could we say are the graphemes of visual information? Or perhaps more importantly, what are the sememes of visual information? What is it in language that helps us visualize? What types of combinations of phonemes, sememes, and lexemes allow us to get a clear picture of what someone is telling us?


fig. 22. the “lexicography taxonomy onion”. 

One clear aspect of spoken and written language that helps us convey our message is syntax, or the systematic order in which we place the words that we speak or write. It is not enough to merely know the parts of speech; we must also know how to successfully arrange them into an order that transmits our ideas — but how much of that order is culture-specific? Eva Belke, of the School of Life and Health Sciences at Aston University in Birmingham, UK, conducted a series of studies involving a referential communication task. This particular type of task requires participants to describe a target object in such a way that a listener could correctly pick out the same target object from a display of multiple objects. She prefaces these studies as such:

“Making verbal reference to objects in the outside world is one of the fundamental functions of language. Depending on the complexity of the situation, referring expressions may differ with regard to their degree of elaboration. For instance, when speakers refer to an object in the context of several similar objects, they often have to specify it by means of a set of features that clearly distinguish it from the other objects” |Belke, 2006|.

But what of the way in which speakers vocalize these specific features?

One would assume that the dimensions of objects that are immediately visually available (e.g. color) would be described before something relative (e.g. size). However the results of her studies revealed something intriguing about language syntax and cultural influence: the specificity of our descriptions are organized by proximity to the object we are describing. This holds true for prenominal languages (languages such as German and English where adjectives come before the noun they are describing) and postnominal languages (languages such as Spanish where descriptors follow the utterance of the noun). As Belke concludes: “The dimensions that are easiest to detect (e.g. absolute dimensions) are commonly placed closer to the noun than other dimensions (e.g. relative dimensions). This stands in stark contrast to the assumption that language production is an incremental process” |Belke, 2006|. In her trials, participants typically included the visual dimensions of shape, color, and size when verbalizing descriptions of the target object on a computer screen display. These target objects where located in a field of other similar (though not the same) objects on the screen, meaning that participants had to differentiate — to a degree which they believed to be sufficient — the target object from its neighboring objects (see fig.__ for an example display). In daily speech, these findings mean that if someone were attempting to describe a blue boat that was large, the prenominal-language speaker would describe it as the “large, blue boat” while a postnominal speaker would describe it as the “boat blue large.” A seemingly small syntactic rearrangement, but the implications of these two descriptive techniques lie in their respective advantages in a communication scenario; prenominal descriptions create an advantage for the listener, while postnominal descriptions do so for the speaker.


fig. 23. comparison of prenominal and postnominal languages shows how each would describe the “big white boat” (prenominal) or the “boat white big” (postnominal). In both cases, the most absolute characteristics are placed closest to the subject of the statement.

Listener-Advantage: by hearing a description that begins with relative information first (information that is reliant upon one object’s comparison with other objects), the listener is allowed more time to filter out visually matching possibilities before hearing the name of the object being described. This convergent system of narrowing down possible objects that match the increasingly specific description of the target object allows the listener to essentially “zero in” on the correct object.

Speaker-Advantage: describing the characteristics of an object first that are more clearly defined (absolute dimensions) allows the speaker to embellish the target object description, creating a more complete and detailed account of that object. This postnominal system is divergent in nature and allows the speaker to “layer on” information to further describe the object.

Essentially Belke describes the existence of a certain verbal spatial relationship, that of object-to-attribute, where we can tell the level of certainty of those descriptors by how they are arranged, in this case by how close to the intended object they are. This understanding only helps us to an extent — we are aware of it, but if we still do not know the correct verbal language, what are we to do? This is one distinct example of how verbal language can translate into visual language; utilizing spatial relationships, even forcing them in some cases, is one way of visually presenting the semantic connections between various elements of an idea, a design, or a system.

This focus on the exchange of semantic meaning as a purpose of communication prompts the question, how do we process semantic meaning when we are listening to someone speak to us? Gerry Altmann from the University of York, UK, and Faulk Heuttig of Ghent University in Belgium, offer some insight on this question and how their answer blends with the effects of language on visual attention. Altmann and Heuttig performed a series of experiments wherein participants were asked to view an array of four unrelated images and listen to a speaker read a sentence aloud. Using eye-tracking technology, they recorded the eye movements of the participants during the trials, from which they observed two interesting phenomena. The first is that as a spoken word unfolds (i.e. spoken), visual attention can be directed immediately toward the image in the array that is conceptually related to the spoken word. For instance, let’s say the array included pictures of a tree, a balloon, a sheep, and a car (fig. 24). If the spoken sentence was, “A man looked in the field, and there was a sheep,” then upon hearing the word “sheep” participants would look at the goat in the visual array. As they explain, “Participants could orient their gaze toward an object’s spatial location because its structural representation matches the visual representation of the concept activated by the phonetic input” |Altmann, 2007|. This means that when we hear the word “sheep” we imagine what a sheep looks like, and we direct our eyes toward an available image of a sheep; indeed all of this happens quickly, even as we are hearing the word.

fig. 24. an example of a typical four-quadrant visual array used in Altmann and Huettig’s semantic priming trials to accompany the statements, “A man looked in the field, and there was a sheep,” and “The shepard looked in the field, and there was a sheep.”


This finding is amusing but not extraordinary, however Altmann and Heuttig also found that if the sentence was structured in such a way as to prime the participants toward a particular target image, then participants would direct their attention toward the target object even before hearing the target word. To use the sheep image example again, if the sentence were changed to become “The shepherd looked in the field, and there was a sheep,” the mere mention of a shepherd would prime participants to direct their attention toward the image of the sheep because “sheep” and “shepherd” are semantically related (quite closely).

To examine how far these semantic similarities could be stretched, they included more trials wherein the array did not include an image of the target word, but an image that shared visual characteristics with what an image of the target word would look like. In the sheep example, the image of a sheep in the four-image display would be changed to something visually similar, perhaps a fluffy cloud (fig. 25). Altmann and Heuttig’s findings revealed that upon hearing the sentence, “The shepherd looked in the field, and there was a sheep,” participants, when viewing an array of four pictures (of a tree, a balloon, a cloud, and a car), would direct more attention to the cloud than to the other non-”shepherd”-related images. This indicates that “shifts in overt visual attention occur towards items related to words in the language when there is some featural match between the target specification accessed by the spoken word and the properties of the objects in the visual display” |Altmann, 2007|.


fig. 25. an example of a conceptually mismatched four-quadrant visual array used in Altmann and Huettig’s semantic priming trials to accompany the statement, “The shepherd looked in the field, and there was a sheep.”

What Altmann and Huettig’s work reveals is that as we receive audible inputs (in the form of spoken, verbal language), we are creating in our minds a conceptual representation of that information, and that we involuntarily tend to shift our visual attention to stimuli in our field of view that match our conceptual representation, either visually or semantically. As author of “Design, Communication and the Functional Aesthetic”, David Rowsell’s words seem appropriate here:

“When communicating, it is more than useful to have some idea of the state of mind of your audience. What beliefs, preconceptions and predilections does your audience have? Our words may be misunderstood, meanings can go astray. Mistakes of meaning can be avoided only if we put in some work on preparing our audience for what is to be said. Aesthetics is one such means of preparation at the disposal of designers”|Triggs, 1995|.

Rowsell touches on some of the same topics as the studies of Altmann and Heuttig: by understanding who our audience is and how they think, we can overcome language mismatches by visually connecting our message with the conceptual representations that lie in the minds of our audience.