What are Emaki? Emaki Productions ¥ The website of Neil Cohn
Introduction What is Visual Language? Research
Home Blog Creative Vitae FAQ



Monday, January 25, 2010

Action Stars and Smoke-veiled fights

I've posted every now and again about a convention in comics that I've called "action stars", where a whole panel is replaced by a star shaped "flash" that essentially represents "event happens here!" but doesn't show that event. I've likened this to being like a pronoun in the visual grammar, since it can replace the Peak events of the sequence, just like a pronoun can replace a noun (or noun phrase).

Over the past year I've run some successful experiments using action stars, and am planning a few more of them. But, I've also had the lingering question whether there are any more of these "visual pronouns" out there...

And I think I've found one.

Another common piece of visual morphology is the "smoke-veiled fight" (alternative names welcomed), where a big puff of smoke is shown with arms and legs sticking out, to stand for a fight occurring, which can also take up a whole panel:



Some interesting contrasts can be made between the smoke-veiled fight (SVF) and action stars. First off, SVF panels are much more restricted; they can only appear for fights, whereas action stars can go on almost any Peak panel. We might write this out this difference in meaning formally as:

Action Stars: [Event: X]
SVF: [Event: FIGHT(A,B,...n)]

This basically says that an action star carries the unspecified meaning that an Event "X" occurs, but SVF panels show an Event of "Fighting", consisting of at least characters A and B up to "n".

Also of interest is that while both depict "events", the nature of those events is intrinsically different. Action stars show a single event, while SVF panels show a duration. Notice, you can't glean the sense of duration from an action star, nor can you interpret the SVF as a single event. But, the difference is there — even though in neither one can you actually see what events are "actually" happening!

Labels:

Tuesday, January 12, 2010

5 Card Nancy and Panel Transitions

One of Scott McCloud's more wacky inventions is the game Five Card Nancy which is based on the old comic strip Nancy. The basic premise of the game is that you can create lots of different (and fun) novel strips by combining random panels together. Scott recently posted an old collage he did that led to the game.

Of immediate note in his collage is that the sequence doesn't exactly make much sense, despite some cohesion between the panels. I'd say that it may have a narrative structure (i.e. visual grammar), but no meaning (semantics).

In some cases though, the juxtaposed panels do make sense, but the global meaning does not. In linguistics (borrowed from math), we'd call this a "first-order Markov chain", since only the units right next to each other have a connection. If a panel had a connection to two panels next to it, it'd become a "second-order chain", etc...

Markov chains were the primary way that people thought about language's grammar up until the 1950s, when Noam Chomsky then showed that grammar needed to account for connections farther than just countable individual word relationships (an approach I then applied to comics' sequences).

Essentially, McCloud's theory of panel relationships is a first-order Markov chain theory. It only looks at juxtaposed relationships. Interestingly, his Five Card Nancy game follows the same characteristic. Since players put down one panel at a time, it appears as though they are just making choices linearly. However, I'm guessing that the higher scoring combos are all ones that gel on a global scale, not just a local connection.

Also, the limitation of the panel transition viewpoint is really highlighted by McCloud's Nancy collage. How can panel transitions be correct if only local connections make sense but ones further down the sequence do not? Though we may draw and read comics one panel at time, it doesn't mean we don't build or project a bigger structure in our minds beyond the linear relations.

Labels: , ,

Monday, January 04, 2010

Storycards and visual grammar

My friend Alex sends along this link to a gift pack of "storycards". Basically, you can use these cards in sequences to create lots of different novel stories. The idea is similar to McCloud's Five Card Nancy game.

I'm interested in it for a few theoretical reasons. For example, having a stock set of units that can be combined in different ways is similar to language, where you have a set of words (vocabulary) that are combined in various ways (grammar). As the main thrust of linguistics in the last 50 years has told us, infinite possible sentences can be made with just a small set of vocabulary items, and that's basically the fun of such a card set!

However, even more interesting, its very similar to a study I finished running a few years ago and am still working on getting written up. In it, my participants were given four panels from a Peanuts strip and asked to arrange them in an order that makes sense.

People were very good at getting the original order of the strip (around 90% if I remember correctly), though that's not what I was interested in. I was more interested in the errors that people made, and whether there were patterns to them. And, indeed, there were. Prior to testing, we had coded the panels for numerous narrative properties, and found that certain narrative categories got moved around in particular patterned ways.

What this showed was that people don't just make up sequences one panel at a time as this game suggests, but that elements of that order are conditioned by roles of panels. These roles are determined both by properties of individual panels, and the relations between images.

So, one-by-one reading/drawing, but guided by underlying complexity (grammar) beyond just linear relationships.

Labels: ,

Sunday, November 15, 2009

Transition Overload!

I've frequently heard it said that every panel in a comic has to connect to every other panel. I've tried to go about showing the problems with individual transitions or McCloud's closure, but I have yet to tap into this issue on the blog.

Potentially, this could be at least somewhat the notion behind Groensteen's ideas of braiding and arthrology. "Restrained arthrology" says there are meaningful connections made between all juxtaposed panel relationships (i.e. what McCloud would call panel transitions), while "general arthrology" pushes this up to possible connections between all panels in a book ("braiding").

In my book, I toyed with a similar idea of multi-connected transitions for very specific examples, but cast it aside before proposing my alternative approach based on Chomsky's generative grammar. However, the "every panel with every other" viewpoints are far more unconstrained than my approach ever was.

One of the biggest problems with this "every panel with every other" as a theory of comprehension is that it would just overwhelm a person's working memory to keep that many things active in their head with no guiding structures. So, I figured it would be worth the exercise of showing how ridiculous such an assertion might be...

For an average book that has 6 panels per page for 24 pages, this would give 144 panels in a book. Connections between any two panels in those 144 would be calculable as 144!/(2!•142!). This would build up to 10,296 possible transitions as every possible combination would additively create with each successive panel read, as the mind continuously retained them all in memory. Granted, not all panel relationships might need to establish an explicit "transition", but all connections would be necessary to at least confirm or deny the need for an explicit transition.

Without any underlying structure to guide such connections, this would be overwhelming for human memory to handle. Rather, there needs to be something explicit provided by the mind to manage (and group/subdivide) such connections— just like a grammar for language. Transitions and general principles of "arthrology" just won't do it.

Labels: ,

Thursday, October 01, 2009

Panels connected by sequentiality

Derik has a short post that makes a nice note about how understanding of individual panels is sometimes conditioned by their context in a sequence.

I think this is a very important point that is well illustrated by his example. Sometimes, understanding of the elements in an individual panel relies on the information in other panels.

Most all cultures and individuals have little trouble decoding most propositional information in images (i.e. that an image of a horse means "a horse", or that an image of person is "a person", etc). However, certain individuals may have trouble comprehending the objects if their meaning is conditioned by a sequence. For example, this sort of meaning by context is often what children under four and other "non-visual language fluent" readers (or those fluent in a different type of system) struggle with.

Why is this important/interesting?

1) It lends validity to the idea that there is a fluency required for sequential image comprehension (and thus that there is a "system" guiding understanding to be fluent in).

2) It implies that even perceptual understanding (i.e. vision and object recognition) might rely on sequential understanding in these contexts, meaning that mere perception alone isn't enough to explain sequential image comprehension (i.e. again, a system for sequential images is necessary).

3) It hints that these sequential images were created to be in sequence and not just as random images strewn together. This is also a support against an image-to-image system of understanding like panel transitions, since transitions could function no matter what is thrown next to each other. This sort of execution has a more global scope: it's a whole sequence made to be a whole sequence, not just one after another.

Labels:

Tuesday, August 11, 2009

Memory, Experience, and Comics Comprehension

In my last post, I discussed some traits of this quote by Chris Ware found from this blog post

“I don't like to think of my work as 'cinematic.' A movie is passive -- you're watching it, taking it in. Where a comic strip, it's completely active: you have to read it, search it for meaning, for the connection with your entire experience and your memory. Yes, you do have the illusion of watching something happen in a comic strip -- but if it's done well, it comes alive on the page like a novel. A novel is the most interactive thing ever created.”


The other thing I find interesting about this quite is that I have a hard time believing that people "imagine" things while reading comics that connect with their "entire experience" and "memory." There are two things that this quote implies:

1) That people are converting their reading experience into consciously clear interpretations (imagery, sounds, etc) while reading a comic (a notion that echoes McCloud's Closure).

2) That people's creation of meaning is entirely based on experience ("Empiricism").

Concerning the first point, I know when I read a comic, I don't necessarily feel like I "fill in" any missing imagery with mental imagery of my own. I don't visualize anything that isn't in the pages. I do understand it, and make the mental connections between and across images/words, but there is no additional imagery added. Novels do create this imagery (for some but not all people) because it isn't provided already.

This blog post has replied to my earlier posting expressing that Ware's meaning of "active" comprehension relates to this sort of filling in of sensory information that's missing. Again, I am hesitant to accept that people are actually imagining sounds, smells, motion, etc. while reading a comic.

Novels certainly allow people to create visual imagery — but vision is our primary sensory modality, so I find it unsurprising that this would happen. I'm less confident about the other senses.

SO....If you actually do feel like you create mental imagery while reading comics, I want to hear about it in the comments please!

On the second point, there is quite a lot of evidence that our understanding of meaning does not necessarily come from experience (and certainly not conscious experience). That's not to say all of it is innate, but there's a give and take between innate meaning and acquired meaning — the debate is over the percentages.

What I'd be more confident stating though is that when reading a comic, I doubt people are actively referencing overt memories or experiences in order to comprehend a sequence. Rather, they are drawing upon their abstract concepts — just like when they read a book, or yes, see a movie.

Labels: , , ,

Sunday, August 02, 2009

How active is comic comprehension versus film?

Dash Shaw writes an interesting post delving into the "cinematic" nature of comics that explores thoughts from authors like Chris Ware with many insightful quotes.

Relevant to some of this discussion might be that some believe comics to have predated the film techniques. Or, the idea that this is a competence versus performance issue — that film uses the same mental structures as comics, just with a different presentation (this will be a topic of an upcoming post).

Indeed, in several experiments of mine I show comic panels one after another, one at a time where the participants have no control over the pacing. My participants have no difficulty understanding these or accepting them "as comics" (no one has ever questioned the labeling).

Most interesting though is this quote of Ware's from the post:
“I don't like to think of my work as 'cinematic.' A movie is passive -- you're watching it, taking it in. Where a comic strip, it's completely active: you have to read it, search it for meaning, for the connection with your entire experience and your memory. Yes, you do have the illusion of watching something happen in a comic strip -- but if it's done well, it comes alive on the page like a novel. A novel is the most interactive thing ever created.”

I have a lot of responses to this quote, but I'll save some for a later post. Right now, I want to question what "watching it, taking it in" means with regard to film comprehension that's different than the "active" comprehension of comics. This is a common thread in comparisions, so I wonder whether Ware (and many others who also do it) is conflating the presentation of a comic/film versus its comprehension.

Is the sense that film is "less active" because it's pace of viewing is not controlled by the viewer? This to me seems like a trivial thing in terms of comprehension. The process of understanding (i.e. piecing together the meaning between images, words, and/or sound) should maintain roughly the same.

If comprehension were different, we would expect grossly different results if we presented the same comic strips in different ways in an experiment (that could use any number of measures of comprehension). Let's say we had three different methods:

1) a comic page where all panels were laid out in a grid, possibly controlled so that subsequent panels only appear when a button is pressed by the reader ("self paced reading")
2) a "self-paced reading" task where only one panel is on a screen at a time
3) a presentation with no participant control, where only one panel appears on a screen at a time for a designated amount of time

Now, I would expect no significant difference in the ability of people to comprehend these different scenarios. This is all about presentation, not the content of the strips, since those could stay the same across all of these (and other) presentation methods.

#3 on this list is essentially the same type of presentation that film uses. Granted, I will wholeheartedly agree, film's use of *moving* images certainly does change comprehension. However, there still has to be meaningful connections between and across film shots (be it live-action or animated). These would be of the same "active" sort of connections that Ware describes. Indeed, you can replace film shots for panels in the above three options and probably get the same sort of comprehension as you would for static comics. So, instead of issues of presentation, the focus of questions should instead be on issues of comprehension, like:

How does the comprehension of static versus moving sequential images differ?

How does moving images within a unit (shot vs. panel) change its comprehension?

How does the use of moving across a scene (as in panning, zooming, etc.) differ in comprehension from it's static presentation in panels (or shots)?


AND... we can't really answer these questions without an adequate theory of how comprehension of sequential images works in the first place, which is essentially what my research for the past several years has been focused on.

Labels: , ,

Sunday, January 04, 2009

The interface of humor and narrative

This Zippy the Pinhead strip from a week or so cuts to the core of the canonical joke pattern in comic strips:



What's interesting to me about this is the choice of descriptors here. "Conflict" and "punchline" here have to do with joke telling, and could correspond to varying parts of an actual narrative arc.

For example, narrative often features a denouement at its end preceded by the Peak of actions. The Punchline could go into both Peak or denouement. In one case, the Punchline would be the apex of the actions, what the strip has led up to. In the case of a denouement, the Punchline would be a reaction to or resolution of that Peak.

So, what we actually have is two separate "schema" for narrative and for jokes:

Jokes: Intro-Set up–Conflict–Punchline

Narrative: Establisher-Initiation-Peak-Release

You could imagine these running parallel to each other, and then different parts hooking up into each other in varying ways. How exactly these schema interface depends on on the desired pacing of the joke I suppose.

Labels: ,

Monday, November 10, 2008

Action!

In discussing this post of mine with Derik, I realized that I should post on the technique I used of substituting a whole panel for an "action star," like this:



This usage is somewhat similar to what I talk about in this older article on metonymy, and the same phenomena creeps up overtly in McCloud's famous "Closure" example from Understanding Comics:



In both cases, we never see the action, because it's replaced by a panel that implies action took place, but replaces the image with some neutral information. In the case of the action star though, we associate that sign with events, so it further indicates the presence of an action, whereas in McCloud's example the text does most of this work, since the cityscape is entirely neutral.

I've recently been exploring this phenomena a lot more, especially since I keep seeing it in Peanuts strips, first shown (I believe) in this one:



So, we now have this phenomenon where we know we can substitute certain types of panels for others to get an entailment of the actions. For storytelling, this is pretty cool, since it forces the reader to draw an inference about the actions (a result interesting enough that McCloud extended this out to all interactions between panels).

However, are there also restrictions on which types of panels we can replace? Since the action star essentially just means "events occurring!" but doesn't show them, it can be considered as a kind of "visual pronoun." Because of this, it can also be used as a diagnostic for determining certain categories of panels versus others. This "pro-form" replacement is a common technique in linguistics for determining grammatical categories: we can replace Noun Phrases with the pronoun "it", and Prepositional Phrases with "there":

1. Martin pushed the really huge boulder up a massive hill.
2. Martin pushed [it] up a massive hill.
3. Martin pushed the really huge boulder [there].
4. * Martin pushed the really huge boulder [it].
5. *Martin pushed [there] up a massive hill.

In 2 and 3, we can see that this substitution works fine, but when we reverse which ones we're substituting for, in 4 and 5, it sounds awful (indicated by the asterisks).

So... can we do this using an "action star" as a kind of visual "pronoun"? Check out these Peanuts strips, where the action star replaces either the second or third panels:

1. *
2. *
3.
4.

When the second panel is replaced, it does not seem to make much sense (nor would it make much sense in the first or final positions here either). However, replacing it for the third panels does work — hinting that those panels belong to a certain class of words where a culmination of an action occurs (even when that action isn't an "impact").

Note also that an approach using linear "transitions" between panels would be unable to express this: what would the action star be a transition of — a "Non-sequitur"? That wouldn't be able to capture the understanding of the event occurring in that panel. Rather, this hints that transitions (based on the relations between panels) are not the way sequences are understood, and the need of some sort of global narrative structure (with categories for actual panels) underlying the sequence.

Labels: ,

Sunday, November 02, 2008

Continuity across panels

Derik posts a quote from this article on Narration in Comics that discusses cognitive schema and comics. I'd read the article awhile ago, but seized on this part of the quote :
"An extrinsic norm crucial to comics is the interpretation of a figure reappearing in several panels as one and the same figure shown at different moments in time (usually in chronological order)… Usually it is assumed that the event represented in the second panel happens after the event represented in the first one…"

This constraint is no doubt what led Saraceni to posit a principle of sequential images that weighs "new" versus "given" information across a sequence. It's also a type of constraint placed by Gestalt organization: Continuity.

However, what struck me on this reading of that quote, is that this schema is exactly the sort of thing that people who lack knowledge of the visual grammar (or who have a competing grammar) have trouble with.

For instance, kids below four years old seem to have no ability to make coherent sense of connecting juxtaposed panels — they can recognize the meaningful content of the things in each panel, but they can't seem to connect them as part of a narrative sequence. (They also seem unable to recognize any representations in the images that are predicated on understanding the causation between panels).

A comparable thing happens with the native Australians who use sand narratives. They draw their narratives unfolding in the same space over time, and when presented with juxtaposed panels, think that each panel is a new scene. For them, their own system inhibits this recognition of continuity across panels.

What's also striking is that there are tons of examples where this constraint is not upheld immediately — many (most?) sequences don't feature the same characters over and over in panels. This is one of the reasons that a linear approach to sequential images (like "panel transitions") just can't work.

For example, let's say panel 1 shows person A, then panels 2 through 4 show other things, then you're back to person A at panel 5. You can't just integrate 4 and 5, because you would have had to lose track of person A through 3 panels. Rather, you have to keep them and their actions in mind somehow. Transitions can't capture this relationship.

There has to be a way of upholding this constraint of continuity across longer distances — which thus requires a bigger system than linear sequence alone provides.

Labels: , ,

Thursday, June 19, 2008

Garfield experimentalism

Apparently we're upon the 30th anniversary of Jim Davis' Garfield strip. As a ten year old I was pretty obsessed with the Garfield books, and can probably mark meeting Jim Davis at the ABA as a highlight of my fourth grade life.

Perhaps unsurprisingly, I've gotten about seven emails from people linking to the Garfield Minus Garfield strips, which I first saw a few years ago even. I was always partial to the Is Garfield Dead? premise, though Nothing Garfield strips are interesting too (though Barfield does give me a good chuckle).

More theory related, the Garfield Generator is a great example of a few points of my research. It shows that there is an overarching coherent structure built into the whole strip (at least sometimes in this case), even when the immediate linear relationships don't make much sense. This is somewhat similar to the famous Chomsky sentence "Colorless green ideas sleep furiously", and I'm actually basing my next big experiment out of 6 panel long Peanuts panels of this same nature.

In some cases with the generator though, you can easily tell that the position of the panel is somewhere it doesn't belong. The thematic role of the panel belies it's canonical positioning.

Anyhow, Happy Birthday Garfield, and thanks for the early influences on my comics obsessions!

Labels: ,

Monday, June 02, 2008

Comics as a Binary Language

Laraudogoitia, Jon Pérez. 2008. The Comic as a Binary Language: An Hypothesis on Comic Structure. Journal of Quantitative Linguistics 15 (2):111-135.

This study examines the structure of comics by converting the contents of panels into binary code. Coding a broad number of Eurpoean comics, a panel holding the protagonist of a story ("lead character") is given a "+" while a panel without is given a "-". The author then uses a series of computations to examine the regularity of sequences where the protagonist does or does not appear, or if there is constancy to the amount that they appear througout a book.

The results show that there is a quasi-regularity to sequences that feature the protagonist or don't feature the protagonist. That is, there are "runs" of sequences with protagonists, then runs without.

While interesting for coming up with a positive result — and very creative for applying computational methods to comics (somehting I don't think has otherwise been done), I find numerous problems with this paper.

First, why should we assume that Protagonist vs. Non-protagonist is a meaningful binary juxtaposition? In some ways it reflects of my distinction between Active and Inactive (or Passive) entities in a panel (originally based on Natsume's distinction of "positive" vs. "negative" entities). However, my breakdown is superficially "things that move across panels" to "things that don't." Protagonists could fall into either one of those categories given the appropriate sequence.

But... what if there is more than one protagonist? What if a scene shift happens where a new character becomes the lead character — this would just be coded as a consistent "-"?

Mostly though, I am unsure of what is interesting about these results. The visual language in comics features consistent "runs" of protagonist or non-protagonist panels: so what?

The analysis throughout focuses only on linear sequences based largely on Markovian chains, but I think my work has strived to show that sequences of images cannot simply be considered linear sequences. They have hierarchic structures guiding them — which such a binary analysis of the surface elements would be unable to show.

This study is an interesting first attempt at using computational methods to analyze visual language structure — and I love that the research has now begun permeating such extants. Hopefully further studies will bring more interesting results.

Labels: , ,

Sunday, May 25, 2008

Random!... panel sequences that is

As long as we're on the topic of comics that people clip out for me, here's another one that my advisor passed along. For some reason, he's rather partial to Zippy the Pinhead (I think because of the philosophy jokes), and this one caught his eye. Particularly this first panel over to the side.

Zippy it seems comes from the Non-sequitur school of panel transitions (if you're into that sort of thing).

What makes this fun for me is that my next experiment is actually going to use various scrambled strips to help illustrate the differences in processing between those and normal strips (plus some other more complex strip types).

Not much is out there about this sort of research, but one study did show that people's comprehension of sequential "picture stories" (Mercer Mayer stories) correlated with their comprehension for text. Skilled readers showed a drop in recollection for scrambled compared to regular sequences. However, unskilled readers showed no comprehension differences at all.

I'm a bit dubious that fluency in visual language is comparable to general comprehension skills (they used no measure for graphic fluency), but this study at least showed some support for a domain general capacity.

Labels: ,

Friday, April 25, 2008

Podcast: "Grammar" in visual language

I've done another podcast with the folks at VizThink, this time debating Yuri Engelhardt and Dave Gray on what constitutes a visual language and the nature of visual language grammar.

This new format allows you to skip around to different chapters to jump straight to parts of interest. (Please note, I object to the insinuation in the chapter title that "comics" can equal "visual language"):



Hint: Use the Full Screen Button to see this video in greater detail.



I think that there is something I strived to point out throughout the discussion that I didn't articulate well enough, but to explain it I'll have to do a mini-linguistics lesson.

In the podcast, Yuri pointed out the view that language has two main parts: a set of units (lexicon) and a set of combinatorial rules (grammar). This view of two components is essentially Chomsky's view of grammar, and organizationally looks something like the diagram to the side.



In this traditional view, syntax/grammar is the component that offshoots meaning, and only syntax has properties for combining elements together. I said that I agreed with this notion, but really I don't. When I mentioned that I subscribe to a view from Chomsky's student, Ray Jackendoff (my teacher), I should perhaps have elaborated on the differences between those perceptions more, because they are extremely important and can resolve some of the conflict of the debate.

Jackendoff's view of grammar is different. This "Parallel Architecture" says that the mind has three main interfacing components: modality (auditory/manual/graphic), syntax, and conceptual structure (meaning). The "lexicon" is distributed across the interfaces between all three of these structures — it doesn't have it's own "place." And, importantly, each of these structures has that capacity for infinite combinations — not just syntax. (Note the similarities to my listing of properties of Language). This would look like this:



Much of our debate focused around whether or not single images (diagrams) have "grammar." My objection is that it does not function like "syntax" does in a verbal grammar, though I acknowledge that there might be a hierarchy or a combinatorial system there. If you subscribe to the Chomksyan view of grammar, you're forced to say that the combinatorial element "is syntax," which is exactly what Yuri is doing:



If you follow the Parallel Architecture (as I do), syntax is not the only element that creates hierarchies. They all do. So, combinations within a single image or diagram is "grammar' insofar as phonology is the "grammar" of sound. Essentially, Yuri's "visual grammar" is the combination system within the graphic structures, which is why I kept prodding about the difference between it and just the system of perception (and why most of its "constraints" are based on iconicity). This instead looks like this:



In contrast, my grammar for visual language needs a combinatorial system for individual images and for combining them together, looking like this:



To the extant that the narrative structure takes concepts and a modality and orders them coherently, it functions the same as syntax in verbal language. This is "visual language grammar" analogous to the way that syntax is verbal language grammar (nouns and verbs). But, all three structures have combinatorial properties. They don't all make reasonable analogies to saying that they are like "grammar" in the syntactic sense, but they may be combinatorial.

(This is also why you can say that "gestures are to sign language what individual images are to visual language" in the context of sequential images, but not for individual images. There is no developmental/fluency gap like this for "... visual objects are to individual images". I.e. People don't learn how to draw simple graphic signs but not be able to put them into a diagrammatic arrangement.)

Making this shift in perception buys you a lot: It makes the distinction why single images may have hierarchy (like perception/phonology), but don't have grammar (like syntax). It addresses why most of that structure is guided by iconic and indexical constraints. And, it also may give you a leg up in describing combinatorial aspects of images beyond diagrams (which occurs within panels).

Finally, it is worth noting that not just aspects of language have consistent patterned units that appear hierarchic in structure within our cognitive system. This also appears in music, event structure, vision, social structure, and a myriad of other domains (discussed well here). But, we don't have to call them "languages" because of this broad similarity.

Suggested reading:
Foundations of Language and
Language, Consciousness, Culture, both by Ray Jackendoff

Labels: ,

Wednesday, April 02, 2008

Time and The Torch

On this page I found another great example of a page by Jae Lee that defies the "temporal mapping" idea that successive panels are successive moments:

I'm unaware of the full context of the page, but the Human Torch is flying around some big monster of sorts and creates the number "4" (for Fantastic Four no doubt) in his path. Doing so, his path begins by violating a constraint of page layout, entering at the bottom of the page, and then flies over his own path, which crosses a panel he's already been in.

I'm not sure I agree with the analysis given on that blog, mainly because I think appealing to McCloud's transitions and closure only hurts his otherwise fairly good discussion.

Now, I don't want to suggest here that there is not time being shown here, but I think that there are two considerations that need to be reoriented.

First, let's not talk about "time," let's talk about "events." To the human mind time is only an extrapolation of events. Thinking in terms of a clicking-clock type of absolutist Time is not on the same level with the understanding of time constructed in a person's head. From understanding events, we can tell that time passes, not so the other way around.

Second, panels do not necessarily have to equal moments. Rather, panels function as "attention units" grouping important information into meaningful chunks. These chunks don't have to be moments, but they do highlight relevant information in ways that the author intends.

This is exactly the case in this example. The interesting thing is that the flow of events runs counter to the standard reading path of panels in order to create the "4" emblem. If reading left-to-right as if these were independent moments, this would make no sense whatsoever. But, because this display uses image constancy (breaking up a single image into parts... what I'd call a Divisional panel, the understanding of which is what Gestalt psychology would call Closure), the panels only serve to divide up the conceptual space of the image to highlight the Torch at different positions within the space.

Yes, the countering of events vs. panels is a bit funky, but it's also a creative use of playing the two off each other to reveal their functions.


Note: For those more interested in these types of examples about Time, most of these ideas are written about more extensively in my paper Time Frames... Or Not. Attention Units are discussed more in A Visual Lexicon.

Labels: ,

Tuesday, April 01, 2008

Some links and whatnots

Steven Seagle has a decent piece up at the First Second blog about visual storytelling. He nicely taps into a simplified version of some of the same things that I've been pushing for my theory of visual grammar. The exercise he uses to rearrange panels is very reminiscent of linguistics methods, and is also a good one that shows how a broader structure exists above and beyond the so-called 'transitions' between panels.

Along these lines, Matt Madden and Jessica Abel will soon have a "how to" book coming out about comics. I've been hearing that its somewhat theory oriented, so the book should be an interesting read. So, keep an eye out in the coming months.

Finally, keep an eye on this very site in the next week. I will finally — finally — be posting the results of my experiment about comic page layouts shooting for next Monday. This one has been a long time coming — I first ran the experiment almost 4 years ago and have been working on the paper since last spring! The project tested whether or not people read comic page layouts using the "left-to-right and down" path like text. A preview: the answer is "not really."

Finally, last month was my most trafficked month ever, so, thanks to everyone that's been reading my site lately!

Labels: ,

Monday, March 17, 2008

Closure's assumptions

Patric continues his defining of "comics" with a discussion of "closure." I've talked before about the problems with the idea of closure, but it strikes me that there are a few underlying issues that people run into when addressing these issues:

1. They assume that time passes between panels, despite there being no evidence that each panel represents a "moment in time." With this assumption in place, it forces people to assume that some "moment" also lies between the panels, when no hidden moment may exist. I wrote my essay "Time Frames...Or Not" about various reasons why this assumption isn't true.

Even McCloud bungles this. While in one place he tries to say that "panels=moments" because "time=space", in his own transitions he includes three that have nothing to do with time at all! (Subject, Aspect, and Non-Sequitur transitions). For the adamant, what are the moments and what are the transitions in this "comic"?

2. People are just looking at the relationships of two juxtaposed panels. Most stabs at sequential meaning, like Patric's or Derik's, have just talked about two-panel pairs. But, rarely are sequences confined to two panels.

Just because we experience reading sequences of images linearly doesn't mean that is how we understand them. In most cases, we can easily acknowledge that whole sequences mean something beyond just paired panels. Looking beyond the scope of immediate panel relations quickly forces a rethinking of the accuracy of a view about closure/transitions.

Here are a few illustrative exercises that people can do to think more about these issues (and are things I did when first getting into this seriously):

1. Actually try to catalogue the "transitions" in a full comic á la McCloud's counting. Note any problems in the categories and where descriptions become more difficult.

2. Take comic pages/strips and sketch out the different relationships of every panel to each other. Which panels need connections, which don't? What do the relationships tell you?

If anyone actually does this, I'd love to hear about their results. In the meantime, if people are curious about my alternatives to closure, I recommend watching this video.

Labels: ,

Tuesday, February 26, 2008

Podcast: The Functions of Panels

The last podcast I did with the VizThink folks was so fun I decided to do another. This one is about the various functional roles that panels play in the visual language used in comics. Among the topics I hit are:

• focusing information within panels
• navigating page layouts
• visual "storytelling"
• text-image relationships

It's a slightly pared down and also expanded (at the same time!) version of the talk I gave at the VizThink conference. Enjoy!

Labels: , ,

Monday, January 21, 2008

Connectedness in comics

Weber, Heinz J. 1989. Elements of Text-Based and Image-Based Connectedness in Comic Stories, and Some Analogies to Cinema and Written Text. Paper read at Text and Connectedness: Proceedings of the Conference on Connexity and Coherence, July 16-21 1984, at Urbino, Italy.

Weber attempts to create a textual model of comic communication drawing on cinema studies and text research (similar to the intents of Saraceni and others).

He describes three sections: Graphic, cinematographic, and textual, as well as the intersections between them. He also postulates several degrees of “connectedness” ranging from conformity, sequential and integrated connexity, cohesion, and coherence.

Conformity deals with arrangement of panels – conventionalized formats, while connexity can either relate to the internal relationships of elements within panels, layout issues, divisional panels, metonymic panels, the shifting of a balloon's tail to different roots, etc. Cohesion depends on causality between the succession of panels (syntax/semantics). Finally, coherence deals with pragmatic relations between “text external” elements.

Like other papers of this ilk, it provides a broad scale analysis of aspects of comics' structure, yet doesn't delineate them carefully enough to detail the componential role they might play. Instead, it uses overarching "principles" to tie them all together to create a goolash of comic structure.

Labels: , ,

Tuesday, December 11, 2007

Some Peanuts patterns

It's the end of the semester, and as usual things are crazy. I've finally got my Peanuts experiment up and running, which means people are coming in to participate. A lot of people. The experiment lasts an hour, and between last week and this week, I'll run 24 subjects, which means I'll have lots of data to pour over during winter break.

In the meantime, I also finished coding several strips from Peanuts, and have found several interesting things. For this experiment, I culled 180 strips from the first two Complete Peanuts volumes (kindly donated by Fantagraphics), which were either silent or I altered to become silent. I then coded them all panel-by-panel. That's 720 panels, and yes, it took me all semester.

So, what did I find in my sample? Well, there is some interesting stuff...

Most of what I coded for has to do with narrative structure, or what I would call visual grammar. I'm hoping my redone terminology is transparent enough to follow here.

Out of 180 strips, 140 of them (78%) used "Establishers" to set up information in the first panel. Conversely, 123 of them (68%) finished with a "Release" where the tension of the narrative dissipates. 135 (75%) also use an "Initial" as the second panel, which initiates the actions of the strip. 112 (60%) finish the strip with a "Peak" — the height of narrative tension.

Of 180 strips, 50% (90) use the overall structure of "Establisher-Initial-Peak-Release." The next highest isn't even close, with only 13 strips using the pattern "Establisher-Initial-Initial-Peak."

The "E-I-P-R" pattern is what I think of as the canonical narrative arc (which on a macro scale resembles the traditional "narrative arc" of plotlines). All this aligns even more interestingly to coding I did of event structures for each characters' actions, but describing all that here might be a little overkill.

Just as a reminder, this is a very specific sample of strips and shouldn't be construed as making any sort of claims overall about Peanuts. Nonetheless, its fun to see what info the strips alone hold. Now I'm even more excited to see what the results of my study show about people's behavior in relation to these coded predictions.

Labels: ,

Sunday, December 02, 2007

New Essay: Japanese Visual Language

For the first time in a long while, I've got a new essay up for download. This one discusses the visual language that underlies manga, and will be part of the Manga: The Essential Reader collection published next year by Continuum Books. Here's the abstract:

Over the past two decades, manga has exploded in readership beyond Japan, and its style has captured the interest of young artists all over. But, what exactly are the properties of this "style" beyond the surface of big eyes and "backward" reading? This paper explores the structural elements of the Japanese Visual Language (JVL) that comprises the "manga style" — ranging from looking at the “big eyes, small mouth” schema as a “standard” dialect, to examining the graphic emblems that form manga’s conventional visual vocabulary. Particular focus will be given to JVL grammar — the system that creates meaning via sequential images — and how it differs from the visual languages from other parts of the world. On the whole, manga provide an excellent forum for understanding the scope of the visual language paradigm.

Labels: , ,

Monday, November 12, 2007

Capturing vs. Generating Comics

A good friend of mine who works for the company that produces Second Life sends over this link about using the Comic Life software with Second Life screenshots. I've expressed my displeasure with Comic Life before, but I haven't really thought about comic creation of video game clips before.

Something about it rubs me the wrong way... And I think its the same issue that I have with why "photo comics" don't work, and why only some CGI comics feel comfortable.

The problem is that they don't come from some sort of conceptual basis. They are just capturing events in the (virtual-)world and the displaying them in segmented parts. But, contrary to regular comic sequences, they aren't produced to be sequential.

(This may be the same reason that pin-up/cover artists don't always translate to being good "storytellers": they are used to drawing single images, not sequences. Or: they have good visual vocab, not so good visual grammar.)

The capturing vs. generating sequences makes a huge difference, since in one you are actively setting out to express concepts visually, and the other you're just collecting whatever actions might be given to you. In fact, I'm guessing that the CGI comics that read the best (and there are some good ones) are the ones that were first drawn in thumbnails or layouts. The actual "visual language production" occurs at the thumbnail stage. The rest is all just refinement. These "event capturing" comics bypass the stage where visual grammar is deployed.

Of course, the grammar could be deployed "online" in the processs of that CGI comic being created, but I doubt most who do this have much capacity for visual grammar in the first place. They use it thinking that it is an alternative to having graphic fluency, only their non-fluency then shows through in CGI instead of poor drawings.

In many ways this issue is similar to an Internalist vs. Externalist debate in linguistics/philosophy as to where meaning comes from. Traditional philosophy/linguistics (and I think? a commonsense view of meaning?) has held that meaning of sentences is derived from the truth value of how that sentence relates to the "real" world. The Internalist side (including my advisor) says that those meanings only connect to concepts in a person's head, regardless of their truth value to the world.

"Capturing of events" for comics is much like the Externalist viewpoint — sequences of images are depictions of some form of events, and it doesn't matter how they get depicted. The Internalist side would be the opposite: Sequences of images are derived from the conceptual expression of a human mind, and reflect the fluency of that mind.

Labels: ,

Wednesday, August 29, 2007

Video: Visual Grammar

The conference proceedings for the Visual and Iconic Languages Conference from a few weeks ago are now posted online. On the site you can download the slides from my talk, and they've also posted full video of the conference proceedings. I've embedded my talk below, and also to my site.

I highly recommend watching this video of if you are at all interested in my overall theories or in how sequences of images communicate. I describe what exactly I mean by "visual language" fairly clearly, and why it is different from "comics."

Most of the talk though is essentially a snippet of my developing theory of visual grammar — how sequences of images communicate — including my arguments for why panel transitions don't work and my alternative model. I don't plan to post an essay of this work online for awhile (I've been tinkering with it for four years), so this is the best place to find these ideas.

My talk runs for the first 45 minutes of this video, followed by 15 minutes of questions (the second hour is someone else). If you download the slides (large pdf) to flip through at the same time, they might be clearer than on the screen. Enjoy!

Labels: ,

Friday, August 17, 2007

The Comic Strip and Film Language

Lacassin, Francis. 1972. The Comic Strip and Film Language. Film Quarterly, 26(1), 11-23.

In this piece translated by David Kunzle, French theorist Francis Lacassin discusses the similarities between the "syntax" of film and comics, noting that they both use "shots" as their base units. For him, this includes various things, like various degrees of framing (long shot, close up, medium shot), dynamic use of what could be multipanel representations ("panning"), as well as semantic alterations, like subjective viewpoints. (I would argue that this isn't "syntax" at all... but that's a larger post).

He argues that though film and comics emerged around the same time, these techniques came first in comics — not the other way around, as is often argued — and that they may have been autonomous developments not influencing each other at all.

He writes: "It is more reasonable to suppose that comic strip and cinema have both separately drawn the elements of their respective languages from the common stock accumulated in the course of the centuries by the plastic and graphic arts." (14)

To this I would question, is it really through historical development, or is this just a reflection of the structuring of people's minds/brains?

He hypothesizes also that film and comics both accomplish their sequential meaning by use of the film theory of montage, which for Lacassin appears to cover most things that do not appear similar to real-world perception.

In his own section at the end, Kunzle criticizes Lacassin for claiming comics were invented before cinema, while those framing techniques are cited being used by authors two generations before that "birth" of comics. Kunzle then discusses the work of 1800s artists Töpffer, Doré, and Busch, noting that they used various techniques like close-ups and polymorphic representation (where one character is repeated in a single frame showing the unfolding of action), among others.

Labels: , ,

Monday, July 23, 2007

Blasphemy!

As I've mentioned before, I'm currently designing a series of experiments that will use Peanuts strips for stimuli (kindly donated by Fantagraphics). Lately, I've been doing the arduous process of scanning them in and coding them.

Since I'm only trying to look at how sequences of images create meaning, and because the inclusion of text changes things, I've been trying to only use silent strips. Luckily, Peanuts has a lot of those. Additionally, I've also been using strips that have minimal text or could be (gasp!) manipulated to have no text.

I've been taking these strips and deleting the text, then touching up the mouths, etc. so that they work alright silently. It's been quite fun to make sure it all looks like Schulz's style. On the one hand I keep thinking "this is so cool!" and on the other I can't help but think, "Blasphemy!" for defiling the originals.

Most of the time I cut and paste from one panel to another, like putting a frown over an open mouth. In most cases, not much at all is lost in the meaning. It's usually like erasing a word balloon saying "Good Grief!" to just Charlie Brown frowning. The meaning (and humor) stays pretty much the same. I think this is actually a testament to how great Schulz's visual humor was, that I could go in and muck with things, yet the original meanings still come through.

Now... once I start swapping panels around or creating new strips out of panels from a variety of strips, then that might be another story...

(Amusingly, my father actually asked me if there were ethical issues with doing that... my advisor just thought it was pretty amusing. I'd have to think there are less ethical issues involved with retouching comic strips than experimenting on animals like the lab next door, but if I start seeing protests out front I'll reconsider).

Labels: ,

Monday, June 11, 2007

Review: The Language of Comics by Mario Saraceni

Saraceni, Mario. 2003. The Language of Comics. New York, NY: Routeledge.

Saraceni's The Language of Comics is one of the few books that attempts to present a holistic theory of how comics work, and draws upon work from "applied linguistics" no less. The book is actually a stripped down version of his dissertation (as is his article "Relatedness" from the Graphic Novel collection). (And good luck finding the dissertation... I had to print it from microfilm on interlibrary loan).

Unfortunately, much of his approach seems to feel of grafting McCloud's work to ideas in applied linguistics in a simplistic (and uncited) way. For instance, he proposes a gradation between semiotic types (like symbols and icons), and can well be compared to McCloud’s Big Triangle.

He treats the sequential aspect of panels as equal to sentences, giving them a “discourse theory” type analysis (like the dissertation by Stainbrook). Saraceni claims meaning is created through commonality between elements in panels, alternating with successive new and given information. He also uses "semantic fields" (connected meanings: like how "snow, caroling, pine trees, and presents" invokes "Christmas") to unite panels not encompassed by this information structure.

However, in doing so, he eschews the role of linear sequentiality, yet provides no argument for why people do indeed read in consistent sequences. The result is essentially a watered down version of McCloud's closure — which it is: his dissertation has the theory in full, and exactly does shoehorn discourse theory onto McCloud's transitions. Some of his insights here are useful and enlightening, yet they deal entirely with "exceptions." He rarely discusses "run of the mill" things like the depictions of events, instead culling his examples from very experimental comics storytelling, like Peter Kuper's The System.

Other chapters cover things like word balloons (perceived as equivalents to direct quotes) and drawings of eyes to buttress a discussion of subjective and objective viewpoints. The final chapter is about computers, which seems out of the blue and has next to nothing to do with comics.

Since it uses applied linguistics, much of the book feels attuned to what might be useful for literary studies. To this extant it might work very well. However, as a theory of "meaning" it falls short, largely because it does not address any type of cognitive system, and lacks even McCloud's precision of surface categorization.

My biggest gripe about the book is that it is presented as an introductory textbook as part of the Intertext textbook series, has no citations outside a "recommended reading" list in the end, and is written with a matter of fact tone that presents it as an authoritative stance on the body of knowledge of this field. The truth is that no such body of knowledge exists at this point for "comics theory." Right now, we're in that period of science where lots of different viewpoints are popping up, just waiting for an encompassing paradigm shift to sweep in and take over.

Even if I were to come out with a book of my full theories, it wouldn't be a de facto textbook because it would be my views drawing upon that body of research. As a result, to those "in the know," the format and style make this book seem misleading in its intents for fronting Saraceni's views as well established scholarship.

To end on a good note: Though I think it fails in not using a cognitive approach, I do like that the book tries to use concepts from linguistics with "comics." It shows that this type of approach is not just intuitive to me, but to others as well, and locating the book in the broader field of linguisics is good for the field as a whole.

Labels: , , ,

Wednesday, May 02, 2007

Coercion... of meaning!

Today I gave my big first year project presentation to the psychology department. From what everyone has said, it went very well. Of course, the project itself is still underway, and I will be running several more subjects in the lab, while probably continuing the study as a whole throught the summer. This shot is of me and my advisors from afterwards. (R to L: Ray Jackendoff, Phil Holcomb, Me, Gina Kuperberg):

As I've mentioned before, my talk involved looking at the "Event Related Potentials" or ERPs involved with processing a certain type of linguistic phenomenon called "semantic coercion." ERPs are a measure of the electrical activity of the brain. We don't get a good fix on specific brain areas that are at work, like in fMRI, but we do get very detailed analysis of the time course of events and certain waveforms do seem to indicate types of brain functioning in contrast to each other. We measure this electrical activity by sticking a cap of electrodes on people and feeding the signals into a computer, which then averages out the noise over several subjects and trials to give a smooth wave for time locked events. Here's me in the cap...

So, I looked at these brain waves for semantic coercion, which involves the extraction of "hidden" meaning from sentences like The chef finished the chicken before the main course. Someone can't literally "finish a chicken," they have to finish doing some action with it, like cooking. Since the event isn't stated outright, it's said to be "coerced" from the combination of the verb "finished" and the direct object "the chicken." Here's a waveform from one of the sites on that cap that I got in the experiment:

While this is interesting as a linguistic phenomenon, I think it's really just a warm up for more comic related studies. Since I couldn't resist, I even opened my talk by showing this strip:

Now, if you look carefully, coercion happens here too. We never see the event of Snoopy catching the ball, yet we know the event happens based on the information provided by the other panels. In addition to other things, coercion is perhaps one of the things that McCloud was trying to get at with his notion of "closure." In many ways, coercion here is an invisible meaning that is created out of the visible components of the graphic sequence. Graphically, it's the stuff that happens "out of view" of the panels. The problem is that McCloud extended this to the (linear) relationships between all image sequences, which just doesn't work.

So, if I do find anything fairly robust in the ERPs for verbal coercion, perhaps a study of visual language coercion could be on the horizon as well? Or perhaps a theoretical paper first...

Labels: , , , ,

Thursday, March 22, 2007

You're a good grammatical construction, Charlie Brown

I've recently been thinking quite a lot about how best to start testing my theories of visual language grammar. Since I'm in a psychology program, I've got to actually think of experiments that might yield reliable and significant results (hopefully).

One of the main ideas I had was starting off using a corpus of comic strips, so I wouldn't be biasing the study with my own drawings. I hit on the thought that Peanuts strips would be perfect for this since 1) there are a ton of wordless ones, 2) they're well recognized culturally, and 3) they use a fairly simple bare bones structure with 4) nearly always with 4 panels.

So, thanks to a very kind donation from Fantagraphics, I am now pouring through several volumes of The Complete Peanuts strips in search of all the wordless/minimal text ones I can find (there are a lot!). Hopefully, by summer I should be testing peoples intutions on the grammar of these strips, and eventually looking at their brainwaves while processing them (fun!).

One of the things that has jumped out at me is how so many of the strips use systematic patterns that I haven't noticed before. Previously, I've talked about the visual grammatical pattern of the 'Set up - Beat - Punchline' construction (as coined by Neal VonFlue). This is the pattern that sets up the joke with dialogue, then has a pause panel, then ends with the punchline. Well, Schulz seems to use a few other patterns a lot as well.

The most intruiging to me is one that is almost exactly like the SBP pattern, only the "beat/pause" panel isn't actually a pause: it's an "action" panel (SAP?). Instead of a passive type "rest," the space is filled by some wordless action that sets up the payoff with the final panel punchline. I've only looked at the oldest of the collections (the 1950s) and have only seen a few actual SBP constructions. I'm curious whether or not this SAP pattern preceded/led to the SBP one.

Another pattern has the first three panels as wordless depictions of an event, only to have a final panel with a punchline that explains or comments on the actions. This one happens extremely frequently, and sometimes takes on an additional characteristic of having the first panel depicting an action as well. It starts with an event that sets up the primary event that unfolds in the rest of the panels.

Patterns like this are fun to find, but can also be challenging theoretically. At least as far as developing a model for my visual grammar, sometimes I'm hesitant of how to notate certain panels, and often debate which is more correct. Imagine not only trying how best to describe how nouns and verbs combine, but also whether or not things are nouns and verbs in the first place and/or whether those categories are appropriate at all (when there's good evidence for all).

And, unlike with homework, there is often no answer key that I can check with someone else (except, hopefully, what my experiments will reveal). I've always found this "working without a net" to be a little scary, but at the same time exciting since it portends new and uncharted territory. I suppose it's the feeling of truly doing science instead of just learning it.


Note: As long as I'm giving thanks for donations, I should also mention the kind contributions of TopShelf, Drawn & Quarterly, Top Cow, Oni Press, and Dark Horse Comics. Their generosity will make a huge difference in these visual language studies and are greatly appreciated!! If you are from another company and would like to donate to this cause, please contact me...

Labels: , ,

Monday, December 18, 2006

Dunkin' pictograms

Speaking of pictographic writing systems that will never be universal, Fabricari sends along this logo from Dunkin' Donuts:



I've been seeing that logo for over a year now, and never quite got that the images were supposed to be a pictographic sentence until a few weeks ago! I had just thought that it was a bunch of pretty pictures. Of course, given that I rarely if ever go to Dunkin' Donuts, I never really put in the effort to try and decode it.

Though, while we're here, I might as well use it as an example as to why pictographs fail as universal systems. Outright, as I mentioned in my last post, the grammar here completely mimics that of English. Of course, that was the intent in this case since its a slogan, which is why there is four units for four words. But, notice also that this conversion makes a very important decision: it chooses to transcribe "America runs on DD" as a verbal-->visual mapping, rather than siphoning the concept behind the idea into the visual modality to then adapt to its own traits. Moving on...

Let's start with nouns. First off, the DD is only understandable if you know the association to the company. The map of America could be read as "map" or some such, but is even more interesting since it is a metonymy. It uses "America" to mean "the people in America."

The verb "run" nicely shows how you can't visually show an action without also showing an object. It's tough to show "run" without also showing the "runner." Verbal grammar (by virtue of its symbolic nature) likes to divide these pieces into [ACTOR]-[ACTION]. In visual grammar this division doesn't work as well (being iconic, not symbolic), instead becoming [ACTOR:state1]-[ACTOR:state2], where "state2" shows the fruition of the action.

Finally, the preposition "on" isn't visually converted at all. I find this to be particularly telling, since it immediately shows the English context. I imagine also that the makers of the logo struggled with it, since this usage of "on" is not the spatial kind ("on top of") but is tied to the verb.

In fact, the interpretation of "run" as an action here (like "run down the street") is wholly off, since they don't mean that Americans "use their legs to run on top of Dunkin' Donuts." Rather, they are using a construction "run on" (arguably not two units) that means roughly "to be powered by." The "person running" image then becomes a "double rebus" --> first mapping the sound pattern to the image, then the image's literal meaning to it's "metaphorical" meaning.

To come back around to my initial statement about mimicing English grammar, this actually can't even do that since the slogan doesn't use concrete elements. A literal reading of this ends up being totally bizarre (bracketed by panel, italics adding clarifying info):

[The country of AMERICA][uses its legs to RUN][ON top of][DUNKIN' DONUTS]

SO... this example only goes to reinforce how hard it is to "accurately" map verbal expressions to visual signs — both individual signs and grammatical sequences — especially when it involves metonymic and metaphorical expressions (add those semantic aspects to the list then). While English speakers might be able to figure this one out, can anyone possibly imagine this working on a global scale?

Labels: , ,

Monday, November 27, 2006

Time essay analyzed

In Derik's continuing exploration of panel transitions (Part 1, Part 2, Part 3), he does an interesting job of dissecting my latest essay "Time Frames... Or Not." To keep things localized, I'll make my responses there, but he seems to have done a fairly thourough job of it. Worth perusing.

Also, Blogger has helpfully decided to add tags in finally, so I'll be trying to work those into all my past posts in due time.

Labels: , , , ,

Wednesday, November 08, 2006

Problems with Transitions

Over at Derik's blog he's been examining McCloud's panel transitions based on influence from film theory (Part 1, Part 2, Part 3, Part 4, Part 5 ...more to come).

While Derik only does it a little bit, the application of film theory to panel transitions isn't altogether new. John Barber essentially grafted McCloud's and my own (old model of) transitions onto Eisenstein's thesis/antithesis/synthesis model in his masters thesis. This was then argued against by Ben Woo in his thesis, dismissing it more because modern film theory does than any explicit argument against Barber's thesis. I'm not up on my modern film theory that much, but I believe Eisenstein is fairly passé at this point anyhow.

A few months ago I started noticing how similar Eisenstein's montage was to the cognitive linguistics notion of "Blending." Blending takes two concepts and extracts parts from them to create a new entailment. A classic example is "The surgeon was a butcher" — both surgeons and butchers are skilled at cutting flesh/meat, yet when combined together they illicit a meaning that the surgeon was sloppy. This is just like the 1+1=3 idea from montage.

And it certainly does appear across panels. I had a whole section on blending in my paper A Force of Change. Though, I think that the structures governing sequential understanding (i.e. syntax and semantics) are different from this.

Really, Eisenstein's montage and McCloud's closure are kind of like the film/comics equivalent of ether; a magical "mental" substance that doesn't really exist that glosses over any real substance the mind might actualy be contributing. They're like pop-science: a simple easy explanation for a very complex phenomenon. Just like Freud and Jung are still thought of by laypeople as being what psychology is about, their theories are far left behind to modern thinking. In fact, I'd venture to say they're more used by humanities/social sciences these days than psychology or cognitive science.

Of course, I've been railing on the panel transition approach for quite a while now, over the course of several alternative models. And, it's not just the idea of transitions that has problems: it's any approach that only takes into account panels that are immediately adjacent to each other. Any linear approach to the idea of creating meaning in sequential images will ultimately fail.

As I mentioned on one of Derik's posts, the major shift comes in what one is looking at. Instead of looking at panels' immediate surroundings and basing the system around those juxtapositions, we can instead acknowledge that whole sequences mean things (events/actions/situations/ideas). From there, it becomes a matter of identifying what functions different panels play in creating that overall meaning. Just because we read and write panels linearly doesn't mean that's how we understand them.

Nevertheless, it's interesting to watch Derik go through steps in his thinking in relation to what I did. He named it "rethinking transitions" so it'll be fun to see what his rethinking leads to.

Updated 12/1 with additional links to further entries

Labels: , , , ,

Tuesday, September 19, 2006

Tim and Time

So, well timed for my new essay, "Time Frames... or Not", on why panels don't equal moments (and time does not equal space), Tim Godek posts this excellent and simple example of a temporal paradox. (I'm not reposting it here because the file size is rather large, so go look yourself!)

Since the three "moments" of the event happen across the same background, a "temporal map" reading would force the foreground figure to be hovering in front of the background, or, the background shifting behind the character. I think that we're forced to reconcile that the person is doing an action that occupies a singular space (i.e. sitting and thinking) while the background does not remain consistent behind that static foreground space. So, either fore/background is shifting or it creates a paradox of temporal progression where we parse the foreground figure in his own "conceptual space" separate from the background.

This also relates back to the "positive/active" versus "negative/passive" elements I talked about in my "A Visual Lexicon" paper. I think what makes this strip work without a "shifting" interpretation is that the Tim character is Active in the sequence compared with the Passive background. So, our focus of attention is held on him rather than on the consistency and oddness of the background.

In either reading, there is something in the content that illicits a "not normal" reading (i.e. moving when sitting vs. static yet background conflict). If anyone finds more of these temporal paradox type examples, definitely send them my way.

Labels: , ,

Monday, September 04, 2006

New Essay: Time Frames... Or Not

Wow, its been a really long time since I last posted a new downloadable essay. Well, if you've been anxiously awaiting one, today is your lucky day! I've just posted my latest theoretical offering, "Time Frames... Or Not," where I tackle the assumptions that lead to the (false) belief that successive panels equal moments in time. Here's the full abstract:
The juxtaposition of two images often produces the illusory sense of time passing, as found in the visual language used in modern comic books. While this linear sequence may seem on the surface to present a succession of individual moments, the understanding of graphic narrative is hardly so simple. This paper will explore how the linearity of reading panels and the iconicity of images create various assumptions about the conveyance of meaning across sequential images relation to space and time.

Astute and long-time readers of this blog will remember that I mentioned writing this paper way back in January of this year. Its good to finally get it done and out!

Like many, I'll be back to school come Tuesday, so hope you all enjoy the day off!

Labels: , , , ,

Monday, June 19, 2006

San Diego, coming soon!!

Thanks go to Scott McCloud for kindly linking to my latest Comixpedia article on "visual rhyming." He suggests "Eye Rhymes" as another good term, and I agree, it does sound pretty cool.

He also notes that he'll be speaking at the San Diego ComicCon on Friday from 12-1pm. I know I'll be there, and so should you: Scott gives great talks. And immediately after Scott's presentation, you should walk over and see me talk at 1:15 to 2:30 on the Visual Language panel of the Comic Arts Conference!

This year's panel should be fantastic. I can't wait, because my talk will be the best presentation I've ever done. I don't want to overhype it, but I've been working really hard on it and it should be great. Called "The Secret of Sequence," I will finally be unveiling most of the my model of Visual Language Grammar, so, if you're curious at all about "how sequential images create meaning," then you won't want to miss it.

The panel will also have two other talks about theory related things as well. I'll post more about them, and the location, as the date nears.

Labels: , ,

Thursday, June 15, 2006

Thoughts on Time

I've had a couple very interesting (and heated) discussions about "time" with friends recently that might be interesting to share. What had inspired one of these discussions was a book I'd read that implied that human's ability for narrative gave way to ideas of past, present, and future, and thus a sense of time that is unavailable to animals.

My friend, ever the philosophical champion of animals (bless his heart) responded that this isn't the case at all. He noted this example: his cat can watch its prey go behind an object and will know its going to come out the other side, so it goes to the other side to trap it when it comes out. So, since it can predict the result from a starting action, it has a sense of a future, and thus a sense of time.

I disagreed with this categorization, thinking that a conception of "time" as a thing is different from the cause and effect knowledge of events in daily life. He replied that it's just a matter of degree, not kind.

My thoughts right now are that there are two things going on here:

1. The subjective experience of time through the events that happen through during the day.
2. An abstract sense of time as something objective and outside just experience.

Whether it’s a matter of degree or kind, some type of symbolic activity (be it language or otherwise) is necessary in order to move from a conception of the first type to the second.

So, what does this have to do with visual language?

Well, many people believe that panels somehow equate to moments, and that "movement through space is movement through time." I believe that the second belief in time underlies these assumptions. The whole sequence of panels (or just physical space) is looked at as an abstract passage of time in which the actual panels (or internal parts of panels) are seen only as parts that fulfil the expectations of the broader whole.

In contrast, a view of panels akin to #1 focuses only on panels as segments of events. That is, the understanding of panels in sequence is far closer to the experiential sense of events than to an overarching sense of time (as I suggested way back in my Buddhism essay).

Labels: , ,

Thursday, February 23, 2006

Timing

Newsrama hosts the first part of three articles about Time in sequential art written by Joanna Estep. The piece is very well presented, and I like how systematic her analysis is, especially her use of diagrams to push along the theory. It's well worth reading, and I look forward to seeing what her next installments bring.

However, I also want to point out that it makes certain assumptions that are largely passed on from the Eisner/McCloud tradition. Mainly, it holds that "one panel = one moment," which simply isn't the case if you actually look at sequences of images from books (as opposed to just mental theorizing – of which I've been guilty of too). There is nothing about two panels that dictates time is passing – only content that implies temporal succession can yield this result. And, once you see that many panel sequences don't inherently push time along, you realize that problems arise in any linear notions of time across panels.

Following this, it also reinforces the ideas that "spatial distance = temporal distance." I had some thoughts on this like four years ago that I've never really worked into a full-blown paper, but the basic idea is that panel sizes create a rhythmic structure for reading. To really see if this is true I'd need to do eye-tracking studies though…

I'll hopefully be posting an essay I've been working on about Time myself sometime soon, but till then my old essay Visual Syntactic Structures (and book Early Writings...) delves into these things for anyone interested.

Update: I now see that Timing Part 2 is posted too. Again, worth reading, but continues the assuptions in McCloud that "reading time = fictitous (i.e. mental) time." I'm also curious why she includes her "hierarchies within images" as being related to time, since she doesn't measure any increase or decrease thereof. I agree with this: I don't think foregrounding is related to time at all, though I do think its related to distinguishing things like who is the focused actor and who is subsidiary.

Update #2: Timing Part 3 is up now, rounding out the articles. This one is about the integration of text. I'm not sure what real relevance it has for the understanding of Time after stripping away the assumptions I talked about above, but she certainly has some interesting things to say about composition and reading orders. Go read.

Labels: , , , , ,

Monday, February 13, 2006

Mayan Visual Language?

I haven't done a review for a while, so here's an absolutely fascinating one (again, listed in my bibliography):

Nielsen, Jesper, and Wichmann, Søren. 2000. America’s First Comics? Techniques, Contents, and Functions of Sequential Text-Image Pairings in the Classic Maya Period. In Comics and Culture: Analytical and Theoretical Approaches to Comics. Magnussen, Anne, and Hans-Christian Christiansen (Eds.).

This absolutely fascinating article provides a structural analysis of what could be interpreted as Mayan Visual Language. Some examples very clearly use the VL grammatical categories I've been researching, such as this one here (read here, R-to-L, click for high res.):



Most of these artifacts were taken from “vessels” (vases), so the sequentiality of the reading (layout) would be gained by turning the object itself. This is reminiscent of the 5000 year old goblet found in Tehran with sequential art on it. The authors also speculate on the usage of speed-lines and speech balloons, which have semantic variation in representation (speech balloons turn into flames used to show anger – a notable conceptual metaphor in its own right).

They also note writing and images exist sometimes exist independently of each other, but by and large are overshadowed by text-image pairings with sequential art. It's interesting the reverence placed on image-text pairings in contrast to Western counterparts:
"In Western society, the combination of text and image was, for centuries, considered a debased form of communication. Only artists who directed their work towards a mass audience, predominently the lower classes, dared venture into text-image pairings. The Mayas, however, considered the combination of text and image the most exquisite and exclusive form of artistic communication, and reserved it for elite consumption only." (p.73)

Would that we achieve what they had. All in all an absolutely amazing piece. I wish more analyses on cultural systems were done like this.

Labels: , , , ,

Friday, January 13, 2006

Problems with Closure, part 4

In my last post, I pointed out the assumption that pictures are not connected to any mental apparatus. I now continue on to show how that affects analysis of sequential images…

Assumption #3: Absence of Mind

By minimizing the contribution of the mind, a simple theory like closure can easily emerge. The images’ meanings are “out there in the world,” so all the mind needs to contribute is possible ways to pull those meanings together. Since no mind is found in the actual images, its placed instead between the images. Transitions just become a surface grafted onto this encompassing unifying process, where the “mind” “fills in the gaps.”

But, what is it "filling in the gaps" with? It must carry some information in order to do this.

Of course, the non-mental explanation says that we understand closure because we’ve had experiences in life that allow us to combine events in images. True enough. This is an appeal to the things being referenced. However, it still can’t escape the mental part of receiving those experiences and drawing upon them to understand images (i.e. doesn’t the mind then have to do something in order to make those experiences understood?).

This view casts the mind as a “magic box.” Stuff goes in, a conscious understanding is reached, but how did it do it? Cognition! Ok, yes, that’s true, but now tell me what that cognition is and how it works. You can’t just say “the mind does it” – you need to say what the mind does to be able to say that “it” does anything. Otherwise you’re just making an empty statement.

Closure doesn’t really say anything about the content of the panels, saying that meaning is created in the space between them. It cedes out a non-role to the “mind,” thereby passing the buck of meaning making to the ether. This makes closure essentially a faux cognitive process. And this is also why it can be extended to apply to just about anything at all.

Instead of a non-principle like “closure,” we can lay out mental schemas for events (and more) in our minds that allow for understanding sequential panels. Rather than a generalized magic that the “mystical mind” performs, this actually identifies the contribution of the mind.

My first model had three of these:

1) Environmental Phrase: unified various environmental elements at the same state
2) View Phrase: combined the same element at the same state
3) Temporal Phrase: unified elements of state changes

These "phrase structures" could then embed into each other, forming a hierarchy showing exactly what the mind brings to the table. While the panels are linear, the structures of understanding are not. Note also, by formulating these rules, they inherently pose constraints to which sequences come out.

My newer approach builds off of this further to stipulate actual grammatical roles, while rejecting the schemas above (because they don’t work entirely). You can see a glimpse of this new approach in the essay "Initial Refiner Projection", though that’s only a small part of it.

In all of these, a contribution of the mind is identified. It is not magically glossed over, and it imbues the power of meaning making to the images themselves in concert with given mental rules.

Once you come to this conclusion though, it raises some other important questions:

Where do these mental schemas come from? (learning or genetics?)
How many are there, and how do they work?
Do these structures connect to other mental domains?

All of these are very important questions, and just the sort of thing that will hopefully occupy a good deal of time and effort in cognitive science in the years to come.

Problems with Closure: Part 1, Part 2, Part 3

Labels: , , ,

Tuesday, January 10, 2006

Problems with Closure, part 3

In my last post I pointed out that pictures are not believed to have constraints on them, and that the mind must place constraints on any sort of understanding:

Assumption #2: The Veil of Iconicity

This assumption is that pictures are “out there in the world,” not learned information, and thus not mental phenomena. McCloud shows this underlying belief by stating:

“Pictures are received information. We need no formal education to “get the message.” The message is instantaneous. Writing is perceived information. It takes time and specialized knowledge to decode the abstract symbols of language.” (p. 49)


This belief is formed because images are most often iconic, meaning that they derive their meaning through resemblance to what they reference. A picture of a person is known to refer to a person because we know what people look like in the world. Note, there are three parts to this equation: the picture of a person, people in the world, and the concept of people in our minds.

However, just because they look like what they mean, it doesn’t mean that pictures aren’t conceptual information. Through this resemblance, we forget that it actually requires a mind to understand these images, and thereby discount its contribution to understanding. Images just seem like what we experience in the world: we don’t seem to need any special understanding to know the world, so thus we don’t need special understanding to know images.

Upon closer reflection, this is somewhat of a ridiculous mistake. If I draw a picture, how can it not be connected to my mental understandings? It came out from my mind, why wouldn’t its reception need to go through my mind too!? I had to learn how to draw, doesn’t that mean I had to learn how to understand drawings too!?

Considerable studies have shown that the understanding of images is clearly not so transparent. Often, this is found in native communities like Australian or Amazon aborigines who couldn’t/can’t understand aspects of "Western” representation. In the past, this was haughtily used to justify their intelligence as "primitive" compared to Ameri-Europeans. Really, this is just a case of not having fluency in the conventionality of a graphic system (natives for the Western system(s), and Westerners for the native systems). Science is rife with these sorts of examples treating the world “objectively” while really being unable to see beyond the petri dish that oneself is standing in.

Because images look like what they represent, we gloss over the mental component for understanding them, and in turn is misplaced for sequential images. I’ll take this up in my next post.

Problems with Closure: Part 1, Part 2, Part 4

Labels: , , , ,