AI Newscasters and 
Barthes’ Third Meaning


---------------------
Robbie Fordyce
20 March, 2024

---------------------

On the use of AI in news content creation in Indonesia.

---------------------
  1. Overview
  2. Diversity
  3. Production
  4. Reception
01. Overview

In 2023, TVOne Indonesia started "AI newscasters" for delivering their news content. While it's easy to see the oddness of these agents, I think the general use raises a few important points about news production and public access.

I'm indebted to my student Rivi who is currently researching this in-depth for his Masters thesis, and brought the topic to my attention.
Example Video from TV One:


As I understand it, the content is created using a phased approach. First, standard reportage and content production strategies are used to create a script. Second, a presenter is recorded reading out the script. Third, the recording fed into a tool that generates a digital agent that syncs with the audio. An obvious benefit to the company here is that this production method integrates well into existing media production strategies for news companies.

Is this AI? I would say no. There's nothing here that specifically requires AI to generate the newscaster. This appears to be nothing more than a video mask that's mapped onto a screen capturing of the human speaker. If AI is involved in this then that would be unnecessary and - at least during this stage of AI development - is more likely to introduce errors or issues in the broadcast. There may be AI used in generating the stills or script, I suppose, but this seems to simply use existing conventional facial mapping technology. 

That said, this doesn't mean that we can't evaluate this in terms that relate to AI, as we're looking at content production strategies where AI is relevant. If this helps us to explore the role that AI might take or identify relevant questions, then that's still useful. I'm going to talk to three categories that stand out to me: diversity, production, and reception.

02. DiversityOne thing that stands out about the videos is that the content has been generated in a way that speaks to the diversity of Indonesia's peoples. Scrolling through the videos here, it's easy to see that while some identities repeat, there is a range of identities represented across the agents. Indonesia is an interesting country for this exploration of news production because of the country has a huge diversity in ethnicity and language. I'm not personally placed to identify the ethnicities or consistency in language here, but simply the fact that this has happened is interesting in itself.

Examples: screenshots of AI newscasters Irrespective of whether this is AI, the current value of diversity leading actors in any context is at least partly framed as a concern over the conditions of production. Certainly the role of actors as representations of specific identities or ethnicities have a role in enabling recognition of participation, inclusion, or other virtues for people who come from those identities. However, this gains value because it signals participation in the conditions of production. The idea is that it shows that minorities are not excluded, that they receive paid roles, that they have a part in the production process, and that their labour is valued. This changes when those identities simply become representations without referent, where an identity is represented without people from that identity taking part in production.

03. ProductionWhile AI is often discussed in terms of labour and workplace changes, there are also infrastructural impacts that don't tend to be discussed so often. The savings on labour can be associated with the removal of various biological demands of labour - safe workplaces, toilets, exhaustion, access to food - but AI also offers a removal of various aspects of the means of production. Labour is often threatened by a replacement framework (where a worker is replaced by a machine), yet this model actually recreates several infrastructural aspects of a studio, specifically expensive recording equipment and spaces. Assuming the process of creating this content is as described above, then all the newscaster requires is a reasonably clear audio and face capture. No lighting, no sound stage, no camera errors, no makeup, no complexities around live to air. Other factors do remain, like montage/editing, reportage, and the producer presumably remains relevant, but removing the studio is a significant development.

Producing content in this way is interesting in terms of its efficiency. News can be created quicker, and easier. I do not necessarily think that this is a 'good thing' in the context of our current news environment, due to matters of trust and misinformation. The idea of a major news company is understood as trustworthy because of the financial commitments of producing a studio and developing a brand, which is often offset by political economic concerns related to the financial or political motivations of the company. Nonetheless, the model we're seeing here is one that offers a means for people to produce news content without a major investment in a studio or equipment.

04. ReceptionCritical theories of imagery have been an important component for studying ideology and capitalism over the last century. Within this, there are theories that are relevant to the study of artificial people, and there are two historical threads that I think are worth exploring.

The first is Masahiro Mori's theory of the uncanny valley (Wikipedia), which is a well-known idea but not strictly within the 'critical theory' canon. The idea is that we relate to the world around us in terms of familiarity and comfort, and that our comfort increases as the things we encounter are more relatable. Mori proposes that we have a subjective crisis when we encounter something that is nearly relatable, but fails in some important way, leading to an inversion of the feeling of comfort. A good example of this is that a prosthetic hand may present a feeling of ill-ease in people, while still being relatable as a hand. There are political considerations to take in here: for instance, prosthetic hands would only generate discomfort if you are unfamiliar with them, a deeply racist person may feel the uncanny valley in the presence of a person of another ethnicity. Certainly the way that some online commentary discusses trans people would suggest that they are perhaps experiencing something along these lines. The subjective/psychological component to this experience means that it's a bit hard to make use of this as a definitive analytical tool, although it is useful for describing the experiences of viewing AI-generated people. The core of it to me is that the uncanny valley relies on a person being unfamiliar with the thing being encountered. If AI generated content becomes more familiar to people, or more convincing, then the explanatory value of the uncanny valley will decrease.

I'd like to point to a different logic, that of the third meaning. The third meaning is an idea from Roland Barthes to explain content. His argument is that when we encounter content, we first experience information, then we experience symbols, then we experience a third thing. Sometimes he just calls this the third meaning or image and at other times he calls this the obtuse meaning.
Example: still from Akira Kurosawa’s Seven Samurai (1956)

What does this mean? Consider this still from Akira Kurosawa's Seven Samurai (1956). The three levels of information, symbol, and the elusive 'third' type are present. 

The information is the first level of interpretation that happens unconsciously when we look at the still. There are three people in the shot, one is shot close with a 3/4 angle with an intense expression, one is mid-shot in profile and speaking, and the last is front on in the middle-rear right of the shot with their head down. They are crouched, and there are words on the screen. A building is in the background.

The symbolic content is the interpreted contextual meaning granted to the information, we can see the expression is contextualised as occurring in response to the talking, as the man becomes increasingly enraged by the threat to his life; the clothing seems to have a traditional print, and in the video this is clearly a worker's garb. 

We're now left with the third, obtuse meaning, that Barthes himself was never especially clear about. The third meaning is more apparent in the video, but is present in the still as well: the bald caps. The third meaning is a type of failure, insufficiency, awkwardness, or gap of faith and trust in the conditions of production. The obtuse meaning is where the conditions of production are visible, and we can start to see how the material was produced in a manner that introduces doubt. It shapes our ability to suspend disbelief or it exposes that we were tricked in the first place. We can see the bald caps, and it does something to how we value the rest of the content. The third meaning is also about reception, like the uncanny valley, but it has two benefits that I think place it above. The third meaning socially contextualised to the conditions of production, so that we can start to understand how the environment was created.

For me, this is something that we can use to understand and evaluate generative AI as an audience experience. This would seem to be how most AI is evaluated: in terms of its falling short of an ideal, not in terms of an actual quality, trustworthiness, or exposing its plumbing.

0X. References
  • Barthes, R. 1987. “The Third Meaning: Research Notes on Some Eisenstein Stills” in Image Music Text. Heath translation. Fontana Press: London. pp.52-68.
  • Kurosawa, A. 1956. Seven Samurai. Toho: Japan. [IMDB]
  • TV One AI. 2023. Various images and stills. YouTube.com. https://www.youtube.com/@tvoneai