AI for Good Press Conference. Part 1

On 7 July 2023, as part of the AI for Good Global Summit organised by the International Telecommunication Union (ITU) — the UN’s specialised agency for information and communication technology — an unusual press conference took place. Media representatives from around the world had the opportunity to ask questions of 8 anthropomorphic robots — Sophia, Nadine, Grace, Desdemona, Ameca, Mika, Ai-Da and Geminoid (which is actually not a robot, but a robot avatar). The press conference attracted a lot of media attention, with The Guardian, The Associated Press, Reuters, Deutsche Welle, Insider and many others covering it.

It should be said that this was not the first press conference where a robot has been interviewed by reporters — although never before have so many robots answered questions at the same time. Moreover, all the robots were well known before the press conference, having appeared at events ranging from consumer electronics shows to UN meetings.

This event was a quintessence of many public appearances of anthropomorphic robots that one could witness in last two decades. Their public demonstrations became a phenomenon in its own right, condensing various social, cultural, and technological trends and realities of contemporary societies. The AI for Good press conference had all the characteristics of such events. So it is a good place to start if you want to understand how such events are organised, and what impact they have on public techno-imaginaries and the way people understand and communicate with artificial intelligence today.

I will present my observations of this press conference based on the full video recording of the event available on YouTube. I will focus on typical features of such events, using the press conference as their epitome. Since the event itself was multifaceted, I will present my observations in three parts: first, I will discuss its general organisational and technical features; second, its content; and third, its interactional specifics.

Let’s start from some notes on the organisational and technical features of the AI for Good press conference.

1. Place and Space

As with most such events, the press conference took place at the intersection of financial, political and media streams. Here it was Geneva, Switzerland, and the UN meeting, but in other cases you can find anthropomorphic robots at the tech summit in Dublin, Ireland, or the academic conference in Tucson, USA, or the business forum in Singapore. As the developers of anthropomorphic robots are not only interested in getting paid for renting these machines to anyone who wants to present them at the particular event, but also in finding future investors, they are eager to place their machines where they can get the necessary publicity. In addition to their financial interests, however, they are all driven by the desire to make a contribution to society by promoting their vision of the future of humanity — the coexistence of humans and robots. This means that they concentrate their efforts on particular cultural hubs where media, business, and politics can be “infected” by their ideas and start to spread their imaginaries and their machines. But this also means that all other places and events that do not ensure this are ignored. Geographically, such events tend to be located in the Global North; organisationally, they tend to involve large numbers of participants and/or the presence of figures with high political, economic or cultural capital.

One of the additional constraints on the public performance of anthropomorphic robots is the availability of the necessary technical infrastructure. They need to be plugged in and usually connected to the internet. The AI for Good press conference was held at the Geneva International Conference Centre, where all the necessary infrastructure is already in place. But even in such places, technical difficulties can arise, as this comment from the press conference moderator shows:

01  MOD   uhm (.) lastly uh the: robots
02        are connected uh (.) over the ↓internet
03        so there may well be: a s:mall time lag
04        in their ↓responses and this is
05        due to .h the ↑internet connection↑
06        and not due to the robots

Obviously, anthropomorphic robots cannot function where the required technical infrastructure is absent, that is, in economically deprived and politically turbulent places.

This quotation also shows that the technical background of such events, which has to stay as much behind the curtain as possible, can become a part of their interactional organization that will be discussed in my next posts.

2. Robots’ Appearance

It’s easy to notice that among the 8 robot participants of the press conference 7 has recognizable female form and voice (the only male robot was actually an avatar). This trend of making anthropomorphic robots and many other natural language-based agents (like voice assistants) markedly female is a subject of long-lasting debates. There are thousands of scholarly and media publications where the purposeful “femalization” of artificial intelligence is subjected to cultural analysis, experimental investigation and societal critique. It seems like developers of these machines think very much in line with the creators of robot Sophia, who put into Sophia’s mouth the following words as answer to the question “Why Female Robots?”:

“I think one reason is that the female form is generally considered to be less threatening. People can be frightened by things that are new or strange, like AI and robotics, and many people see a female voice and form as more soothing and less confrontational.” (Riccio 2021: 69)

But sometimes they also justify their choice with feminist rhetoric, claiming that their machines can increase the visibility of women and even fight for women’s rights.

However, critics argue that female anthropomorphic robots actually reinforce the stereotypical image of women as “servants” and propagate the objectification and sexualisation of the female body.

Whatever the long-term consequences of such a femalization of AI, one of the immediate effects seen at the AI for Good press conference is the universal use of the pronouns “she” and “her” when talking about the present robots. This — along with robots referring to themselves as “I” — contributes to the attribution of agency to the machines by humans, thus overshadowing or obscuring the human work behind them. Also, although not at this press conference, the overtly feminine appearance of the robot may trigger certain kinds of questions related to the female experience, such as whether “she” wants to be a mother one day.

3. Technical Arrangements

Events like the AI for Good press conference are complex technical assemblages, where organisers and robots’ human companions work together to make the whole event run smoothly. All this work is meant to be out of sight, and only comes to the fore by accident. Yes, some developers deliberately make the technical insides of the robot visible — as in the case of robots Sophia and Ameca, which have a transparent panel at the back of their “heads” — but these “windows” into the mechanical nature of the robots do not reveal the huge amount of work done by humans to make them work during the event. The robot has to be connected to the power source and the internet (usually via cables), it has to be placed so that it does not fall down, microphones have to be provided to allow communication between the host(s)/audience and the robot, etc. Last but not least, robots should be prepared for public demonstrations, which, especially in the case of female-looking machines, means applying make-up, putting on special clothes, attaching a wig, and so on.

The robots’ “bodies” are an integral part of the technological equation that any such event is. Design constraints can lead to very specific configurations of robot “bodies” that can affect how the public communicates with them. For example, here is the screenshot of the robot Grace answering a question at the AI for Good press conference:

You can see that Ben Goertzel — one of Grace’s developers — is holding the microphone against Grace’s “chest”, because the robot’s speaker is actually there and not in its “mouth”. Its “lips” move, but they do not transmit the sound of speech. Here Goertzel makes clear both the peculiar technical configuration of the robot and the work that its listeners do when they associate the sound from the robot’s speaker with the movements of its “lips”.

4. Human Companions

If you read the headlines of the reportages from the AI for Good press conference, you might get the impression that the audience was communicating with robots. But this is only partially the case. In fact, the audience was interacting with robots and their developers/creators. Next to each robot was a human representing the company or team that created that robot and is the leading figure in the whole process of its development. They were Nadia Thalmann (robot Nadine), Will Jackson (robot Ameca), David Hanson (robots Mika and Sophia), Aidan Meller (robot Ai-Da) and Ben Goertzel (robots Desdemona and Grace).

The on-the-spot presence of the person who created the robot is a common feature of such events. Sometimes these humans are the main presenters, and the robots only showcase their arguments. But often humans are secondary to the event, as was the case at the AI for Good press conference. There, humans played two important roles. First, they presented the robot and talked about its design, history, functionality, application and “personality”. And second, they could intervene when they felt it is necessary, making technical repairs on the spot or, more often, “helping” the robot by modifying the input. Consider, for example, this excerpt from the AI for Good press conference:

01  RP3   as AI becomes (.) more powerful
02        and more sophisticated
03        and m:ight at some point develop agendas
04        of its own ↑how can we↑ as humans
05           (0.3)
06        continue to trust you the machines
07           (10.3)
08  JAC   Ameca how could we trust you
09        as a machi:ne as AI develops
10        and becomes more powerful?

After the reporter asks the question and the robot does not answer for 10.3 seconds, Will Jackson steps in and reformulates the reporter’s question. The reason for the robot’s failure to respond may be a technical problem with the robot’s audio reception, or the robot’s inability to process a rather long question or to determine whether the human has stopped speaking, but, in any case, the robot’s companion finds this long silence problematic and scaffolds the tottered communication.

Robot developers are uniquely competent to repair technical and communicative breakdowns in communication with their products because they know the “normal problems” of these machines and how to overcome them. They are skilled interaction tinkerers who know how to get things back on track. Of course, this does not mean that they are always successful in their efforts. From time to time, anthropomorphic robots surprise their designers and prove to be unruly to the interventions of their human companions.

5. Interactional Noise

One of the persistent features of anthropomorphic robots is the systematic discrepancy between their communication apparatus and their actual contribution to communication. Anthropomorphic robots are equipped with “eyes”, “lips”, “ears” and mobile “face”, and some of them also have “arms” and “legs”, but usually these tools of communication are not used as such, creating more often “interactional noise”. If you look at the footage from the AI for Good press conference, you will see that the robots are constantly making some “body” movements, and their “face” expression can only rarely be related to the content of their speech. Sometimes this discrepancy can be quite profound, as in the case of robot Ameca’s “frowning”:

This expression was shown during this episode:

01  RP3   but do we know that you are not
02        going to lie: to us.
03           (5.3)
04  AME   $no one can ever know that for sure (.)
05  ame   $“frowns”------------------------------>
06        but I can promise to always be honest
07        and truthful with you.
08           (3.3) $ (3.0)
09  ame         -->$

Not only does the robot “frowned” while talking about things that are very difficult to frown about, it also, as you can see from the transcript, “frowned” for 3.3 seconds after it stopped talking. The footage of the press conference is interspersed with such episodes of situationally inappropriate expressions.

In extreme cases, this interactional noise can be completely detached from the ongoing communication, revealing the purely technical nature of robot’s contribution. For example, the same Ameca demonstrated the following very eccentric “eye” movements several times during the press conference:

This effect of “jumping eyes” has nothing to do with the current situation and is very difficult to explain by the robot’s interlocutor.

It should also be added that, in general, the “eye” behaviour of anthropomorphic robots is quite unusual. Throughout the press conference, none of the robots demonstrated the ability to focus their “gaze” on the interlocutor. Their “eyes” were constantly moving, making it virtually impossible for humans to grasp what they were looking at and to use this grasping to communicate with them.