16 Comments
Mar 18Liked by Rohit Krishnan

The example you provide of Midjourney creating a stereotypical depiction of, in this case, “an Indian man,” can be maybe ~50% attributed to the default style parameters that it applies to all images. When these parameters are turned off (by adding —style raw and —stylize 0 to the end of the prompt), the results are much more varied, boring, and realistic. Midjourney has ostensibly set these parameters to apply as the default to “beautify” the images it generates, but any attempt to automate a normative vision beauty will be—by definition—stereotypical. Human artists might always have a total monopoly on art that is simultaneously beautiful and subversive.

Expand full comment

I find it interesting that Gemini coming well after other competitive products - and with everything Google has in terms of data, infrastructure, talent, good "process" (I assume) & an incentive to get this right - tripped so badly. I see this as Google's "New Coke" moment. For consumer facing AI products at the intersection of company values, technology & politics the go/no go criteria have to be defined very differently than say B2B applications. And the company culture influences these criteria so I'm very sympathetic to Ben Thomson's view that existing cuture will have to change which may not be possible with current leadership.

And I agree that Google was probably a bit unlucky; other AI companies will have the same hurdles to cross. Interesting times nevertheless!

Expand full comment
author

I'm not sure they had a good process personally. I'd venture they leant on their smarts instead of brute forcing solutions, which doesn't work with LLMs. Hence you end up speedrunning all the same mistakes.

Expand full comment

In which case my comment on culture (given the reputational risks involved after all the known issues with 'hallucinations' as well as legal challenges OpenAI/others are facing) is even more pertinent.

Expand full comment

ChatGPT’s text-based answers seem generally more neutral. What is OpenAI doing right that Google is doing wrong?

Expand full comment
author

An enormous amount I imagine, gained from multiple experiments over multiple years and releases now

Expand full comment
Feb 29Liked by Rohit Krishnan

"When you yell AIs to be nicer, or focus on a diverse world, or to be law abiding, or to not say hateful things, these all interact with each other in weird ways" (tell?) and "bureaucracy meeting the urge to ship fast" I think are much needed notes of empathy for people and companies trying to solve hard problems.

As a (very) average programmer, I know how hard it is to write correct code, get it to run reliably, ship it on time, learn from user feedback, etc. and things like AGI are many orders of magnitude more difficult to get right than, say, simple webapps.

This freak out (or just in corners I inhabit) over embarrassing but hardly consequential errors is startling to observe.

Expand full comment

Isn’t the problem here that we’re trying to think of these LLMs as having a single personality instead of a collection of a large number of personalities? The solution then would be to expose them as a large collection of personalities instead of a single one.

If I’m a subsistence farmer in Africa looking for advice on some issue my crops are having, I don’t want the solution that would be appropriate for an industrial scale farmer in the US. Ideally the UX for these LLMs should require you to first choose who you want to talk to and then ask the question. With that type of a UX even the inappropriate images you included in your post could be considered to be quite appropriate provided, say, you choose an alternate history fiction writer as the personality you’re talking to.

Expand full comment
author

They're already trained on a large collection of personalities, perhaps the largest ever. Turns out that doesn't provide situation specific intelligence only inasmuch as also a whole lot of weirder associations that we can't predict.

Expand full comment

Even the gotcha image from Bing had 2 white male soldiers in the 4 that were picked.

I think there's a huge difference between data-derived stereotypes, and then a reinforcement training program that attempts to counter that stereotype, and Gemini eliminating the stereotype - and an entire people group. This was manifestly obvious to anyone who generated images. The fact the model was released publicly doesn't point to bureaucracy intermixed with urgency. It points to a myopic world view best represented by "the median member of the San Francisco Board of Supervisors." This is an institutional failure and the backlash is justified.

Expand full comment
author

Sure. I'm saying the institutional failure isn't easy to point to without saying where or why. Did they just prompt it heavily? Or screw up the training data? Or tune it wrong? It definitely isn't trivial, and seems a continuation of the same types of errors that all of them fell prey to. Counting the number of white people isn't instructive enough about failure modes imo.

Expand full comment

Where: I think it's a failure to add a lot of politically correct language that you append to each prompt.

Why: Ultimately I think you have to look at people's motivations and the incentives in the institution. Clearly they screwed up the training process because they had strong ideological motivations and blind spots.

I don't think I need to identify every (or even any) cause to identify a car accident or notice the pattern and blame BA's culture and decision-making.

It's not just a continuation of the same types of errors, it's so egregious as to be qualitatively different.

Expand full comment
author

My point was they all add this and the solution it's to continue to add it but make sure that the model understands why you are adding and the texts in which it should apply, which is more complicated than just leave it blank

Expand full comment
Feb 29Liked by Rohit Krishnan

Not sure you should add it, but I might be wrong.

Either way, Google wasn't just a difference of degree, but a huge lapse indicative of systemic cultural rot. Other LLMs would actually generate a white male, but Gemini wouldn't (at least in all the examples I've seen).

Expand full comment

Why did we expect image generators to be historically accurate? I for one didn't, and I loved that this happened, even though it points to much bigger issues of diversity in AI and tech (mostly white and Asian males, not that that's bad, but we need to have more variety, and this issue points to "overcompensation" in the sense of "don't look at us, our systems are not racist"). Although, to give Google and other companies some credit, I do not want my system to be misused by white supremacists to spread stupid and harmful propaganda and harmful biases (white = good, white = beautiful, white = pure, and shit like that). So in conclusion, Google did not screw up. The press and media think they screwed up, and Google, like the geek it is, let itself be bullied by them. Google showed extremely poor leadership in even accepting that they screwed up. They could have virtuously signaled their way out of the situation by being more strategic: "what the hell else did you want us to do? we are trying to prevent neo-Nazis and white supremacists from abusing our systems, stop playing the white victim" or whatever.

Expand full comment
author

I think the annoyance was that a) people did not get what they asked for, and b) people got some of what they did not expect. The latter's mostly fine (like an Asian Jesus or whatever), but the former is the actual issue. And especially the part of it that came from ham-handed prompt engineered diversity creation, instead of it being either more subtle by training the model better, or more overt by publicising the prompts.

Expand full comment