Happy Friday and Happy Thanksgiving to those who celebrate it! (Question: what is the collective noun for a whole bunch of conversations? Shall we reclaim ‘twitter’ as literally no one else is using it any more?) I attended my first Thanksgiving dinner this week in London and had a twitter of fascinating conversations with VCs, startups, bankers and lawyers, most of whom felt that the UK government’s current policies will hinder economic growth.
In brighter news, earlier this year, the ‘Barbenheimer’ box office pitted boys against girls, while last weekend, Gladiator / Wicked (‘Glicked’) pitted theatre gays against loincloth gays, booking over $200m of largely pink pounds at their opening weekend. And if you want to know more about tribes, read on. But first this week, we look into how AI models are benchmarked, find how much money it takes to be successful and bust some common crypto myths... Let’s get into it!
The layout and premise of the newsletter is simple: an end-of-the-week sheep-dip of tech, culture, policy and research stories, which I hope you enjoy. If you think friends or colleagues would benefit, please share with them so they can subscribe on Substack or LinkedIn.
Best wishes, Alex.
1. Tech innovation
Every time a new AI model is released, we are reliably told how it has exceeded previously established performance metrics. When OpenAI launched ChatGPT-4o a few months back, it did so amid fanfare that it beat pretty much every other LLM against almost all recognised benchmarks.
However, new research from a group of academics (Reuel, Hardy et al) argues that not all benchmarks are the same: their quality depends on their design and usability.
These benchmarks are important because they are used to showcase AI safety and self-censorship, which effectively comfort regulators, who don’t have the capacity or wherewithal to scrutinise the rapid churn of new models as they hit the market.
And to be clear, government is relying on these benchmarks to do a lot of the regulatory heavy-lifting: the AI Safety Institute in the UK references benchmarks in its guidelines and the incoming European AI Act refers to benchmarks as key to identifying whether or not models present systemic risks (in which case they are likely to be subject to higher levels of regulatory scrutiny).
Rather like those billboard adverts that declare a product to be “the UK’s most recommended shampoo” above small print that admits 14 out of 17 customers in Milton Keynes between 6-11 November said they would recommend it to a friend, AI benchmarks can be a law unto themselves: at best inaccessible, at worst, deliberately impenetrable. As the authors point out, “there exist large quality differences and that commonly used benchmarks suffer from significant issues.”
The new paper outlines an assessment framework considering 46 best practices across an AI benchmark's lifecycle and they evaluate 24 AI benchmarks against it. The authors have also published what may be the first repository of benchmark assessments to support benchmark comparability, so that LLMs in particular can be measured in the same way moving forward.
2. Culture
Is there a secret to financial success? Just over half of Americans (52%) think so, although the average salary considered successful is $270k per year, according to new research from U.S. financial planning firm Empower.
But it’s not just money: it’s what money can buy that makes people feel they’ve made it: only 27% rank wealth as the highest measure of financial success. Rather, most Americans say happiness (59%) is the most important benchmark: being able to spend money on the things and experiences that bring the most joy and doing what you love; followed by the luxury of free time (35%) to pursue personal passions.
Respondents said success was about the ‘factor of four’: hard work (84%); talent (65%); who you know (55%); and luck and circumstance (51%).
Pay yourself first, say over one third of people surveyed (35%), by putting money away and saving for retirement (a convenient finding for research from a financial planning firm). While for almost one in five GenZ and Millennial workers (19%), a secret to success is to ‘fake it ‘til you make it.’
3. Policy & Research
Crypto gets a bad rap, with media stories about blockchains being havens for criminals and having no real-world use cases. Now cyber intelligence firm Chainalysis have produced a Crypto Myth Busting Report which explores some of these perceived wisdoms and goes some way to debunking them with analysis and insight, including:
Myth #1: Crypto is only used by criminals.
Reality: With stronger law enforcement and growing regulatory oversight in multiple jurisdictions, illicit crypto transactions have dropped to less than 1% of total activity.
Myth #2: Crypto is anonymous and untraceable.
Reality: Bitcoin and other cryptocurrencies are pseudonymous, not anonymous. Transactions are tied to publicly visible addresses and are recorded on a public ledger. Know Your Customer (KYC) regulations and blockchain analysis tools enable tracing of activities and transactions.
Myth #3: Crypto has no real-world use cases.
Reality: Cryptocurrencies are proving particularly crucial in emerging markets for remittances and as a stable asset during economic instability. They've facilitated millions in aid during crises and offer a reliable alternative for preserving wealth.
The full report can be found here.
4. Watch/read/listen
Tribalism is a misunderstood buzzword. We've all heard pundits bemoan its rise, and it's been blamed for everything from political polarisation to workplace discrimination. But as cultural psychologist and Columbia professor Michael Morris argues in Tribal, released last week, our tribal instincts are humanity's secret weapon.
He claims ours is the only species to live in tribes: groups glued together by their distinctive cultures that can grow to a scale far beyond clans and bands. Countries, churches, political parties, and companies are all tribes, and tribal instincts explain our loyalties to them and the hidden ways that they affect our thoughts, actions, and identities.
Weaving together deep research, current and historical events, as well as stories from business and politics, Morris cuts across conventional wisdom to completely reframe how we think about our tribes. Bracing and hopeful, Tribal unlocks the deepest secrets of our psychology and gives us the tools to manage our misunderstood superpower.
4. Playbook picks & worthy clicks
Amazon doubles down on AI startup Anthropic with another $4Bn (Reuters)
Data centres powering artificial intelligence could use more electricity than entire cities (CNBC)
Jamie Dimon predicts the next generation of employees will work 3.5 days a week (Yahoo)
MPs brand UK financial regulator 'incompetent' (BBC)
Bluesky has been accused of breaching EU data rules (FT)
Distributed computing is the next big thing (Quanta Magazine)
How OpenAI stress-tests its large language models (MIT Technology Review)
Microsoft is about to turn 50 (WIRED)
Now, you can follow Digital Culture Playbook on LinkedIn (please do!)
Wishing you all a fab weekend!