ContentSproute

us-technology

Inside Heineken’s limited-edition NA beer campaign at the US Open

Heineken is betting on special limited-edition tennis-themed cans to drive more sales of its non-alcoholic beer during the U.S. Open tennis tournament. Last year, the limited-edition cans — called Heineken’s 0.0 L0ve.L0ve cans — sold out during the first week of the U.S. Open, where they were sold exclusively. In turn, on-site NA sales at the tournament for Heineken grew 25% over 2023. Now the beer giant wants to bring that buzz outside the stadium and onto retail shelves for at-home tennis fans — the limited-edition can will be sold at retail nationwide over the coming weeks. It will also still be available throughout the grounds of the U.S. Open, which takes place from Aug. 18 to Sept. 8. The L0ve.L0ve edition’s campaign also comes at a time when Heineken is increasing its focus on capturing a bigger NA beer market share in the U.S. Heineken 0.0 is currently the top-selling NA beer globally. According to Nielsen data, the brand experienced 25.2% sales growth in 2024, on top of 32.3% in 2023. Heineken’s 0.0 beer has grown significantly in the last few years. Since 2021, Heineken says 0.0 has sold over 11.2 million cases.  Maggie Timoney, president and CEO of Heineken USA, told Modern Retail it took the company decades to perfect its 0.0 beer, which first launched in 2017 in Europe and in 2019 in the U.S. Now that it has been scaled globally, Heineken’s USA division wants to grow it by reaching more people via cultural moments.  The name is a nod to what’s known as a “love-all” in tennis, or a 0-0 score among two players. The can was first made available exclusively on-site at last year’s U.S. Open, but this year, retail availability helps bring the Love campaign outside of the tennis atmosphere in New York, allowing more fans to purchase these limited-edition cans. “We’ve been a sponsor of the U.S. Open for over three decades,” Timoney said. Outside of the limited-edition cans, Heineken 0.0. has been sold at U.S. Open matches for the last five years. “This is a huge sponsorship that’s not just on a national platform in the United States, it’s a global event.” Timoney said sampling is the most effective way to introduce people to Heineken 0.0, with repeat purchase rates currently at about 44%. That’s a big motivator for Heineken to dedicate more ad dollars and marketing initiatives to 0.0.  Timoney said, last year, the idea for the L0ve.L0ve came up right before the matches, so the company wasn’t prepared for the high demand throughout the two-week tournament. “The supply was the main challenge, and we ran out of stock the first week,” Timoney said. This year, Timoney said, Heineken worked ahead of time to secure inventory distribution for the tournament, as well as the company’s retail partners. These include major grocery chains and independent stores. “This year, we had time to do a whole retail program and present it to retailers late last year, and they’re really excited about it,” Timoney said. The L0ve.L0ve campaign during the U.S. Open is one example among several ways Heineken is trying to break through with its nonalcoholic offerings during cultural moments. “It’s another way to activate and shine the spotlight on Heineken 0.0,” Timoney said. “We were also at Coachella this year, where Heineken 0.0 was up 125.5% over the prior year.” The 0.0 beer was also featured prominently in the “F1” film this summer. Being a nonalcoholic product opens up doors for marketing opportunities in public spaces. Depending on the region, it’s often difficult to offer free alcoholic samples. NA beer has allowed Heineken to explore more activations, which Timoney said is important to converting customers.  To support the U.S. Open L0ve.L0ve cans, Heineken is doing a Grand Central Station takeover during the tournament, from Aug. 24 to Sept. 7. The takeover will also include out-of-home ads throughout the station and appearances by influencer partners to promote it. The company will also hand out L0ve.L0ve beer cans at multiple locations throughout Grand Central, including the Graybar Passage near Track 17 and the dining concourse. Timoney said many match attendees get to the Queens stadiums via the famous transit hub, making it ideal for a sampling activation. “We’re making the commuters’ life a bit more refreshing by handing it out in a very unique way,” she said.  But marketing is just one part of the equation to drive more NA sales. Heineken is one of the brewers that has invested in improving the quality of non-alcoholic beers over the years. The increase in improved recipes and variety has attracted people who are interested in cutting back on alcohol without sacrificing taste. In turn, NA beer options are becoming more prominent on menus, as they now appeal to a wider audience.  “It took 20 years for our master brewers to crack the taste of a non-alcoholic beer that can be part of a lifestyle of different occasions,” Timoney explained. Twenty years ago, Timoney said, the non-alcoholic beer off-premise beer market in the U.S. was about 0.4%. “We are now at about 1.3%,” she said. “We see now that 90% of shoppers who buy alcohol also buy a non-alcoholic beer.”  According to a January 2025 Beer Institute survey, 60% of Americans see low- and non-alcoholic beer as viable alternatives for long-term moderation, a 2% increase over 2024. Beer is also leading the category over other NA options, with 22% of respondents preferring non-alcoholic beer over other NA beverages. Meanwhile, 10% prefer NA liquor and 13% prefer NA wine. Nonalcoholic beer sales are also beginning to eclipse some traditional beer varieties, most recently surpassing ale beer by volume. Oren Bitan, chair of Buchalter’s wine, beer and spirits industry law group, said that NA beer is a “bright spot” in the declining beer market.  “It makes perfect sense that a big supplier like Heineken would double down in the space and try and capture some of that market share,” Bitan said. On-premise marketing at entertainment and sporting events is a major opportunity for trial, he continued.  “On

Inside Heineken’s limited-edition NA beer campaign at the US Open Read More »

Esquire’s Michael Sebastian on the workday of a magazine editor in the age of TikTok and AI

By Tony Case  •  August 20, 2025  • Photo by Guy Aroch In the midtown Manhattan offices of Hearst Tower, surrounded by the familiar chaos of print production — stacks of back issues, proof pages scattered across the desk, the constant hum of editorial collaboration — Michael Sebastian presides over one of America’s most iconic magazines. With a twist. At 44, the editor in chief of Esquire represents something of a paradox: a digital native who came up through the ranks of online journalism, now stewarding a 92-year-old print publication (and its various digital products) that recently notched the industry’s loftiest honors. This spring, Esquire picked up both the Pulitzer Prize for Feature Writing — awarded to journalist Mark Warren for his article, “Death of a Small-Town Pastor,” about a small-town minister and mayor who died by suicide after his secret online life was exposed — and the National Magazine Award for General Excellence. For Sebastian, the awards are the validation of a philosophy that bridges old and new media — that great storytelling transcends format, and that heritage brands can thrive in the digital age without abandoning their print DNA. Sebastian joined Esquire eight years ago as digital director, after running digital news operations for the magazine’s parent Hearst Magazines, contributing content to the websites of 18 titles that also include Harper’s Bazaar, Cosmopolitan and Town & Country. Yet taking charge of Esquire was a singular challenge: modernizing a media institution without losing what made it iconic. The formula for that, he’s learned, lies not in choosing between print and digital but in making them work together, inextricably linked parts of a brand ecosystem. WorkLife sat down with Sebastian to talk about what the typical workday of a magazine editor looks like now, the intersection of legacy media and digital innovation (including, naturally, the onslaught of AI), and what it takes to keep a heritage brand culturally relevant in an age of infinite content. You come from a digital background and now you’re also running a print magazine. How has that perspective shaped your approach to Esquire? I like to say that our digital manifestation — and I don’t just mean the website anymore; obviously, it’s all of these channels — is the beating heart of the brand, but the print magazine is the soul of the brand. It’s also the flagship store on Fifth Avenue. We need that because it differentiates us from an Instagram account or a website. Our audience likes it, the subscriber numbers remain strong, writers love to be in print, photographers love to see their stuff in print, and celebrities like to see themselves in print. How do these platforms work together strategically? All of these things fuel each other. When we book a celebrity for a cover, the thought is always: How is this going to perform in the digital environment? Five years ago, it was about website traffic. That remains important, but it’s also about engagement on Instagram, what kind of video we can produce that’s going to live on TikTok and YouTube. Print now serves digital very much, but if you don’t treat print like a bespoke product, readers and advertisers will recognize that. What does the typical workday look like for a magazine editor now versus the past? There are far fewer martinis in the afternoon! But seriously, no two days are the same. I oversee the brand — setting strategy, making sure we’re executing on everything from a big feature story to what we’re publishing on Instagram. I could be having lunch with an advertiser, meeting with the team about stories, but I also like to dip into the weeds. Yesterday I spent time reviewing digital headlines for all the celebrity profiles in our “Mavericks of Hollywood” package [the subject of Esquire’s September cover, featuring Leonardo DiCaprio], tweaking those with an editor. Maybe that’s below my pay grade, but I think it’s important. Describe your own day. During the school year, I have the early shift with my kids — I have a 10-year-old and a 6-year-old — so I’m making breakfast, getting them dressed, getting them to the bus. Then I’m in the office between 9 and 10, staying until 7 or 8. There’s still a lot of travel — [Esquire creative director] Nick [Sullivan] and I go to Florence, Milan and Paris for the [fashion] shows in June and January, so I’m gone most of those months. How is AI changing magazine publishing? We have access to enterprise ChatGPT and internal editorial guidelines, but we’re not using AI to create content. There’s a quote I love: “I don’t want AI to write and make art for me. I want AI to do my dishes and my laundry so I can have more time to make literature and art.” That’s exactly how we’ve approached this. How can we use tools to free up writers, editors, photographers, designers so they can do what they’re most skilled at and passionate about? I use it for research when preparing for interviews, or to surface things from our digital archive. But AI can’t report a story. It can’t go to the Bowery Hotel bar and paint a scene. It can’t replace taste or experience. When Mark Warren writes about people who have lost someone they love, AI hasn’t had its heart broken or been in love. How has the pace of change in general affected magazine work? The pace of change that once lasted decades, then years, now seems to happen daily or weekly. You have to be up to that challenge to work here — and you have to love it like an ER doctor loves it, because you’re going to be thrown curve balls every day. If you just want to do what happened 30 years ago, it’s not going to work. But if you have that passion in your belly, it’s a great job. Why are you confident about the future of magazines? Esquire is a Tiffany brand. It’s been

Esquire’s Michael Sebastian on the workday of a magazine editor in the age of TikTok and AI Read More »

Future of TV Briefing: WTF is co-viewing measurement?

This Future of TV Briefing covers the latest in streaming and TV for Digiday+ members and is distributed over email every Wednesday at 10 a.m. ET. More from the series → This week’s Future of TV Briefing looks at what co-viewing measurement is and — more importantly — why it’s so problematic. Co-viewing measurement is a necessary evil in the TV and streaming ad market. Ad buyers and sellers want to know how many people may have seen an ad. Which is understandable, but there’s no perfect way for counting how many people were actually in the room when an ad aired. Instead co-viewing measurement relies on probabilistic modeling, a solution that is simultaneously problematic. What is co-viewing measurement? Co-viewing measurement is exactly what it sounds like: measuring how many people are in the room together when an ad or program airs on screen.  How is co-viewing measured? There are two main methods of measuring co-viewing, but both effectively take a direct co-viewing measurement from a smaller sample of viewers and then project that across the broader TV and streaming audience base. One method for taking the direct measurement is to have a sample of viewers physically log their TV watching. Nielsen deploys this through what are called personal people meters, in which the sample viewers press a button to tell Nielsen when they are in the room and are about to start watching something, how many people are in the room with them as well as when any of them leave the room.  The other direct measurement method is to have devices, such as a camera, in the same room as a TV that detect when the TV is on and scan the room for the number of people in it. This is the methodology used by TVision, which provides that data to one of Nielsen’s chief rivals VideoAmp. Little surprise then that TVision commissioned marketing firm Matter More Media to produce a study on co-viewing measurement that calls into question Nielsen’s co-viewing measurement methodology without calling out Nielsen by name. What’s the problem with Nielsen’s co-viewing measurement methodology? Humans. It largely relies on people actively logging their TV watching, including how many people they are watching with. Nielsen does have guardrails in place, like monitoring audio to determine if anyone may have left the room and then flagging those in the room to log any changes in who’s watching. But again, ultimately it’s up to people to remember to log the measurements, which is the criticism TVision and Matter More Media lob at this methodology in their study. “Pushing buttons requires compliance, where TVision is a passive system that is automatically measuring who’s in the room,” said Kelly Abcarian, the former Nielsen executive and current Matter More Media chief strategy officer who wrote the TVision study. The study cites a simulation conducted by TVision and the Coalition for Innovative Media Measurement that found that a match between active co-viewing measurements and passive co-viewing measurements 56%, which means that 44% of the time one of the methodologies is under- or over-counting compared to the other. So TVision’s co-viewing measurement methodology is better? Not necessarily. The camera-based co-viewing measurement system is passive, so it’s not reliant on people actively logging their TV watching. It also measures on a second-by-second basis, so it’s less prone to viewership gaps. But it still requires the camera to be able to accurately analyze the number of people in the room. Also TVision bases its measurement on a sample of 6,000 households, which is less than a third the number of households with Nielsen’s personal people meter. Moreover, while the industry’s measurement arbiter Media Rating Council has audited Nielsen’s co-viewing measurement methodology, “I haven’t audited TVision. I don’t know how accurate it is, how good it is at measuring people’s faces, whether there’s bias [among the people who] even agree to have these things in their homes,” said Ron Pinelli, svp of digital research and standards at MRC. OK, so both methodologies have their problems. Isn’t there a better way? Sure. Either company — both — could install cameras in every room with a TV in every household in the U.S. and also equip every person with a personal people meter to log their TV watching. It’ll be like taking inventory every time you want to chill on the couch. How’s that? A nightmare. But is this a U.S.-only issue? Like, don’t they measure TV and streaming co-viewing in the U.K.? They do. But the U.K. has Barb, which is an independent organization that provides the singular standard for co-viewing measurement that all ad buyers and sellers accept. There’s still the problem of the co-viewing measurement being projected across the entire market from a smaller panel of viewers, but because it’s one measurement that’s accepted by all parties, there’s less concern about any one party being disproportionately affected. Hmm. So why doesn’t the U.S. have a Barb? Why doesn’t the U.S. have a monarchy? Wut? The U.S. effectively had a Barb in Nielsen. It was the measurement standard for TV viewership used by ad buyers and sellers as the basis for transactions. But then Nielsen messed up on its measuring during the pandemic, was called out for it, lost its MRC accreditation (which has since been reinstituted) and opened the door for the likes of VideoAmp, Comscore and iSpot.tv to vie to usurp Nielsen’s position as the industry’s primary measurement provider. And now — for as much as Nielsen still dominates the measurement currency market — the industry lives in a multi-currency measurement system akin to the multi-party political system. Oh-kay… You mentioned MRC being “the industry’s measurement arbiter.” Can’t they do something about this? Great question. And one I asked Pinelli and MRC CEO and executive director George Ivie. What’s they say? They know that co-viewing measurement is imperfect. “We were just talking to a big advertiser earlier this week who was telling us like, this is their biggest source of pain is

Future of TV Briefing: WTF is co-viewing measurement? Read More »

Nasa and IBM apply artificial intelligence to tackle solar digital disruption

NicoElNino – stock.adobe.com Solar storms and flares can have a big impact on digital society, which is why IBM and Nasa are forecasting solar weather with AI By Cliff Saran, Managing Editor Published: 20 Aug 2025 14:00 IBM and Nasa have made available on Hugging Face an open source artificial intelligence (AI) model called Surya, trained to understand and predict how solar activity affects Earth and space-based technology. Surya applies AI to solar image interpretation and space weather forecasting research. According to IBM, it can be used to help protect GPS navigation, power grids and telecommunications from the Sun’s ever-changing nature. Digital technology that powers modern society is vulnerable to space weather, and a systemic risk scenario created by Lloyd’s showed that the global economy could be exposed to losses of $2.4tn over a five-year period. A hypothetical solar storm could lead to $17bn of economic damage. Solar flares and coronal mass ejections can damage satellites, spacecraft and/or astronauts that are stationed beyond Earth. They may cause satellite hardware to fail, damaging solar panels and circuits. Solar weather can also impact airline travel, due to navigational errors and potential risk of radiation for airline crew and passengers. Disruption to GPS due to solar weather could impact agriculture, leading to lower food production. Surya is a 366 million-parameter foundation model for heliophysics – the study of the Sun and its effect on the Solar System. It was pre-trained using data from the Atmospheric Imaging Assembly and Helioseismic and Magnetic Imager instruments on Nasa’s Solar Dynamic Observatory (SDO) mission, which was launched in 2010. It uses self-supervised learning to identify patterns in unlabelled solar data, which eliminates the need for experts to categorise thousands of complex solar events manually and was trained on nine years of high-resolution solar observations from Nasa’s SDO. These solar images are 10 times larger than typical AI training data. Surya required a custom technology architecture to handle the massive scale of the dataset while maintaining efficiency.  In a paper discussing the model, researchers from IBM and Nasa said Surya learned general-purpose solar representations that capture both “the fine-scale variability of magnetic fields and the large-scale dynamics of the solar atmosphere”. They claimed the pre-training strategy enabled the model to perform zero-shot forecasting of solar activity, representing a shift from narrowly focused, task-specific models to a more versatile and scalable approach for heliophysics. Traditional solar weather prediction relies on partial satellite views of the Sun’s surface, making accurate forecasting extremely difficult. Surya addresses this typical limitation by training on the largest curated heliophysics dataset, which IBM and Nasa said was designed to help researchers better study and evaluate critical space weather prediction tasks. “We are advancing data-driven science by embedding Nasa’s deep scientific expertise into cutting-edge AI models,” said Kevin Murphy, chief science data officer at Nasa’s headquarters in Washington. “By developing a foundation model trained on Nasa’s heliophysics data, we’re making it easier to analyse the complexities of the Sun’s behaviour with unprecedented speed and precision. This model empowers broader understanding of how solar activity impacts critical systems and technologies that we all rely on here on Earth.” By releasing Surya on Hugging Face, IBM and Nasa said they were “democratising access to advanced tools for understanding and forecasting solar weather and scientific exploration”, encouraging the development of specialised applications for different regions and industries. Juan Bernabe-Moreno, director of IBM Research Europe for UK and Ireland, said: “This AI model gives us unprecedented capability to anticipate what’s coming, and is not just a technological achievement, but a critical step toward protecting our technological civilisation from the star that sustains us.” Read more on Clustering for high availability and HPC 5 ways solar flares affect technology By: Ava DePasquale Solar shift: How data centers can embrace renewable energy By: Julia Borgini Confluent Current 2023: A journey around the data streaming universe  By: Adrian Bridgwater SpaceX By: Andy Patrizio Read More

Nasa and IBM apply artificial intelligence to tackle solar digital disruption Read More »

Interview: David Walmsley, chief digital and technology officer, Pandora

David Walmsley, chief digital and technology officer at jewellery retailer Pandora, is reflecting on four years of sparkling progress. His team has built a platform for digital transformation. Now he’s eager to help push the business into new data-enabled areas. When Computer Weekly last spoke with Walmsley in early 2021, he’d recently transitioned from being chief digital and omnichannel officer at Pandora to his current role. He’d been tasked with using his broad knowledge to help the company embrace technological innovation for high-quality customer experiences. “I was brought into Pandora to sort out e-commerce and digital, which is what I’d been doing in John Lewis and Marks & Spencer,” he says. “I inherited a traditional IT function that relied on outsourcing and that didn’t have a great track record in delivering big programmes and change. That’s what I’ve been addressing during the past four years, and we’ve been having a lot of fun ever since.” Walmsley describes a long list of projects that have been completed. However, the biggest shift is cultural. He’s insourced engineering and architecture capabilities and built a strong product management function. The overarching strategy is to ensure that people in technology focus on business outcomes. “Yes, we’ve built some big stuff, but the bigger change is that we now concentrate on how we use technology to drive conversion, customer satisfaction, shelf-edge availability, manufacturing, and more,” he says. “The approach is all about asking, ‘How do you drive the outcome and line the technology up behind that?’” While the solution to the business challenge could involve implementing a big IT platform, the answer at other times might involve clever combinations of existing data and technology. “So, we don’t exist to build big stuff, we exist to drive outcomes,” he says. “And that’s the big achievement. While we have a strong internal technology team, we also have great partners. Our successes are not just about doing clever stuff ourselves and hoping we know best. Our partnerships with Salesforce, SAP and Microsoft have been part of the fuel that’s powered us forward.” Developing a digital stack When Walmsley last spoke with Computer Weekly, he discussed the importance of using agile-focused development techniques to create new experiences. That approach is still important to his team’s work at Pandora four years later, but with a twist. Each business domain becomes a product line where stakeholders are integrated with digital and technology professionals in a team. This team runs quarterly planning meetings and establishes sprints and granular delivery points. However, Walmsley was keen to avoid stricter approaches, such as the scaled agile framework, that place tighter constraints on workflows. “We don’t exist to build big stuff, we exist to drive outcomes. And that’s the big achievement. While we have a strong internal technology team, we also have great partners” David Walmsley, Pandora “We created this phrase, ‘Pandora, the agile way’. We wanted to create our flavour of agile. We do continuous improvement and development from the engineers’ desktops, and automation at scale. Our systems are, in the main, autonomous and loosely coupled. But then there are variations in between,” he says. “Take the ERP programme with SAP. That’s a massive transformation. Yes, you can be iterative in certain areas during the work. But building an ERP system is at the other end of the spectrum. Yet across this spectrum, our digital stack is the most leading-edge I’ve had in my career. It would hold up against any major enterprise e-commerce stack.” Pandora’s Digital Hub in Copenhagen plays a key part in this development process. Walmsley explained in 2021 how the hub was established during the coronavirus pandemic as a route to e-commerce innovation. The hub, which employed 120 people four years ago, is now home to over 300 IT, digital and data analytics professionals. Walmsley sets these developments within the context of the firm’s IT hiring decisions. “We took an extra floor of the building, so that hub has been expanding over the last couple of years,” he says. “We’ve also been building IT talent and knowledge. We use the phrase sustainable technology, which is about keeping a check on total headcounts and costs. That approach is important for how we frame technology’s contribution to the business.” Building data foundations Walmsley says his technology team’s work involves a mixture of legacy streamlining at the back end and pushing change at the front end. “My opportunity as a digital leader is to drive the top line,” he says. “All the complexity is at the back end. I bias my days towards doing clever stuff at the front end. However, the back-end digital transformation, across areas like ERP and manufacturing technology, is critical.” Walmsley is proud of the work his team has completed on the firm’s data foundations. These foundations will help Pandora embrace emerging technologies, particularly artificial intelligence (AI). As a member of the company’s executive team, Walmsley works with the senior leadership team to develop a platform for change. “Our investment in data foundations is a fundamental multiplier to where we’re going next,” he says. “We’ve got three big bets in AI.” Our investment in data foundations is a fundamental multiplier to where we’re going next. We’ve got three big bets in AI David Walmsley, Pandora Those bets include working with Salesforce and its Agentforce product to develop agentic AI solutions for selling and service. Pandora is eight months into that relationship, and the technology already manages a significant chunk of online customer service requests. The second big bet is product development. Here, the team explores how AI can help bring new products to market. Finally, the third bet is back-end automation across the organisation, using productivity-boosting technologies from providers such as Microsoft and SAP. Walmsley says Pandora’s explorations leave him to conclude that successful exploitation of AI requires a targeted approach. CIOs can suffer from a fear of missing out when it comes to AI, especially when chief executives and other C-suite members pile on the pressure for innovation. Walmsley

Interview: David Walmsley, chief digital and technology officer, Pandora Read More »

Metropolitan Police contract with Fujitsu is ‘potential conflict of interest’ amid Post Office probe

Freedom of information request reveals sub-contract between Fujitsu and the police force leading nationwide investigation of the IT firm’s part in the Post Office scandal By Karl Flinders, Chief reporter and senior editor EMEA Published: 20 Aug 2025 12:09 Nationwide police investigation into Fujitsu’s role in the Post Office scandal faces conflict of interest challenges, due to the IT supplier’s contract with the lead investigating police force. A freedom of information (FOI) request has revealed that through a subcontract in the Metropolitan Police’s deal with DHL Supply Chain, to provide Met police officers with uniforms, Fujitsu supplies and supports the online ordering platform, known as Uniform Hub. The subcontract creates a potential conflict of interest as the Met Police lead the nationwide Post Office Horizon scandal criminal probe, known as Operation Olympos, which includes investigating Fujitsu and former employees. Operation Olympos was set up in the aftermath of the broadcast of the Post Office scandal-based drama, Mr Bates versus the Post Office, and the public anger that followed. A legal source told Computer Weekly this creates “at least” a potential conflict of interest. The Metropolitan Police Service said it has an indirect commercial arrangement with the company Fujitsu, adding: “This is not confirmation of any corporate entities under criminal investigation…” According to the FOI request response, through the Uniform Hub contract, Fujitsu captures, stores and processes human resources (HR) files that contain information relating to officers. The data includes personal and employment details such as employee number and user ID, name, email address, gender, last hire date and termination date if relevant. It also includes information on what Met police officers are working on, the start date of current assignments, rank bands and job roles. The contract was first signed in December 2015 and is due to end in March 2027, with a tender process underway for a new contract starting in April 2027. The contract was worth £123m. A legal source told Computer Weekly: “This situation creates at least a potential conflict of interest for the police force, and possibly an actual conflict depending on how the relationships are managed. The legal issues centre on impartiality, bias, misconduct and procurement law. The force must take active steps to identify, declare and manage the conflict to uphold legal and ethical standards.” In its FOI response, the Met Police said the National Uniform Managed Service (NUMS) supplier conducts regular independent audits on sub-contractors, along with carrying out independent IT health checks annually or in response to a significant incident or change to the Uniform Hub. Fujitsu, as the NUMS supplier’s IT partner, is a Police Assured Secure Facility (PASF) approved and subject to re-certification every three years, it added. Separately to Operation Olympos, but in relation to the Post Office scandal, former Fujitsu staff have been under investigation since 2020. As Computer Weekly revealed that year, the Met Police began assessing evidence of potential perjury offences committed by Fujitsu staff in criminal trials of subpostmasters prosecuted for accounting errors caused by a computer system. In November 2021, it opened a criminal investigation into Fujitsu staff who gave evidence in trials of subpostmasters. These were tech workers Gareth Jenkins and Anne Chambers. But Operation Olympus widens investigations into the action of Fujitsu and its staff. In an update on Operation Olympos in December 2024, Met Police commander Stephen Clayman, who is leading the investigation, said police will “go where the evidence takes” them, with no person or crime out of the scope of the investigation. In June 2025, Police said they were investigating 45 people in relation to potential crimes committed in the scandal, with seven formally identified as suspects. The sub-contract with the Met Police is an example of Fujitsu carrying out public sector business under the public radar. In January 2024, soon after the ITV drama about the Post Office Scandal was broadcast, and following the public backlash, Fujitsu made a gesture to UK government to pause bidding on government contracts. But there has been anger among the public and politicians as the firm has continued to win lucrative business. In April 2024, Computer Weekly revealed that Fujitsu instructed its staff how to work around its own rules, with going through partners seen as a valid approach. The company’s then head of public sector, Dave Riley, told staff that the current Cabinet Office position at the time, where it bid with a partner, meant that it was up to the partner to decide if they are comfortable to work with Fujitsu, so not “currently subject to the gateway checks of us bidding”, he said. The Post Office scandal was first exposed by Computer Weekly in 2009, revealing the stories of seven subpostmasters and the problems they suffered due to the accounting software (see timeline of Computer Weekly articles about the scandal below). Read more on IT for government and public sector Kroll reviewing Post Office Horizon’s current integrity and discrepancy identification By: Karl Flinders Post Office scandal data leak interim compensation offers made By: Karl Flinders Government announcement on Fujitsu talks add ‘vague words’ and no interim payment By: Karl Flinders Metropolitan Police concern puts brakes on Post Office Horizon data migration By: Karl Flinders Read More

Metropolitan Police contract with Fujitsu is ‘potential conflict of interest’ amid Post Office probe Read More »

UK chip strategy needs an AI acceleration slant

archy13 – stock.adobe.com Analysis for the government shows gaps in Labour’s AI plan of action, but the big opportunity is in optoelectronics By Cliff Saran, Managing Editor Published: 20 Aug 2025 12:01 Buried in a 12-page analysis prepared for the government by the Council for Science and Technology (CST) is recognition that Labour’s action plan for artificial intelligence (AI) opportunities lacks any concrete support for semiconductors. In June, Labour unveiled its industrial strategy, which includes £19m of funding to establish a UK semiconductor centre that will serve as a single point of contact for global firms and governments to engage with the UK semiconductor sector. The Department for Science, Innovation and Technology (DSIT) said the centre would help ambitious firms scale up, form new partnerships and strengthen the UK’s role in global supply chains – helping to grow the economy. But while the UK starts to build a viable semiconductor sector, the world is moving ahead at an incredible pace due to the growth of AI and the need for AI acceleration chips. The CST believes the UK needs to develop a workforce of chip designers, especially in the area of optoelectronics, which it predicts will be essential to provide the high-speed interconnects required to enable the connectivity of large numbers of graphics processing units (GPUs) to support advances in AI inference and training. AI chips are forecast to be the largest growth area in the chip industry for the next decade. The authors of the Council for Science and Technology’s Advice on building a sovereign AI chip design industry in the UK analysis noted that there are disproportionate opportunities for businesses – and nations – with the right capabilities, given that six of the seven largest companies in the world are investing billions where they perceive low-hanging fruit for more efficient, faster, lower power AI chips. The government has a 50-point plan of action for AI, which includes establishing AI growth zones. The CST’s analysis states that “the plan is quiet on UK-designed chips for AI despite the opportunity and the risks” While there are disproportionate opportunities, the UK should look at the possibility of having a stake in AI chips, which the Council for Science and Technology said would “also help us secure our hardware supply chain for domestic commercial and military applications, in an uncertain era of tariffs and export restrictions”. This is now more relevant than ever, given recent policy changes in the US, and the risk that US semiconductor technology may be a key bargaining chip that US trade negotiators put into play to exert pressure on trading partners. Earlier in August, Associated Press reported that Nvidia and AMD had agreed to share 15% of their revenues from chip sales to China with the US government, as part of a deal to secure export licences for the semiconductors. Some industry watchers have remarked that this sets a dangerous precedent. Policy institute Chatham House warned that while the US administration has argued the case for restricting export of high-end semiconductors to China on grounds of national security, the levy being imposed by the Trump administration represents a way to exert pressure on certain countries. “The deal sets a concerning precedent with long-term ramifications. It suggests that other companies active in strategic industries could potentially in future pay their way out of burdensome and complex export control regimes, even if they involve key US national security concerns,” wrote Katja Bego, a senior research fellow in Chatham House’s International Security programme, in a recent post on the Chatham House website.  She warned that this kind of arrangement could also set the scene for the Trump administration to more broadly exercise its ability to control export licensing to influence companies whose supply chains involve the US, such as the high-tech sector. Compound semiconductors The UK government’s semiconductor strategy sees a big role for compound semiconductors. However, the Council for Science and Technology believes optoelectronics – a compound chip heavily used in AI acceleration for ultra-fast connectivity of GPUs – should be prioritised. Its analysis points out that targeted investment could require trade-offs. “Although the UK has a history of investing in compound materials, DSIT should consider deprioritising them in favour of activity in support of AI chips. Within the field of compound materials, optoelectronics should continue to be a relatively high priority as the market for it is expected to grow at a greater rate,” said the CST. According to the Council for Science and Technology, data communications inside a single cloud datacentre are about 10,000 times larger than the entire public internet. “A single rack of AI accelerators communicates about 10 times faster than an equivalent rack of CPUs [central processing units]. Hence, AI systems already require more communications and are growing faster still,” said the report. The report’s authors said this is good news for the UK, due to the nation’s strength in all aspects of optoelectronics, from optical subsystems and modules to specialised manufacturing. “We can anticipate far larger growth in manufacturing processes for optoelectronics than other compound manufacturing,” they wrote. The Council for Science and Technology also recommended that the government explore investment in advanced chip assembly and packaging. Read more on Artificial intelligence, automation and robotics Government funds training to build UK chip skills By: Cliff Saran Expect a tariff on semiconductors within two months, says US commerce secretary By: Cliff Saran Custom software and silicon set to define next-gen chips By: Cliff Saran Government on the hunt for science and tech leaders By: Lis Evenstad Read More

UK chip strategy needs an AI acceleration slant Read More »

Google spins up agentic SOC to speed up incident management

Looker_Studio – stock.adobe.com Google Cloud elaborates on its vision for securing artificial intelligence unveiling new protections and capabilities across its product suite By Alex Scroxton, Security Editor Published: 19 Aug 2025 17:59 At Google Cloud’s virtual Security Summit this week, the organisation shared more details of its expanding vision around safeguarding artificial intelligence (AI), both in terms deploying AI’s capabilities in the service of improving resilience with new agentic security operations centre (SOC) capabilities and features, and securing its customers’ future AI development projects. Google leadership spoke of an “unprecedented” opportunity for organisations to redefine their security postures and reduce risk around their AI investments. The firm’s vision of the agentic SOC is an “integrated experience” whereby detection engineering workflows are streamlined based on AI agents optimising data pipelines, automating alert triage, investigation and response in a system whereby they are able to coordinate their actions in support of a shared goal. Its new alert investigation agent, which was first announced back at Google Cloud Next in April but enters preview today for a number of users, will supposedly enrich events, analyse command line interfaces (CLIs), and build process trees based on the work of the human analysts at Google Cloud’s Mandiant unit. The resulting alert summaries will be accompanied by recommendations for human defenders, which Google believes may help them drastically cut down both manual effort and response times. “We’re excited about the new capabilities that we’re bringing to market across our security portfolio to help organisations not only continue to innovate with AI, but also leverage AI to keep their organisation secure,” Google Cloud’s Naveed Makhani, product lead for security AI, told Computer Weekly. “One of the biggest security improvements that we’re announcing is within our AI Protection solution. As organisations rapidly adopt AI, we’re developing new capabilities to help them keep their initiatives secure,” added Makhani. In this space, Google today announced three new capabilities within its Agentspace and Agent Builder tools that it hopes will protect customer-developed AI agents. These include new agent inventory and risk identification capabilities to help security teams better spot potential vulnerabilities, misconfigurations, or dodgy interactions among their agents, better safeguards against prompt injection and jailbreaking attacks, and enhanced threat detection within Security Command Centre. Elsewhere, Google added enhancements to its Unified Security (GUS) offering – also unveiled earlier this year – including a security operations laboratory feature offering early access to experimental AI tools for threat parsing, detection and response, dashboards to better visualise, analyse and act on security data, and the porting of security features present in the Android version of its Chrome browser to Apple’s iOS. Trusted Cloud, meanwhile, gains several updates around compliance, posture management, risk reporting, agentic identity and access management (IAM), data protection and network security. AI consulting Based on Mandiant data that suggests its human analysts are increasingly seeing customer demands for guidance around cyber security for AI applications, Google will also introduce more AI-specific offerings within the overall solution set offered by Mandiant’s consultants. “Mandiant Consulting now provides risk-based AI governance, pre-deployment guidance for AI environment hardening and AI threat modelling. Partnering with Mandiant can empower organisations to embrace AI technologies while mitigating security risks,” said Google. Read more on Business continuity planning Enterprise AI adoption moving beyond experimentation By: Aaron Tan Scattered Spider cyber gang turns fire on aviation sector By: Alex Scroxton Nokia launches autonomous network fabric By: Joe O’Halloran Financially motivated cyber crime remains biggest threat source By: Alex Scroxton Read More

Google spins up agentic SOC to speed up incident management Read More »

Stop benchmarking in the lab: Inclusion Arena shows how LLMs perform in production

August 19, 2025 4:07 PM Credit: VentureBeat, generated with MidJourney Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Benchmark testing models have become essential for enterprises, allowing them to choose the type of performance that resonates with their needs. But not all benchmarks are built the same and many test models are based on static datasets or testing environments.  Researchers from Inclusion AI, which is affiliated with Alibaba’s Ant Group, proposed a new model leaderboard and benchmark that focuses more on a model’s performance in real-life scenarios. They argue that LLMs need a leaderboard that takes into account how people use them and how much people prefer their answers compared to the static knowledge capabilities models have.  In a paper, the researchers laid out the foundation for Inclusion Arena, which ranks models based on user preferences.   “To address these gaps, we propose Inclusion Arena, a live leaderboard that bridges real-world AI-powered applications with state-of-the-art LLMs and MLLMs. Unlike crowdsourced platforms, our system randomly triggers model battles during multi-turn human-AI dialogues in real-world apps,” the paper said.  AI Scaling Hits Its Limits Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are: Turning energy into a strategic advantage Architecting efficient inference for real throughput gains Unlocking competitive ROI with sustainable AI systems Secure your spot to stay ahead: https://bit.ly/4mwGngO Inclusion Arena stands out among other model leaderboards, such as MMLU and OpenLLM, due to its real-life aspect and its unique method of ranking models. It employs the Bradley-Terry modeling method, similar to the one used by Chatbot Arena.  Inclusion Arena works by integrating the benchmark into AI applications to gather datasets and conduct human evaluations. The researchers admit that “the number of initially integrated AI-powered applications is limited, but we aim to build an open alliance to expand the ecosystem.” By now, most people are familiar with the leaderboards and benchmarks touting the performance of each new LLM released by companies like OpenAI, Google or Anthropic. VentureBeat is no stranger to these leaderboards since some models, like xAI’s Grok 3, show their might by topping the Chatbot Arena leaderboard. The Inclusion AI researchers argue that their new leaderboard “ensures evaluations reflect practical usage scenarios,” so enterprises have better information around models they plan to choose.  Using the Bradley-Terry method  Inclusion Arena draws inspiration from Chatbot Arena, utilizing the Bradley-Terry method, while Chatbot Arena also employs the Elo ranking method concurrently.  Most leaderboards rely on the Elo method to set rankings and performance. Elo refers to the Elo rating in chess, which determines the relative skill of players. Both Elo and Bradley-Terry are probabilistic frameworks, but the researchers said Bradley-Terry produces more stable ratings.  “The Bradley-Terry model provides a robust framework for inferring latent abilities from pairwise comparison outcomes,” the paper said. “However, in practical scenarios, particularly with a large and growing number of models, the prospect of exhaustive pairwise comparisons becomes computationally prohibitive and resource-intensive. This highlights a critical need for intelligent battle strategies that maximize information gain within a limited budget.”  To make ranking more efficient in the face of a large number of LLMs, Inclusion Arena has two other components: the placement match mechanism and proximity sampling. The placement match mechanism estimates an initial ranking for new models registered for the leaderboard. Proximity sampling then limits those comparisons to models within the same trust region.  How it works So how does it work?  Inclusion Arena’s framework integrates into AI-powered applications. Currently, there are two apps available on Inclusion Arena: the character chat app Joyland and the education communication app T-Box. When people use the apps, the prompts are sent to multiple LLMs behind the scenes for responses. The users then choose which answer they like best, though they don’t know which model generated the response.  The framework considers user preferences to generate pairs of models for comparison. The Bradley-Terry algorithm is then used to calculate a score for each model, which then leads to the final leaderboard.  Inclusion AI capped its experiment at data up to July 2025, comprising 501,003 pairwise comparisons.  According to the initial experiments with Inclusion Arena, the most performant model is Anthropic’s Claude 3.7 Sonnet, DeepSeek v3-0324, Claude 3.5 Sonnet, DeepSeek v3 and Qwen Max-0125.  Of course, this was data from two apps with more than 46,611 active users, according to the paper. The researchers said they can create a more robust and precise leaderboard with more data.  More leaderboards, more choices The increasing number of models being released makes it more challenging for enterprises to select which LLMs to begin evaluating. Leaderboards and benchmarks guide technical decision makers to models that could provide the best performance for their needs. Of course, organizations should then conduct internal evaluations to ensure the LLMs are effective for their applications.  It also provides an idea of the broader LLM landscape, highlighting which models are becoming competitive compared to their peers. Recent benchmarks such as RewardBench 2 from the Allen Institute for AI attempt to align models with real-life use cases for enterprises.  Daily insights on business use cases with VB Daily If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI. Read our Privacy Policy Thanks for subscribing. Check out more VB newsletters here. An error occured. Read More

Stop benchmarking in the lab: Inclusion Arena shows how LLMs perform in production Read More »

LLMs generate ‘fluent nonsense’ when reasoning outside their training zone

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A new study from Arizona State University researchers suggests that the celebrated “Chain-of-Thought” (CoT) reasoning in Large Language Models (LLMs) may be more of a “brittle mirage” than genuine intelligence. The research builds on a growing body of work questioning the depth of LLM reasoning, but it takes a unique “data distribution” lens to test where and why CoT breaks down systematically. Crucially for application builders, the paper goes beyond critique to offer clear, practical guidance on how to account for these limitations when developing LLM-powered applications, from testing strategies to the role of fine-tuning. CoT prompting, which asks an LLM to “think step by step,” has shown impressive results on complex tasks, leading to the perception that models are engaging in human-like inferential processes. However, a closer inspection often reveals logical inconsistencies that challenge this view.  Various studies show that LLMs frequently rely on surface-level semantics and clues rather than logical procedures. The models generate plausible-sounding logic by repeating token patterns they have seen during training. Still, this approach often fails on tasks that deviate from familiar templates or when irrelevant information is introduced.  AI Scaling Hits Its Limits Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are: Turning energy into a strategic advantage Architecting efficient inference for real throughput gains Unlocking competitive ROI with sustainable AI systems Secure your spot to stay ahead: https://bit.ly/4mwGngO Despite these observations, the researchers of the new study argue that “a systematic understanding of why and when CoT reasoning fails is still a mystery,” which their study aims to address. Previous work has already shown that LLMs struggle to generalize their reasoning abilities. As the paper notes, “theoretical and empirical evidence shows that CoT generalizes well only when test inputs share latent structures with training data; otherwise, performance declines sharply.” A new lens on LLM reasoning The ASU researchers propose a new lens to view this problem: CoT isn’t an act of reasoning but a sophisticated form of pattern matching, fundamentally bound by the statistical patterns in its training data. They posit that “CoT’s success stems not from a model’s inherent reasoning capacity, but from its ability to generalize conditionally to out-of-distribution (OOD) test cases that are structurally similar to in-distribution exemplars.” In other words, an LLM is good at applying old patterns to new data that looks similar, but not at solving truly novel problems. The data distribution lens Source: GitHub To test this hypothesis, they dissected CoT’s capabilities across three dimensions of “distributional shift” (changes between the training data and the test data). First, they tested “task generalization” to see if a model could apply a learned reasoning process to a new type of task. Second, they examined “length generalization” to determine if it could handle reasoning chains that are significantly longer or shorter than those it was trained on. Finally, they assessed “format generalization” to measure how sensitive the model is to minor changes in the prompt’s wording or structure.  For their analysis, they developed a framework called DataAlchemy to train smaller LLMs from scratch in a controlled environment, allowing them to precisely measure how performance degrades when pushed beyond the training data. “The data distribution lens and controlled environment are both central to what we were trying to convey,” Chengshuai Zhao, doctoral student at ASU and co-author of the paper, told VentureBeat. “We hope to create a space where the public, researchers, and developers can freely explore and probe the nature of LLMs and advance the boundaries of human knowledge.” The mirage confirmed Based on their findings, the researchers conclude that CoT reasoning is a “sophisticated form of structured pattern matching, fundamentally bounded by the data distribution seen during training.” When tested even slightly outside this distribution, performance collapses. What looks like structured reasoning is more of a mirage, “emerging from memorized or interpolated patterns in the training data rather than logical inference.” The breakdown was consistent across all three dimensions. On new tasks, models failed to generalize and instead replicated the closest patterns they had seen during training. When faced with reasoning chains of different lengths, they struggled, often trying to artificially add or remove steps to match the length of their training examples. Finally, their performance proved highly sensitive to superficial changes in the prompt, especially variations in core elements and instructions. Interestingly, the researchers found that these failures could be quickly fixed. By fine-tuning the models on a very small sample of the new, unseen data through supervised fine-tuning (SFT), performance on that specific type of problem increased rapidly. However, this quick fix further supports the pattern-matching theory, suggesting the model isn’t learning to reason more abstractly but is instead just memorizing a new pattern to overcome a specific weakness. Takeaways for the enterprise The researchers offer a direct warning to practitioners, highlighting “the risk of relying on CoT as a plug-and-play solution for reasoning tasks and caution against equating CoT-style output with human thinking.” They provide three key pieces of advice for developers building applications with LLMs. 1)Guard against over-reliance and false confidence. CoT should not be treated as a reliable module for reasoning in high-stakes fields like finance or legal analysis. LLMs can produce “fluent nonsense” (plausible but logically flawed reasoning) that is more deceptive than an outright incorrect answer. The authors stress that “sufficient auditing from domain experts is indispensable.” “The advance of science should remain human-centered—machines can assist, but discovery still thrives on humanity and curiosity,” Zhao said. 2) Prioritize out-of-distribution (OOD) testing. Standard validation, where test data mirrors training data, is not enough to measure true robustness. Developers must implement rigorous testing that systematically probes for failures across task, length, and format variations. 3)Recognize fine-tuning as a patch, not a panacea. While supervised fine-tuning (SFT) can quickly “patch” a model’s performance

LLMs generate ‘fluent nonsense’ when reasoning outside their training zone Read More »

Scroll to Top