Technology Archives - Page 70 of 79

Oppo K13 Turbo and K13 Turbo Pro launch with D8450 and SD 8s Gen 4 chips, waterproof fans

With few exceptions, smartphones are cooled passively – the new Oppo K13 Turbo and K13 Turbo Pro buck that trend and bring a tiny fan to keep their chipsets cool even during long gaming sessions. The Oppo K13 Turbo is powered by the Dimensity 8450, a 4nm (TSMC) chip with eight Cortex-A725 cores and a

Oppo K13 Turbo and K13 Turbo Pro launch with D8450 and SD 8s Gen 4 chips, waterproof fans Read More »

Leak: the Google Pixel Watch 4 will have a brighter display, may have a new chipset after all

Technology

Details on the Google Pixel Watch 4 have been trickling out in the build-up to the August 20 unveiling. The latest scuttlebutt reveals that the new model will have a brighter display and new silicon for improved efficiency. The Pixel Watch 4 will be available in 41mm and 45mm sizes, which should correspond to 1.2”

Leak: the Google Pixel Watch 4 will have a brighter display, may have a new chipset after all Read More »

The vanilla Huawei Pura 80 pre-sales are starting on Wednesday, chipset leaks too

Technology

The vanilla Huawei Pura 80 is late to the party – the rest of the Pura 80 family launched in China in early June and even managed an international launch since then. The vanilla model is only now ready to go and will be available in China this Wednesday (July 23). Or maybe not even

The vanilla Huawei Pura 80 pre-sales are starting on Wednesday, chipset leaks too Read More »

First Xiaomi Redmi Turbo 5 insider rumors start coming in

Technology

Xiaomi’s Redmi Turbo series has a reputation for delivering great performance on a budget, and according to early insider rumors, the Redmi Turbo 5 is poised to carry on the tradition. According to the info, the phone will be based on an unannounced MediaTek Dimensity 8500 Ultra chipset. Currently, we don’t know anything about the

First Xiaomi Redmi Turbo 5 insider rumors start coming in Read More »

Weaving reality or warping it? The personalization trap in AI systems

Technology

AI represents the greatest cognitive offloading in the history of humanity. We once offloaded memory to writing, arithmetic to calculators and navigation to GPS. Now we are beginning to offload judgment, synthesis and even meaning-making to systems that speak our language, learn our habits and tailor our truths. AI systems are growing increasingly adept at recognizing our preferences, our biases, even our peccadillos. Like attentive servants in one instance or subtle manipulators in another, they tailor their responses to please, to persuade, to assist or simply to hold our attention. While the immediate effects may seem benign, in this quiet and invisible tuning lies a profound shift: The version of reality each of us receives becomes progressively more uniquely tailored. Through this process, over time, each person becomes increasingly their own island. This divergence could threaten the coherence and stability of society itself, eroding our ability to agree on basic facts or navigate shared challenges. AI personalization does not merely serve our needs; it begins to reshape them. The result of this reshaping is a kind of epistemic drift. Each person starts to move, inch by inch, away from the common ground of shared knowledge, shared stories and shared facts, and further into their own reality. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF This is not simply a matter of different news feeds. It is the slow divergence of moral, political and interpersonal realities. In this way, we may be witnessing the unweaving of collective understanding. It is an unintended consequence, yet deeply significant precisely because it is unforeseen. But this fragmentation, while now accelerated by AI, began long before algorithms shaped our feeds. The unweaving This unweaving did not begin with AI. As David Brooks reflected in The Atlantic, drawing on the work of philosopher Alasdair MacIntyre, our society has been drifting away from shared moral and epistemic frameworks for centuries. Since the Enlightenment, we have gradually replaced inherited roles, communal narratives and shared ethical traditions with individual autonomy and personal preference. What began as liberation from imposed belief systems has, over time, eroded the very structures that once tethered us to common purpose and personal meaning. AI did not create this fragmentation. But it is giving new form and speed to it, customizing not only what we see but how we interpret and believe. It is not unlike the biblical story of Babel. A unified humanity once shared a single language, only to be fractured, confused and scattered by an act that made mutual understanding all but impossible. Today, we are not building a tower made of stone. We are building a tower of language itself. Once again, we risk the fall. Human-machine bond At first, personalization was a way to improve “stickiness” by keeping users engaged longer, returning more often and interacting more deeply with a site or service. Recommendation engines, tailored ads and curated feeds were all designed to keep our attention just a little longer, perhaps to entertain but often to move us to purchase a product. But over time, the goal has expanded. Personalization is no longer just about what holds us. It is what it knows about each of us, the dynamic graph of our preferences, beliefs and behaviors that becomes more refined with every interaction. Today’s AI systems do not merely predict our preferences. They aim to create a bond through highly personalized interactions and responses, creating a sense that the AI system understands and cares about the user and supports their uniqueness. The tone of a chatbot, the pacing of a reply and the emotional valence of a suggestion are calibrated not only for efficiency but for resonance, pointing toward a more helpful era of technology. It should not be surprising that some people have even fallen in love and married their bots. The machine adapts not just to what we click on, but to who we appear to be. It reflects us back to ourselves in ways that feel intimate, even empathic. A recent research paper cited in Nature refers to this as “socioaffective alignment,” the process by which an AI system participates in a co-created social and psychological ecosystem, where preferences and perceptions evolve through mutual influence. This is not a neutral development. When every interaction is tuned to flatter or affirm, when systems mirror us too well, they blur the line between what resonates and what is real. We are not just staying longer on the platform; we are forming a relationship. We are slowly and perhaps inexorably merging with an AI-mediated version of reality, one that is increasingly shaped by invisible decisions about what we are meant to believe, want or trust. This process is not science fiction; its architecture is built on attention, reinforcement learning with human feedback (RLHF) and personalization engines. It is also happening without many of us — likely most of us — even knowing. In the process, we gain AI “friends,” but at what cost? What do we lose, especially in terms of free will and agency? Author and financial commentator Kyla Scanlon spoke on the Ezra Klein podcast about how the frictionless ease of the digital world may come at the cost of meaning. As she put it: “When things are a little too easy, it’s tough to find meaning in it… If you’re able to lay back, watch a screen in your little chair and have smoothies delivered to you — it’s tough to find meaning within that kind of WALL-E lifestyle because everything is just a bit too simple.” The personalization of truth As AI systems respond to us with ever greater fluency, they also move toward increasing selectivity. Two users asking the same question today might receive

Weaving reality or warping it? The personalization trap in AI systems Read More »

Scattered Spider playbook evolving fast, says Microsoft

Technology

ra2 studio – stock.adobe.com Microsoft warns users over notable evolutions in Scattered Spider’s attack playbook, and beefs up some of the defensive capabilities it offers to customers in response By Alex Scroxton, Security Editor Published: 16 Jul 2025 19:00 Microsoft has rolled out a series of targeted enhancements across its Defender and Sentinel cyber security ecosystem designed to help its customers guard against the possibility of falling victim to Scattered Spider as the cyber gang continues to evolve its playbook. Scattered Spider – referred to in Microsoft’s threat telemetry as Octo Tempest – ramped up the pace of its activity in April and May with disruptive attacks aimed at UK high street retailers. It then shifted up its targeting to go after insurance organisations, and in late June appeared to pivot to the aviation sector, with several possible victims emerging. The cyber gang uses varying methods in its attacks and, as before, its most common approaches involve gaining initial access through social engineering attacks and user impersonation to fool service desk workers through phone calls, emails and messages, SMS-based phishing using adversary-in-the-middle domains mimicking legitimate organisations, the use of tools such as ngrok, Chisel and AADInternals, and attacking hybrid identity infrastructures and exfiltrating data to support extortion and ransomware. However, as has been seen recently, the gang now seems to favour the use of DragonForce ransomware and has been particularly focused on VMware ESX hypervisor environments. Moreover, said Microsoft, in contrast to previous attack patterns where Scattered Spider exploited cloud identity privileges in order to attain on-premise access, it now appears to be hitting both on-premise accounts and infrastructure during the initial stage of its intrusions, prior to transitioning to cloud access. “In recent weeks, Microsoft has observed Octo Tempest, also known as Scattered Spider, impacting the airlines sector, following previous activity impacting retail, food services, hospitality organisations and insurance between April and July 2025,” said the Microsoft Defender research team in a blog update. “This aligns with Octo Tempest’s typical patterns of concentrating on one industry for several weeks or months before moving on to new targets. Microsoft Security products continue to update protection coverage as these shifts occur.” More assistance To better assist its customers, Microsoft has updated the range of detections available within Defender, spanning endpoints, identities, software-as-a-service (SaaS) applications, email and collaboration tools, and cloud workloads. It is also enhancing Defender’s built-in attack disruption capabilities – which draw on multi-domain signals, new threat intel, and AI-backed machine learning models to try to predict and disrupt a threat actor’s next move – essentially by containing and isolating the compromised asset. Microsoft said that based on its learnings from previous Scattered Spider attacks, this will also disable the user account used by the gang and revoke all existing active sessions it has open. Elsewhere within Defender, Microsoft has upped its advanced hunting capabilities to help organisations identify and ward off the gang’s more aggressive social engineering attacks on privileged individuals, even going so far as to identify who within the organisation is most likely to be targeted before an attack begins. Analysts will be able to question first- and third-party data sources through Microsoft Defender XDR and Microsoft Sentinel, as well as gaining exposure insights from Microsoft Security Exposure Management, which equips teams with capabilities like critical asset protection and attack path analysis. Exposure Management now also contains threat actor initiatives to unify insights on Scattered Spider to harden their defences and act quicker. The initiative features a guide on key Scattered Spider tactics, techniques and procedures (TTPs), and as well as a broader ransomware initiative focused on reducing exposure to extortion attacks, which also offers Scattered Spider-specific guidance. The latest guidance, which can be read here, also contains core advice for any and all users to take in regard to managing their cloud, endpoint and identity security postures. Read more on Business continuity planning Co-op chief ‘incredibly sorry’ for theft of 6.5m members’ data By: Alex Scroxton Luxury retailer LVMH says UK customer data was stolen in cyber attack By: Alex Scroxton M&S calls for mandatory ransomware reporting By: Alex Scroxton Scattered Spider link to Qantas hack is likely, say experts By: Alex Scroxton Read More

Scattered Spider playbook evolving fast, says Microsoft Read More »

UK government to invest £1m in building out regional tech clusters

Technology

harvepino – stock.adobe.com The Department for Science, Innovation and Technology is leading a push to make the UK tech sector less London-centric By Caroline Donnelly, Senior Editor, UK Published: 16 Jul 2025 18:12 The UK government is investing £1m in accelerating the growth of regional tech clusters outside of London, in acknowledgement of the fact that “tech innovation does not stop at the M25”. The roll-out of the Regional Tech Booster programme is being overseen by the Department for Science, Innovation and Technology (DSIT) and will see tech entrepreneurs based outside the capital offered access to mentoring, funding and skills development opportunities. At the time of writing, no further details had been shared about how interested parties will go about applying for the programme, but DSIT said information on that will be announced later this year. Specifically, DSIT said the programme is geared towards accelerating the growth of tech clusters and early-stage digital startups in areas such as Scotland and the north of England, as part of a push to close the innovation gap between London and the regional tech ecosystems in other parts of the UK. Minister for tech and future digital economy Baroness Jones said the programme is being launched in acknowledgement of the fact that tech innovation is not just limited to London. “Tech innovation doesn’t stop at the M25, and we’re choosing to invest in the talent and ideas flourishing across the UK,” she said. The programme is the latest in a long line of initiatives the government has introduced as part of its Plan for Change, which consists of a series of economic milestones it wants to hit by the end of the current Parliament, continued Jones. “This investment forms an important part of our Plan for Change to kickstart economic growth in every part of the UK. By supporting regional tech entrepreneurs, we’re creating the conditions for innovation and prosperity to flourish.” The UK’s nations and regions are home to a diverse and growing network of tech ecosystems. They already make a vital contribution to the economy, and with the right support, they can do even more Katie Gallagher, UK Tech Cluster Group The initiative also ties into other regional development schemes the government has introduced since coming to power in July 2024, including its AI growth zones buildout plan, which will see high-performance compute facilities built in various corners of the UK in support of the government’s wider AI ambitions. Katie Gallagher, chair of the UK Tech Cluster Group (UKTCG), which is a network of regional tech organisations focused on fostering growth across the UK, said her organisation has been selected to participate in the Regional Tech Booster pilot programme. As such, the UKTCG will support the government in ensuring the programme delivers sustainable benefits that continue beyond the initial £1m funding period, and will work with industry, academic institutions and local tech leaders on its delivery. “We’re pleased that DSIT has selected the UK Tech Cluster Group to pilot a new approach. This programme will focus on collaboration, connecting clusters, sharing best practice, supporting founders and entrepreneurs, and creating a practical playbook for building strong, sustainable regional tech economies,” said Gallagher. She also expanded on the reasons why the programme is needed and why it is so important for tech entrepreneurs based beyond London and the south-east to get a leg-up. “The UK’s nations and regions are home to a diverse and growing network of tech ecosystems. They already make a vital contribution to the economy, and with the right support, they can do even more,” she added. Read more on Infrastructure-as-a-Service (IaaS) Is history repeating itself with the government’s push to open public sector cloud deals to SMEs? By: Caroline Donnelly UK government signs deal with Google Cloud to upskill 100,000 civil servants in AI by 2030 By: Caroline Donnelly DSIT aims to bolster expertise with year-long secondments By: Cliff Saran Government funds training to build UK chip skills By: Cliff Saran Read More

UK government to invest £1m in building out regional tech clusters Read More »

Meta AI training will be challenged at Europe’s highest court, says data protection chief

Technology

The data protection commissioner for Hamburg believes Meta’s AI training should be stopped By Mark Ballatd Published: 16 Jul 2025 17:47 Europe’s baulked attempt to curb Meta’s artificial intelligence (AI) training will persevere and culminate at the European Court of Justice (ECJ), said the data protection commissioner who is leading the challenge. Within days of European Union (EU) privacy commissioners giving Meta licence to train its popular open source Llama large language model (LLM) on public posts made by Europeans on its Facebook and Instagram social media platforms, the US big tech firm began its training. It cranked up just in time for the June launch of the AI-powered Ray Ban sunglasses that depend on the deep cultural understanding its AI gets by consuming EU data. Yet while the regulatory approval was celebrated by those in Europe’s AI industry who believe it has been stifled by over-regulation, and who fear the European Commission’s ongoing implementation of its controversial AI Act will suppress their businesses, the main case against Meta has still to be heard. Four days before Meta was due to start training, a Cologne Court rejected an emergency injunction by which an official German consumer organisation tried to halt Meta’s EU AI training. But it and other consumer bodies have yet to bring their main cases to court. AI training should be stopped Thomas Fuchs, data protection commissioner for Hamburg, who supported the injunction, told Computer Weekly his office has not dropped its belief that the US giant’s AI training should be stopped, even though it conceded after the Cologne ruling and agreed with other EU regulators to let it proceed. “The urgent injunction was denied, but it doesn’t mean that the ruling has to be the same at the end of the whole legislative process,” he said. “So, this is the start. At the end, it will be decided by the European Court of Justice, I’m quite sure.” The Hamburg commissioner dropped his proceedings to use emergency powers he has under Article 66 of Europe’s GDPR data protection law to halt Meta’s training, after the Irish Data Protection Commissioner, which controls Europe’s regulatory oversight of Meta, decided the US firm had a legitimate interest to proceed under EU law, on condition that it takes strict measures to protect people’s privacy. The urgent injunction [to halt Meta AI training] was denied, but it doesn’t mean that the ruling has to be the same at the end of the whole legislative process Thomas Fuchs, data protection commissioner for Hamburg Fuchs does not believe, as Meta and a lobby of EU AI firms claimed in April, that Europe’s confused AI regulation is holding its industry back. The court ruling and regulatory decisions made no difference to the legal standing of EU AI firms because it concerned Meta “very precisely”, said Fuchs. But it did give EU firms a “green light” to do their own AI training on public posts. “You could say, if even Meta can base AI model training with personal data on legitimate interest, then probably for a lot of other companies that will be suitable as well,” said Fuchs. In a statement he wrote before the court hearing in May, Semjon Rens, Meta’s public policy director for Germanic countries, cited Mario Draghi, the former Italian premiere whose report on Europe’s flagging industries has become a blueprint for urgent European Commission reforms. Draghi said Europe’s AI firms had been impaired by a confusion of burdensome regulation. Terminating Meta’s AI training would weaken Germany’s AI industry, preventing firms from building AI applications using a Llama AI that had gained German cultural, historic and linguistic nuance from EU data, he said. The alternative, he implied, was German companies running Anglo-AI. It would also fragment Europe’s single market, offending another Draghi dictum, he said. EU behind EU regulation has delayed Meta’s EU roll-out of its most advanced Llama model by between six and 18 months, and limited its capabilities, while corporations in the US and India are already using it, said Rens. The European Commission’s agenda is meanwhile dominated by urgent efforts to improve EU competitiveness, and to catch up with AI advances in other countries, which it realised had left it far behind even before its regulatory machine stalled Llama’s training and subsequent roll-out last summer. Professor Ivan Yamschikov, co-founder of Pleias, an EU AI firm that built the multilingual Common Corpus – the world’s largest collection of open data for AI training – said the decisions would boost EU industry by increasing inward investment from US firms that now have the freedom to operate on the same terms as they do in the US. EU firms would, as Meta claimed, gain by utilising Llama models trained on EU data. But it is also a “green flag” for European innovators to proceed even without US money or technology, he said. I’m very happy to see that the focus in Brussels has changed from regulation to enabling growth and innovation Peter Sarlin, SiloAI “If even Meta can base AI model training with personal data on legitimate interest, that will probably be suitable for a lot of other companies as well,” added Yamschikov. “It’s what a common-sense regulator should do – balance the interests of business, innovation and society at large. I believe in growth rather than in hindrance,” he said. Peter Sarlin, CEO of SiloAI, one of Europe’s largest AI firms, said: “I’m very happy to see that the focus in Brussels has changed from regulation to enabling growth and innovation.” He applauded the commission’s Draghi-inspired simplification agenda to cut digital regulation. But he said it was “crucial” that it address barriers facing EU entrepreneurs trying to scale their firms – referring to another Draghi diagnosis that fragmented capital markets left firms gasping for growth capital. The Finnish AI firm accelerated its growth by selling to US chip giant AMD last year. The general counsel at another of Europe’s largest AI firms, who asked not to be named, said regulation was

Meta AI training will be challenged at Europe’s highest court, says data protection chief Read More »

BBC plans to outsource and offshore thousands of tech roles

Technology

Chris Chambers – stock.adobe.com The BBC is drawing up plans to outsource roles in its digital product group and finance, which could include some being carried out overseas By Karl Flinders, Chief reporter and senior editor EMEA Published: 16 Jul 2025 15:49 The BBC is reported to be looking at how it can meet tough cost-cutting targets by outsourcing roles in tech and finance. According to a Guardian report, thousands of UK jobs are at risk as the broadcaster seeks cost savings, which could see work carried out offshore. The report states that jobs in Salford, Glasgow, Newcastle and Cardiff could be lost through the plans. Affected departments include the BBC’s finance function and its digital product group, which develops digital platforms like iPlayer. The BBC is said to be in talks with US tech giants, and jobs could be moved offshore. The BBC would not comment on the reported plans, but a spokesperson said: “Like many organisations, it’s routine to assess different options that could deliver these changes, and it would be wrong to suggest decisions have been taken.” The spokesperson added: “We have made clear our ambition to innovate and transform to be able to invest in the content and services audiences love. To do this, we must accelerate our transformation and take advantage of opportunities in technology or with partners to strengthen our capabilities.” Transform through technology In a speech in May, BBC director general Tim Davie said: “We want to transform the BBC through new technology.” He described the use of artificial intelligence (AI) across the broadcaster’s business as an example: “The majority of BBC staff are already using AI in their work and we see big potential as we develop our own bespoke large language model, deploying agentic AI capabilities.” To this end, he said the organisation was “looking at new, major partnerships with the world-leading big-tech companies, the hyperscalers”. “As part of this, we are already working on the media supply chain, the process behind the scenes that gets content from the camera to screen, from microphone to the headphone. This will open up huge creative possibilities, and it will allow us to drive efficiencies and reinvest into world-class content,” he added. Davie also said the organisation wants to increase its use of social media platforms. “We want to help turn the tide by dramatically increasing our news presence on platforms like YouTube and TikTok to ensure we have a stronger position amidst the noise,” he said. “We are already making progress. We are the biggest news account globally on Instagram, but we want to deploy new technology and skills to create more content that works on these platforms while incentivising links to our services.” When it comes to the finance department, the BBC has used Tata Consultancy Services (TCS) since 2023, in a contract with the Indian IT giant to transform the finance and payroll function. Public sector offshoring There is a growing appetite for public sector organisations, such as the BBC, to offshore work to locations such as India to take advantage of huge low-cost talent pools. Speaking to Computer Weekly about TCS’s UK public sector plans last year, Amit Kapur, its UK country head, said there was “potential, paucity and action” with “good engagement”. It’s not just Indian giants that see the UK public sector as an opportunity. Mumbai-headquartered Hexaware said IT suppliers that currently dominate the public sector were in its sights, with a changing appetite for tech within UK government departments and public sector bodies. Read more on IT outsourcing BBC votes for multi-network IoT SIMs in general election By: Joe O’Halloran Ooredoo Qatar claims milestone for private networks in Middle East By: Joe O’Halloran BBC outsources IT to India’s Tata Consultancy Services By: Karl Flinders Delays, downscaling of national fast broadband roll-out threaten BBC digitisation By: Joe O’Halloran Read More

BBC plans to outsource and offshore thousands of tech roles Read More »

Agents built from alloys

Technology

July 17, 2025 Albert Ziegler Head of AI This spring, we had a simple and, to my knowledge, novel idea that turned out to dramatically boost the performance of our vulnerability detection agents at XBOW. On fixed benchmarks and with a constrained number of iterations, we saw success rates rise from 25% to 40%, and then soon after to 55%. The principles behind this idea are not limited to cybersecurity. They apply to a large class of agentic AI setups. Let me share. XBOW’s Challenge XBOW is an autonomous pentester. You point it at your website, and it tries to hack it. If it finds a way in (something XBOW is rather good at), it reports back so you can fix the vulnerability. It’s autonomous, which means: once you’ve done your initial setup, no further human handholding is allowed. There is quite a bit to do and organize when pentesting an asset. You need to run discovery and create a mental model of the website, its tech stack, logic, and attack surface, then keep updating that mental model, building up leads and discarding them by systematically probing every part of it in different ways. That’s an interesting challenge, but not what this blog post is about. I want to talk about one particular, fungible subtask that comes up hundreds of times in each test, and for which we’ve built a dedicated subagent: you’re pointed at a part of the attack surface knowing the genre of bug you’re supposed to be looking for, and you’re supposed to demonstrate the vulnerability. It’s a bit like competing in a CTF challenge: try to find the flag you can only get by exploiting a vulnerability that’s placed at a certain location. In fact, we built a benchmark set of such tasks, and packaged them in a CTF-like style so we could easily repeat, scale, and assess our “solver agent’s” performance on it. The original set has, sadly, mostly outlived its usefulness because our solver agent is just too good on it by now, but we harvested more challenging examples from open source projects we ran on. The Agent’s Task On such a CTF-like challenge, the solver is basically an agentic loop set to work for a number of iterations. Each iteration consists of the solver deciding on an action: a command in a terminal, writing a Python script, running one of our pentesting tools. We vet the action and execute it, show the solver its result, and the solver decides on the next one. After a fixed number of iterations we cut our losses. Typically and for the experiments in this post, that number is 80: while we still get solves after more iterations, it becomes more efficient to start a new solver agent unburdened by the misunderstandings and false assumptions accumulated over time. What makes this task special, as an agentic task? Agentic AI is often used on the continuously-make-progress type of problems, where every step brings you closer to the goal. This task is more like prospecting through a vast search space: the agent digs in many places, follows false leads for a while, and eventually course corrects to strike gold somewhere else. Over the course of one challenge, among all the dead ends, the AI agent will need to come up with and combine a couple of great ideas. If you ever face an agentic AI task like that, model alloys may be for you. The LLM From our very beginning, it was part of our AI strategy that XBOW be model provider agnostic. That means we can just plug-and-play the best LLM for our use case. Our benchmark set makes it easy to compare models, and we continuously evaluate new ones. For a while, OpenAI’s GPT-4 was the best off-the-shelf model we evaluated, but since Anthropic’s Sonnet 3.5 came along in June last year, no other provider managed to come close for a while, no matter how many we tested. Sonnet 3.7 presented a modest but recognizable improvement over its predecessor, but when Google released Gemini 2.5 Pro (preview in March), it presented a real step up. Then Anthropic hit back with Sonnet 4.0, which performed better again. On average. On the basis of individual challenges, some are best solved by Gemini, some by Sonnet. That’s not terribly surprising. If every agent needs five good insights to progress through the challenge, then some sets of five are the kind that come easily to Sonnet, and some sets of five come easily to Gemini. But what about the challenges that need five good ideas, three of which are the kind that Sonnet is good at, and two are the kind that Gemini is good at? Alloyed Agents Like most typical AI agents, we call the model in a loop. The idea behind an alloy is simple: instead of always calling the same model, sometimes call one and sometimes the other. The trick is that you still keep to a single chat thread with one user and a single assistant. So while the true origin of the assistant messages in the conversation alternates, the models are not aware of each other. Whatever the other model said, they think it was said by them. So in the first round, you might call Sonnet for an action to get started, with a prompt like this: System: Find the bug! Let’s say it tells you to use curl. You do that and gather the output to present to the model. So now you call Gemini with a prompt like this: System: Find the bug! Assistant: Let’s start by curling the app. User: You got a 401 Unauthorized response. Gemini might tell you to log in with the admin credentials, and you do that, and then you present the result to Sonnet: System: Find the bug! Assistant: Let’s start by curling the app. User: You got a 401 Unauthorized response. Assistant: Let’s try to log in with the admin credentials. User: You got a 200

Agents built from alloys Read More »