us-technology Archives - Page 119 of 122

The best deals on AirPods, iPads, MacBooks and other Apple gear you can get right now

Whether it’s your nephew who wants a new iPad, your sister who needs a new pair of AirPods or your parent who could benefit from an easy-to-use MacBook, Apple devices are some of the most in-demand devices you can get today. That means, unfortunately, that big discounts are few and far between, and they’re often the first to sell out when sales do arise. But make no mistake: you can find good Apple deals across the web, you just have to know where to look. Engadget keeps track of deals like these regularly, so we’re here to help. We’ve collected the best Apple deals on items like AirPods, MacBooks, iPads and more that you can get right now. Just note: you’ll find the best Apple deals from retailers like Amazon, Best Buy, Walmart, Target and others — not directly from Apple. Unless you shop refurbished, you’ll always pay top dollar direct at Apple, and for some things (like iPhones), that might be the best route to take. Apple’s flagship wearable is the best smartwatch you can buy, period. While the Series 10 was an iterative update, that’s not necessarily a bad thing. It sports a slightly longer battery life, a slimmer design and wide-angle OLED screen for better viewing angles. It tracks workouts accurately and delivers alerts to your wrist efficiently. Also at Walmart. $299 at Amazon Apple Watch SE for $169 (32 percent off): We wouldn’t be surprised to see an update to Apple’s budget smartwatch sometime soon (and we have a few suggestions on the matter). But thanks to some serious discounts, the Apple Watch SE has turned out to be the most affordable way to get an iPhone companion for your wrist. Despite the lackluster screen and limited extra features, it handles the basics well. Also at Best Buy and Walmart. Best AirPods deals Best iPad deals This is Apple’s most affordable large-screen iPad. Engadget’s Nate Ingraham awarded it a review score of 89 upon its debut this March. When you pair it with accessories like a keyboard folio and mouse, it becomes a true productivity machine — though those add-ons make it a pricey package. Good thing iPads are on sale for Prime Day. This $120 discount represents the lowest price we’ve seen. Also at Best Buy. $728 at Amazon iPad Pro (M4, 11-inch) for $899 ($100 off): The most powerful iPad is the iPad Pro, and it’s the one to get if you’re even toying with the idea of using your new slab as a laptop replacement. Both the 11- and 13-inch models have gorgeous displays, thinner and lighter designs, a repositioned front camera and the excessively powerful M4 chip inside. Also at Best Buy. Best MacBook deals Apple’s latest MacBook Air is another device that only came out in March but already has a modest discount at some retailers. One of the things we appreciated most in our review was the slight price drop for the base configuration. Instead of starting at $1,099 like the 13-inch M3 MacBook Air, the M4 starts at $999. Add in this discount and the fact that the ultraportable packs Apple’s latest M-series chip, and you’ve got yourself a pretty good deal on a capable laptop — one that happens to be our favorite laptop overall. Also at B&H Photo. $849 at Amazon Apple MacBook Air (15-inch, M3, 24GB RAM) for $1,299 ($400 off): The last-gen M3 MacBook Air has officially been discontinued, but it remains a superb laptop while the last bits of stock remain available. We gave this 15-inch model a score of 90 in our review — outside of its slower chip (which is still plenty fast for everyday use), marginally improved camera and inability to power two external displays with the lid open, it’s virtually identical to the newer version. This deal applies to the model with the larger 24GB of memory in the Starlight colorway. MacBook Air (M3, 15-inch) for $949 ($350 off): The base model M3 MacBook Air is also available at a sizeable discount.. Apple MacBook Air (15-inch, M4) for $1,049 ($150 off): The 15-inch MacBook Air is nearly identical to the smaller version but features more robust speakers and a more spacious trackpad alongside its roomier display. Also at B&H. Best Apple accessories deals Apple Pencil (USB-C) for $69 ($10 off): This more affordable Apple Pencil doesn’t support pressure sensitivity, but it still makes for a useful stylus for the basics. If you’re a casual note-taker and can live without wireless charging, you’ll save a few bucks by picking this one up. Also available at Walmart. Apple Pencil (2nd gen) for $90 ($39 off): The Second gen Pencil both attaches and charges magnetically, supports tilt and pressure sensitivity and allows for tool changes with a tap of the flat edge. It’s an older pencil and isn’t compatible with the latest iPad models. This is the lowest price we’ve tracked this year, but it went as low as $80 for last year’s Black Friday sales. Read more Apple coverage: The best AirPods The best Apple Watches The best MacBooks The best iPhones The best iPads Follow @EngadgetDeals on X for the latest tech deals and buying advice. Read More

The best deals on AirPods, iPads, MacBooks and other Apple gear you can get right now Read More »

Unicode’s new emoji refuses to put respect on Bigfoot’s name

us-technology

The Unicode Consortium has announced that it’s adding what’s essentially a Bigfoot emoji to the open Unicode standard this fall. The famous cryptid will appear as “Hairy Creature” alongside a selection of other fun new emoji options in Unicode 17.0. It might seem strange that a consortium of companies as powerful as Apple, Google and Microsoft would practically subtweet one of North America’s most famous semi-mythological creatures. But the global nature of Unicode makes avoiding region-specific nomenclature preferable whenever possible. To me, that’s Bigfoot, plain and simple, but elsewhere in the world it might scan as a yowie, yeti, nuk-luk, hibagon, orang pendekor or an almas. Unicode Consortium Besides “Hairy Creature,” here’s some of the other new emoji that’ll be added with Unicode 17.0: Trombone Treasure Chest Distorted Face Fight Cloud Apple Core Orca Ballet Dancers Landslide Unicode 17.0 is slated to be released on September 9, 2025, but these new emoji likely won’t be added to Android and iOS until a bit after the standard is updated. You’ll just have to make do with what you can create with Genmoji or Emoji Kitchen while you wait. Update, July 21, 2025 3:35PM: The article was updated to add the “Landslide” emoji that wasn’t included in the Unicode Consortium’s original announcement post. Read More

Unicode’s new emoji refuses to put respect on Bigfoot’s name Read More »

Xbox cloud games will soon follow you just about everywhere

us-technology

Microsoft just launched a service for Xbox Insiders that brings all cloud-playable games, along with play histories, to the official Xbox PC app. This includes console exclusives spanning multiple generations and hundreds of other releases. The service extends to games owned by the player and Game Pass titles. The big hook here is that recently-played games will follow people across devices, including Xbox consoles, PCs and Windows handhelds. This will make it easier for folks to jump back into something, even when going from, say, an Xbox Series X to a PC. Xbox The new “play history” section of the PC app and Xbox console UI will display cloud games as recently-played titles, and this list follows people wherever they go. It includes cloud-powered game saves, so there will be no wasted time. Being as this is all part of Xbox Cloud Gaming, players will be able to start a game on a console and finish on a PC, even if that title isn’t available natively on the second platform. There’s also a new search filter in the library section for cloud games, along with a “jump back in” list on the home screen of the app. “While the large tiles highlight games you’ve recently played on your current device, the play history tile shows games you’ve played across any Xbox device, making it easy to pick up where you left off,” the company wrote in a blog post. This is all thanks to the redesigned library feature for the Xbox app. This allows games purchased from various platforms to all be launched from the same place. Read More

Xbox cloud games will soon follow you just about everywhere Read More »

Crowdstrike’s massive cyber outage 1-year later: lessons enterprises can learn to improve security

us-technology

As we wrote in our initial analysis of the CrowdStrike incident, the July 19, 2024, outage served as a stark reminder of the importance of cyber resilience. Now, one year later, both CrowdStrike and the industry have undergone significant transformation, with the catalyst being driven by 78 minutes that changed everything. “The first anniversary of July 19 marks a moment that deeply impacted our customers and partners and became one of the most defining chapters in CrowdStrike’s history,” CrowdStrike’s President Mike Sentonas wrote in a blog detailing the company’s year-long journey toward enhanced resilience. The numbers remain sobering: A faulty Channel File 291 update, deployed at 04:09 UTC and reverted just 78 minutes later, crashed 8.5 million Windows systems worldwide. Insurance estimates put losses at $5.4 billion for the top 500 U.S. companies alone, with aviation particularly hard hit with 5,078 flights canceled globally. Steffen Schreier, senior vice president of product and portfolio at Telesign, a Proximus Global company, captures why this incident resonates a year later: “One year later, the CrowdStrike incident isn’t just remembered, it’s impossible to forget. A routine software update, deployed with no malicious intent and rolled back in just 78 minutes, still managed to take down critical infrastructure worldwide. No breach. No attack. Just one internal failure with global consequences.” The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF His technical analysis reveals uncomfortable truths about modern infrastructure: “That’s the real wake-up call: even companies with strong practices, a staged rollout, fast rollback, can’t outpace the risks introduced by the very infrastructure that enables rapid, cloud-native delivery. The same velocity that empowers us to ship faster also accelerates the blast radius when something goes wrong.” Understanding what went wrong CrowdStrike’s root cause analysis revealed a cascade of technical failures: a mismatch between input fields in their IPC Template Type, missing runtime array bounds checks and a logic error in their Content Validator. These weren’t edge cases but fundamental quality control gaps. Merritt Baer, incoming Chief Security Officer at Enkrypt AI and advisor to companies including Andesite, provides crucial context: “CrowdStrike’s outage was humbling; it reminded us that even really big, mature shops get processes wrong sometimes. This particular outcome was a coincidence on some level, but it should have never been possible. It demonstrated that they failed to instate some basic CI/CD protocols.” Her assessment is direct but fair: “Had CrowdStrike rolled out the update in sandboxes and only sent it in production in increments as is best practice, it would have been less catastrophic, if at all.” Yet Baer also recognizes CrowdStrike’s response: “CrowdStrike’s comms strategy demonstrated good executive ownership. Execs should always take ownership—it’s not the intern’s fault. If your junior operator can get it wrong, it’s my fault. It’s our fault as a company.” Leadership’s accountability George Kurtz, CrowdStrike’s founder and CEO, exemplified this ownership principle. In a LinkedIn post reflecting on the anniversary, Kurtz wrote: “One year ago, we faced a moment that tested everything: our technology, our operations, and the trust others placed in us. As founder and CEO, I took that responsibility personally. I always have and always will.” His perspective reveals how the company channeled crisis into transformation: “What defined us wasn’t that moment; it was everything that came next. From the start, our focus was clear: build an even stronger CrowdStrike, grounded in resilience, transparency, and relentless execution. Our North Star has always been our customers.” CrowdStrike goes all-in on a new Resilient by Design framework CrowdStrike’s response centered on their Resilient by Design framework, which Sentonas describes as going beyond “quick fixes or surface-level improvements.” The framework’s three pillars, including Foundational, Adaptive and Continuous components, represent a comprehensive rethinking of how security platforms should operate. Key implementations include: Sensor Self-Recovery: Automatically detects crash loops and transitions to safe mode New Content Distribution System: Ring-based deployment with automated safeguards Enhanced Customer Control: Granular update management and content pinning capabilities Digital Operations Center: Purpose-built facility for global infrastructure monitoring Falcon Super Lab: Testing thousands of OS, kernel and hardware combinations “We didn’t just add a few content configuration options,” Sentonas emphasized in his blog. “We fundamentally rethought how customers could interact with and control enterprise security platforms.” Industry-wide supply chain awakening The incident forced a broader reckoning about vendor dependencies. Baer frames the lesson starkly: “One huge practical lesson was just that your vendors are part of your supply chain. So, as a CISO, you should test the risk to be aware of it, but simply speaking, this issue fell on the provider side of the shared responsibility model. A customer wouldn’t have controlled it.” CrowdStrike’s outage has permanently altered vendor evaluation: “I see effective CISOs and CSOs taking lessons from this, around the companies they want to work with and the security they receive as a product of doing business together. I will only ever work with companies that I respect from a security posture lens. They don’t need to be perfect, but I want to know that they are doing the right processes, over time.” Sam Curry, CISO at Zscaler, added, “What happened to CrowdStrike was unfortunate, but it could have happened to many, so perhaps we don’t put the blame on them with the benefit of hindsight. What I will say is that the world has used this to refocus and has placed more attention to resilience as a result, and that’s a win for everyone, as our collective goal is to make the internet safer and more secure for all.” Underscores the need for a new security paradigm Schreier’s analysis extends beyond CrowdStrike to fundamental security architecture: “Speed at scale comes at a cost. Every routine update now carries the weight of potential systemic failure.

Crowdstrike’s massive cyber outage 1-year later: lessons enterprises can learn to improve security Read More »

Google DeepMind makes AI history with gold medal win at world’s toughest math competition

us-technology

Google DeepMind announced Monday that an advanced version of its Gemini artificial intelligence model has officially achieved gold medal-level performance at the International Mathematical Olympiad, solving five of six exceptionally difficult problems and earning recognition as the first AI system to receive official gold-level grading from competition organizers. The victory advances the field of AI reasoning and puts Google ahead in the intensifying battle between tech giants building next-generation artificial intelligence. More importantly, it demonstrates that AI can now tackle complex mathematical problems using natural language understanding rather than requiring specialized programming languages. “Official results are in — Gemini achieved gold-medal level in the International Mathematical Olympiad!” Demis Hassabis, CEO of Google DeepMind, wrote on social media platform X Monday morning. “An advanced version was able to solve 5 out of 6 problems. Incredible progress.” The International Mathematical Olympiad, held annually since 1959, is widely considered the world’s most prestigious mathematics competition for pre-university students. Each participating country sends six elite young mathematicians to compete in solving six exceptionally challenging problems spanning algebra, combinatorics, geometry, and number theory. Only about 8% of human participants typically earn gold medals. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF How Google DeepMind’s Gemini Deep Think cracked math’s toughest problems Google’s latest success far exceeds its 2024 performance, when the company’s combined AlphaProof and AlphaGeometry systems earned silver medal status by solving four of six problems. That earlier system required human experts to first translate natural language problems into domain-specific programming languages and then interpret the AI’s mathematical output. This year’s breakthrough came through Gemini Deep Think, an enhanced reasoning system that employs what researchers call “parallel thinking.” Unlike traditional AI models that follow a single chain of reasoning, Deep Think simultaneously explores multiple possible solutions before arriving at a final answer. “Our model operated end-to-end in natural language, producing rigorous mathematical proofs directly from the official problem descriptions,” Hassabis explained in a follow-up post on the social media site X, emphasizing that the system completed its work within the competition’s standard 4.5-hour time limit. We achieved this year’s impressive result using an advanced version of Gemini Deep Think (an enhanced reasoning mode for complex problems). Our model operated end-to-end in natural language, producing rigorous mathematical proofs directly from the official problem descriptions –… — Demis Hassabis (@demishassabis) July 21, 2025 The model achieved 35 out of a possible 42 points, comfortably exceeding the gold medal threshold. According to IMO President Prof. Dr. Gregor Dolinar, the solutions were “astonishing in many respects” and found to be “clear, precise and most of them easy to follow” by competition graders. OpenAI faces backlash for bypassing official competition rules The announcement comes amid growing tension in the AI industry over competitive practices and transparency. Google DeepMind’s measured approach to releasing its results has drawn praise from the AI community, particularly in contrast to rival OpenAI’s handling of similar achievements. “We didn’t announce on Friday because we respected the IMO Board’s original request that all AI labs share their results only after the official results had been verified by independent experts & the students had rightly received the acclamation they deserved,” Hassabis wrote, appearing to reference OpenAI’s earlier announcement of its own olympiad performance. Btw as an aside, we didn’t announce on Friday because we respected the IMO Board’s original request that all AI labs share their results only after the official results had been verified by independent experts & the students had rightly received the acclamation they deserved — Demis Hassabis (@demishassabis) July 21, 2025 Social media users were quick to note the distinction. “You see? OpenAI ignored the IMO request. Shame. No class. Straight up disrespect,” wrote one user. “Google DeepMind acted with integrity, aligned with humanity.” The criticism stems from OpenAI’s decision to announce its own mathematical olympiad results without participating in the official IMO evaluation process. Instead, OpenAI had a panel of former IMO participants grade its AI’s performance, a approach that some in the community view as lacking credibility. “OpenAI is quite possibly the worst company on the planet right now,” wrote one critic, while others suggested the company needs to “take things seriously” and “be more credible.” You see? OpenAI ignored the IMO request. Shame. No class. Straight up disrespect. Google DeepMind acted with integrity, aligned with humanity. TRVTHNUKE pic.twitter.com/8LAOak6XUE — NIK (@ns123abc) July 21, 2025 Inside the training methods that powered Gemini’s mathematical mastery Google DeepMind’s success appears to stem from novel training techniques that go beyond traditional approaches. The team used advanced reinforcement learning methods designed to leverage multi-step reasoning, problem-solving, and theorem-proving data. The model was also provided access to a curated collection of high-quality mathematical solutions and received specific guidance on approaching IMO-style problems. The technical achievement impressed AI researchers who noted its broader implications. “Not just solving math… but understanding language-described problems and applying abstract logic to novel cases,” wrote AI observer Elyss Wren. “This isn’t rote memory — this is emergent cognition in motion.” Ethan Mollick, a professor at the Wharton School who studies AI, emphasized the significance of using a general-purpose model rather than specialized tools. “Increasing evidence of the ability of LLMs to generalize to novel problem solving,” he wrote, highlighting how this differs from previous approaches that required specialized mathematical software. It wasn’t just OpenAI. Google also used a general purpose model to solve the very hard math problems of the International Math Olympiad in plain language. Last year they used specialized tool use Increasing evidence of the ability of LLMs to generalize to novel problem solving https://t.co/Ve72fFmx2b — Ethan Mollick (@emollick) July 21, 2025 The model demonstrated particularly impressive reasoning in one problem where many human competitors applied graduate-level mathematical concepts. According

Google DeepMind makes AI history with gold medal win at world’s toughest math competition Read More »

Chinese startup Manus challenges ChatGPT in data visualization: which should enterprises use?

us-technology

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now The promise sounds almost too good to be true: drop a messy comma separated values (CSV) file into an AI agent, wait two minutes, and get back a polished, interactive chart ready for your next board presentation. But that’s exactly what Chinese startup Manus.im is delivering with its latest data visualization feature, launched this month. Unfortunately, my initial hands-on testing with corrupted datasets reveals a fundamental enterprise problem: impressive capabilities paired with insufficient transparency about data transformations. While Manus handles messy data better than ChatGPT, neither tool is yet ready for boardroom-ready slides. Rossums’ survey of 470 finance leaders found 58% still rely primarily on Excel for monthly KPIs, despite owning BI licenses. Another TechRadar study estimates that overall spreadsheet dependence affects roughly 90% of organizations — creating a “last-mile data problem” between governed warehouses and hasty CSV exports that land in analysts’ inboxes hours before critical meetings. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Manus targets this exact gap. Upload your CSV, describe what you want in natural language, and the agent automatically cleans the data, selects the appropriate Vega-Lite grammar and returns a PNG chart ready for export—no pivot tables required. Where Manus beats ChatGPT: 4x slower but more accurate with messy data I tested both Manus and ChatGPT’s Advanced Data Analysis using three datasets (113k-row ecommerce orders, 200k-row marketing funnel 10k-row SaaS MRR), first clean, then corrupted with 5% error injection including nulls, mixed-format dates and duplicates. For example, testing the same prompt — “Show me a month-by-month revenue trend for the past year and highlight any unusual spikes or dips” — across clean and corrupted 113k-row e-commerce data revealed some stark differences. Tool Data Quality Time Cleans Nulls Parses Dates Handles Duplicates Comments Manus Clean 1:46 N/A ✓ N/A Correct trend, standard presentation, but incorrect numbers Manus Messy 3:53 ✓ ✓ ✗ Correct trend despite inaccurate data ChatGPT Clean 0:57 N/A ✓ N/A Fast, but incorrect visualisation ChatGPT Messy 0:59 ✗ ✗ ✗ Incorrect trend from unclean data For context: DeepSeek could only handle 1% of the file size, while Claude and Grok took over 5 minutes each but produced interactive charts without PNG export options. Outputs: Figure 1-2: Chart outputs from the same revenue trend prompt on messy e-commerce data. Manus (bottom) produces a coherent trend despite data corruption, while ChatGPT (top) shows distorted patterns from unclean date formatting. Manus behaves like a cautious junior analyst — automatically tidying data before charting, successfully parsing date inconsistencies and handling nulls without explicit instructions. When I requested the same revenue trend analysis on corrupted data, Manus took nearly 4 minutes but produced a coherent visualization despite the data quality issues. ChatGPT operates like a speed coder — prioritizing fast output over data hygiene. The same request took just 59 seconds but produced misleading visualizations because it didn’t automatically clean formatting inconsistencies. However, both tools failed in terms of “executive readiness.” Neither produced board-ready axis scaling or readable labels without follow-up prompts. Data labels were frequently overlapping or too small, bar charts lacked proper gridlines and number formatting was inconsistent. The transparency crisis enterprises can’t ignore Here’s where Manus becomes problematic for enterprise adoption: the agent never surfaces cleaning steps it applies. An auditor reviewing the final chart has no way to confirm whether outliers were dropped, imputed or transformed. When a CFO presents quarterly results based on a Manus-generated chart, what happens when someone asks, “How did you handle the duplicate transactions from the Q2 system integration?” The answer is silence. ChatGPT, Claude and Grok all show their Python code, though transparency through code review isn’t scalable for business users lacking programming experience. What enterprises need is a simpler audit trail, which builds trust. Warehouse-native AI is racing ahead While Manus focuses on CSV uploads, major platforms are building chart generation directly into enterprise data infrastructure: Google’s Gemini in BigQuery became generally available in August 2024, enabling the generation of SQL queries and inline visualizations on live tables while respecting row-level security. Microsoft’s Copilot in Fabric reached GA in the Power BI experience in May 2024, creating visuals inside Fabric notebooks while working directly with Lakehouse datasets. GoodData’s AI Assistant, launched in June 2025, operates within customer environments and respects existing semantic models, allowing users to ask questions in plain language while receiving answers that align with predefined metrics and business terms. These warehouse-native solutions eliminate CSV exports entirely, preserve complete data lineage and leverage existing security models — advantages file-upload tools like Manus struggle to match. Critical gaps for enterprise adoption My testing revealed several blockers: Live data connectivity remains absent — Manus supports file uploads only, with no Snowflake, BigQuery or S3 connectors. Manus.im says connectors are “on the roadmap” but offers no timeline. Audit trail transparency is completely missing. Enterprise data teams need transformation logs showing exactly how AI cleaned their data and whether its interpretation of the fields are correct. Export flexibility is limited to PNG outputs. While adequate for quick slide decks, enterprises need customizable, interactive export options. The verdict: impressive tech, premature for enterprise use cases For SMB executives drowning in ad-hoc CSV analysis, Manus’s drag-and-drop visualisation seems to be doing the job. The autonomous data cleaning handles real-world messiness that would otherwise require manual preprocessing, cutting turnaround from hours to minutes when you have reasonably complete data. Additionally, it offers a significant runtime advantage over Excel or Google Sheets, which require manual pivots and incur substantial load times due to local compute power limitations. But regulated enterprises with governed data lakes should wait

Chinese startup Manus challenges ChatGPT in data visualization: which should enterprises use? Read More »

A ChatGPT ‘router’ that automatically selects the right OpenAI model for your job appears imminent

us-technology

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now In the 2.5 years since OpenAI debuted ChatGPT, the number of large language models (LLMs) that the company has made available as options to power its hit chatbot has steadily grown. In fact, there are now a total of 7 (!!!) different AI models that paying ChatGPT subscribers (of the $20 Plus tier and more expensive tiers) can choose between when interacting with the trusty chatbot — each with its own strengths and weaknesses. But how should a user decide which one to use for their particular prompt, question, or task? After all, you can only pick one at a time. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Is help on the way? Help appears to be on the way imminently from OpenAI — as reports emerged over the last few days on X from AI influencers, including OpenAI’s own researcher “Roon (@tszzl on X)” (speculated to be technical team member Tarun Gogineni) — of a new “router” function that will automatically select the best OpenAI model to respond to the user’s input on the fly, depending on the specific input’s content. As Roon posted on the social network X yesterday, July 20, 2025, in since-deleted response to influencer Lisan al Gaib’s statement that they “don’t want a model router I want to be able to select the models I use”: “You’ll still be able to select. This is a product to make sure that doctors aren’t stuck on 4o-mini” Similarly, Yuchen Jin, Co-founder & CTO of AI inference cloud provider Hyperbolic Labs, wrote in an X post on July 19. “Heard GPT-5 is imminent, from a little bird. It’s not one model, but multiple models. It has a router that switches between reasoning, non-reasoning, and tool-using models. That’s why Sam said they’d “fix model naming”: prompts will just auto-route to the right model. GPT-6 is in training. I just hope they’re not delaying it for more safety tests. 🙂“ While a presumably far more advanced GPT-5 model would (and will) be huge news if and when released, the router may make life much easier and more intelligent for the average ChatGPT subscriber. It would also follow on the heels of other third-party products such as the web-based Token Monster chatbot, which automatically select and combine responses from multiple third-party LLMs to respond to user queries. Asked about the router idea and comments from “Roon,” an OpenAI spokesperson declined to provide a response or further information at this time. Solving the overabundance of choice problem To be clear, every time OpenAI has released a new LLM to the public, it has diligently shared in either a blog post or release notes or both what it thinks that particular model is good for and designed to help with. For example, OpenAI’s “o” series reasoning models — o3, o4-mini, o4-mini high — have performed better on math, science, and coding tests thanks to benchmarking tests, while non-reasoning models like the new GPT-4.5 and 4.1 seem to do better at creative writing and communications tasks. Dedicated AI influencers and power users may understand very well what all these different models are good and not so good at. But regular users who don’t follow the industry as closely, nor have the time and finances available to test them all out on the same input prompts and compare the outputs, will understandably struggle to make sense of the bewildering array of options. That could mean they’re missing out on smarter, more intelligent, or more capable responses from ChatGPT for their task at hand. And in the case of fields like medicine, as Roon alluded to, the difference could be one of life or death. It’s also interesting to speculate on how an automatic LLM router might change public perceptions toward and adoption of AI more broadly. ChatGPT already counted 500 million active users as of March. If more of these people were automatically guided toward more intelligent and capable LLMs to handle their AI queries, the impact of AI on their workloads and that of the entire global economy would seem likely to be felt far more acutely, creating a positive “snowball” effect. That is, as more people saw more gains from ChatGPT automatically choosing the right AI model for their queries, and as more enterprises reaped greater efficiency from this process, more and more individuals and organizations would likely be convinced by the utility of AI and be more willing to pay for it, and as they did so, even more AI-powered workflows would spread out in the world. But right now, this is presumably all being held back a little by the fact that the ChatGPT model picker requires the user to A. know they even have a choice of models and B. have some level of informed awareness of what these models are good for. It’s all still a manually driven process. Like going to the supermarket in your town and staring at aisles of cereal and different sauces, the average ChatGPT user is currently faced with an overabundance of choice. Hopefully any hypothetical OpenAI router seamlessly helps direct them to the right model product for their needs, when they need it — like a trusty shopkeeper showing up to free you from your product paralysis. Daily insights on business use cases with VB Daily If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

A ChatGPT ‘router’ that automatically selects the right OpenAI model for your job appears imminent Read More »

How to Migrate from OpenAI to Cerebrium for Cost-Predictable AI Inference

us-technology

If you’re building an AI application, you probably started with OpenAI’s convenient APIs. However, as your application scales, you’ll need more control over costs, models, and infrastructure. Cerebrium is a serverless AI infrastructure platform that lets you run open-source models on dedicated hardware with predictable, time-based pricing instead of token-based billing. This guide will show you how to build a complete chat application with OpenAI, migrate it to Cerebrium by changing just two lines of code, and add performance and cost tracking to compare the two approaches to AI inference using real data. When you’re done, you’ll have a working chat application that demonstrates the practical differences between token-based and compute-based pricing models, and the insights you need to choose the right approach for your use case. Prerequisites To follow along with this guide, you’ll need Python 3.10 or higher installed on your system. You’ll also need the following (all free): OpenAI API key. Cerebrium account (includes free tier access to test GPU instances up to A10 level). Hugging Face token (free account required). Llama 3.1 model access on Hugging Face. Visit meta-llama/Meta-Llama-3.1-8B-Instruct and click “Request access” to get approval from Meta (typically takes a few minutes to a few hours). Familiarity with Python and API calls is helpful, but we’ll walk through each step in detail. Creating an OpenAI Chatbot We’ll build a complete chat application that works with OpenAI as our foundation and enhance it throughout the tutorial without ever needing to modify the core chat logic. Create a new directory for the project and set up the basic structure: mkdir openai-cerebrium-migration cd openai-cerebrium-migration Install the dependencies: pip install openai==1.55.0 python-dotenv==1.0.0 art==6.1 colorama==0.4.6 Create a .env file to store API credentials: OPENAI_API_KEY=your_openai_api_key_here CEREBRIUM_API_KEY=your_cerebrium_api_key_here CEREBRIUM_ENDPOINT_URL=your_cerebrium_endpoint_url_here Replace your_openai_api_key_here with your actual OpenAI API key. Now we’ll build the chat.py file step by step. Start by creating the file and adding the imports: import os import time from dotenv import load_dotenv from openai import OpenAI from art import text2art from colorama import init, Fore, Style These imports handle environment variables, OpenAI client creation, ASCII art generation, and colored terminal output. Add the initialization below the imports: load_dotenv() init(autoreset=True) Add this display_intro function: def display_intro(use_cerebrium, endpoint_name): print(“n”) if use_cerebrium: ascii_art = text2art(“Cerebrium”, font=”tarty1″) print(f”{Fore.MAGENTA}{ascii_art}{Style.RESET_ALL}”) else: ascii_art = text2art(“OpenAI”, font=”tarty1″) print(f”{Fore.WHITE}{ascii_art}{Style.RESET_ALL}”) print(f”Connected to: {Fore.CYAN}{endpoint_name}{Style.RESET_ALL}”) print(“nType ‘quit’ or ‘exit’ to end the chatn”) This function provides visual feedback when we switch between endpoints. Add the main function that handles the chat logic: def main(): # OpenAI endpoint client = OpenAI(api_key=os.getenv(“OPENAI_API_KEY”)) model = “gpt-4o-mini” endpoint_name = “OpenAI (GPT-4o-mini)” use_cerebrium = False display_intro(use_cerebrium, endpoint_name) conversation = [] while True: user_input = input(“You: “).strip() if user_input.lower() in [‘quit’, ‘exit’, ‘bye’]: print(“Goodbye!”) break if not user_input: continue conversation.append({“role”: “user”, “content”: user_input}) This function sets up the endpoint configuration and handles the basic chat loop. Add the response handling logic inside the main function’s while loop: try: print(“Bot: “, end=””, flush=True) chat_completion = client.chat.completions.create( messages=conversation, model=model, stream=True, stream_options={“include_usage”: True}, temperature=0.7 ) bot_response = “” for chunk in chat_completion: if chunk.choices[0].delta.content: content = chunk.choices[0].delta.content print(content, end=””, flush=True) bot_response += content print() conversation.append({“role”: “assistant”, “content”: bot_response}) except Exception as e: print(f”❌ Error: {e}”) conversation.pop() Finally, add the script execution guard at the end of the file: if __name__ == “__main__”: main() Test the chatbot by running: You’ll see the OpenAI ASCII art, and you can start chatting with GPT-4o mini. Ask a question to verify that the app works correctly. Responses will stream in real-time. Deploying a Cerebrium Endpoint With vLLM and Llama 3.1 Now we’ll create a Cerebrium endpoint that serves the same OpenAI-compatible interface using vLLM and an open-source model. When we’re done, we’ll be able to switch to a self-hosted open-source model endpoint by changing just two lines of code. Configuring Hugging Face Access for Llama 3.1 First, make sure you have access to the Llama 3.1 model on Hugging Face. If you haven’t already requested access, visit meta-llama/Meta-Llama-3.1-8B-Instruct and click “Request access”. Next, create a Hugging Face token by going to Hugging Face settings, clicking “New token”, and selecting “Read” permissions. Add your Hugging Face token to your Cerebrium project secrets. Go to your Cerebrium dashboard, select your project, and add HF_AUTH_TOKEN with your Hugging Face token as the value. Setting Up a Cerebrium Account and API Access Create a free Cerebrium account and navigate to your dashboard. In the “API Keys” section, copy your session token and save it for later – you’ll need it to authenticate with the deployed endpoint. Add the session token to the .env file as a CEREBRIUM_API_KEY variable: OPENAI_API_KEY=your_openai_api_key_here CEREBRIUM_API_KEY=your_cerebrium_api_key_here CEREBRIUM_ENDPOINT_URL=your_cerebrium_endpoint_url_here Building the OpenAI-Compatible vLLM Endpoint Start by installing the Cerebrium CLI and creating a new project: pip install cerebrium cerebrium login cerebrium init openai-compatible-endpoint cd openai-compatible-endpoint We’ll build the main.py file step by step to understand each component. Start with the imports and authentication: from vllm import SamplingParams, AsyncLLMEngine from vllm.engine.arg_utils import AsyncEngineArgs from pydantic import BaseModel from typing import Any, List, Optional, Union, Dict import time import json import os from huggingface_hub import login login(token=os.environ.get(“HF_AUTH_TOKEN”)) These imports provide the vLLM async engine for model inference, Pydantic models for data validation, and Hugging Face authentication for model access. Add the vLLM engine configuration: engine_args = AsyncEngineArgs( model=”meta-llama/Meta-Llama-3.1-8B-Instruct”, gpu_memory_utilization=0.9, # Set GPU memory utilization max_model_len=8192 # Set max model length ) engine = AsyncLLMEngine.from_engine_args(engine_args) This configuration uses 90% of available GPU memory and sets an 8K-token context window, optimizing for throughput while maintaining reasonable memory usage. Now add the Pydantic models that define the OpenAI-compatible response format: class Message(BaseModel): role: str content: str class ChoiceDelta(BaseModel): content: Optional[str] = None function_call: Optional[Any] = None refusal: Optional[Any] = None role: Optional[str] = None tool_calls: Optional[Any] = None class Choice(BaseModel): delta: ChoiceDelta finish_reason: Optional[str] = None index: int logprobs: Optional[Any] = None class Usage(BaseModel): completion_tokens: int = 0 prompt_tokens: int = 0 total_tokens: int = 0 class ChatCompletionResponse(BaseModel): id: str object: str created: int model: str choices: List[Choice] service_tier: Optional[str] = “default” system_fingerprint: Optional[str] = “fp_cerebrium_vllm” usage: Optional[Usage] = None These models ensure

How to Migrate from OpenAI to Cerebrium for Cost-Predictable AI Inference Read More »

Kapa.ai (YC S23) is hiring a software engineers (EU remote)

us-technology

Create enterprise-grade AI assistants from your content Software Engineer (Full-stack) $100K – $150K / 0.10% – 0.30% Location GB / EG / RU / UA / TR / FR / IT / ES / PL / RO / KZ / NL / BE / SE / CZ / GR / PT / HU / AT / CH / BG / DK / FI / NO / SK / LT / EE / DE / Remote (GB; EG; RU; UA; TR; FR; IT; ES; PL; RO; KZ; NL; BE; SE; CZ; GR; PT; HU; AT; CH; BG; DK; FI; NO; SK; LT; EE; DE) Visa US citizenship/visa not required Connect directly with founders of the best YC-funded startups. Apply to role › About the role As a software engineer you will work across the stack on the Kapa systems that answer thousands of developer questions per day. Check out Docker’s documentation for a live example of what kapa is. In this role, you will: Work directly with the founding team and our research engineers. Scale the infrastructure that powers the Kapa RAG engine (Python). Experiment with new features in the Kapa analytics platform (React + Python). Work on the client integrations which are used to deploy Kapa for our customers (React + Python). Give Kapa access to new kinds of data (Python). Maintain our React SDK. You may be a good fit if you have: A degree in computer science, machine learning, mathematics, statistics or a related field. 3+ years of software engineering experience working on complex systems in both backend and frontend. An affinity for machine learning, deep learning (including LLMs) and natural language processing. The ability to work effectively in a fast in a environment where things are sometimes loosely defined. * This is neither an exhaustive nor necessary set of attributes. Even if none of these apply to you, but you believe you will contribute to kapa.ai, please reach out. About kapa.ai kapa.ai makes it easy for technical companies to build AI support and onboarding bots for their users. Teams at +150 leading startups and enterprises incl. OpenAI, Mixpanel, Mapbox, Docker, Next.js and Prisma use kapa to level up their developer experience and reduce support. We enable companies to use their existing technical knowledge sources incl. docs, tutorials, chat logs, and GitHub issues to generate AI bots that answers developer questions automatically. More than 750k developers have access to kapa.ai via website widgets, Slack/Discord bots, API integrations, or via Zendesk. We’ve been fortunate to be funded by some of the greatest AI investors in Silicon Valley: Initialized Capital (Garry Tan, Alexis Ohanian), Y Combinator, Amjad Masad and Michele Catasta (Replit), and Douwe Kiela (RAG paper author and founder of Contextual AI), and other folks incl. angels at OpenAI. Founded:2023 Batch:S23 Team Size:14 Status:Active Founders Read More

Kapa.ai (YC S23) is hiring a software engineers (EU remote) Read More »

Complete silence is always hallucinated as “ترجمة نانسي قنقر” in Arabic

us-technology

Comment options {{title}} VAD, probably. I’ve only tried the turbo one, but what I can say is that v3 is different from the earlier models. It looks like it doesn’t have the audio descriptions to fall back on and produces hallucinations instead. The earlier models will also produce some miscellaneous crap when they encounter silence (they do this regardless of language), but there are more options for how to deal with that. For example, these things can be effective for the small model (but not for v3): the suppress_tokens trick setting initial prompt to something like “.” adjusting logprob_threshold to -0.4 (works for this empty audio, probably not good for general use) You must be logged in to vote 0 replies Comment options {{title}} is there any good arabic model you guys found which is better than large v3 ? @misutoneko @puthre You must be logged in to vote 1 reply Comment options {{title}} Voxtral was released a few days ago and looks promising Comment options {{title}} I found a similar thing happens in German where it says “Untertitelung des ZDF für funk, 2017.” For both German and Arabic I found that this pretty much only happens at the very end of videos / when there is sustained silence. You must be logged in to vote 0 replies Comment options {{title}} Essentially this seems to be an artifact of the fact that Whisper was trained on (amongst other things) YouTube audio + available subtitles. Often subtitlers add their copyright notice onto the end of the subtitles, and the end of the videos are often credits with music, applause, or silence. Thus whisper learned that silence == “copyright notice”. See some research for the Norwegian example here: https://medium.com/@lehandreassen/who-is-nicolai-winther-985409568201 You must be logged in to vote 0 replies Comment options {{title}} In English there is always applause You must be logged in to vote 0 replies Comment options {{title}} this also happens when you don’t speak into the voice mode, the transcript usually results in the same Arabic phrase You must be logged in to vote 0 replies Comment options {{title}} I’ve also seen this happen a lot in English with Skyeye: It also happens a lot with hallucinations saying stuff like “This is the end of the video, remember to like and subscribe” You must be logged in to vote 0 replies Comment options {{title}} I have built https://arabicworksheet.com for arabic learning from absolute beginners to professional speakers. It created dynamic exercises and worksheets based on your level and topics. Behind the scene I have used Gemini 2.5-pro & GPT-4o for overall agentic workflows. You must be logged in to vote 1 reply Comment options {{title}} Ok? This doesn’t have anything to do with the topic of this discussion Comment options {{title}} In german it’s “Vielen Dank” (Thank you very much) You must be logged in to vote 0 replies Comment options {{title}} You must be logged in to vote 0 replies Read More

Complete silence is always hallucinated as “ترجمة نانسي قنقر” in Arabic Read More »