← All Posts

Tensordyne's 3nm Napier AI Chip Promises 13x Higher Token Throughput Than Blackwell & Blazes Past Rubin With 1000 Tokens/s In Multi-Trillion Parameter Models

The Hot Take: I really hope something comes soon to alleviate all this nonsense AI is causing.

US-based AI company, Tensordyne, has announced the successful tape-out of its Napier chip, which it claims to demolish NVIDIA's Blackwell & Rubin chips with leading token throughput and efficiency. Tensordyne’s new Napier AI Chip arrives with one clear mission: to make NVIDIA’s Blackwell and Rubin chips look considerably less impressive The Napier chip will be the core component of the Tensordyne Napier TDN system, which is designed in collaboration with Broadcom and HPE Juniper Networks. The Napier platform has one goal: to unify AI through novel logarithmic AI math, a tightly integrated memory architecture, and a high-performance scale-up interconnect that […]Read full article at https://wccftech.com/tensordyne-3nm-napier-ai-chip-13x-higher-token-throughput-blackwell-blazes-past-rubin/

Read the full article

AWS Graviton5 Debuts with 192 Arm Cores and PCIe 6.0

The Hot Take: ARM seems to be breaking out from everywhere. Fujitsu, Nvidia, AWS and ARM. Qualcomm seems to be playing catch up in the server market from the looks of it.

AWS has provided a first look at its next-generation Graviton5 processor, a custom server CPU developed by Annapurna Labs for deployment across the company's cloud computing platform and AI inference infrastructure.

Read the full article

Microsoft is killing the Copilot+ PC advantage, brings Windows 11’s local AI to RTX 30+ PCs with 6GB vRAM

The Hot Take: Now we know why M$ is trying to squeeze out every ounce of performance in Windows 11.....

Microsoft says you’ll be able to run Windows 11’s local Language Model APIs on non-Copilot+ PCs as long as you meet the new hardware requirement: an RTX 30+ GPU with 6GB of VRAM. It’s a major change, as it means Copilot+ PCs’ advantages are getting “thin,” and I wouldn’t be surprised if Microsoft drops the NPU requirement entirely in the future. Copilot+ PCs officially debuted on June 18, 2024, and they’ve been driving sales for PC makers. However, it’s not because of the “Copilot” or “NPU” factor. It’s largely because newer PCs are now sold as “Copilot+ PCs,” so even a regular laptop purchase gets counted as proof that AI PCs are taking off. For a PC to meet the “Copilot+ PC” requirement, it would need to have 16GB of RAM, an SSD, and at least a 40 TOPS NPU. For those unaware, an NPU (Neural Processing Unit) is a chip designed to run AI models, specializing in efficiency rather than raw power. On the other hand, a GPU is a heavy-duty processor designed for massive parallel tasks. What is a “Copilot+ PC?” Microsoft sold you Copilot+ PCs as the only way to run local AI, but that was never…

Read the full article

Chinese military has been acquiring Nvidia chips, even post-Washington export controls, research claims — multiple institutions linked to the PLA asked for Nvidia AI chips, according to publicly available documents

The Hot Take: Tell me something I didn't know already. Why else would the GPU market go crazy prices wise?

A business-intelligence researcher said that the Chinese military has been actively acquiring Nvidia AI chips, even after the U.S. put export controls on them. Public documents show that some institutions ask for these chips either through the specifications they demand or by directly asking for Nvidia chips by name.

Read the full article

Intel details long-awaited Crescent Island AI GPU at Computex, boasts up to 480 GB of LPDDR5X to combat memory shortages — company shares more details of its Xe3P inference accelerator at Computex

The Hot Take: Intel moving fast to make up lost ground on this front for sure. From the looks trying to hit the $ sweet spot too.

Intel revealed more details of its next-gen Data Center GPU, code-named Crescent Island, at Computex 2026. This inference-optimized chip will feature up to 480GB of LPDDR5X memory for efficient handling of massive AI contexts.

Read the full article

NVIDIA Loses Ground With AI Engineers as Cooling and Power Costs Push Hyperscalers Toward Custom ASICs, Evercore Warns

The Hot Take: When these start getting traction we'll get GPUs to drop in price.....

While AI GPU giant NVIDIA's chips are widely believed to offer superior total cost of ownership (TCO) compared to custom AI chip alternatives, analysts from Evercore ISI believe that AI engineers are unimpressed by them. NVIDIA CEO Jensen Huang has defended his firm's AI chip price points on multiple occasions by claiming that they offer better performance efficiency compared to peers. However, according to the Evercore report, AI engineers are also focused on other metrics, such as the cost of cooling the chips, when deciding which products to use. Power Consumption & Cooling Are Important For NVIDIA's AI Chip Costs, […]Read full article at https://wccftech.com/nvidia-loses-ground-with-ai-engineers-as-cooling-and-power-costs-push-hyperscalers-toward-custom-asics-evercore-warns/

Read the full article

'Changing of the Guard'? AMD, Intel, and Micron Soar While Nvidia Lags

The Hot Take: AMD seems to be out performing Intel & Nvidia on the market, while Nvidia is still the preferred Ai holy-grail? Just seems odd.

While Nvidia has dominated the "infrastructure boom" since 2022's launch of ChatGPT and "the generative AI craze," CNBC writes that "This week offered the starkest illustration yet of what MIzuho analyst Jordan Klein said could be a 'changing of the guard in AI.'" Chipmakers Advanced Micro Devices and Intel notched gains of about 25%, while memory maker Micron jumped more than 37% and fiber-optic cable maker Corning climbed about 18%. All four of those companies have more than doubled in value this year, with Intel leading the way, up well over 200%. Nvidia, meanwhile, is only slightly ahead of the Nasdaq in 2026, gaining 15% for the year, aided by an 8% rally this week. In spreading the wealth to a wider swath of hardware companies, investors are clearly betting that the bull market in AI has long legs and that data centers are going to need a wider array of advanced components for years to come. Memory has been the biggest theme of late due to a global shortage that's driven up prices and turned Micron, a 47-year-old company tucked in a sleepy corner of the semiconductor market, into one of the hottest trades over the past 12 months. Micron blew past an $800 billion market capitalization for the first time this week, and the stock is now up over 750% in the past year. CEO Sanjay Mehrotra told CNBC in March that key customers are only getting "50% to two-thirds of their requirements" because of supply issues. The memory market is largely dominated by Micron, along with Korea-based Samsung and SK Hynix, which are also both in the midst of historic rallies... Bank of America estimates the data center CPU market could more than double from $27 billion in 2025 to $60 billion in 2030. AMD's quarterly results this week underscored the emerging trend, as earnings, revenue and guidance sailed past estimates on strong data center growth. The company has long led the CPU charge, and CEO Lisa Su said on the earnings call that AMD now expects 35% growth over the next three to five years in the server CPU market, up from a forecast of 18% growth that the company provided in November. The article cites two other big movers: Intel "is in the midst of a revival sparked by a major investment from the U.S. government last year. Intel's stock had its best month on record in April, more than doubling, and has continued notching massive gains, rising 33% in the early days of May." Nvidia still remains the world's most valuable company "and is expected to show revenue growth of 70% this fiscal year," the article points out — adding that companies like Corning are also benefiting from Nvidia partnerships. "Glass maker Corning, which celebrated its 175th anniversary this week, signed a massive deal with Nvidia on Wednesday that involves the development of three new U.S. factories dedicated entirely to optical technologies... likely a major step in Nvidia's move away from copper cables and towards fiber-optic cables as it builds out its rack-scale systems." Read more of this story at Slashdot.

Read the full article

Claude hitches ride on SpaceX's datacenter capacity

The Hot Take: Ai usage growing pretty steady and fast it would appear.

Anthropic is partnering with SpaceX to ease capacity constraints that have stranded Claude customers, a gesture that may soothe developer discontent about service availability and cost. Ami Vora, chief product officer at Anthropic, announced the expanded rate limits during Code for Claude, a developer event livestreamed from San Francisco. "As of today, we are increasing rate limits for developers on Claude Code and the Claude Platform," said Vora. "More specifically, we are doubling Claude Code's five-hour rate limits for Pro, Max, Team, and seat-based enterprise plans. And we're raising our API limits considerably for Claude Opus." Anthropic is also ending its peak hours limit reduction on Claude Code for Pro and Max accounts. The AI biz is able to do this, she explained, thanks to a partnership with SpaceX that expands available inference capacity. Anthropic has struck a deal to use "all the capacity of [SpaceX’s] Colossus 1 data center." According to SpaceX, "Colossus 1 features over 220,000 Nvidia GPUs, including dense deployments of H100, H200, and next-generation GB200 accelerators." The deal adds more than 300 megawatts of new capacity within the month and follows similar compute arrangements with Amazon and Google/Broadcom. The company's insatiable hunger for processing power may even take it into space. Anthropic says that it "expressed interest in partnering with SpaceX to develop multiple gigawatts of orbital AI compute capacity." In recent months, Anthropic has struggled to meet unexpected demand for Claude services – its models became sufficiently capable to win over skeptical developers and usage patterns shifted as a result of the popularity of OpenClaw's long-running agents. "Year over year, API volume is up nearly 17x on the cloud platform," said Vora. "And on Claude Code, the average developer is now spending 20 hours per week running Claude." Amid this growing popularity, Anthropic has also wrestled with bugs that affected model performance. During her presentation, Vora tempered expectations by noting that no new model would be announced. Instead, she presided over a review of new and recent Claude features in an effort to frame model improvements as exponential. The salient exponent here would be two – the doubling of Claude's five-hour rate limits. Model performance, as measured by benchmarks, has been incremental. Opus 4.7 is a few percentage points better than Opus 4.6 in various measurements, not twice as capable or more. That didn't stop Vora from claiming, "even though model capabilities are improving on an exponential, most organizations are still adopting AI on a linear path." Vora's use of "exponential" may be more of a thematic framing device than a literal assertion of progress, a device to draw a contrast between Claude's capabilities and a more cautious pace of corporate AI adoption. She cast the upcoming feature review as an opportunity for customers to see where Claude development is headed, "So you can plan for it and ride the exponential with us." The remainder of the presentation consisted of a summary of recent Claude feature improvements. These include: multi-agent orchestration, outcomes, and dreaming – a capability that showed up in the recent Claude Code source leak. "With Dreaming," explained Angela Jiang, head of product for the Claude platform, "Claude is actually able to self-learn. It's able to actually inspect over its previous sessions, figure out skills that it missed, lessons it should have learned, and actually apply those directly to memory on its own." Boris Cherny, head of Claude Code, took a turn on stage to remind everyone about Routines, a way to trigger and run Claude jobs locally or on cloud servers. "Routines can be run on a schedule, they can be kicked off by webhooks, or they can even be kicked off by arbitrary API calls, you can run them locally on your machine or on remote cloud compute," he said. Cherny said, "for me personally, a lot of my code nowadays is written by routines. I'm not the one doing the prompting. I'm the one creating a routine that does the prompting." Who wouldn't want to "ride the exponential" when one's company is paying the API bill? ®

Read the full article

Rambus Bets on Time Division Multiplexing to Fix PCIe 7.0 for AI Workloads, As GPUs Starve for Data

The Hot Take: Ai & GPUs need that bandwidth.

Rambus wants to address ongoing AI bandwidth problems with its new PCIe 7.0 Switch IP that features Time Division Multiplexing. Rambus Introduces PCIe 7.0 Switch IP with Time Division Multiplexing for Scalable AI and Data Center Infrastructure Press Release: Rambus, a premier chip and silicon IP provider making data faster and safer, today announced the Rambus PCIe 7.0 Switch IP with Time Division Multiplexing (TDM), a new addition to its advanced interconnect IP portfolio designed to address the rapidly escalating bandwidth, latency, and scalability requirements of AI, cloud, and high-performance computing (HPC) systems. As AI infrastructure grows in scale and architectural complexity, […]Read full article at https://wccftech.com/rambus-bets-on-time-division-multiplexing-to-fix-pcie-7-0-for-ai-workloads/

Read the full article