Dell Deskside Agentic AI  Image © DellDell Deskside Agentic AI (Image © Dell)

Transition from cloud to on-premises AI infrastructure

The trend towards agentic AI architectures has led to an increase in token usage, which can make cloud-based strategies financially prohibitive for many organizations. By moving inference to local hardware, organizations can keep their costs more predictable and retain more control over sensitive intellectual property. Dell estimates that organizations can reduce their expenses by up to 87% over two years compared to using cloud APIs, with some configurations breaking even in as little as three months.

The system uses NVIDIA OpenShell to manage these on-premises deployments. This provides a unified layer of security and policy enforcement that works across the entire hardware stack - from individual desktop workstations to Dell PowerEdge XE servers.

  • Dell Pro Max with GB10: Compact and energy-efficient system for prototyping smaller, single agents, suitable for models with 30 to 200 billion parameters.
  • Dell Pro Precision 9: Enterprise workstation tower with Intel Xeon 600 processors for workstations and up to five NVIDIA RTX PRO Blackwell Workstation Edition GPU configurations, providing scalable performance for demanding GPU workloads and supporting models with 30 to 500 billion parameters.
  • Dell Pro Max with GB300: Powered by the NVIDIA GB300 Grace Blackwell Ultra Desktop Superchip and Dell's exclusive MaxCool technology for maximum efficiency, this platform is purpose-built for inferencing state-of-the-art AI models with 120 billion to 1 trillion parameters.
  • NVIDIA NemoClaw Reference Stack: An open-source foundation for securely managing always-on AI agents, based on OpenClaw - the agent-based framework that enables persistent, autonomous, multi-stage AI workflows on local hardware. The stack combines powerful NVIDIA Nemotron open models for reasoning and coding with the secure runtime environment of OpenShell - all part of the NVIDIA Agent Toolkit for creating and coordinating long-running agents.
  • Dell Services: Comprehensive guidance through the entire agentic AI lifecycle - from initial strategy and hardware deployment to workflow customization, agent prioritization and ongoing optimization - to accelerate time to production readiness and close internal AI skills gaps.

Specialized hardware for diverse workloads

The solution is designed to support open-weight models with 30 billion to 1 trillion parameters, spread across three primary hardware tiers:

The Dell Pro Max with GB10 is a compact, energy-efficient system designed for small-scale prototyping, supporting models with 30 billion to 200 billion parameters. For more intensive workloads, the Dell Pro Precision 9 features Intel Xeon 600 processors and supports up to five NVIDIA RTX PRO Blackwell GPUs, expanding model capacity up to 500 billion parameters.

For top-level AI modeling, the Dell Pro Max with GB300 utilizes the NVIDIA GB300 Grace Blackwell Ultra Desktop Superchip. This platform features MaxCool technology to handle the thermal and efficiency requirements of models with 120 billion to 1 trillion parameters.

Dell Pro Max with GB300Dell Pro Max with GB300 (Image © Dell)

Dell Pro Max GB300 technical specifications

  • Processor: NVIDIA® Grace™ 72 Core Neoverse™ V2
  • Operating system: Ubuntu 24.04 LTS with NVIDIA AI Developer Tools
  • Graphics card: Discrete graphics card: NVIDIA DGX B300, 252 GB HBM3e (integrated), NVIDIA RTX PRO 2000-Blackwell, 16 GB GDDR7 (PCIe card)
  • Memory: 496 GB, LPDDR5X, 6400 MT/s, SOCAMM
  • Storage: 16 TB: (4 x 4 TB), SSD, Gen4, SED-capable
  • Microsoft Office: None
  • Security solutions for home users and small businesses: None

The infrastructure is supported by the NVIDIA NemoClaw reference stack, an open source foundation that enables persistent, autonomous and multi-stage AI workflows. This stack integrates Nemotron open models for reasoning and coding with the secure OpenShell runtime environment.

For highly regulated industries, Dell and NVIDIA have released the AI-Q 2.0 reference architecture. This framework is based on the Dell AI Data Platform and provides a proven foundation for deploying multi-agent workflows for decision support and complex research tasks.

To facilitate the transition from pilot to production, Dell Services provides comprehensive consulting on hardware deployment, workflow tuning and agent prioritization.

Availability

Dell Deskside Agentic AI, the Dell-NVIDIA AI-Q 2.0 reference architecture and NVIDIA OpenShell are available for immediate deployment.