About
NVIDIA TensorRT is an SDK for optimizing and deploying deep learning models across a range of hardware, from data centers to edge devices. Utilizing techniques like quantization and kernel tuning, TensorRT significantly reduces inference latency and increases throughput compared to CPU-only deployments. The platform specifically targets developers working with performance-critical applications and large language models requiring efficient GPU acceleration.
Focus Areas
Technology Focus
ml infrastructure llm developer tools
Quick Stats
Founded 2017
Status private
Type enterprise
Explore More
Similar Companies
Feast
San Francisco, United States
82%
3 shared categories
3 shared AI types
Same country
ML InfrastructureMLOps +1
Baseten
San Francisco, United States
82%
3 shared categories
3 shared AI types
Same country
ML InfrastructureAI Infrastructure +1
BentoML
San Francisco, United States
82%
3 shared categories
3 shared AI types
Same country
ML InfrastructureMLOps +1
Kubeflow
Mountain View, United States
82%
3 shared categories
3 shared AI types
Same country
ML InfrastructureMLOps +1