Sr. Consultant

Novisync, Inc Logo
Novisync, Inc
  • IT
  • FullTime
  • Shift

Must have skills: LLM and Kubernetes

Project Tasks AI Operations Platform Consultant

  • Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
  • Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server.
  • Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
  • Setup and operation of AI inference service monitoring for performance and availability.
  • Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
  • Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
  • Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
  • Experience with standard processes for operation of a mission critical system — incident management, change management, event management, etc.
  • Managing scalable infrastructure for deploying and managing LLMs
  • Deploying models in production environments, including containerization, microservices, and API design
  • Triton Inference Server, including its architecture, configuration, and deployment.
  • Model Optimization techniques using Triton with TRTLLM
  • Model optimization techniques, including pruning, quantization, and knowledge distillation