Join Us in Shaping the Future of AI
At FriendliAI, we’re transforming how AI runs at scale: making inference faster, cheaper, and easier for everyone. Our mission is to democratize access to production-grade AI so teams can focus on what truly matters: building great products. We believe the boldest innovations come from great teams, and we’re always seeking passionate, humble, and curious builders to join us.

Seungha Jeon, Software Engineer
“One of the best parts of FriendliAI is seeing how your idea or experiment can turn into something customers love.”
Seungha Jeon, Software Engineer
Open positions
We’d love to hear from you. Reach out anytime at careers@friendli.ai.
Engineering
Engineering / Inference Systems
About the job
We are seeking a highly technical Inference Engine Engineer to optimize the performance and efficiency of our core inference engine. You will focus on designing, implementing, and optimizing GPU kernels and supporting infrastructure for next-generation generative and agentic AI workloads. Your work will directly power the most latency-critical and compute-intensive systems deployed by our customers.
The ideal candidate is an exceptional engineer with a strong foundation in GPU programming and compiler infrastructure. You enjoy pushing the performance boundaries and have experience supporting production-scale machine learning applications.
Key Responsibilities
- Design and optimize custom GPU kernels for AI (e.g., transformer and diffusion) workloads
- Contribute to the development of FriendliAI's kernel compiler, memory planner, runtime, and other core components
- Collaborate with cloud and infrastructure engineers to ensure end-to-end inference performance
- Analyze performance bottlenecks across the software and hardware stack, and implement targeted optimizations
- Drive support for new model architectures and tensor compute patterns
- Maintain production-grade performance infrastructure, including profiling, benchmarking, and validation tools
Qualifications
- 5+ years of experience in production or high-impact research environments
- Production-level expertise in Python and C++
- Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
- Experience developing machine learning frameworks or performance-critical runtime systems
- Hands-on experience writing and optimizing GPU kernels
- Hands-on experience profiling GPU kernels
- Experience working with generative AI models such as transformer and diffusion models
Preferred Experience
- Experience developing machine learning compilers or code generation systems
- Familiarity with dynamic shape compilation, memory planning, and kernel fusion
- Contributions to inference engines, compilers, or high-performance numerical libraries
- Understanding of multi-GPU and distributed inference strategies
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
About the job
FriendliAI is looking for a GPU Kernel Engineer to design, build, and optimize the low-level compute kernels that power our large-scale, GPU-accelerated AI inference platform. You will be delivering world-class inference speed across NVIDIA and AMD GPUs. With our recent $20M funding, we are scaling our team to meet market demand.
This is a deeply technical, high-impact role where you will write GPU code, implement advanced optimizations. As part of our engine team, you will contribute directly to the company's proprietary inference engine which supports over 450,000 models on Hugging Face. You will work with the inventors of continuous batching and collaborate with the platform team to deploy your work into production.
Key Responsibilities
- Design, implement, and optimize high-performance GPU kernels for AI inference (e.g., GEMM, attention, routing)
- Develop and maintain GPU code in CUDA and C++, including low-level assembly when needed
- Implement reduced-precision and quantized kernels (FP8/FP4) for low-latency or high-throughput inference
- Benchmark and ensure cross-vendor performance parity between NVIDIA and AMD hardware
- Contribute to internal GPU libraries and tune performance of performance-critical components
- Accelerate multi-modal model pipelines
- Investigate and integrate next-generation GPU features
Qualifications
- 3+ years of experience in GPU programming, HPC, or performance-critical systems
- Bachelor's or Master's degrees in Computer Science, Computer Engineering, Electrical Engineering, or a related field
- Strong proficiency in CUDA for NVIDIA GPUs or ROCm/HIP for AMD GPUs
- Deep understanding of GPU architecture: warps, threads, memory hierarchy, synchronization, and latency-throughput trade-offs
- Proficiency in C++
- Experience with GPU profiling and performance tuning
- Strong numerical background with understanding of precision trade-offs and quantization techniques
Preferred Experience
- Experience optimizing transformer, multi-modal, or Mixture-of-Experts (MoE) architectures at the kernel level
- Familiarity with the latest GPU libraries and frameworks (CUTLASS, Triton, …)
- Inter-GPU communication programming experience
- Open-source contributions related to GPU performance or ML acceleration
- Research or conference presentations on GPU optimization, HPC, or numerical computing
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
Engineering / Core Product
About the job
We are seeking a frontend engineer with a passion for high-performance UI and AI applications. This role focuses on crafting scalable, accessible, and optimized interfaces that empower users to manage generative and agentic AI deployments. You will play a key role in shaping user experience across our developer dashboard, agent builder, and observability tools.
An ideal candidate thrives at the intersection of frontend engineering and AI. You are deeply comfortable with scalable component systems, accessibility standards, and modern TypeScript-based architectures. You enjoy crafting elegant interfaces as much as working with agentic frameworks and AI-native interaction patterns.
Key Responsibilities
- Design and implement high-performance, accessible, and SEO-friendly user interfaces
- Develop and maintain reusable, scalable UI components across our TypeScript monorepo
- Collaborate with design and product to deliver intuitive workflows for AI/LLM development and deployment
- Work closely with backend teams to integrate UI with real-time inference, orchestration, and monitoring APIs
- Contribute to and evolve our design system and documentation
- Participate in architectural reviews and code quality initiatives
- Continuously iterate on frontend performance, accessibility, and user feedback
Qualifications
- 3+ years of frontend engineering experience in production systems
- Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
- Expert-level knowledge of TypeScript, React, and component-driven architecture
- Proven track record building accessible (a11y) and SEO-optimized SPAs and web apps
- Strong understanding of monorepo maintenance, code reuse patterns, and scalable UI systems
- Familiarity with data visualization frameworks, real-time APIs, and frontend performance monitoring tools
Preferred Experience
- Experience working with LLMs, prompt-driven UIs, or agentic orchestration frameworks
- Background in frontend systems for developer tools or AI platforms
- Strong documentation and writing skills (for internal tools or user education)
- Passion for design systems, usability, and performance optimization
- Have built developer-facing products or technical UIs
- Understanding of AI agent architectures or multimodal UI patterns
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
About the job
We're seeking a Full-Stack Engineer to design, build, and scale our web platform, which serves as the core interface for deploying multimodal models, observing workloads, and building agent workflows. You'll collaborate closely with product, infrastructure, and design teams to create high-performance, developer-friendly, and enterprise-ready tools.
We are looking for a hands-on engineer who is a talented full-stack developer, eager to work at the intersection of infrastructure, developer experience, and AI applications. A great candidate is a strong collaborator who enjoys working across the stack, cares deeply about developer workflows, and is eager to help define the future of AI adoption.
Key Responsibilities
- Design, build, and maintain web applications and tools for AI model deployment, monitoring, and performance optimization
- Develop clean, scalable, and robust APIs powering AI agents, workflows, and user-facing systems
- Collaborate with infrastructure engineers to integrate backend systems with deployment and orchestration pipelines
- Optimize the performance and usability of web interfaces
- Drive code quality through automated testing, CI/CD, and code reviews
- Contribute to architecture and design decisions that shape our platform's long-term direction
- Identify and resolve technical debt and improve system reliability in production systems
Qualifications
- 5+ years of industry experience in full-stack or backend engineering
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or equivalent
- Fluent in TypeScript and Python, Expert with React/Next.js
- Strong backend experience with FastAPI or similar Python frameworks
- Proven expertise in delivering production-scale full-stack applications
- Proficiency in designing data models, writing SQL, and working with PostgreSQL
- Deep understanding of modern web frameworks and component-driven architecture
- Strong API design experience across gRPC/REST/GraphQL in production systems
- Solid foundation in cloud-native development
- Familiarity with OpenTelemetry tracing, metrics, and structured logging
- Knowledge of web security, authentication, RBAC, and multi-tenant SaaS systems
Preferred Experience
- Familiarity with LLM-based workflows, tool invocation, or agentic systems
- Familiarity with Kubernetes for container orchestration, including deploying, scaling, and managing containerized applications in production environments
- Have worked in a startup or fast-paced environments with ownership and ambiguity
- Built developer-facing SDKs/CLIs
- Passion for developer experience and enabling AI adoption
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
Engineering / Infrastructure
About the job
FriendliAI is looking for an engineer to architect and build the security foundations of a multi-tenant serverless compute platform running on Kubernetes with hardened container isolation. As a Software Engineer, Platform Security, you will own the security design for both control plane and data plane, implement guardrails as code, and partner closely with Platform/SRE/Infra teams to ship a secure-by-default developer experience. This is a hands-on builder and architect role.
Key Responsibilities
- Design and implement the security architecture for a Kubernetes-based, multi-tenant serverless platform
- Build guardrails as code using Terraform and Helm
- Establish network segmentation and service-mesh policy
- Develop secrets and key management patterns
- Implement runtime detections for containerized workloads
- Design and roll out IAM & workload identity management
- Threat model new features and changes to the overall platform
- Collaborate with engineering to debug production issues, lead post-incident hardening, and automate evidence for compliance controls
Qualifications
- 5+ years securing cloud-native platforms with a focus on distributed systems and multi-tenancy
- Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
- Strong hands-on experience with AWS (IAM, VPC/networking, logging/monitoring, …) and Kubernetes/container security
- Proficiency with Terraform and Helm
- Experience building or operating runtime detections for containers/functions
- Strong technical background in backend systems, cloud infrastructure, or AI tooling
- Clear written and verbal communication; ability to translate pricing/business rules into robust systems
Preferred Experience
- Knative or similar serverless-on-K8s frameworks
- Container hardening and sandboxing
- Policy-as-code and admission policy design at scale
- eBPF-based observability, detections, and signal tuning
- Multi-cloud exposure and cross-cloud identity approaches
- Hands-on contributions to SOC 2 / HIPAA control automation and audit evidence pipelines
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
About the job
FriendliAI is looking for an engineer to design, build, and operate the foundations of our large-scale, GPU-accelerated AI inference platform. As a Software Engineer, SRE, you will be responsible for ensuring the reliability, scalability, and efficiency of our cloud-native systems. You'll work at the intersection of infrastructure, developer platforms, and reliability engineering—building tools and processes that empower engineering teams to ship with confidence.
This is a hands-on role where you will manage large Kubernetes fleets, design resilient cloud architectures, and lead efforts in observability, CI/CD automation, and service reliability.
Key Responsibilities
- Design, build, and operate cloud-native infrastructure (primarily AWS) using Terraform and Helm
- Manage and scale multi-cluster Kubernetes environments with strong reliability, security, and cost-efficiency in mind
- Develop, automate, and maintain CI/CD systems (Argo CD, Argo Rollouts, Spinnaker, GitHub Actions)
- Enhance reliability through service mesh (Istio) operations, traffic routing optimization, and secure mTLS-based communication
- Build and operate distributed data systems (e.g., Redis, Vault) with multi-AZ resiliency
- Improve developer velocity through internal deployment platforms, canary rollouts, and self-service tooling
- Implement observability practices (Datadog, metrics, logging, alerting) as infrastructure-as-code
- Lead post-incident reviews, reliability improvements, and production hardening
- Partner closely with product, infra, and security teams to deliver a reliable-by-default developer experience
Qualifications
- 3+ years of experience as an SRE, DevOps, or Infrastructure Engineer operating large-scale cloud systems
- Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
- Proficiency in AWS, Kubernetes, Terraform, Helm
- Hands-on experience with service mesh (Istio), CI/CD platforms (Argo, Spinnaker, GitHub Actions), and infrastructure automation
- Strong debugging skills across distributed systems (networking, containers, databases, observability)
- Programming skills in Go, Java, or Python, with the ability to develop tools and automation
- Solid understanding of cloud networking, identity, and security fundamentals
Preferred Experience
- Multi-cloud or hybrid-cloud operations
- Experience building internal developer platforms on Kubernetes
- Knowledge of Redis, or other distributed data stores
- Policy-as-code (OPA/Gatekeeper, Kyverno) or advanced workload identity patterns
- Cost optimization strategies for large-scale workloads (GPU or networking-intensive)
- Prior contributions to incident management, SLO/SLI design, or chaos engineering
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
About the job
FriendliAI is hiring a Billing Engineer to build our next-generation, in-house metering and billing platform. The billing system should power complex, rapidly evolving pricing across subscriptions, usage, tiers, discounts, overages, and custom enterprise contracts. You'll be the primary contributor implementing event-driven pipelines, pricing/rating logic, and deep Stripe integrations. You'll work with other software engineers for key decisions while going hands-on in Python/Go services, Kafka-based event pipelines, and cloud operations across AWS (plus other cloud services).
We are looking for someone who loves turning messy real-world usage into precise, auditable invoices; who designs for correctness, configurability, and change as our pricing strategy evolves continuously.
Key Responsibilities
- Design and implement Kafka-backed metering and event pipelines
- Implement pricing and rating logics. Encode plans, tiers, proration, discounts, custom contracts, and more. Should support mid-cycle price and plan changes safely
- Own Stripe integrations. Implement Payments and invoicing, webhooks, retries, refunds, and reconciliation workflows
- Model billing domain by developing data models for product catalogs, entitlements, and charge items
- Implement various payment strategies
- Ensure the accuracy of the billing system and provide observability to the system
Qualifications
- 3+ years of backend engineering experience building production services
- Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
- Proficiency in Python; strong software design and code quality
- Experience with event streaming (Kafka or similar) and distributed system concepts
- Hands-on experience integrating Stripe Payments & Invoicing (or equivalent payment platforms)
- Strong SQL and data modeling skills; comfort with large-scale usage data and financial accuracy
- Experience with ClickHouse (or similar OLAP systems) for high-volume usage aggregation and analytics
- API design expertise with gRPC and REST: resource modeling, versioning, backward compatibility, authn/z, pagination, idempotency
- Cloud experience with AWS and familiarity with containers and CI/CD
- Clear written and verbal communication; ability to translate pricing/business rules into robust systems
Preferred Experience
- Background in a fintech company or experience building in-house usage-based billing platforms
- Knowledge of dunning strategies, dispute/chargeback workflows, and revenue assurance
- Experience integrating with cloud marketplaces (e.g., AWS marketplace), including private offers, SaaS/usage metering and entitlements, fulfillment APIs, and downstream invoicing
- Experience with infrastructure-as-code and security best practices
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
Engineering / Applied Engineering
About the job
FriendliAI is hiring a Python Engineer to build developer tools for our users and internal engineers. The software developed by the engineer will be the primary way developers integrate with our inference and agent platform. Your work will remove friction for internal teams and external users. You'll design ergonomic & stable APIs, reliable releases to PyPI, and drive top-tier developer experience across documentation, examples, and tooling. You'll collaborate with frontend engineers on end-to-end workflows and partner with product & engineering teams.
Key Responsibilities
- Own the Python SDK lifecycle. You will design APIs and implement the client side in the SDK
- Build and maintain a cross-platform CLI with a modern interface, including helpful error messages and good UX
- Manage packaging and distribution pipelines
- Develop and maintain internal developer tools related to DevOps
- Create examples, templates, and guides that help developers effectively utilize our software bundles
Qualifications
- 3+ years of professional Python engineering building libraries, SDKs, or developer tools used in production
- Demonstrated SDK/CLI ownership. Familiarity with API ergonomics and versioning, deprecation policies, telemetry, and debugging customer issues
- Strong Python fundamentals, including asyncio, typing, packaging, and testing
- Web API fluency in REST and gRPC
- Experience working with Python monorepos
- Clear written communication skills and capable of turning complex features into clean APIs and concise docs
Preferred Experience
- Maintainer or significant OSS contributor in Python libraries (tooling, SDK, CLI, etc.)
- Experience with Python AST and Meta-programming, Packaging
- Performance work (profiling, streaming, backpressure) or multi-platform build experience
- Cross‑language exposure (TypeScript/Node, Go)
- Containers & local dev tooling (Docker); familiarity with Kubernetes basics
- Experience with LLM/agent SDKs or inference workflows
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
About the job
We're seeking a Customer Success Engineer who is passionate about helping customers achieve success with cutting-edge AI infrastructure. You'll act as a trusted technical advisor to our users, ensuring smooth onboarding, ongoing support, and optimal product usage. You'll work across product, engineering, and support to represent the voice of the customer and help shape a delightful developer experience.
The ideal candidate is an empathetic engineer with strong communication skills who thrives at the intersection of technology and user advocacy. You enjoy helping developers succeed, debugging real-world issues, and turning feedback into action.
Key Responsibilities
- Guide new users through onboarding, setup, and best practices
- Document technical learnings, common patterns, and solutions derived from real customer interactions
- Transform lessons learned into clear, actionable knowledge for both internal teams and external users
- Create and maintain documentation, tutorials, and sample projects
- Provide technical support via Slack, email, and meetings, helping users troubleshoot and resolve issues quickly
- Collaborate with engineering to debug production issues and prioritize fixes
- Help users architect scalable and efficient deployments using FriendliAI
- Lead customer-facing technical sessions, demos, and Q&As
Qualifications
- 2+ years in a developer-facing or technical support role (Customer Success, Solutions Engineering, etc.)
- Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
- Strong technical background in backend systems, cloud infrastructure, or AI tooling
- Excellent written and verbal communication skills
- Proficient in Python and familiar with LLM ecosystems (e.g., Hugging Face, LangChain)
- Experience working with APIs, CLI tools, and deployment workflows
- Ability to explain complex technical concepts clearly and concisely
Preferred Experience
- Strong technical writing skills with experience authoring user-facing or internal technical content
- Experience writing developer documentation or educational content
- Prior experience working with enterprise customers or managing technical escalations
- Contributions to developer communities or open source
- Familiarity with model serving platforms and inference workflows
- Hands-on experience building with agentic or autonomous AI frameworks
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
About the job
We're seeking an Agent Engineer to design and build agentic features in our platform, including document understanding, advanced RAG, and customer support automation. You will develop not only the agent components but also the Friendli Agent API, which is the core developer interface for building and extending agent applications. As an Agent engineer, you will also build the agent applications as production-ready examples of how agents can solve real-world problems.
Applications will be primarily written in Python and serve as reference implementations for our customers and community. You will work with various open-source models, unlocking their potential inside AI agent applications, and collaborate with platform and infrastructure teams to ensure seamless integration at scale.
We are looking for a hands-on engineer passionate about building agent systems and enabling developers to adopt AI easily. You should be comfortable creating agent applications that highlight what's possible. The ideal candidate is curious about and experienced with open-source models, and enjoys turning them into reliable, high-impact features.
Key Responsibilities
- Design, build, and maintain agent APIs and applications that deliver document understanding and other high-value features
- Evaluate and integrate open-source models to power production-ready agent features where possible
- Develop reference agent applications to showcase workflows and accelerate customer adoption
- Collaborate with backend and infrastructure teams to integrate agents with deployment, orchestration, and monitoring systems
- Ensure APIs are robust, developer-friendly, and enterprise-ready through strong design principles and documentation
- Continuously improve the reliability, scalability, and performance of agent features in production
Qualifications
- 3+ years of experience in software engineering, preferably in backend, ML systems, or API development
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or equivalent
- Strong programming skills in Python; experience with various Python frameworks
- Solid understanding of LLM workflows, agent patterns, or tool invocation systems
- Experience designing and delivering production APIs
- Familiarity with open-source LLMs and multimodal models (HuggingFace, LangChain, LlamaIndex, etc.)
- Strong foundations in cloud-native development
Preferred Experience
- Experience with document understanding pipelines (e.g., OCR, RAG, summarization, structured extraction)
- Familiarity with Kubernetes or container orchestration in production
- Built or contributed to agent frameworks, SDKs, or CLIs
- Have worked in a startup or fast-paced environments with ownership and ambiguity
- Passion for developer experience and enabling AI adoption
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
Sales
About the job
As a GTM professional at FriendliAI, you will be instrumental in driving our market expansion by securing high-value enterprise accounts and building strategic relationships with leading AI-forward companies. You'll focus on helping enterprises leverage FriendliAI's inference serving platform to achieve transformative cost savings and performance improvements in their AI deployments.
Environments
- Highly competitive base salary plus uncapped commission structure tied to enterprise deal value
- Equity in a rapidly growing AI infrastructure company
- Comprehensive benefits package
- Collaborative environment with exposure to cutting-edge AI technologies
Key Responsibilities
- Own the end-to-end sales process with enterprises, from generating pipeline to closing high-value, strategic enterprise deals focused on AI inference and serving solutions
- Identify and capitalize on growth opportunities within existing accounts, becoming a trusted advisor for AI infrastructure optimization
- Manage technical POCs that demonstrate FriendliAI's superior performance and cost advantages, collaborating with internal engineering teams and enterprise clients
- Consistently generate and manage a robust pipeline by identifying enterprises with significant inference workloads and positioning FriendliAI's solutions strategically
- Lead tech community engagements, representing FriendliAI at premier AI/ML conferences, meetups, and industry events (AI Engineer World's Fair, AWS re:Invent, etc.)
- Cultivate strategic relationships with AI engineers, ML platform leaders, and technical decision-makers at target enterprises through community involvement
- Organize technical events; workshops, technical demos, and meetups showcasing FriendliAI
- Collaborate on technical content, case studies, and speaking opportunities that establish FriendliAI's market leadership
- Bring the voice of enterprise customers into product discussions, ensuring FriendliAI's roadmap aligns with market needs and competitive positioning
- Work closely with engineering teams to understand and articulate complex technical differentiators around inference optimization
- Analyze competitive landscape of market and provide insights on enterprise AI infrastructure trends and requirements
Qualifications
- 5+ years of B2B sales experience
- Proven track record of closing $300K+ ARR deals and consistently exceeding revenue targets
- Strong technical background in AI/ML infrastructure, cloud platforms, or developer tools
- Deep understanding of LLM deployment, inference serving, and AI/ML workflows
- Active participation in AI/ML tech communities with demonstrated networking abilities
- Experience managing technical POCs and multi-stakeholder enterprise sales cycles
- Excellent communication skills for engaging both C-level executives and technical teams
Preferred Experience
- AI infrastructure, MLOps, or developer tools company background
- Existing network within enterprise AI/ML teams
- Technical degree or equivalent practical experience in AI/ML space
- Track record of organizing or speaking at tech meetups and industry conferences
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
About the job
As a GTM professional at FriendliAI, you will be instrumental in driving our market expansion by securing high-value enterprise accounts and building strategic relationships with leading AI-forward companies. You'll focus on helping enterprises leverage FriendliAI's inference serving platform to achieve transformative cost savings and performance improvements in their AI deployments.
Environments
- Highly competitive base salary plus uncapped commission structure tied to enterprise deal value
- Equity in a rapidly growing AI infrastructure company
- Comprehensive benefits package
- Collaborative environment with exposure to cutting-edge AI technologies
Key Responsibilities
- Own the end-to-end sales process with enterprises, from generating pipeline to closing high-value, strategic enterprise deals focused on AI inference and serving solutions
- Identify and capitalize on growth opportunities within existing accounts, becoming a trusted advisor for AI infrastructure optimization
- Manage technical POCs that demonstrate FriendliAI's superior performance and cost advantages, collaborating with internal engineering teams and enterprise clients
- Consistently generate and manage a robust pipeline by identifying enterprises with significant inference workloads and positioning FriendliAI's solutions strategically
- Lead tech community engagements, representing FriendliAI at premier AI/ML conferences, meetups, and industry events (AI Engineer World's Fair, AWS re:Invent, etc.)
- Cultivate strategic relationships with AI engineers, ML platform leaders, and technical decision-makers at target enterprises through community involvement
- Organize technical events; workshops, technical demos, and meetups showcasing FriendliAI
- Collaborate on technical content, case studies, and speaking opportunities that establish FriendliAI's market leadership
- Bring the voice of enterprise customers into product discussions, ensuring FriendliAI's roadmap aligns with market needs and competitive positioning
- Work closely with engineering teams to understand and articulate complex technical differentiators around inference optimization
- Analyze competitive landscape of market and provide insights on enterprise AI infrastructure trends and requirements
Qualifications
- 5+ years of B2B sales experience
- Proven track record of closing $300K+ ARR deals and consistently exceeding revenue targets
- Strong technical background in AI/ML infrastructure, cloud platforms, or developer tools
- Deep understanding of LLM deployment, inference serving, and AI/ML workflows
- Active participation in AI/ML tech communities with demonstrated networking abilities
- Experience managing technical POCs and multi-stakeholder enterprise sales cycles
- Excellent communication skills for engaging both C-level executives and technical teams
Preferred Experience
- AI infrastructure, MLOps, or developer tools company background
- Existing network within enterprise AI/ML teams
- Technical degree or equivalent practical experience in AI/ML space
- Track record of organizing or speaking at tech meetups and industry conferences
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
Marketing
About the job
FriendliAI is hiring a Technical Writer to craft clear, engaging, and accurate content that helps developers, customers, and partners understand and succeed with our AI platform. You'll be the bridge between complex technical concepts and accessible information. From developer documentation to customer-facing guides, your work will remove friction, drive adaptation, and communicate FriendliAI's value across multiple channels.
You'll collaborate with engineers, product managers, and go-to-market teams to create documentation, marketing assets, and campaign content. Your writing will serve both technical and business audiences, ranging from SDK reference docs to blog posts, whitepapers, newsletters, and partner materials.
Key Responsibilities
- Plan, create, and manage content across various platforms, including documentation, websites, blogs, newsletters, and social channels
- Write clear guides, examples, and references that help developers adopt and integrate FriendliAI products effectively
- Research potential customers, competitors, and industry trends to inform the direction of content
- Support existing customers with updated resources
- Continuously update and improve content to reflect new product features, industry trends, and best AI practices
Qualifications
- Bachelor's in Business, Marketing, Technical Communication, or a related field
- 3+ years of experience in technical writing, developer documentation, or content creation in the technology industry
- Excellent written and verbal communication skills in English, with a proven ability to clearly explain complex concepts
- Ability to work independently, prioritize, and manage multiple projects in a fast-paced environment
- Proficiency with content management and publishing tools (e.g., CMS, Markdown, Git)
Preferred Experience
- Familiarity with or interest in AI (especially generative AI) and IT B2B SaaS
- Experience writing for developer tools, SDKs, or APIs
- Developer relations experience
- Have published an academic paper
- Comfortable collaborating across engineering, product, and go-to-market teams
Benefits
- A front-row seat to the AI infrastructure revolution
- Opportunity to work cross-functionally with top-tier engineers, product leaders, and go-to-market teams
- Competitive compensation
- Premium hardware and health support benefits
- A highly collaborative, fast-moving team where your impact is immediate and visible
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.
About the job
Technical Marketing Engineer bridges the gap between marketing and engineering, focusing on creating technical marketing strategies that align with engineering principles. They are instrumental in developing and implementing marketing plans that resonate with technical audiences, ensuring that products are presented accurately and attractively.
Environments
- Highly competitive base salary plus uncapped commission structure tied to enterprise deal value
- Equity in a rapidly growing AI infrastructure company
- Comprehensive benefits package
- Collaborative environment with exposure to cutting-edge AI technologies
Key Responsibilities
- Coordinate with the marketing team on marketing goals and strategies
- Integrate technical products such as data analytics software into marketing strategies and campaigns
- Research and develop a thorough understanding of the company's competitors and their activities in the marketplace
- Identify the technical necessities for potential projects and document the issues and feasibility in detailed reports
- Produce user manuals for the marketing team and clients
- Oversee the development of promotional marketing materials and keep an eye on the product performance
Preferred Experience
- Proficiency in programming languages is a plus (Python, TypeScript etc.)
- Deep understanding of SEO and web accessibility
- Proven ability to design and architect solutions, not just execute
- Proficient in digital marketing tools and analytics platforms
About us
FriendliAI, a San Mateo, CA-based startup, is building the next-generation AI inference platform that accelerates the deployment of large language and multimodal models with unmatched performance and efficiency. Our infrastructure supports high-throughput, low-latency AI workloads for organizations worldwide. We are also integrated with the Hugging Face platform, allowing instant access to over 450,000 open-source models. We are on a mission to deliver the world's best platform for AI inference.