KTek Resourcing · 20 hours ago
Senior Tools Architect – Enterprise Observability & AIOps (Hands-On)
KTek Resourcing is seeking a Senior Tools Architect specialized in Enterprise Observability and AIOps. The role involves designing and implementing observability platforms, integrating with ServiceNow, and leading incident management strategies while mentoring a team of observability engineers.
Responsibilities
Architect and hands-on implement a unified observability platform covering on-prem, multi-cloud (Azure + AWS), containers, and SaaS applications
Own the full Moogsoft AIOps deployment: ingestion pipelines, situation clustering, noise reduction, automated remediation workflows, and integration with incident/response tools
Design, build, and maintain enterprise 'single-pane-of-glass' dashboards (executive, NOC/SOC, service-owner, and engineering views) in tools such as Grafana, Datadog, Dynatrace, New Relic, or Lightstep
Lead deep bi-directional integration between observability tools and ServiceNow (Event Management, CMDB, Incident, Change, ITSM workflows, Service Mapping, Discovery)
Drive event correlation, alerting rationalization, and elimination of alert fatigue using Moogsoft and supporting tools
Hands-on build and maintain data ingestion pipelines (metrics, events, logs, traces) using Prometheus, OpenTelemetry, Fluent Bit/Fluentd, Elastic, Splunk, Datadog agents, etc
Create and present observability maturity roadmaps, AIOps business cases, SLA/SLO reporting, and tool rationalization plans to C-level executives and the board — in-person in San Jose
Own licensing strategy, cost governance (FinOps for observability), and vendor relationships across the entire stack
Mentor observability engineers and act as the final escalation owner for major incidents and platform issues
Qualification
Required
10+ years in enterprise IT operations with 6+ years owning large-scale observability and AIOps platforms (5,000+ servers, 50,000+ containers, multi-region)
Deep, hands-on expertise with Moogsoft AIOps (recent versions) – you have built or rebuilt Moogsoft environments from scratch, tuned clustering algorithms, and delivered >80% noise reduction
Proven track record building and operating enterprise dashboards that are used daily by executives, NOC/SOC, and engineering teams
Expert-level ServiceNow integration experience: Event Management (event rules, alert grouping, MID servers), Bi-directional incident sync with Moogsoft or other tools, CMDB population via Discovery and Service Mapping, Custom ServiceNow dashboards and Performance Analytics
Broad modern observability stack experience (at least four of the following required): Metrics & Dashboards: Grafana (advanced), Datadog, Dynatrace, New Relic, Lightstep; Logs & Tracing: Elastic Stack, Splunk, Loki, OpenTelemetry; Cloud-native: Azure Monitor, CloudWatch, Google Operations; AIOps: Moogsoft (mandatory), BigPanda, PagerDuty SignalFlow
Strong scripting/automation: Python (mandatory), Go, PowerShell, Ansible
Comfortable and effective presenting in-person to senior leadership and war-room teams in San Jose HQ on a daily basis
Certifications (at least two required): Moogsoft Certified Engineer or Architect, ServiceNow Certified Implementation Specialist – Event Management, Datadog Certified Architect, Dynatrace Associate/Pro, Grafana TCO, etc., ITIL v4 Foundation or higher
Company
KTek Resourcing
KTek Resourcing is a staffing and recruiting company specializing in healthcare, oil, and gas human resources staffing solutions.
H1B Sponsorship
KTek Resourcing has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2)
2023 (1)
2021 (1)
Funding
Current Stage
Growth StageRecent News
Company data provided by crunchbase