Career Area:
Engineering
Job Description:
Your Work Shapes the World at Caterpillar Inc.
When you join Caterpillar, you're joining a global team who cares not just about the work we do – but also about each other. We are the makers, problem solvers, and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here – we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it.
Own the end-to-end architecture and operating model for the data platform powering analytics and AI. This includes data repositories (lakehouse/warehouse), enterprise reporting strategy, AI-ready data capabilities, engineering platform design, and the integration & operationalization of data science models (ML & GenAI). You’ll translate strategic business outcomes into scalable, secure, and cost efficient data and AI solutions, enabling self service, trust, and rapid delivery.
Key Responsibilities
Data Repository Architecture & Governance
Define the target-state architecture (lakehouse/warehouse, zones: raw/bronze, curated/silver, serving/gold) and data product strategy.
Establish data modeling standards (dimensional/star, data vault, wide tables, Delta/Iceberg/Hudi) for batch and streaming.
Implement metadata, cataloging, lineage, and data contracts; standardize schema evolution and versioning.
Drive security & compliance (RBAC/ABAC, encryption, masking, PII handling, consent/retention) and cost/performance optimization.
Set and enforce SLA/SLOs for availability, freshness, and reliability with full data observability (logs, metrics, traces).
Enterprise Reporting & BI Strategy
Define the semantic layer and metric store to standardize KPI definitions across BI tools (Power BI/Tableau/Looker).
Build governed datasets and self service enablement models with certified dashboards and release/version management.
Create a data storytelling & usability framework (accessibility, performance, load patterns, caching).
Establish data product ownership, lifecycle management, SLAs, change control, and audit for BI assets.
Data Needs for AI
Architect AI-ready data pipelines: labeling, feature stores, embeddings, vector search, training/validation/inference data management.
Data Engineering Platform Architecture
Design the data engineering platform: ingestion (batch/streaming), transformation, orchestration, testing, observability, deployment.
Provide platform blueprints and reusable scaffolding/SDKs to accelerate pipeline development.
Establish DevEx standards: CI/CD for data pipelines, environment strategy (dev/test/prod), infra as code (Terraform etc).
Implement monitoring & alerting (freshness, volume anomalies, schema drift, SLA breaches), and DQ testing (Great Expectations/dbt tests).
Own capacity planning, cost governance, partitioning/file layout, and performance tuning.
Integrating Data Science Models (Operationalization)
Define model serving architectures (batch scoring, online inference, streaming) with A/B testing and canary deployments.
Build feature lifecycle management: creation, reuse, drift detection, backfills, and documentation.
Integrate models with data contracts, lineage, and model observability (data drift, performance, fairness, latency).
Enable low-latency APIs and event-driven inference; ensure scale, reliability, and rollback capabilities.
Qualifications
Extensive years of experience in data engineering/architecture; + years of experience with platform/solution architecture leading enterprise initiatives.
Deep expertise in distributed systems (Spark/Flink), streaming (Kafka/Kinesis), lakehouse (Delta/Iceberg/Hudi), and warehouses (Snowflake/BigQuery/Redshift/Synapse).
Strong hands on with Python/Scala/SQL, orchestration (Airflow/Dagster/ADF), dbt, and CI/CD (GitHub Actions/Azure DevOps/Jenkins).
Experience with semantic/metric layers and BI tools (Power BI/Tableau/Looker).
Knowledge of ML Ops & LLM Ops ( Preferable)
Security & compliance mindset: IAM/RBAC, encryption, data masking, privacy laws (GDPR/DPDP), Responsible AI practices.
Proven stakeholder leadership; excellent communication and product mindset.
Core Competencies
End to End Architecture: Designs across data, analytics, and AI workflows and operating models.
Platform Product Thinking: Treats the platform and data assets as products (roadmaps, SLAs, adoption).
Reliability & Performance: Builds resilient, cost efficient systems with clear metrics.
Enablement & Influence: Drives cross functional alignment and self service adoption.
Risk & Compliance: Balances innovation with governance, security, and regulatory needs.
Benefits:
Competitive remuneration package including a great bonus structure and share options.
Intentional career development with exposure to global teams and markets.
A strong commitment to safety and your wellbeing
An inclusive workplace culture focused on quality, customer service and the environment
A commitment to diversity and inclusion, equal opportunity, and equal outcome
The opportunity to do truly meaningful work in a supportive, constructive culture that encourages you to make the most of your talents.
Caterpillar of Australia is not currently hiring foreign national applicants that require or will require sponsorship.
Relocation is available for this position.
Posting Dates:
December 3, 2025 - December 20, 2025
Caterpillar is an Equal Opportunity Employer. Qualified applicants of any age are encouraged to apply
Not ready to apply? Join our Talent Community.