What are the responsibilities and job description for the Staff Data Engineer, Merchandising Catalog & Taxonomy (IC) position at Attachments King?
About The Role
Attachments King is an eCommerce startup in the Heavy Equipment Industry developing proprietary software that flexibly discovers compatibility between equipment and host machine components.
We’re hiring a Staff Data Engineer in an Individual Contributor role (no direct reports) to build and operate the single source of truth for a high‑SKU construction equipment catalog: taxonomy, product ingestion, price/availability pipelines from messy non‑API sources, and the automation layer that scales SKU count, price discovery, and data quality with for a small team with lean headcount.
This role is based in San Francisco, CA. This will be an in-office role and will extend past the standard 40 hours / week of many 9-5 jobs. We have long hours, weekend work sessions, and prioritize a results-driven culture.
Salary, Equity, and Benefits
Base Pay: $245,000 / year
Equity Offered: 2.00% (Options, 1yr Cliff, 4yr vest)
- No Funding Raised, Most Recent 409A FMV is $10M.
Total Compensation: $295,000 / year
- TC excludes potential refreshes; equity valued at 409A on grant date, amortized over 4 years
Employer-provided Health Insurance
Employer-provided 401k Plan
Day‑to‑day scope
- Taxonomy & PIM modeling: Own category trees, attributes, variants, compatibility metadata, and normalization rules (GS1/UNSPSC awareness; custom facets for consumer browse paths).
- Data ingestion (messy source formats): Build resilient pipelines for CSV/Excel, email attachments, SFTP, scraped HTML, PDFs, and images.
- Transform & validate: Typed, idempotent ETL/ELT with schema evolution and contract-based QA.
- Pricing & availability: Schedulers/agents to detect deltas, reconcile conflicts, discover competing listings, and publish to Shopify with guardrails for margin protection.
- Images: Automation for background removal, resizing, deduping, and attribute extraction (e.g., dimensions, metadata).
- Analytics: Build merchandising dashboards (assortment growth, price competitiveness, availability, metadata quality).
- Operations & SRE: Observability, alerting, backfills, SLAs/SLOs, rollback strategies, and cost control.
Current Platforms
- AWS (native-first): S3, DynamoDB, Neptune, Lambda, Step Functions, ECS/Fargate, EventBridge, SQS/SNS, CloudWatch, SSM Parameter Store.
- IaC: AWS CDK v2 (Python / TypeScript)
- ECommerce Platform: Shopify Plus
- Analytics: Power BI / Microsoft Fabric
- AI Tooling: Cursor, Devin, Graphite, Personal ChatGPT Pro / Claude Max plans
Requests for, and use of, additional AI tools is heavily encouraged
Core outcomes
30 days:
- Ship a production ingestion → normalization → enrichment → publish pipeline for all existing SKUs (2,200); stand up initial PIM data model with faceted attributes optimized for search/browse; wire price & availability watchers for all current vendors (files, web pages, emails, competitor websites).
- Baseline data quality with automated contracts & tests; initial operational dashboards (latency, freshness, fill rates, failure rates).
90 days:
- SKU count increased by 500% (11,000), coverage expanded to support top 100 product families and machine categories rank-ordered by search traffic demand; image set completeness > 95% for top movers; pricing latency < 15 minutes for tracked vendors; vendor onboarding time < 48 hours from first file to live SKUs.
- AI/agent workflows auto‑extract attributes from PDFs/images; continuous taxonomy evolution with zero-downtime migrations.
365 days:
- Deliver $9.27M in annual revenue, 100% attributable to zero-touch online orders of managed SKUs.
Must‑have requirements
- 7 years building production data systems (or commensurate impact): Python (pandas/polars), SQL (Postgres/Redshift/Snowflake/BigQuery), orchestration (Step Functions/Airflow/Prefect), eventing (SQS/Kafka), object storage (S3), CI/CD, containerization.
- Ecommerce catalog expertise: PIM concepts (attribute schemas, variants/SKU creation, canonicalization, dedup), Shopify Admin/GraphQL, metafields, collections, feed health.
- Non‑API data wrangling at scale: Selenium/Playwright for scraping (with robots/legal etiquette, rotation, backoff), email/SFTP ingestion, PDF OCR, document parsing.
- Data quality & contracts: Great Expectations, Pydantic (typed models), versioned schemas, migration plans, data diffing, idempotency as a base case.
- Image processing: PIL/Pillow, OpenCV, ImageMagick; batch pipelines and basic color/contrast/compositing.
- Analytics: Power BI and/or Tableau; metric design for merchandising (coverage, freshness, price index, conversion lift).
- AI/agentic workflows: Retrieval tool‑use agents to extract attributes, reconcile conflicts, propose taxonomy changes; prompt chaining; evaluation harnesses; safe‑ops patterns for deterministic fallbacks.
- Search relevance & indexing: Search relevance for catalogs (Meilisearch/Elastic/OpenSearch) and faceted navigation tuning.
- AWS: S3, Lambda, Glue/Athena, Step Functions, ECS/Fargate, CloudWatch; IaC via the CDK; strong cost/performance instincts.
Nice‑to‑haves
- Experience with homegrown PIMs
- Vendor EDI familiarity; GS1 barcoding; UNSPSC mapping.
You Might Thrive Here If...
- You are incredibly ambitious
- You are a self-starter and intensely curious
- You are hard-working and relentless, frequently going above and beyond in previous or current roles
- You are driven by achievement and energized by big, industry-disrupting challenges
- You want a "hardcore" work environment
- You want to leave a positive impact on the world
About Attachments King
Attachments King is E-Commerce for Heavy Machinery Attachments. We're pushing the boundaries of the construction industry with innovative proprietary technology that drastically improves the customer experience when purchasing heavy equipment. We firmly prioritize a hard-working, results-driven culture.
Our bar for talent is high, and we do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status. If you are remarkably good at what you do, you belong on our team.
For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.
This is the most important time to be alive in human history. Join us, and be a part of something incredible.
Salary : $245,000 - $295,000