What are the responsibilities and job description for the Knowledge Graph Engineer position at e-IT Professionals Corp.?
Job Title: Knowledge Graph Engineer
Location: Frisco TX or Atlanta, GA
Mode: Contract
Should work any of these companys :: Google Meta Amazon Apple Netflix Microsoft Nvidia Salesforce Oracle IBM Intel Cisco Adobe Palantir Snowflake Databricks Twitter / X LinkedIn Uber Airbnb
Must: - Graph DB (Neptune OR Neo4j OR TigerGraph) - Gremlin OR Cypher
Good to have: - Neptune specific - Flink - Embeddings
About the Role
We are looking for an experienced Knowledge Graph Engineer to design, build, and scale a production-grade property graph platform that powers customer segmentation, device intelligence, and household-level insights. You will own the full lifecycle from schema design and bulk ingestion to real-time CDC pipelines and graph embedding working closely with the segment engine team to deliver high-performance traversal queries and ML-ready embeddings at scale.
Key Responsibilities
Schema Design: Architect the property graph schema defining node types Customer, Device, Account, Plan, Offer and edge types HAS_DEVICE, ON_PLAN, SHARES_HOUSEHOLD, SIMILAR_TO ensuring optimal cardinality and partition key design for scale.
Bulk Load Pipeline: Build and validate the initial bulk load job across the ingestion stack (e.g., Delta Lake S3 staging Neptune bulk loader or equivalent technology).
Real-Time CDC Pipeline: Implement a change data capture pipeline (e.g., Cosmos DB Change Feed Kafka Flink Neptune writer) with an end-to-end lag target of <60 seconds.
Query Development: Write and optimize Gremlin traversal queries for household segmentation, device-sharing patterns, and account-linked segmentation use cases.
Index Strategy: Design vertex-centric indexes and leverage Neptune Analytics HNSW for embedding-based similarity lookups.
Graph Embeddings: Build a Node2Vec embedding pipeline (SageMaker or Databricks) and load SIMILAR_TO edges to support ML-driven similarity features.
Documentation: Document schema definitions, traversal patterns, and query performance benchmarks for consumption by the segment engine team.
Must-Have Skills & Qualifications
4 years of graph database engineering experience with production Gremlin / TinkerPop expertise.
AWS Neptune or equivalent cloud graph database bulk loader operations, instance sizing, HA configuration, and VPC networking.
Apache Kafka and Apache Flink for CDC pipeline design and implementation.
Property graph data modelling entity resolution, edge cardinality, and partition key design.
Graph traversal performance profiling at scale (100M nodes).
Nice-to-Have Skills
Graph embedding algorithms Node2Vec, GraphSAGE, or similar.
Neptune Analytics experience for graph analytics workloads.
Neo4j migration or comparative architecture experience (trade-offs vs. Neptune at scale).
Python (gremlinpython) and Java traversal source authoring.
AWS SageMaker or Azure Databricks for embedding model training.
What We Offer
Opportunity to architect and own a greenfield knowledge graph platform at enterprise scale.
Work with cutting-edge graph and ML technologies across AWS, Kafka, Flink, and SageMaker ecosystems.
Collaborate with data engineering, ML, and product teams to drive real customer and business impact.
Competitive compensation, flexible work arrangements, and a culture of continuous learning.
Salary : $40 - $60