What are the responsibilities and job description for the Data Solutions Architect position at Insight Global?
• Demonstrates knowledge of data solutions, data platforms, data management & governance practices and standards • Uses architecture tools to create design artifacts, including but not limited to Data models/architecture, API design and data solutions architecture.
• Leads strategy & architecture for one or more data management products such as Data Catalog, Data Lineage, Data Feeds Registry, Data Quality and related data services.
• Design data products to provide high quality data for quantitative modeling solutions for GRM.
• Applies knowledge to perform regular assessments of the health and maturity of data & information capabilities for the Global Risk Management (GRM) domain.
• Evangelize and design new data solutions and capabilities to support risk lines of businesses.
• Participates in efforts to define the mission, goals, critical success factors, principles, and procedures for data strategy and information architecture.
• Participates in building the business cases required to secure approval and funding for the Data and Analytical efforts.
• Understands the end-to-end change impact by managing linkages from information capabilities to technical assets (operational analytical)
• Maintains integrated logical data models and data flows to understand data and its interdependencies regardless of its usage pattern.
6 years exp in understanding and working with Data Platforms and Data Applications.
o Experience hands-on developing actual prototypes - (1 opening needed only) Proficient in GenAI frameworks such as Langchain, LamaIndex, OpenAI's Assistants API, FAISS, RAG (Retrieval-Augmented Generation), Azure OpenAI Service, Azure Cognitive Search, and vector databases like Pinecone, VectorDB, AstraDB, and Azure Cosmos DB for MongoDB vCore. - Proficient in data modeling methodologies including dimensional modeling (Star and Snowflake schemas), Data Vault, 3NF, Kimball and Inmon approaches, and designing slowly changing dimensions (SCDs) for tracking historical data changes. - Expertise in Big Data tools including Hive for SQL-like queries on Hadoop, Pig for data flow scripting, Flink for stateful computations, Storm for real-time stream processing, MapReduce programming paradigm, Kafka for distributed event streaming, and PySpark for Python-based Spark programming, enabling efficient processing of large-scale datasets.
- Experience designing and building big data architectures and data platforms.
- Strong proficiency in Python for data processing and scripting.
- Practical experience with Apache Spark for large-scale data transformation.
- Ability to develop data visualizations and dashboards with JavaScript.
- Familiarity with data pipeline management, data cataloging, data lineage, and data quality tools
• Knowledge of graph processing and graph stores is a plus.
• Basic understanding of Javaspringboot is good, but not a requirement.
• Exposure to open data, linked data and frameworks such as Apache Jena cloud platforms is a huge plus.
• Collibra experience
• Open Meta Data frameworks such as Apache Atlas, Datahub, Marquez and Open lineage
• Prior Risk experience – consumer side of risk