What are the responsibilities and job description for the Data Scientist III, Cancer Genomics Research Laboratory (CGR) position at Frederick National Laboratory for Cancer Research?
Job ID: req4544
Employee Type: exempt full-time
Division: Clinical Research Directorate
Facility: Rockville: 9615 MedCtrDr
Location: 9615 Medical Center Drive, Rockville, MD 20850 USA
The Frederick National Laboratory is operated by Leidos Biomedical Research, Inc. The lab addresses some of the most urgent and intractable problems in the biomedical sciences in cancer and AIDS, drug development and first-in-human clinical trials, applications of nanotechnology in medicine, and rapid response to emerging threats of infectious diseases.
Accountability, Compassion, Collaboration, Dedication, Integrity and Versatility; it's the FNL way.
Program Description
The Cancer Genomics Research Laboratory (CGR) investigates the contribution of germline and somatic genetic variation to cancer susceptibility and outcomes in support of the NCI's Division of Cancer Epidemiology and Genetics (DCEG), the world’s most comprehensive cancer epidemiology research group. CGR is located at the NCI-Shady Grove campus in Rockville, MD and is operated by Leidos Biomedical Research, Inc. We care deeply about discovering the genetic and environmental determinants of cancer, and new approaches to cancer prevention, through our contributions to the molecular, genetic, and epidemiologic research of the 70 investigators in DCEG. CGR staff form a large multidisciplinary team of members comprised of laboratory staff, informaticists and data analysts, project managers, working in concert with epidemiologists, biostatisticians, and research scientists within the DCEG intramural research program. CGR focuses on generating high-quality data to support a wide range of sequencing and GWAS studies at the population level. These efforts produce large and complex bioinformatics and data analysis outputs with large volumes of multi-year data requiring careful curation and presentation to ensure long-term preservation and usability. The Data Scientist III will be responsible for the continuation, refinement, and creation of data curation procedures and methods needed to properly preserve the high-quality data and analyses generated within CGR/DCEG and bring data in line with the FAIR principles (Findable, Accessible, Interoperable, and Reusable). This will include the development and deployment of those procedures for new data generation, as well as the application of those procedures to key datasets. The successful candidate must demonstrate the technical and personal skills to successfully interact with CGR/DCEG staff with varied skillsets to understand the background of legacy data and how it is used as well as have the scientific and technical know-how to access and evaluate all data. We are seeking an enthusiastic and driven professional with a passion for understanding state-of-the-art genomic technologies used in cancer research and applying the knowledge for high-quality data curation practices. The candidate is required to collaborate and build on existing and establish new principles and procedures to protect that data via long-term curation while facilitating its usefulness and accessibility for research by CGR and its collaborators. If you have the desire for the understanding and preservation of high-quality cancer susceptibility data while making it more FAIR, then come and help enable that data provide additional impact and drive further understanding of cancer genetics.
Key Roles/Responsibilities
To be considered for this position, you must minimally meet the knowledge, skills, and abilities listed below:
Candidates with these desired skills will be given preferential consideration:
All qualified applicants will receive consideration for employment without regard to sex, race, ethnicity, color, age, national origin, citizenship, religion, physical or mental disability, medical condition, genetic information, pregnancy, family structure, marital status, ancestry, domestic partner status, sexual orientation, gender identity or expression, veteran or military status, or any other basis prohibited by law. Leidos will also consider for employment qualified applicants with criminal histories consistent with relevant laws.
Pay And Benefits
Pay and benefits are fundamental to any career decision. That's why we craft compensation packages that reflect the importance of the work we do for our customers. Employment benefits include competitive compensation, Health and Wellness programs, Income Protection, Paid Leave and Retirement. More details are available here
113,500.00 - 162,533.00 USD
The posted pay range for this job is a general guideline and not a guarantee of compensation or salary. Additional factors considered in extending an offer include, but are not limited to, responsibilities of the job, education, experience, knowledge, skills, and abilities as well as internal equity, and alignment with market data.
The salary range posted is a full-time equivalent salary and will vary depending on scheduled hours for part time positions
Employee Type: exempt full-time
Division: Clinical Research Directorate
Facility: Rockville: 9615 MedCtrDr
Location: 9615 Medical Center Drive, Rockville, MD 20850 USA
The Frederick National Laboratory is operated by Leidos Biomedical Research, Inc. The lab addresses some of the most urgent and intractable problems in the biomedical sciences in cancer and AIDS, drug development and first-in-human clinical trials, applications of nanotechnology in medicine, and rapid response to emerging threats of infectious diseases.
Accountability, Compassion, Collaboration, Dedication, Integrity and Versatility; it's the FNL way.
Program Description
The Cancer Genomics Research Laboratory (CGR) investigates the contribution of germline and somatic genetic variation to cancer susceptibility and outcomes in support of the NCI's Division of Cancer Epidemiology and Genetics (DCEG), the world’s most comprehensive cancer epidemiology research group. CGR is located at the NCI-Shady Grove campus in Rockville, MD and is operated by Leidos Biomedical Research, Inc. We care deeply about discovering the genetic and environmental determinants of cancer, and new approaches to cancer prevention, through our contributions to the molecular, genetic, and epidemiologic research of the 70 investigators in DCEG. CGR staff form a large multidisciplinary team of members comprised of laboratory staff, informaticists and data analysts, project managers, working in concert with epidemiologists, biostatisticians, and research scientists within the DCEG intramural research program. CGR focuses on generating high-quality data to support a wide range of sequencing and GWAS studies at the population level. These efforts produce large and complex bioinformatics and data analysis outputs with large volumes of multi-year data requiring careful curation and presentation to ensure long-term preservation and usability. The Data Scientist III will be responsible for the continuation, refinement, and creation of data curation procedures and methods needed to properly preserve the high-quality data and analyses generated within CGR/DCEG and bring data in line with the FAIR principles (Findable, Accessible, Interoperable, and Reusable). This will include the development and deployment of those procedures for new data generation, as well as the application of those procedures to key datasets. The successful candidate must demonstrate the technical and personal skills to successfully interact with CGR/DCEG staff with varied skillsets to understand the background of legacy data and how it is used as well as have the scientific and technical know-how to access and evaluate all data. We are seeking an enthusiastic and driven professional with a passion for understanding state-of-the-art genomic technologies used in cancer research and applying the knowledge for high-quality data curation practices. The candidate is required to collaborate and build on existing and establish new principles and procedures to protect that data via long-term curation while facilitating its usefulness and accessibility for research by CGR and its collaborators. If you have the desire for the understanding and preservation of high-quality cancer susceptibility data while making it more FAIR, then come and help enable that data provide additional impact and drive further understanding of cancer genetics.
Key Roles/Responsibilities
- Collaborate with CGR project managers, bioinformaticians, laboratory staff, DCEG investigators and data science experts to continue, refine, and develop data management procedures and controls for the management and curation of CGR-generated datasets as per the FAIR Principles.
- Develop and refine best practices and lists of metadata needed for the archival and long-term curation of CGR/DCEG data.
- Evaluate all datasets in CGR to effectively apply established curation procedures and allow for standardization, harmonization, and general usability of these data.
- Evolve and continue development of a repository of CGR sample metadata to facilitate accessibility and reuse of data.
- Collaborate with established data storage groups within NCI, FNLCR, and NIH to facilitate the archival and curation of datasets while ensuring proper storage and accessibility of the data is maintained.
- Collaborate closely with DCEG PIs and CGR staff to facilitate and develop a culture in support of the FAIR principles to provide greater data accessibility and usability.
- Coordinate as needed with staff at CGR and DCEG to support posting of publication-associated data to repositories in line with NIH Data Management and Sharing policies.
To be considered for this position, you must minimally meet the knowledge, skills, and abilities listed below:
- Possession of Bachelor’s degree from an accredited college/university according to the Council for Higher Education Accreditation (CHEA) or four (4) years relevant experience in lieu of degree. Foreign degrees must be evaluated for U.S. equivalency.
- In addition to the education requirement, a minimum of five (5) years of progressively responsible experience.
- Team-oriented with excellent written and verbal communication skills, organizational skills, and strong attention to detail; ability to interface with many collaborators across multiple roles including project managers, PIs, and bioinformaticians.
- Demonstrated ability in working with and understanding of genetic and epidemiological datasets and how they are used.
- Proficiency in programming languages, such as Python, R, bash and SQL.
- Demonstrated experience with a variety of systems (e.g. HPC, GPU, Cloud), platforms, and environments (e.g. native, Conda, containerization) to help facilitate data assessment and understanding of user data access.
- Demonstrated experience with version control and code management systems such as Git.
- Familiarity with data repositories and/or database systems such as MySQL, PostgreSQL, MS SQL Server, Oracle.
- Ability to learn new topics quickly and apply the concepts to drive progress.
- Independently organize meetings and track project progress over time.
- Strong interpersonal skills for working in large teams with different backgrounds.
- Strong written and presentation skills to summarize complex topics effectively and clearly.
- Show flexibility in understanding the needs of the projects and adapt accordingly.
- Familiarity of FAIR principles and related best practices.
- Ability to obtain and maintain a security clearance.
Candidates with these desired skills will be given preferential consideration:
- Master's degree or PhD
- Experience managing Omics datasets.
- Experience managing and establishing FAIR based data curation procedures.
- Experience harmonizing metadata of large datasets from multiple sources.
- Experience navigating file systems to find and identify data and file metrics.
- Familiarity with web application development languages such as TypeScript, JavaScript, etc.
- Familiarity with cloud computing environments such as AWS and GCP, and cloud storage services such as AWS S3 and GCP GCS.
- Familiarity with genomic data commons, such as NCI GDC, AnVIL, NHLBI BioData Catalyst data platform.
- Familiarity of Omics data analysis procedures to establish an understanding of how the data is used and what is needed to perform analyses.
- Familiarity with project management tools for documentation and communication (Jira, Teams, Slack, etc.) to document status and activities.
- Familiarity with laboratory management systems.
- Familiarity with differing storage platforms and performance tiers.
- Familiarity with common bioinformatics tools and workflows for processing of GWAS and sequencing based genomic data. Understanding of the QC metrics, file formats and best practices for data management and archiving.
All qualified applicants will receive consideration for employment without regard to sex, race, ethnicity, color, age, national origin, citizenship, religion, physical or mental disability, medical condition, genetic information, pregnancy, family structure, marital status, ancestry, domestic partner status, sexual orientation, gender identity or expression, veteran or military status, or any other basis prohibited by law. Leidos will also consider for employment qualified applicants with criminal histories consistent with relevant laws.
Pay And Benefits
Pay and benefits are fundamental to any career decision. That's why we craft compensation packages that reflect the importance of the work we do for our customers. Employment benefits include competitive compensation, Health and Wellness programs, Income Protection, Paid Leave and Retirement. More details are available here
113,500.00 - 162,533.00 USD
The posted pay range for this job is a general guideline and not a guarantee of compensation or salary. Additional factors considered in extending an offer include, but are not limited to, responsibilities of the job, education, experience, knowledge, skills, and abilities as well as internal equity, and alignment with market data.
The salary range posted is a full-time equivalent salary and will vary depending on scheduled hours for part time positions