Data Scientist

Company: Blue Ocean Ventures
Location: West Sacramento
Posted on: January 15, 2023

Job Description:

Remote position

Will convert to a FT role

Data Scientist

Data Scientist (Level 04)
Having at least 1 4 years of experience in industry (including internships in an industrial setting while in an educational setting) and solidly collaborative interpersonally, a Data Scientist knows and uses the basic tools of the craftdata ingesting, data cleansing, feature engineering, data filtering and aggregating, statistical estimation and hypothesis testing, and machine learningto support GWT's Medicaid clients, by wrangling healthcare and other client data assets and then analyzing them to surface basic, impactfully actionable insights. This level doesn't require prior experience with healthcare, but it's strong plus coming in, and learning this content domain starting day one is part and parcel of the on-going task.
Must Have:
Has worked several times with datasets at least 100K rows in size
Has solid experience wrangling "Ugly Data?
Has used all basic SQL features (select, join, where, group by, etc.) on non-training databases
Has used Machine Learning on at least two non-training data sets (that is, not e.g., Iris)
Has used Statistics on at least two non-training data sets
Solid Python including syntax and ML libraries
Frequently Occurring Plusses at This Level:
Experience using Jupyter Notebooks
Familiar with the use of GitHub, GitLab, or other source control software
Experience with Distributed Computing
Experience with Healthcare data
Main-stream BI dashboarding tools such as PowerBI, Tableau, SAS, etc.
Strong Object-Oriented Design (C++, C#, python, etc.)
Functional Programming
Excellent written and verbal communication skills
Please Highlight any of these more Advanced Levels of Experience:
Expert knowledge of SQL queries to process large amounts of data in a set-based fashion
Thorough understanding of various SQL windowing functions through regular use
Solid understanding of indexing, stored procedures, materialized views, etc.
Has worked with datasets 10M 10B rows in size, some of them ugly/messy
Solid experience with the use of Dev\Sec\Ops procedures for iterative development
Experience doing Data Science leveraging the elasticity in a major Cloud environment
Has good knowledge of Repeated Measures Statistical methods including forecasting
Performed Data Science with Healthcare data
Performance Optimization for Data Science development and in production
Working knowledge of stacks and CQRS data structures, and state machines that use them
Knowledge of Semi-Structured and UnStructured data, schema-on-read, parsers, NLP packages
Artificial Intelligence / Neural Network Experience a strong plus
Comfort-level presenting to business stakeholders and executives

