About us:
At Grok (www.grokstream.com), we all come to work every day because we want to solve the most challenging problems in machine learning for IT service organizations. Problems such as making machine learning actionable, easy to use and deploy, and allow organizations to take ownership of their machine learning destiny.
We are looking for a Senior Data Scientist / ML Engineer who will support our product development and sales efforts by applying and developing ML techniques to automate operational processes. The ideal candidate is adept at using large data sets to find opportunities for product and process optimization and using models to test the effectiveness of different courses of action. They must have strong experience using a variety of data mining/data analysis methods, using a variety of data tools, building and implementing models, using/creating algorithms and creating/running simulations. They must have a proven ability to drive business results with their data-based insights. They must be comfortable working with a wide range of stakeholders and functional teams. The right candidate will have a passion for discovering solutions hidden in large data sets and working with stakeholders to improve business outcomes.
Responsibilities:
- Take existing research based on classical statistical learning (Random Forrest) and make improvements to existing models.
- Mine and analyze data from customer operational data to drive optimization and improvement of product development.
- Assess the effectiveness and accuracy of new data sources and data gathering techniques.
- Develop custom data models and algorithms to apply to data sets.
- Use predictive modeling to increase and optimize customer experiences
- Coordinate with different functional teams to implement models and monitor outcomes.
- Develop processes and tools to monitor and analyze model performance and data accuracy.
Required:
- Master’s or PhD in Statistics, Mathematics, Computer Science or Physical Sciences
- Intermediate spoken English skills.
- Someone with 3-5 years either research or industry experience building statistical models.
- Experienced with the following software/tools:
- Coding knowledge and experience with Python
- Python libraries: Jupyter, SciPi, Numpy
- Classical Machine Learning libraries: Random Forrest, Gradient Boosted Trees, Hierarchical Clustering
- Strong problem solving skills with an emphasis on product development.
- Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks.
- Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience with applications.
- A drive to learn and master new technologies and techniques.
- Experience creating and using advanced machine learning algorithms and statistics: regression, simulation, scenario analysis, modeling, clustering, decision trees, neural networks, etc.
- Experience visualizing/presenting data for stakeholders.