Job Title : Data Scientist
Duration : 6+ Month’s contract (with extension; no end date to the contract)
Work Location : Toronto, ON (Hybrid-3 days in office- 9am to 5pm)
Job Description :
The Data Scientist is a role in the Digitalization Program Team and reports to the Project Director of Digitalization. They will play a pivotal role in building a roadmap of data analysis and science for operational data to drive planning and decision-making insights.
The bulk of the work will be data modelling, management and problem analysis, data exploration and preparation, data collection and integration, and model operationalization defining the future of data practice.
Candidates need to be motivated, self-driven, curious and creative. Client is still building up its data analytics practice, the role could also cover the related roles of Data Engineer and DevOps engineer at times.
Data Scientist roles and responsibilities include :
Data mining or extracting usable data from valuable data sources
Using machine learning tools to select features, create and optimize classifiers
Carrying out preprocessing of structured and unstructured data
Enhancing data collection procedures to include all relevant information for developing analytic systems
Processing, cleansing, and validating the integrity of data to be used for analysis
Analyzing large amounts of information to find patterns and solutions
Developing prediction systems and machine learning algorithms
Presenting results in a clear manner
Collaborate with Business and IT teams
Develop ETL and data integration processes as need it.
Key Requirements
A successful candidate will have the expertise and skills described below :
Education and Training
A bachelor’s or master’s degree in computer science, data science, operations research, statistics, applied mathematics, or a related quantitative field or equivalent work experience, such as in economics, engineering and physics is preferred.
Alternate experience and education in equivalent areas such as economics, engineering, or physics, is acceptable. Experience in more than one area is strongly preferred.
Candidates must have a specialization in ML, AI, cognitive science or data science.
Previous Business Experience
Candidates should have six or more years of relevant project experience in successfully executing data science projects.
Preferably in the domains of customer behavior prediction, transportation, logistics operations.
A specialization in text analytics, image recognition, graph analysis or other specialized ML techniques such as deep learning, etc., is preferred.
Ideally, the candidates are adept in agile methodologies and well-versed in applying DevOps / MLOps methods to the construction of ML and data science pipelines.
Candidates should demonstrate Data Wrangling experience used in successfully implemented projects.
Candidates should exhibit significant project experience in applying ML and data science to business functions such as operations planning, customer journey analytics, marketing analytics.
Candidates need to demonstrate that they were instrumental in executing end-to-end significant data science projects.
Candidates should have demonstrated the ability to manage large data science projects including many data sources and diverse teams.
Candidates should have a growing track record of coaching other team members and providing training for knowledge transfer.
IT Knowledge / Skills
Coding knowledge and experience in several languages such as R, Python / Jupyter, SAS, Java, Scala, C++, Excel, MATLAB, etc.
Experience with popular database programming languages including SQL, PL / SQL, others for relational databases and upcoming nonrelational databases such as NoSQL / Hadoop-oriented databases such as MongoDB, Cassandra, others.
Experience with distributed data / computing tools : MapReduce, Hadoop, Hive, Kafka, MySQL.
Experience with operationalizing ML workflows using specialized MLOps frameworks such as Kubeflow, MLFlow, Liminal, Seldon Core, or general task orchestration frameworks such as AirFlow, Luigi, Argo and others.
This may also include MLOps tools such as Domino Data Lab, IBM, TIBCO, Superwise.AI, Arthur.AI, Modzy, ModelOp and others.
Experience of working across multiple deployment environments including cloud, on-premises and multiple operating
Machine Learning and Data Science Knowledge / Skills
Experience in one or more of the following commercial / open-source data discovery / analysis platforms : RStudio, Spark, KNIME, RapidMiner, Alteryx, Dataiku, H2O, Microsoft AzureML.
Knowledge and experience in statistical and data mining techniques : generalized linear model (GLM) / regression, random forest, boosting, trees, text mining, hierarchical clustering, deep learning, convolutional neural network (CNN), recurrent neural network (RNN), T-distributed Stochastic Neighbor Embedding (t-SNE), graph analysis.
Interpersonal Skills and Characteristics
All candidates must be self-driven, curious, and creative.
All candidates must demonstrate the ability to work in diverse, cross-functional teams in a dynamic business environment.
All candidates should be confident, energetic self-starters, with strong moderation and communication skills.
All candidates should exhibit superior presentation skills, including storytelling and other techniques to guide and inspire.
All candidates should exhibit a strong business sense / acumen and be driven to business success.
All candidates should have a track record in launching innovative projects, gaining the respect of stakeholders at all levels and roles within the company.