Job Title : Python Backend Software Engineer
Location : Mississauga, ON
Description : Background
Background
The Visualization and Interactive Data Analysis group within gRED Computational Catalysts is a group of scientists and engineers who build interfaces to help scientists better understand data.
The development of high throughput methods to profile the genome, screen compounds, and automatically collect images rapidly generates vast amounts of data that enable us to better understand the underlying causes of disease and identify treatments.
However, translating these data into insights to identify drug targets and candidates remains challenging. The sheer size of these data necessitates better techniques to query, explore, and analyze them.
Moreover, these datasets are often highly dimensional, requiring the integration of data modalities to understand their meaning.
We combine our passion for visualization, information processing, and user-centered design with expertise in manipulating data to extract scientific insights plus the engineering skills to bring this vision to life.
Working closely with scientists who are experts in a particular disease area, we build easy-to-use tools to enable exploration and interpretation of large, heterogeneous data and analyses.
We also develop and share new methods to visualize and interact with data. Within this group, you'll lead backend engineering design and development to quickly access large amounts of data within interactive visualization applications.
As the team is distributed between US (SanFrancisco) and Canada (Vancouver) the successful candidate should work in the Eastern or Pacific Time Zone.
Responsibilities
- Develop and maintain highly performant, scalable systems capable of transforming, analyzing, and querying data from distributed sources to feed data visualization interfaces
- Create processes to schedule, execute, and monitor data transformation workflows
- Design, implement, and maintain APIs to quickly access data from a web-based application
- Collaboratively and pragmatically solve scientific software engineering challenges within interactive data analysis and visualization
- Work with computational scientists, biologists, and other software engineers to elucidate the emerging needs of our scientists, whether they are working at the keyboard or the bench
- Collaborate with distributed scientific and engineering teams to support your software development efforts
- Contribute to the broader scientific community through open-source software development
Required Qualifications
- BS or higher in Bioinformatics, Computer Science or related fields
- Expertise (+ years of experience) in Python, designing and developing high-performance systems & package development
- Expertise in building, deploying, maintaining, and monitoring APIs
- Expertise in designing, running, and maintaining workflow processes, containers, schedulers, and systems in an on-premise server and in the cloud
- Experience with new and efficient file formats for large data
- Experience with scientific computing packages (SciPy, NumPy, pandas, etc.)
- Proficiency with cloud infrastructure, particularly AWS, to establish APIs and data services or databases
- Expertise in storing and extracting large amounts of data via cloud-based systems, including S buckets
- Demonstrated adherence to best practices in software engineering, particularly usability, version control, testing, and appropriate use of abstraction
- Passion for continuous learning and teaching others
- As the team is distributed between the US and Canada, the successful candidate should work in the Eastern or Pacific Time Zone.
Nice-to-haves
- Familiarity with formal build / release / deploy and continuous integration frameworks
- Kubernetes, AWS Lambda, and any other FaaS or containerized workloads experience
- Maintaining deployment infrastructure (reproducible, and IaaS), monitoring of events, and system maintenance
- Data wrangling, processing, and analysis in Python and / or R
- Biological domain knowledge, specifically in single cell genomics
- Familiarity with Multi Assay Experiment and other representations of biological information
- Experience building interactive visualization applications using modern frameworks and technologies (, React, Vue, Svelte; , WebGL)
- Building interactive data apps in R and Python (Shiny, Streamlit, etc.)