Search jobs > Markham, ON > Data engineer

HSIO Functional Validation Engineer - Data Center GPU

Advanced Micro Devices, Inc
MARKHAM, Ontario, Canada
$150K a year (estimated)
Full-time

WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world.

Our mission is to build great products that accelerate next-generation computing experiences the building blocks for the data center, artificial intelligence, PCs, gaming and embedded.

Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges.

We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. AMD together we advance THE ROLE : The Data Center GPU Group is looking for self-driven Functional Validation Engineers.

You will work on High Speed IO (HSIO) interfaces that are crucial to enabling AI-driven processing demands, such as PCIe and AMD Infinity Fabric.

As a Functional Validation (FV) Engineer, you will cover digital IP controllers and link protocols, HSIO performance, and RAS.

The scope of work spans all product life-cycle phases, defined as : pre-silicon readiness, post-silicon bring-up and validation, device interoperability, and customer high volume deployment.

You will have opportunities to build capabilities in test creation, automation, and debug. Our team fosters continuous innovation and is committed to nurturing your career growth.

THE PERSON : You are a driven HSIO Validation Engineer with a passion for pushing boundaries in the pursuit of finding bugs.

You should be very comfortable deep diving into HSIO link controllers, protocol, traffic, and device interoperability. Complete comfort operating in Linux-based environments is required for data center products.

KEY RESPONSIBILITIES : Develop test procedures for Functional Validation of PCIe Gen5 interface (or similar). Hands-on debug with PCIe state machines, logic analyzers, exercisers, and LTSSM issues.

Hands-on experience with testing HSIO performance and power features. Metrics driven validation and test characterization for stress and effectiveness.

Develop automation in Python. Implement improvements towards time-to-quality and time-to-root-cause. Engage with external partners and customers for tools development and debug.

Technical reporting to communicate program status. Drive to meet program milestones and customer deliverables. IDEAL CANDIDATE 5+ years of experience in semiconductor industry in a validation role.

Experience with PCIe Gen5 Protocol, Performance, and Power features. Experience with SOC digital logic design for link controllers and traffic arbitration.

Experience with system error handling and management (RAS) for HSIO. Experience with HSIO interoperability testing between root, end point, switches, re-timers, and other devices.

Strong coding skills (Python). Must have strong analytical skills for test creation and debug. Must have strong communication and collaboration skills.

Must be a self-starter and be able to independently drive tasks to completion. Experience in Data Center Industry a valuable asset.

ACADEMIC CREDENTIALS : Bachelors or Master’s degree in Electrical Engineering (or equivalent field) LOCATION : Markham, ON #LI-SL2 #LI-HYBRID Benefits offered are described : AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and / or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.

We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

THE ROLE : The Data Center GPU Group is looking for self-driven Functional Validation Engineers. You will work on High Speed IO (HSIO) interfaces that are crucial to enabling AI-driven processing demands, such as PCIe and AMD Infinity Fabric.

As a Functional Validation (FV) Engineer, you will cover digital IP controllers and link protocols, HSIO performance, and RAS.

The scope of work spans all product life-cycle phases, defined as : pre-silicon readiness, post-silicon bring-up and validation, device interoperability, and customer high volume deployment.

You will have opportunities to build capabilities in test creation, automation, and debug. Our team fosters continuous innovation and is committed to nurturing your career growth.

THE PERSON : You are a driven HSIO Validation Engineer with a passion for pushing boundaries in the pursuit of finding bugs.

You should be very comfortable deep diving into HSIO link controllers, protocol, traffic, and device interoperability. Complete comfort operating in Linux-based environments is required for data center products.

KEY RESPONSIBILITIES : Develop test procedures for Functional Validation of PCIe Gen5 interface (or similar). Hands-on debug with PCIe state machines, logic analyzers, exercisers, and LTSSM issues.

Hands-on experience with testing HSIO performance and power features. Metrics driven validation and test characterization for stress and effectiveness.

Develop automation in Python. Implement improvements towards time-to-quality and time-to-root-cause. Engage with external partners and customers for tools development and debug.

Technical reporting to communicate program status. Drive to meet program milestones and customer deliverables. IDEAL CANDIDATE 5+ years of experience in semiconductor industry in a validation role.

Experience with PCIe Gen5 Protocol, Performance, and Power features. Experience with SOC digital logic design for link controllers and traffic arbitration.

Experience with system error handling and management (RAS) for HSIO. Experience with HSIO interoperability testing between root, end point, switches, re-timers, and other devices.

Strong coding skills (Python). Must have strong analytical skills for test creation and debug. Must have strong communication and collaboration skills.

Must be a self-starter and be able to independently drive tasks to completion. Experience in Data Center Industry a valuable asset.

ACADEMIC CREDENTIALS : Bachelors or Master’s degree in Electrical Engineering (or equivalent field) LOCATION : Markham, ON #LI-SL2 #LI-HYBRIDBenefits offered are described : AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and / or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.

We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

30+ days ago
Related jobs
Advanced Micro Devices, Inc
Markham, Ontario

KEY RESPONSIBILITIES: Driving technical innovation to improve AMD’s capabilities across validation, including tool and script development, technical and procedural methodology enhancement, and various internal and cross-functional technical initiatives Owning and executing end to end validation plan...

AMD
Markham, Ontario

AI GPU Software (AGS) Datacenter group is looking for dynamic and skilled individuals that can contribute to the bring up, support and debug of complex computing systems. Setup hardware (CPU/APU, GPU, Memory cards) in server/workstation computing systems to facilitate user defined workloads in the A...

Advanced Micro Devices, Inc
Markham, Ontario

Responsibilities include: Identify AI, ML and HPC workloads needed to validate / stress AMD DC GPUs Collaborate with SW and HW teams to create and automate validation programs for a high-volume manufacturing environments Develop python based automated test suites and content for computer hardware va...

Equinix
Toronto, Ontario

Canadian Armed Forces - Data Center Facility Engineer (HVAC, Mechanical, Electrician, Security Systems). Data Center Operations, Critical Facilities Engineer. Equinix is the world’s digital infrastructure company®, operatingover 250 data centers across the globe. Joining our operations team means th...

Canonical
Toronto, Ontario

We are hiring a Data Center Infrastructure Engineer to build and maintain MAAS test labs. As a Data Center Infrastructure Engineer in Canonical, you will be responsible for the day-to-day management and operations of our lab in the Toronto area, which we use for Ubuntu server certification of US bas...

Equinix
Toronto, Ontario

Senior Mechanical Design Engineer (Data Center HVAC). Knowledge of data center HVAC mechanical engineering design. Equinix is the world’s digital infrastructure company®, operatingover 250 data centers across the globe. Joining our operations team means that you will be at the forefront of all we do...

S.i. Systems
Toronto, Ontario

Data Engineer to perform ETL development (PostgreSQL) on Guidewire billing/policy/claims center for a major insurance client . Data Analysis: Understanding the data structures and terminology used in Guidewire applications will be essential for performing accurate data analysis and troubleshooting. ...

Advanced Micro Devices, Inc
Markham, Ontario

AMD together we advance_ THE ROLE: The Business Operations Director for the Data Center GPU Business Unit at AMD will play a pivotal role in orchestrating the operational aspects of our business, ensuring seamless coordination between sales, business development, supply management, and order schedul...

Salute
Toronto, Ontario

Salute is a leading provider of cutting-edge Data Center Infrastructure Services, dedicated to serving data center clients worldwide. The Data Center Operator (DCO) is a facilities-focused role that is responsible for the operational integrity, commissioning, and regulatory compliance of the electri...

Global Applications Solution
Toronto, Ontario

Cloud Managed Development/Services such as Google Cloud Storage. CloudSQL, NoSQL, Relational databases). Google Cloud Architect certification will be a strong asset. ...