OrthoRoute Logo

OrthoRoute - GPU Accelerated Autorouting for KiCad

You shouldn't trust the autorouter, but at least this one is faster

You can download this project from the Github repository

This is a project born out of necessity. Another thing I was working on needed an enormous backplane. A PCB with sixteen connectors, with 1100 pins on each connector. That’s 17,600 individual pads. Here, just take a look:

a view of the backplane, before routing the PCB

Look at that shit. Hand routing this would take a month. I tried FreeRouting, the KiCad autorouter plugin, for a laugh and it routed 4% of the traces in seven hours. If that trend held, which it wouldn’t, that would be a month of autorouting. I had a few options, all of which would take weeks.

When confronted with a task that will take months, always choose the more interesting path.

Project Overview

OrthoRoute is a GPU-accelerated PCB autorouter, designed for parallel routing of massive circuit boards. Autorouting is a parallel problem, and nearly everyone doing serious work has a few spare CUDA cores sitting around. Unlike most autorouters such as Altium Situs, FreeRouting, and a dozen EE-focused B2B SaaS startups, OrthoRoute uses GPUs for parallelizing the task of connecting pads with traces.

I don’t know why no one has thought to put wavefront expansion in a GPU before. I seriously feel like I’m taking crazy pills. Autorouting algorithms are ideal for parallel operation.

OrthoRoute is designed as a KiCad plugin, and heavily leverages the new-ish KiCad IPC API and kicad-python bindings for the IPC API.

Key Features:

Screenshots

Screenshot 1, showing an Arduino clone

Performance

Unsurprisingly, doing extremely parallel operations in a GPU is fast. But how does it compare to other Autorouters?

SOMETHING ABOUT PERFORMANCE WHEN I GET THE AUTOROUTING DONE

Why A GPU-Accelerated Autorouter Is Dumb

GPUs are really good at parallel problems. And you would think autorouting is a very parallel problem. It’s just finding a path between two points on a graph. There are algorithms that are embarrassingly parallel that just do this. This is true, but there’s a lot you’re not considering.

Instead of mapping an entire (blank) PCB into a GPU’s memory and drawing traces around obstacles, an autorouter is about finding a path under constraints that are always changing.

If you have three nets, route(A→B), route(C→D), and route(E→F), you start out by routing the direct path (A→B). But (C→D) can’t take the direct path between those points, because it’s blocked by (A→B). Now (E→F) is blocked by both previous routes, so it takes a worse path.

It’s like the traveling salesman problem, but all the salesmen can’t take the same road. Also there are thousands of salesmen.

There’s a reason this is the hardest problem in computer science. This is why people have been working on autorouters for sixty years, and they all suck.

GPUs are mostly terrible for autorouting. Should net (A→B) be higher priority than (C→D)? GPUs hate branching logic. You can’t route (C→D) until you’ve routed (A→B), so that embarrassingly parallel problem is actually pretty small. Now deal with Design Rules. If you don’t want a trace to intersect another trace of a different net, you apply the design rules. But this changes when you go to the next net! You’re constantly redefining Design Rules, which kills any GPU efficiency.

But all is not lost. There’s exactly one part of autorouting that’s actually parallel, and a useful case to deploy a GPU. Lee’s Wavefront expansion. You route your traces on a fine-pitch grid, and propagate a ‘wave’ through the grid. Each cell in the wave can be processed independently. Shortest path wins, put your trace there. That’s what I’m using the GPU for, and the CPU for everything else. Yeah, it’s faster, but it’s not great. Don’t trust the autorouter, but at least this one is fast.

Animated GIF of Wavefront Expansion

Why OrthoRoute is Great on a GPU.

But the entire point of this project isn’t to build a general autorouter. I built this to route one board. And I’m doing it with ‘Manhattan routing’. Manhattan routing is embarrassingly parallel because it’s geometrically constrained. I have 16 identical parts, each with 1100 SMD pads, arranged in a regular grid. That’s 17,600 pads that need to connect to each other in very predictable patterns.

PUT A PICTURE OF ORTHOGINAL ROUTING HERE

Instead of the “route anything anywhere” problem of general autorouting, I have layers with dedicated directions: horizontal traces on one layer, vertical on the next, horizontal again, and so on. When a net needs to change direction, it drops a via and moves to the appropriate layer. No complex pathfinding required, just geometric moves on a regular grid.

This is why it’s called OrthoRoute. It’s just routing through a grid of traces. There’s no DRC needed, because drawing the grid of traces is defined by DRC.

Implementation

The following are the implementation details of OrthoRoute. How I built it, and why I made the decisions I did. If you want to never write a plugin for KiCad, skip this section.

Pre-KiCad 9.0 had a SWIG-based plugin system. Compared to the new IPC plugin system released with KiCad 9, there are serious deficits. The SWIG-based system was locked to the Python environment bundled with KiCad. Process isolation, threading, and performance constraints abound. Doing GPU programming with CuPy, while not impossible, is difficult.

The new IPC plugin system for KiCad is a godsend. The basic structure of the OrthoRoute plugin looks something like this:

Orthoroute architecture

The OrthoRoute plugin communicates with KiCad via the IPC API over a Unix socket. This API is basically a bunch of C++ classes that gives me access to board data – nets, pads, copper pour geometry, airwires, and everything else. This allows me to build a second model of a PCB inside a Python script and model it however I want.

Bug-finding and Path-finding.

After using the IPC API to extract the board, I had everything. Drills, pads, and copper pours. The next step was to implement path finding. This is the process of selecting an airwire, and finding a path between its beginning and end. This is what autorouters do, so I’m using standard algorithms like Wavefront expansion, and a few other things I picked up from a VLSI textbook. All I needed to do was convert that list of pads and traces and keepouts to a map of where the autorouter could draw traces.

The key insight here, and I think this is pretty damn smart, is that a copper ground plane is the map of where traces can go. If I create a board with a few parts on it and put down a ground plane, that ground plane defines where the autorouter can draw traces.

While the kicad-python library has functions that will return the polygon of a copper pour, I can’t use that. Not all boards will have a copper pour. What I really need to do is reverse engineer the process of making these copper pours myself. I need a way to extract the pads, traces, and keepouts and applying DRC constraints to them.

That’s exactly what I did. First I learned how to extract the copper pour from a board, giving me a 5000-point polygon of copper with holes representing pads, traces, and clearances. Then, I reverse engineered this to create a ‘virtual’ copper pour – one that could be generated even if the KiCad board file doesn’t have a copper pour. By comparing this ‘virtual’ copper pour to the ground truth of the copper pour polygon extracted from the board, I could validate that my ‘virtual’ copper pour was correct. This gives me an exact map of where the autorouter can lay down copper.

Development of the virtual copper pour extraction showing real thermal relief vs virtual algorithm vs difference map

Pathfinding wasn’t the issue. That’s easy because there are published algorithms. What really gave me a lot of trouble is figuring out where the pathfinding was valid. I’m sure someone else would have spent months going through IPC specifications to figure out where an autorouter should route traces. But I leveraged the simple fact that this data is already generated by KiCad – the copper pour itself – and just replicated it.

If there are any KiCad devs out there reading this, this is what you should implement in the next iteration kicad-python. A get_free_routing_space(int copper_layer) function. Something that returns the polygon of a copper pour on boards that don’t have a copper pour, for each layer of copper.

back