-
Notifications
You must be signed in to change notification settings - Fork 12
Modular Design
This page describes how to use Lighthouse in a modular way, inside and outside of the Lighthouse repository. Not everything is done or correct, but this is the intent.
Lighthouse is a pip installable Python package that allows you to build MLIR compilers. It contains all the infrastructure to:
- Load models from various ingress frameworks
- Select, iterate, auto-tune and run various schedules and pipelines
- Prepare, collect dependencies and run on various targets
A separate compiler tool is a driver that selects the options, brings additional functionality, schedules, transforms and potentially even whole dialects and makes sure that the appropriate libraries and drivers are in the right place for the Lighthouse modules to use.
Lighthouse depends on other projects within the LLVM umbrella, mainly MLIR and Torch-MLIR. MLIR is our core compiler infrastructure while Torch-MLIR gives us the ingress from Torch, XLA and ONNX into MLIR dialects such as Linalg or TOSA. Therefore, Lighthouse depends on those projects, and pulls them via Python dependencies. It also brings other dependencies, such as PyTorch, GPU drivers, OpenMP and other libraries that the ingress or egress portions may use.
Furthermore, you can also add additional dependencies to your own infrastructure. When building a compiler with Lighthouse, you can build your tools (for example, other MLIR-based compilers) and pipe them through the scheduler, or consume the generated output to execute on a proprietary hardware platform.
A number of utilities can be added onto Lighthouse with time, for example extractor of certain modules from Huggingface, KernelBench, etc. that use the ingress modules in Lighthouse. Certain compilers, especially the examples in the Lighthouse repository, may in time provide repeat examples on high level concepts that get added to Lighthouse, so users can pick them up at package installation.
Another key design of the project is to common up scheduling decisions, to guide better compatibility between passes, transforms and to inform canonicalization and reduce complexity. The end result of this is that a collection of common patterns will emerge, and these can also be included in the Lighthouse package, so that compilers that use it can benefit from it directly, without having to replicate downstream.
These are not core functionality, but they do add value to the project.
Inside the Lighthouse repository, we have the primary users, ie. compilers, that help us validate our design. They're primary not just because they'll be tested with every Lighthouse (and MLIR) commit, but also because they should represent the "upstream intent" on how to use MLIR tools and dialects. Since there can be many different intents, there will be different tools or they'll be used differently.
One example is an mlir-opt-like tool, that takes in MLIR and outputs MLIR and runs various passes/schedules in between. This is the core tool for our local testing: it allows us to validate that changes in Lighthouse/MLIR will not break our assumptions and invariants across the different schedules.
Another example is an ingress tool, that uses the Lighthouse "front-end" to take in various modules and convert to MLIR dialects. Since we already have many such tests in Torch-MLIR, this tool would use Lighthouse that would use Torch-MLIR to take in a model from various places and spit out MLIR. The testing is in the gather, not the conversion.
Finally, a compiler tool that would fully utilize the Lighthouse functionality, sharing implementation with the tools above, and provide a more general driver to test end-to-end execution.
Like MLIR and LLVM, Lighthouse needs two types of tests: output verification and execution.
The former is done by running tools and checking the IR or output against a golden standard (LIT, grep, diff, etc), and can be done in any CI loop. These are quick and run on CPUs, and can use the standard free instances in Github, running on every commit and even pre-commit CI.
The latter need special hardware, so must run on crafted builders. The tests, and the way to reproduce them, must be all upstream, so that anyone with that particular hardware can test them. But the CI loops should be done in separate. They can be reported back into the project (ex. as badges in the README).
These tests should use the tools above, which will use Lighthouse modules, their dependencies and add-ons. The more comprehensive tests the better.
Using the upstream tools as examples, downstream tools should focus on what they're aiming (just an opt tool or a full compiler) and then copy the style of the upstream Lighthouse usage. This is because Lighthouse is the infrastructure, and any compiler that uses it is just a wrapper driver around it.
If a tool starts growing in complexity which get replicated downstream, we have two choices:
- Offload the complexity into Lighthouse, since it's already used by other users
- Change the model to overload the tool
Choosing option (2) above it non trivial. For example, many Clang clones struggle to merge conflicts and to upstream deltas, ending up in between two bad choices.
Equally, if Lighthouse's complexity grows and non-Lighthouse MLIR users are repeating it, that's a strong signal to move that complexity into MLIR.
If you have a Lighthouse based project and you want to upstream a piece of code, you need to analyse what kind of code this is to decide where it goes. By default, if it can go to MLIR, then it should go there. This is especially true for things that increase merge conflicts, such as changes in API, dialect operations, etc.
Lighthouse is adding infrastructure to create dialects and transforms in Python for experimentation and prototyping, not to keep deltas here, as opposed to MLIR. So, if users have developed dialects in their forks, they should prepare a patch into MLIR and go directly there, not through Lighthouse, then pull all the way down, and use it.