by: Kevin Broløs & Chris Cave
(Feyn version 3.0 or newer)
QLattice is a supervised machine learning tool for symbolic regression, and is a technology developed by
Abzu that is inspired by Richard Feynman's path integral formulation. That's why we've named our
Feyn, and the
QLattice is for
It composes functions together to build mathematical models between the inputs and output in your dataset. The functions vary from elementary ones such as addition, multiplication and squaring, to more complex ones such as natural logarithm, exponential and tanh.
Overall, symbolic regression approaches tend to keep a high performance, while still maintaining generalisability, which separates it from other popular models such as random forests. Our own benchmarks show similar results.
Feyn is the python module for using the
QLattice, and training models that have been sampled from the
When sampling models, you define criteria in
Feyn that these models must meet. Some examples include: is it a classification or regression problem, which features you want to learn about, what functions you want to include, how complex the models may be, and other such constraints.
The fitting process
A typical process looks like this:
- You sample a few thousand models at a time from a
- You fit them all using a version of backpropagation, and evaluate them on some criteria (such as a variety of loss function and information criteria).
- You discard the worst models.
- You update the
QLatticewith the structures of the best models.
- You start over from point 1, and add a new handful of samples to your list of models to evaluate and compete with the ones you kept from the previous loop.
You can consider a
QLattice as a probability distribution where models are sampled from it. Initially, this distribution is uniform and is tuned after each update call. Going through this process helps a
QLattice converge and shapes the distribution towards better solutions.
Every step in this process happens locally on your machine.
Why not just brute-force?
The space of all possible models is potentially infinite, which makes brute-forcing the solution intractable for all but the simplest datasets. This is why you update the
QLattice with the best model structures so far. You can also narrow the search space, by being specific on what relationships to investigate, and restrict the types of models the
QLattice will produce.
What about privacy?
Every step of the process when using a
Feyn happens locally on your machine. You can even run this without an internet connection.
In particular this means that none of your data is at any point exchanged and does not leave your machine.
Understanding the models
The resulting models are represented by unidirectional, acyclic graphs that cleanly visualize what happens in the mathematical equation for everyone to understand. On top of this, we have a suite of plots and tools to help you dig deeper into the models you get, and help you understand not only the relationships better, but also the tradeoffs, biases and support levels present in your model.
This makes the
QLattice especially great for when you want insights and intend to investigate relationships between your features.
The QLattice in a nutshell
QLattice is an environment to simulate discrete paths from multiple inputs to an output. It does this in a finite multi-dimensional lattice-space. This is where the inspiration from Feynman's path integral comes in.
QLattice simulates inputs taking a path through the lattice space before emerging to an output. If you do this until a solid path has been shaped, you'll eventually converge to the path most likely to explain the problem you're trying to model. Along the path that we take, we'll randomly sample from a selection of
interactions -- functions that transform the inputs to a new output.
Interactions are the basic computation units of each model. They take in data, transform it and then emit it out to be used in the next
interaction. Here are the current possible interactions:
We determine the
interactions based on probabilities, guided by repeated reinforcement of the best solutions provided by the
QLattice, as you fit the hundreds of thousands of models, that are discovered. During repeated reinforcement, islands will form in the
QLattice space, each with their own independent evolution. This narrows the search space, and gives way to many separate evolutionary spaces. A benefit to this process, is that the user helps decide which models are useful, and which paths will be reinforced. The user also decides how to constrain the decision space, giving the user full control over the shapes the models will be taking.
Altogether, this approach has some benefits, such as:
- there are far fewer nodes and connections.
- there are functions you wouldn't normally see in a neural network.
- the models are more inspectable, simpler and less prone to overfitting.
- the models are mathematical formulas, allowing you to reason about the consequences of your hypothesis.
- the models that have been tried are diverse and you can trust that nothing has been overlooked during training.
If there's a signal, the
QLattice will find it - so you can trust whether your problem is best solved with a complex non-linear mathematical equation, or a simple linear model.