for versions of Feyn 1.2.+
Just a tiny bit of quantum mechanics
We typically think that when an object moves from one place to another it will move in the path of least resistance. Usually, that will be a straight line.
This intuition breaks down when we move to the world of the very small, such as the movement of photons and electrons. According to Richard Feynman, photons do not just take one path from point A to point B: they take every path. Each path contributes to the probability of the photon arriving at point B and the path with highest contribution is the one of least resistance---usually a straight line.
So...what is a QLattice?
It is the environment that considers all possible models of a dataset and finds that 'path of least resistance' i.e. the model that fits the problem the best.
QLattice is very much inspired by Richard Feynman's path integral formulation.
That's pretty damn Feyn, where do I play?
If you can't contain yourself anymore you can play right here. As we are inspired by Richard Feynman, our python library,
Feyn, is pronounced like 'fine' as in 'so damn fine'.
Let's talk more about
QGraphs and a
QGraph is an unbounded list of models. A model looks a little bit like a standard neural network. It is similar because:
- models take inputs from features and output a prediction;
- it has a bunch of nodes with some functions in them;
- we use backpropagation to fit each model to the data.
They are also differences to neural networks because:
- there are far fewer nodes and connections (and we also call the nodes
- there are functions you wouldn't normally see in a neural network (
- NO ONE-HOT ENCODING NEEDED! If a feature takes only categorical values then feyn automatically encodes them.
Here's a neat little model from the regression example that only takes two categorical features:
Ok, but HOW do I play?
Suppose a data scientist has some data that they would like to model using
Feyn and a
QLattice. First she will extract a
QGraph from a
QLattice. Remember this is an unbounded list of potential models for the dataset. Next she will fit data on a subset of these models and update the
QLattice with the best one.
This has the effect of telling the
QLattice that this was a good model, more of this please! The search space of the
QLattice narrows and focuses in this direction. The next time she fits the
QGraph it will be on more relevant models for the dataset.
Can you tell me a bit more?
Of course we can! You should get to know more about:
interactionsin a model;
Before we are going to produce a
QGraph we first need to tell what type of data the inputs and output are. This is a way of telling each model what type of data it should expect and from that how to behave. We call these
semantic types or
stypes. There are two stypes:
- Numerical. This is for features that are continuous in nature such as: height, number of rooms, latitude and longtitude etc.
- Categorical. This is for features that are discrete in nature such as: neighbourhoods, room type etc.
If no type is assigned to the a feature then it defaults to the numerical semantic type. What this means is that if we imagine our data scientist again she has some features and a target variable in her dataset. She will only need to assign an stype to each categorical feature. This is when the model knows to do the automatic encoding of the categories.
Interactions are the basic computation units of each model. They take in data, transform it and then spit it out to be used in the next
interaction. In the model above, there's only one
tanh. Here are the others:
There is also a
linear interaction. Each
multiply) has weights and biases which can be optimised using the backpropagation algorithm. This is what happens when we fit the models to our data. You can find out more about fitting in the guides to the side.
Ok I think I've followed but what's the workflow like?
Every workflow follows the same basic pattern:
- extract the
- fit each model in the
- update the
QLatticewith the best model;
- return to 2.
Of course you can play around with this workflow as much as you like but the basic one is: Extract, Fit, Update, Repeat.
You can find out more about updating in the guides to the right.
Great, so why do I want it?
After a job well done and you've got a good model from the above workflow, those learnings are still there in the
QLattice. The next time you return to it, it will pick up from where it left off. This has some awesome implications.
If you have multiple problems within the same domain, then it is likely that they are related. Therefore the QLattice can apply learnings from one problem to another and may be an improvement compared to learning the problem in isolation.
Many independent teachers
Data is often distributed by nature and then centralized for training. With a
QLattice it does not have to be so. You could have a dataset and your friend could have another part of that dataset. Instead of centralising it you can just access the same
QLattice. If you both update the
QLattice with good models then the learnings are stored centrally so your data doesn't have to be.
What other great things can it do?
Well I'm glad you asked.
As there are fewer
interactions, the models are easier to read than a neural network. The
QLattice tends towards finding models with fewer
interactions and less complex models. This is so that there is more
explainability to the models' predictions.
Where's yo data at? It’s wherever you decide to have it. You don’t have to upload or transfer it to us to work with the QLattice. The QLattice just stores learnings so we have no need to store your data with us. You keep your data. It doesn’t go anywhere. It stays with you.