The FF Neural Exploratory Lab.

The FF Neural Exploratory Lab.

So for my first ever blog-post, I wanted to talk about a project of mine that I have very recently begun to work on, one that I want to scale-up and establish a group in my university’s AI student-club and/or the data-science club.

In the past year or so, ever since I dived into machine-learning, at first, I admittedly did so with the wrong intentions - to get a job. I mean, I know for a fact that a lot of the people reading this blog right now may have done so for the same or related reasons (like the fact that the world is indeed becoming more ‘AI-centric’) in their lives at differing points in time. But even with the wrong intentions, I fell in love and found a field that gave me more than a purpose to live my life with, it gave me more than just a resolve to simply build models and follow the AI-fad that is extremely viral right now, and it forced me to think about my life and the future of our species and where we might be in the next 3 to 4 years.

Until very recently, I was admittedly a bit ‘aimless’. I knew that I loved to do math, and I had taken nearly every advanced course at Waterloo to challenge and improve myself, but I had plenty of problems that would prevent me from learning about math, statistics or computer science in the way I wanted to - from having eroded fundamentals, to having the wrong intentions; from learning too much at once, to trying to learning too fast - the list of problems was endless. Each of these problems and how I solved them literally warrant their own blog-posts!

Nonetheless, I knew that I wanted to do something in neural networks because they fascinated me, yet I didn’t know what to do with them. I had taken STAT240 (Advanced Intro. to Probability) and had fallen in love with Probability Theory as well, so much so that I now know that I want to pursue my interests in this in the future, along with Statistical Theory (due to the course I am taking right now, STAT231). All I knew from my genius friends at Waterloo was that building models required lots of bits and pieces to be put together - from deciding what architecture you want, to the optimizer choice; from how you want to initialize your weights to deciding the initial values of hyperparameters - and that a lot of the science, human capital and investment capital went into this.

This side of ML is known as applied ML and while, yes, I do love any applied sciences, I am more interested by the axioms and other theoretical bases it is founded upon. For instance, any form of engineering is fundamentally rooted in physics (even chemistry is physics at its core) or, arguably, mathematics. Disappointedly, I didn’t find any solid treatments on the theoretical foundations of deep learning. There’s a fantastic graduate textbook by Charu C. Aggarwal called ‘Neural Networks and Deep Learning’ that I began reading last July, and I’ve read a bit more of it in the past few months as well, yet, I was disappointed yet again to not find some theoretical foundations upon which deep learning rests.


I believe that such a foundation allows you to understand a lot more complex phenomenon that occurs later on in the field of study, in this case, neural networks. Furthermore, with the progress-vector that this field and consequently, the world, has, we’re bound to reach a point where machines smarter than us are the new, dominant life-form on this planet. What they might do to us, only time can tell. But simple game-theory would tell you that this arms-race is too late to stop, and no number of letters from institutions like the ‘Future of Life Institution’ or others can put this genie back in the bottle.

Clearly then, it is necessary to understand how these systems work. You might wonder that, ‘Wait. Don’t we already know how these things work? I mean, we build these things!’, and I’d say, no, not even in the slightest.

We know how simple stuff like linear, multiple or logistic regression might work because we do have theoretical foundations for these simple things, but not for neural networks.

Neural networks act like black-boxes. We assemble the components I mentioned above, throw compute at them and with time, they learn how to do what they’re intended to do. But they often have surprising properties and structure to them. For instance, in a 2018 study, Ziad Obermeyer and co. experimented with a neural network to try and detect diabetic retinopathy by simply looking at a patient’s eye. While it certainly accomplished this, it unexpectedly also learnt to detect a person’s sex when given their eye’s image. The network was never trained to do this.

There’s more examples of this, but I’ll spare you the details. Clearly, neural networks and the representations they acquire post-training, the correlations between features they learn (like in the above example), the specialisations the units/‘neurons’ acquire to process certain features in the input, etc. prove that these neural networks are like alien intelligences.

This is certainly my sentiment, and this is also a sentiment shared by pioneers in this field like Neel Nanda (of DeepMind) and Chris Olah (of Anthropic). Simply put, we are about to enter an age where models like these will overtake humans in terms of their capabilities, yet we have no clue as to how they think and function. Recent research, like Anthropic’s ‘Tracing the thoughts of an LLM’ and ‘Toy Models of Superposition’ echo this feeling even more so.

This is why, to me, AI alignment and mechanistic interpretability is the need of the hour and solving this problem is the most important one of our time. Making models is cool and all, but understanding them from the inside-out is even cooler.

This is why I started with my project - the FF Neural Exploratory Lab.

[The ‘FF’ stands for the ‘Future Foundation’ and it’s a nod to my favourite comic-book team (you may have also seen them in the ‘About’ page here), so shout to them!]

I began this project as a way to learn about this field by first establishing a strong base, starting with the simple stuff like normal fully-connected neural networks, experimenting upon them and related neural network architectures and asking questions; to then soon scale up and tinker around with more advanced architectures like Transformers and LLMs. This is not an actual lab, not yet anyways, but it is my personal project.

The cherry on the cake is, this field rewards one if they were to already work in applied ML. Poking around, tinkering and exploring the internals of these networks that have been trained a certain way, on some certain data, fed some particular input, asked to perform a task, etc. is very much needed when it comes to establishing strong foundations in this field. As Neel Nanda puts it, “There’s a lot of low-hanging fruit in mechanistic interpretability.”

Thus, all the work that I’ll conduct (two experiments of which I’ve written about in the past on LinkedIn) will be written about and posted over here. And last but not least, I will also post my thoughts about papers published in this field, and the correspondences I will have with researchers in the space as well.

So, if you are interested in such work, please feel free to reach out to me via email or on LinkedIn. I, too, am a beginner and would love to hear from you and the results from experiments you’ve conducted. In the future, I plan to scale-up my project and open it up as an open-group where anyone can join, do work or work on projects related to this field. When I do so, I will be sure to post an update here!

Thanks for reading.