Self-supervised denoising of calcium imaging data

This blog post is about algorithms based on deep networks to denoise raw calcium imaging movies. More specifically, I will write about the difficulties to interprete their outputs, and on how to address these limitations in future work. I will also share my own experience with denoising calcium imaging data from astrocytes in mice.


Averages from calcium imaging or representative movies in publications often look great. In reality, however, two-photon calcium imaging is often limited primarily by low signal-to-noise ratios. There are many cases where recordings are dominated by shot noise to the extent that almost no structure is visible from a single frame. This can be due to a weakly expressing transgenic line; or on-purpose low expression levels to avoid calcium buffering; or it can be due to the fact that the microscope scans so fast across a large field of view or volume that it only picks up a few photons per neuron.

The advent of new algorithms to denoise calcium imaging movies

So, would it not be great to get rid of the noise in these noise-dominated movies using some magical deep learning algorithm? This seems to be the promise of a whole set of algorithms that were designed to denoise noisy images and recover the true signal using either supervised [1] but recently also self-supervised deep learning [2-3]. Recently, there have also been a few applications of these algorithms for calcium imaging [4-6]. The following tweet by Jérôme Lecoq was the first time I saw the results of such algorithms:

The results look indeed very impressive. The implementations of these algorithms were subsequently published, by Lecoq et al. (DeepInterpolation) [4] and independently with a very similar approach by Li et al. (DeepCAD) [5]. Since then, a few follow-ups were published to compare performance among the two algorithms and also to improve performance of DeepCAD [6], or with ideas to apply a sort of mixture algorithm between supervised and un-supervised to denoise calcium imaging data [7].

Despite the great-looking results, I was a bit skeptical. Not because of a pratical but rather because of a theoretical concern: there is no free lunch. Why should the output suddenly be less noisy than the input? How can the apparent information content increase? Let’s have a closer look at how the algorithm works to address this question.

Taking into account additional information to improve pixel value estimates

Let’s start very simple: Everybody will agree that the measured intensity for a pixel is a good estimate for the true fluorescence or calcium concentration at this location. However, we can refine our estimate very easily using some background information. What kind of background information? For example, if a high intensity value occurs in a region without any labeling, with all other time points of this pixel having almost zero value, we can be rather certain that this outlier is due to a falsely detected photon and not due to a true fluorescence signal. If however all surrounding pixels had high intensity values and the pixel of interest not, we could also correct our estimate of this pixel’s intensity value using (1) our experience about the spatial structures that we want to observe and (2) the information gained from the surround pixels. Therefore, refining our estimate of the pixel’s intensity is simply taking into account a prior what we expect the pixel’s intensity to be.

Methods based on self-supervised deep networks perform more or less such a procedure, and it is in my opinion a very reasonable way to obtain a better estimate for a pixel’s intensity. As a small difference, they only use the surrounding frames (adjacent in time) and not the pixel intensity itself (therefore lacking this Bayesian idea of improving an estimate using prior information). Despite this interesting small difference, it is clear that such denoising will – in principle – work. The network then uses deep learning to gain knowledge about what to expect in a given context; practically speaking, the prior knowledge will be contained in the network’s weights and extracted from a lot of raw data during learning. Using such a procedure, the estimate of the pixel’s intensity value will, probably under most conditions, be better than the raw intensity value.

A side note: Computing the SNR from raw and denoised pixels

From that, it is also clear that neighboring pixels of a denoised movie are correlated since their original values have influenced each other. It is therefore not justified to compare something like a SNR based on single pixels or single frames between raw and denoised data, because in one case (raw data) adjacent data points are truly independent measurements, while in the other (denoised data) they are not. Both DeepInterpolation [4] and DeepCAD [5] used such SNR measures that reflect the visual look and feel but are, in my opinion, not a good quantification of how much signal and how much noise is in the data. But this just as a side note.

Denoising can make plausible point estimates that are however artifacts

However, there is a remaining real problem. Let’s take some time to understand it. Clearly, the estimated intensity value is only a point estimate. So we don’t know anything about the confidence of the network to infer exactly this pixel intensity and not a different intensity value. Deep networks have been often shown to hallucinate familiar patterns when they were unconstrained by input. It is therefore not clear from looking at the results whether the network was very confident about all pixel intensities or whether it just made up something plausible because the input did not constrain the output sufficiently.

To make this rather vague concern a bit more concerete, here is an example of a calcium recording that I performed a few years ago (adult zebrafish forebrain, GCaMP6f). On the left side, you can see the full FOV, on the right side a zoom-in.
In the movie, there is first a version based on raw data, then the same raw data but with a smoothing temporal average, and finally a version denoised using the DeepInterpolation algorithm [4]. To provide optimal conditions, I did not use a default network provided by the authors but retrained it on the same data to which I applied the algorithm afterwards.

First, the apparent denoising is impressive, and it is easy to imagine that an algorithm performing automated source extraction will perform better for the denoised movie as for the raw movie.
When we look more carefully and with more patience, a few odd things pop out. In particular, the neuronal structures seem to “wobble around” a bit. Here is a short extract of a zoom-in into the denoised movie:

Neurons are densely packed in this region, such that the cytoplasms filled by GCaMP generate an almost hexagonal pattern when you slice through it with the imaging plane. In the excerpt above, there is indeed a sort of hexagonal pattern in each frame. However, the cell boundaries are shifting around from frame to frame. This shifting of boundaries can be particularly well seen for the cell boundary between the right-most neuron and its neighbor to the left. From the perspective of an intelligent human observer, these shifting boundaries are obviously wrong – it is clear that the brain and its neurons do not move.

So, what happened? The network indeed inferred some structural pattern from the noise, but it arrived at different conclusions for different time points. The network made the most likely guess for each timepoint given the (little) information it was provided, but the inconsistency of the morphological pattern shows that the network made up something plausible that however is partially wrong.

Solution (1): Taking into account the overall time-averaged fluorescence

To fix this problem specifically, the network could take into account not only surrounding pixels, but also the overall mean fluorescence (average across all movie frames) in order to make an educated guess about pixel intensity. As human observers, we do this automatically, and that’s why we can spot the artifact to start with. With the information about the overall anatomy, the network would have the same prior as the human observer and would be able to produce outputs that do not include such artifacts.

Solution (2): Taking into account the uncertainty of point estimates

However, the more general problem of the network to fill up uncertain situations with seemingly plausible but sometimes very wrong point estimates still persists. The only difference is that a human observer probably would be unable to identify the generated artifacts.

A real solution to the problem is to properly deal with uncertainties (for reference, here a review of uncertainty in deep networks). This means that the network needs to be able to estimate not only the most likely intensity values for each pixel but also the confidence intervals for each value. With a confidence interval for each pixel value, one could compute the confidence interval for e.g. the time course of a neuron’s calcium ΔF/F averaged across an ROI. The computational problem here is that the error ranges for each pixel do not just add as independent errors, resulting in a standard error of the mean, since the values and confidence intervals for adjacent pixels are dependent on each other. I assume that a straight-forward analytical treatment might be too tricky and some sort of Monte Carlo-based simulation would work better here. This would make it possible to use the denoised movie to derive e.g. a temporal ΔF/F trace of a neuron together with an uncertainty corridor of the estimated trace.

To sum it up, at this point it seems that there is not only a need to develop tools that provide faster and more beautiful denoised images, but even more so procedures to properly deal with uncertainties of estimates that reflect an output that is not enough constrained by the input. Without such tools, analyses based on denoised data must be carefully inspected whether they might be susceptible to such artifacts.

Practical aspects: Using denoising for astrocytic calcium imaging

In a recent preprint [8], I used such methods (DeepInterpolation [4]) to denoise calcium recordings from hippocampal astrocytes. Astrocytes are rather sensitive to laser-induced heating, and I therefore applied low excitation power, resulting in relatively noisy raw recordings. One main goal of the study was to study not only ROIs drawn around somata but also the spatio-temporal calcium dynamics from somatic and distal compartments, ideally with a precision of a single pixel.

To be able to quantify such patterns, it was essential to denoise the raw movie (see Figure 6e and supplemental Figure S8 in [8]). Briefly, it worked really nicely:

It was however crucial to carefully look at both raw and denoised data to understand what was going on, and to consider potential artifacts with respect to downstream analyses. In my case, it helped that the central result of the paper, that calcium signals propagated from distal to proximal compartments under certain conditions, was based on analyses averaged over time (due to the use of correlation functions). Such averaging is likely to undo any harm introduced by small artifacts generated by denoising. In addition, I carefully looked at raw together with denoised data and thought about possible artifacts that might be introduced by denoising.

The second aspect to notice is that the algorithm was rather difficult to use, required a GPU with large memory and still then was very slow. This has improved a bit since then, but the hardware requirements are still high. An alternative algorithm [5] seems to have slightly lower requirements on hardware, and the authors of [5] also developed a modified version of their algorithm that seems to be much faster, at least for inference [6].


The development of methods to denoise imaging data is a very interesting field, and I look forward to seeing more work in this direction. Specifically, I hope that the two possible developments mentioned above (taking into account the time-averaged fluorescence and dealing properly with uncertainty) will be properly explored by other groups.

Researchers who apply denoising techniques are themselves often very well aware of potential pitfalls and hallucinations generated by U-Nets or other related techniques. For example, Laine et al. [9] end their review of deep learning-based denoising techniques with this note of caution:

“Therefore, we do not recommend, at this stage, performing intensity-based quantification on denoised images but rather to go back to the raw [images] as much as possible to avoid artefacts.”

With “quantification”, they do not refer to the computation of ΔF/F but rather to studies that quantify e.g. localized protein expression in cells. But should the computation of ΔF/F values have less strict standards?

There are a few cases where potential problems and artifacts are immediately obvious for the application of denoising methods to calcium imaging data. Self-supervised denoising uses the raw data to learn the most likely intensity value given. As a consequence, there will be a tendency to suppress outliers. This is not bad by itself because such outliers are most likely just noise. But there might also be biologically relevant outliers: rare local calcium events on a small branch of a dendritic tree; or unusually shaped calcium events due to intracellularly recruited calcium; or unexpected decoupling of two adjacent neurons that are otherwise strongly coupled by electrical synapses. If the raw SNR is not high enough, the network will take such events as unlikely to be true and discard them in favor of something more normal.

As always, it is the experimenter who is responsible that such concerns are considered. To this end, some basic understanding of the available tools and their limitations is required. Hopefully this blog post helps to make a step into this direction!


  1. Weigert, M., Schmidt, U., Boothe, T., Müller, A., Dibrov, A., Jain, A., Wilhelm, B., Schmidt, D., Broaddus, C., Culley, S. and Rocha-Martins, M. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nature Methods15(12). 2018.
  2. Krull, A., Buchholz, T.O. and Jug, F. Noise2void-learning denoising from single noisy images. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
  3. Lehtinen, J., Munkberg, J., Hasselgren, J., Laine, S., Karras, T., Aittala, M. and Aila, T. Noise2Noise: Learning image restoration without clean data. arXiv. 2019.
  4. Lecoq, J., Oliver, M., Siegle, J.H., Orlova, N., Ledochowitsch, P. and Koch, C. Removing independent noise in systems neuroscience data using DeepInterpolation. Nature Methods18(11). 2021.
  5. Li, X., Zhang, G., Wu, J., Zhang, Y., Zhao, Z., Lin, X., Qiao, H., Xie, H., Wang, H., Fang, L. and Dai, Q. Reinforcing neuron extraction and spike inference in calcium imaging using deep self-supervised denoising. Nature Methods18(11). 2021.
  6. Li, X., Li, Y., Zhou, Y., Wu, J., Zhao, Z., Fan, J., Deng, F., Wu, Z., Xiao, G., He, J. and Zhang, Y. Real-time denoising of fluorescence time-lapse imaging enables high-sensitivity observations of biological dynamics beyond the shot-noise limit. bioRxiv. 2022.
  7. Chaudhary, S., Moon, S. and Lu, H. Fast, Efficient, and Accurate Neuro-Imaging Denoising via Deep Learning. bioRxiv / [Update September 2022: Nature Communications]. 2022.
  8. Rupprecht, P., Lewis, C., Helmchen, F. Centripetal integration of past events by hippocampal astrocytes. bioRxiv. 2022.
  9. Laine, R.F., Jacquemet, G. and Krull, A. Imaging in focus: an introduction to denoising bioimages in the era of deep learning. The International Journal of Biochemistry & Cell Biology140. 2021.
Posted in Calcium Imaging, Data analysis, Imaging, machine learning, Microscopy, Neuronal activity, Review, zebrafish | Tagged , , , , | Leave a comment

Video introduction to CASCADE

I recently recorded a short video talk about CASCADE, our supervised method to infer spike rates from calcium imaging data (Github / paper / preprint).

The video includes short video tutorials of our Colab Notebooks to explore the ground truth database and to test the algorithm without installation in the cloud.

Check it out:

Please note that you can also increase the playback speed of of the video. There is also a reference in the video to a previous blog post on noise levels.

Posted in Calcium Imaging, Data analysis, electrophysiology, Imaging, machine learning, Microscopy, Neuronal activity | Tagged , , , , | Leave a comment

Simple geometrical optics to understand and design point-scanning microscopes

Custom-built microscopes have become more and more sophisticated over the last years to provide a larger FOV, better resolution through some flavor of adaptive optics or simply more neurons simultaneously. Professional optical engineers are hired to design the ideal lens combination or the objectives with Zemax, a software that can simulate the propagation of light through lenses systems based on wave optics.

Unfortunately, these complex design optimizations might discourage users from trying to understand their microscopes themselves. In this blog post, I will give a few examples how optical paths of microscopes can be understood and, to some extent, also designed using simple geometrical optics. Geometrical optics, or ray optics, are accessible to anybody who is willing to understand a small equation or two.

Three examples: (1) How to compute the beam size at the back focal plane of the objective. (2) How to compute the field of view size of a point scanning microscope. (3) How to compute the axial focus shift using a remotely positioned tunable lens.

All examples are based on a typical point scanning microscope as used for two-photon microscopy.

(1) How to compute the beam size at the back focal plane of the objective

The beam size at the back focal plane is the limiting factor for the resolution. The resolution is determined by the numerical aperture (NA) focusing onto the sample, and a smaller beam diameter at the “back side” of the objective can result in an effectively lower NA compared to what is possible with the same objective and larger beam diameter.

In general, it is therefore the goal to overfill the back focal aperture with the beam (check out this review if you want to know more). However, especially for video-rate two-photon microscopy, one of the scan mirrors is a resonant scanner, which usually comes with a usable aperture of ca. 5 mm. Often, there are only two lenses between scan mirror and objective, scan lens and tube lens. The two black lines illustrate the “boundaries” of the beam:

If the beam diameter at the scan mirror is 5 mm, the beam diameter dBFA at the objective’s back aperture will be, using simple trigonometry:

d_{BFA} = f_t/f_s \cdot 5 mm

Therefore, for a typical ratio ft:fs of 4:1 or 3:1, you will get a beam size at the objective’s back aperture of 20 mm or 15 mm. This is enough to overfill a typical 40x objective (objective back aperture 8-12 mm, depending on the objective’s NA), but barely enough for a 16x objective or even lower magnification with reasonably high NA (e.g., for NA=0.8, the back aperture is around 20 mm or higher).

Therefore, when you design a beam path or buy a microscope, it is important to plan ahead which objectives you are going to use with it. And this simple equation tells you how large the beam will be at the objective’s back aperture’s location.

(2) How to compute the field of view size of a point scanning microscope

Another very simple calculation to do is to compute the expected size of the field of view (FOV) of your microscope design. This calculation is based on the very same lenses as above, plus the objective. In this case, different rays (colored) illustrate different deflections through the mirror, not the boundaries of the beam as above:

The deflection angle range of the beam, α, is de-magnified by the scan lens/tube lens system to the smaller angle β and then propagates to the objective. The top-bottom spread of the scanned beams on the left indicates the size of the FOV. Using simple trigonometry based on the angle and the distances of lenses to their focal point, one can state (but see also Fabian’s comment below the blog post):

tan(\alpha) = d/f_s

tan(\beta) = d/f_t

tan(\beta) = s_{FOV}/f_o

From which one can derive the expected size of the FOV:

s_{FOV} = tan(\beta) \cdot f_o = tan(\alpha) \cdot f_o \cdot f_s/f_t

Interestingly, the FOV size depends linearly on the ratio fs/ft. As we have seen above, the beam diameter at the back aperture of the objective depends linearly on ft/fs, the inverse. This results in a trade-off, such that, when you try overfill the back aperture by increasing ft/fs, you will automatically decrease the maximal FOV size. It is therefore important to know beforehand what is more important, high NA (and therefore resolution) or a large FOV size. This is not only but to some extent already determined by the choice of scan lens and tube lens.

To give some real numbers, the deflection angle of a typical resonant scanner is 26°. Let’s say we have ft/fs=4. The “effective focal length” of the objective is often not obvious nor indicated. As a rule of thumb, the magnification (e.g., 16x from Nikon) together with the appropriate tube lens (e.g., 200 mm) can be used to compute the focal length as 200 mm/16 = 12.5 mm. – Where does the 200 mm come from? This is a value that is company-specific. Most companies use 200 mm as this standard tube lens focal length, while for example Olympus uses 180 mm as default. (As a side-effect, a 20x objective from Olympus has a shorter focal length than a 20x objective from Nikon.)

Together, we have fo=12.5 mm, α=26°, ft/fs=4, arriving at sFOV=1.5 mm as an estimate for the maximally achievable FOV using these components.

As another side note, when you look up the maximal scan angles of a scanner, there is often confusion between “mechanical” and “optical” scan angle. When a scanner moves mechanically over 10°, the beam will be optically deflected by twice the amount, 20°. For FOV calculations, the optical scan angle should be used, of course.

(3) How to compute the axial focus shift using a remotely positioned tunable lens

The final third example for gemetrical optics is a bit more involved, but it might be interesting also for less ambitious microscope tinkerers to get the gist of it.

I few weeks ago, together with PhD student Gwen Schoenfeld in the lab of Fritjof Helmchen, I wanted to re-design a two-photon microscope in order to enable remote focusing using an electro-tunable lens. Here’s the basic microscope “design” that I started with:

As you can see, the objective, scan lens and tube lens are as simple as in the previous examples. Behind the scan lens (fs), there is only a slow galvo scanner (for the y-axis), while the fast resonant scanner (for the x-axis) is behind a 1.5:1 relay. There are two ideas behind this “relay” configuration. First, a relay between the two scan mirrors makes it possible to position them more precisely at the focus of the scan lenses (there exist also some more advanced relay systems, but this is a science by itself). Second, the relay system here is magnifying and enlarges the beam size from 5 mm (at the resonant scanner) to 7.5 mm (at the slow galvo scanner). In the end, this results in a smaller FOV size for the x-axis but in a larger beam size at the back of the objective and therefore better resolution.

To insert a remote tunable lens into this system, we had to do this in an optically conjugate plane of the back focal plane. This required the addition of yet another relay. This time, we chose a de-magnifying relay system. The tunable lens had a larger free aperture than the resonant scanner, so it made sense to use the full aperture. Also, as you will see below, this de-magnifying relay system considerably increases the resulting z-scanning range using the tunable lens. For the tunable lens itself, we chose a tunable lens together with a negative offset lens, resulting in a default behavior of the lens as if it did not exist (at least to a first approximation).

Now, before choosing the parts, I wanted to know which z-scanning range would be expected for a given combination of parts. My idea, although there might be simpler ways to get the answer, was to compute the complex beam parameter of the laser beam after passing through the entire lens system. The beam waist and therefore the focus can then be computed as the point where the real part of the beam parameter is zero (check out the Wikipedia article for more details).

To calculate the complex beam diameter after passing through the system, I used geometrical optics. More precisely, ABCD optics. ABCD optics is a formalism to use geometrical optics using 2×2 matrices. A lens or a certain amount of free propagation space are represented as simple matrices, and the beam propagation (distance from center line and angle) is then computed by multiplying all the matrices. For the system above, this means the multiplication of 16 matrices, which is not difficult but takes very long. The perfect tool for that is Wolfram’s Mathematica, which is not free but can be tested as a free trial for 30 days per e-mail address. All the details of this calculation are in a Mathematica Notebook uploaded to Github.

Briefly, the result of this rather lengthy mathematical calculation is a simple result, with the main actors being the axial z-shift in the sample (z) and the effective focal length of the tunable lens (fETL):

z = -\frac{f_o^2 \cdot f_s^2 \cdot f_{r1}^2 \cdot f_{r3}^2}{f_{ETL} \cdot f_t^2 \cdot f_{r2}^2 \cdot f_{r4}^2}

Using this formula and a set of candidate lenses, it is then possible to compute the z-scanning range:

Here, the blue trace is using the formula above. For the red trace, I displaced the ETL slightly from the conjugate position, making the transfer function more linear but the z-range also smaller, and showing the power and flexibility of the ABCD approach using Mathematica.

Of course, this calculation does not give the resolution for any of these configurations. To compute actual resolution of an optical system, you will have to work with real (not perfect) lenses, using ZEMAX or a similar software that can import and simulate lens data from e.g. Thorlabs. But it is a nice playground to develop an intuition. For example, from the equation it is clear that the z-range depends on the magnification of the relay lenses quadratically, not linearly. Also, the objective focal length enters this equation with the power of the square. Therefore, if you have such a remote scanning system where a 16x objective results in 200 μm of z-range, switching to a 40x objective will reduce the z-range to only 32 μm!

This third example for the use of geometrical optics goes a bit deeper, and the complex beam parameter is, to be honest, not really geometrical optics but rather Gaussian optics (which can however use the power of geometrical ABCD optics, for complicated reasons).

In 2016, I used these mathematical methods to do some calculations for our 2016 paper on remote z-scanning with a voice coil motor, and it helped me a lot to perform some clean analyses without resorting to ZEMAX.

Apart from that, the calculation of beam size and FOV size are very simple even for beginners and a nice starting point for a better understanding of one’s point scanning microscope.

Posted in Calcium Imaging, Imaging, Microscopy | Tagged , , | 2 Comments

Public peer review files

Peer-review is probably the most obscure part of the publication of scientific results. In this blog post, I would like to make the point that the best way to learn about it – except by being directly involved – is to read public peer review files. In addition, I will recommend some interesting or surprising peer review files, mostly for systems neuroscience, but also for some optics papers.

Public peer review files should be the standard

Peer review takes place before official “publication” in a peer-reviewed journal and is therefore usually not accessible to the reader. During the last years, this practice has changed, and more and more journals are now offering the peer review and rebuttal letters as supplementary files. This is done for example by the journal eLife, but also Nature Communications, sometimes for the SfN journals, and Nature. For some journals, like Nature, the authors can opt out of the peer review file publication (here is Nature’s policy). But, honestly, if you are not willing to share the reviews publicly, what do you want to hide? I think it should become a mandatory standard to share the reviews, with only specific reasons justifying an opt-out. (Update 2022-02-14: a Twitter-discussion on this topic.)

What to learn from peer review files

As a young researcher like a PhD student, it is rare to be involved in the review process, which, therefore, remains a black box. For me, it was fascinating to read my first peer review files on eLife maybe 5 years ago. It felt like the entire smooth surface of this paper started to crumble and give way to a more rich and more nuanced point of view. Looking back at this paper, this nuanced view was also somewhat included in the paper, but in such a smooth manner that it was difficult to extract without absolute expert knowledge.

Nowadays, I rarely read eLife papers until also the peer review section is online (which comes a couple days after the preliminary pdf). The reviewer comments provide an additional point of view that is very helpful in particular when I cannot fully judge the paper myself.

Additionally, reading such review files helps to write both better manuscripts, better rebuttal letters, and also better reviews. Also, gaining experience from these files prepares a bit for the sometimes very difficult experiences when receiving the reviews. Plus, reading those reviews makes the entire process a bit more transparent. When I wrote my first reviews for journals, I had seen only three or four reviews for my own co-author papers; but I had seen many more as public review files.

Let’s look into some examples that show what can happen during peer review. Paper links are in the title, links to the review files/sections thereafter.

Toroidal topology of grid cell activity (Review file)

When the preprint of this paper by Gardner et al. from the Moser lab appeared, I put it on my reading list, but never actually read it fully because I did not feel on top of the vast literature about continuous attractor models, and I did not even know whether a “toroidal” geometry of the activity space would be a surprising result or not.

Checking the peer review files after publication at Nature provided exactly this entry point that I had been missing before. In retrospect, all the information needed is also in the paper, but the more direct and less formal language in the reviews took me by the hand and showed me directly what the paper is about. The summaries at the beginning of each review provide a very useful second perspective on the main messages of the paper.

Deep physical neural networks (Review file)

A few weeks ago, I noticed an interesting article published at Nature on “deep physical neural networks”. I checked the author’s Twitter thread on the paper and found the topic intriguing, but slightly beyond my area of expertise. The review file provided me exactly with the lacking context and critical external opinion that I needed to form a somewhat less vague opinion of the specific advance made by the paper. Really helpful!

Place cells respond to moving bars in immobile rats (Review file)

This review file contains an entire story by itself. In the first round of review, reviewer #1 was rather critical, while the other two reviewers were almost convinced already. The authors, in a 8-month revision period finally managed to very decently address most points brought up by the reviewers. They do this, as has become apparently common practice for rebuttal letters to high-impact journals, in a very long and detailed letter (the entire review file is 44 pages).

But after this round of reviews and rebuttals, suddenly, reviewer #3 changes his/her opinion entirely:

“Unfortunately my evaluation of this paper has changed since the first submission due to comments from the other reviewers and because of literature I have discovered since then. This paper presents a set of reasonably well performed and analyzed experiments, but I no longer think the main results are novel or surprising. I therefore do not recommend publication in Nature and think this paper is better suited for a specialist journal.”

This is just the beginning of a long comment on why the reviewer thinks the paper is not novel enough any more. This is of course a nightmare for the authors. In the end, the authors do their best to address these concerns of novelty. Reviewer #3 remains unconvinced. The editor decides to ignore these concerns and goes with the recommendation of the other reviewers to publish the manuscript. Check it out yourself to form your own opinion on this discussion!

Non-telecentric two-photon microscopy (Review file)

A much wilder story is hidden in the peer review file of this optics paper from Tom Baden’s lab, which was finally published at Nature Communications. The first set of reviews at Nat Comm is still pretty normal, but then something goes wrong. Reviewer #4, who is – very obviously – highly competent but also a bit obsessed with details and annoyed by imprecision, has a few things to complain about, mostly about a few relatively small unsubstantiated claims and some minor errors that do not affect the main idea of the manuscript. However, the authors do not agree with this opinion, and a long and slowly escalating discussion between an annoyed reviewer and frustrated authors evolves. Check it out yourself. If you don’t get stomach pain while reading, you have my full admiration. At some point, reviewer #4 writes:

“In the last round of review, I wrote detailed comments, color-rebuttal, in a 16-page PDF, with the hope that they would be of help to the authors to make the manuscript better. The response I received on this round, surprisingly, is only 5 pages, and the authors reluctantly chose, on purpose or randomly, 5 comments to address, and ignored all other comments I curated. PLEASE RE-WRITE your response by addressing all comments I gave last time.”

Upon “editorial guidance”, the authors refrain from doing so. It all ends with mutual passive aggression (“No further comment” from both sides) – and acceptance for publication as an article.

Read it yourself to see how crazy and tiring peer review can be, and consider yourself how this situation could have been avoided by the authors or the reviewers. However, in the end, this review file is also a contribution to the scientific discussion (e.g., about proper PSF measurements) and therefore valuable by itself. It is a painful document, but also useful and full of insights.

Three-photon microscopy (Paper 1, Reviews; Paper 2, Reviews)

Three-photon microscopy is a relatively new field still, and when these two papers came out in Nature Communications and eLife, respectively, I was very happy to be provided with the additional context and details in the peer review files. I found especially the discussion in the Nat Comm paper (paper 1) about potential concerns for three-photon imaging very interesting.

Somato-dendritic coupling of V1 L5 neurons (Review file)

A few years ago, I had covered a preprint by Francioni et al. from the Rochefort lab on my blog. This study was later published on eLife, and since I liked the work of this paper already, I was very curious about the comments of the reviewers, their concerns, and the author’s replies. It is nice to get additional insights into such interesting studies!

Juxtacellular opto-tagging of CA1 neurons (Review file)

The review file of this beautiful methods paper from the Burgalossi lab tells an interesting story. Apparently, the authors had included an additional experimental sub-study based on cfos in the paper, but the reviewers were not convinced by the results. They therefore suggested – very surprising to me! – the acceptance of the paper for publication, but only after deletion of this subsection. I would not have guessed that such a unexpected and helpful consensus can be reached during the review process. This was probably helped by the fact that at eLife, it is common practice that editors and reviewers discuss their reviews with one another.

Nonlinear transient amplification (Review file)

Purely theoretical (neuroscience) papers are often challenging because it is difficult to fully judge the novelty, even whenthe concepts and ideas and equations are transparent. This paper by Wu and Zenke is conceptually close to what I have studied experimentally during my PhD (paper link in case you’re interested), so I was happy that this paper got published at eLife, with the review publicly available. A very useful secondary perspective!

In vivo calcium imaging in CA3 (Review file)

This is – so far – the only paper with public review file where I have been involved as an author. I wrote some sections of the rebuttal letter, actually without knowing that the reviews and rebuttals would be openly available afterwards. Unfortunately, the journal (eNeuro) messed up the formatting in such a horrible way that the review file becomes almost unreadable (which is a pity, because our rebuttal letter was very nicely written). This mess-up shows that there is still some progress to make, also in terms of infrastructure.

In general, I hope that public review files will become more common in the future, to the extent that non-public review files will be a thing of the past entirely. Public reviews make the editorial process more transparent, they open up the discussion behind the paper, lower the barriers for junior researchers with less peer review experience, and do not have, to my understanding, any major negative side-effects.

Posted in Calcium Imaging, Imaging, machine learning, Microscopy, Neuronal activity, Review | Tagged , , , | Leave a comment

Annual report of my intuition about the brain (2021)

How does the brain work and how can we understand it? I want to make it a habit to report some of the thoughts about the brain that marked me most during the past twelve month at the end of each year – with the hope to advance and structure the progress in the part of my understanding of the brain that is not immediately reflected in journal publications. Enjoy the read! And check out previous year-end write-ups: 2018, 2019, 2020, 2021.

During the last year, I have continued to work on the ideas described during previous year-end write-ups, resulting in a project proposal that is currently under evaluation. I will use this year’s write-up to talk about something different, although related, a recent book by Peter Robin Hiesinger: The Self-Assembling Brain.

Hiesinger, based in Berlin, is working in the field of developmental neurobiology. However, this book is rather a cross-over between multiple disciplines, ranging from developmental neurobiology, circuit neuroscience, artificial intelligence, robotics, and many side-branches of the mentioned disciplines. Hiesinger masterfully assembles the perspectives of the different fields around his own points of interest. For example, his introductory discussion about the emergence of the field of artificial intelligence in the 1950s is one of the most insightful account that I have read about this period. He tells the stories how key figures like von Neumann, Minsky, Rosenblatt or McCarthy and their relationships and personalities influenced the further development of the field.

The main hypothesis of Hiesinger’s book is that the genetic code does not encode the endpoint of the system (e.g., the map of brain areas, the default state network, thalamocortical loops, interneuron connectivity, etc.). According to him, and I think that most neuroscientists would agree, the neuronal circuits of the brain are not directly encoded in the genetic code. Instead, the simple genetic code needs to unfold in time in order to generate the complex brain. More importantly, it is, according to Hiesinger, necessary to actually run the code in order to find out what the endpoint of the system is. Let’s pick two analogies brought up in the book to illustrate this unfolding idea.

First, in the preface Hiesinger describes how an alien not familiar with life on earth finds an apple seed. Upon analysis of the apple seed, the alien realizes that there are complex and intricate genetic codes in the apple seed, and it starts to see beauty and meaning in these patterns. However, the analysis based on its structural content would not enable the alien to predict the purpose of the apple seed. This is only possible by development (unfolding) of the seed into an apple tree. Unfolding therefore is the addition of both time and energy to the seed.

Second, Hiesinger connects the unfolding idea with the field of cellular automata, and in particular with the early work of Stephen Wolfram, a very influential but also controversial personality of complexity research, and his cellular automaton named rule 110. The 110 automaton is a very simple rule (the rule is described in this wikipedia article) that is applied to a row of 1’s and 0’s and results in another binary row. The resulting row is again subject to rule 110, etc., leading to a two-dimensional pattern as computed here in Matlab:

The pattern is surprisingly complex, despite the simplicity of the rule. For example, how can one explain the large solitary black triangle in the middle right? How the vertical line of equally sized triangles in the center that ends so abruptly? The answers are not obvious. These examples show that a very simple rule can lead to very complex patterns. From Hiesinger’s point of view, it is important to state that the endpoint of the system, let’s say line 267, cannot be derived from the rule – unless it is developed (unfolded) for exactly 267 iterations. Hiesinger believes that this analogy can be transferred to the relationship between the genetic code and the architecture of the brain.

The rest of Hiesinger’s book discusses the implications of this concept. As a side-effect, Hiesinger illustrates how complex the genome is in comparison with the simple 101 automaton. Not only is the code richer and more complex, but it is also, due to transcription factor cascades that include feedback loops, a system of rules where rules, unlike rule 110, change over time with development. Therefore, according to Hiesinger, the classical research in developmental biology that tries to match single genes (or a few genes) onto a specific function is ill-guided. He convincingly argues that the examples for such relationships that have been found as “classical” examples for the field (e.g., genes coding for cell adhesion molecules involved in establishing synaptic specificity) are probably the exception rather than the rule.

The implication of the unfolding hypothesis for research on artificial intelligence is, interestingly, very similar. That is, to stop treating intelligent systems like engineered systems, where the rules can be fully designed. Since the connection between the generative rules and the resulting endpoint system cannot be understood unless their unfolding in time is observed, Hiesinger is in favor of research that embraces this limitation. He suggests to build models based on a to-be-optimized (“genetic”) code and, letting go of the full control, make them unfold in time to generate an artificial intelligence. Of course, this idea reminds of the existing field of evolutionary algorithms. However, in classic evolutionary algorithms, evolving properties of the code are more or less directly mapped to properties of the network or the agent. If I understood the book right, it would be in Hiesinger’s spirit to make this mapping more indirect through developmental steps that allow for higher complexity, even though it would also obfuscate the mechanistic connection between rules and models.

Overall, I find Hiesinger’s approach interesting. He shows mastery of other fields as well, but it is pncing point that the idea of the unfolding code, the self-assembling brain, is reasonable, and he also brings up examples of research that goes into that direction. However, as a note of caution to myself, accepting the idea of self-assembly seemed a bit like giving in when faced with complexity. There is a long history of complexity research that agreed on the fact that things are too complex to be understood. Giving in resulted in giving vague names to the complex phenomena, which seemed to explain away the unknown but in reality only gave it a name. For example, the concepts of emergence, autopoiesis or the free energy principle are in my opinion relatively abstract and non-concrete concepts that contributed to the halting of effective research by preventing incremental progress on more comprehensible questions. I get similar vibes when Hiesinger states that the connections between the self-organizing rules and the resulting product are too complex to be understood and require unfolding in time. The conclusion of this statement is either that everything is solved, because the final explanation is unfolding in time of a code that cannot be understood; or it is that nothing can be solved because it is too complex. In both cases, there seems to be some sort of logical dead end. But this just as a note of caution to myself.

So, what is the use of the unfolding hypothesis about the organization and self-assembly of the brain? I think it is useful because it might help guide future efforts. I agree with Hiesinger that the field of “artificial intelligence” should shift its focus on self-organized and growing neuronal networks. In my opinion, work focusing on evolutionary algorithms, actor-based reinforcement learning (e.g., something called neuroevolution), neural architecture search or more generally AutoML go into the right direction. Right now it seems a long shot to say this, but my guess is that these forms of artificial neuronal networks will become dominant within 10 years, potentially replacing artificial neuronal networks based on backpropagation. – After finishing the write-up, I came across a blog post by Sebastian Risi that is a good starting point with up-to-date references on self-assembling algorithms from the perspective of computer science and machine learning – check it out if you want to know more.

For neurobiology, on the other hand, the unfolding hypothesis means that an understanding of the brain requires understanding of its self-assembly. Self-assembly can happen, as Hiesinger stresses, during development, but it can also happen in the developped circuit through neuronal plasticity (synaptic plasticity on short and long time scales, as well as intrinsic plasticity). I have written about this self-organizing aspect of neuronal circuits in my last year’s write-up. Beyond that, if we were to accept the unfolding hypothesis as central to the organization of the brain, we would also be pressured to drop some of the beautiful models of the brain that are based on engineering concepts like modularity. For example, the idea of the cortical column, the canonical microcircuit, or the concept of segregated neuronal cell types. All those concepts have been obviously very useful frameworks or hypotheses to advance our understanding of the brain, but if the unfolding of the brain is indeed the main concept of its assembly, these engineering concepts are unlikely (although not impossible) to turn out to be true.

It is possible that most of the ideas are already contained in the first few pages, and the rest of the book is less dense and feels often a bit redundant. But especially the historical perspective in the beginning and also some later discussions are very interesting. Language-wise, the book could have benefitted from a bit more inference by the editor to avoid unnaturally sounding sentences, especially during the first couple of pages. But this is only a minor drawback of an otherwise clear and nice presentation.

The fictional characters of a systems neuroscientist (Alfred), an AI researcher (Pramesh), a developmental biologist (Minda) and a robotics researcher (Aki) discuss how developmental growth could be implemented for artificial neuronal networks.

The book is structured into ten “seminars”, which are each of them a slightly confusing mix of book chapter and lecture style. Each of the “seminars” is accompanied by a staged discussion between four actors: a developmental biologist, an AI researcher, a circuit neuroscientist and a robotics engineer (see the photo above). Theoretically, this is a great idea. In practice, it works only half of the time, and the book loses a bit of its natural flow because the direction is a bit missing. However, these small drawbacks are acceptable because the main ideas are interesting and enthusiastically presented.

Altogether, Hiesinger’s book is worth the time to read it, and I can recommend it to anybody interested in the intersection of biological brains, artificial neuronal networks and self-organized systems.

Posted in machine learning, Network analysis, Review | Tagged , , , | 4 Comments

Large-scale calcium imaging & noise levels

Calcium imaging based on two-photon scanning microscopy is a standard method to record the activity of neurons in the living brain. Due to the point-scanning approach, sampling speed is limited and the dwell time on a single neuron reduces with the number of recorded neurons. Therefore, one needs to trade off the number of quasi-simultaneously imaged neurons versus the shot noise level of these recordings.

To give an simplified example, one can distribute the laser power in space and time over 100 neurons at 30 Hz, or 1000 neurons at 3 Hz. Due to the lower sampling rate, the signal-to-noise-ratio (SNR) of the 1000 neurons will decrease as well.

A standardized noise level

To compare the shot noise levels across recordings, in our recent paper (Rupprecht et al., 2021) we took advantage of the fact that the slow calcium signal is typically very similar between adjacent frames. Therefore, the noise level can be estimated by

\nu  = \frac{Median_t \mid \Delta F/F_{t+1} - \Delta F/F_t \mid}{\sqrt{f_r}}

The median makes sure to exclude outliers that stem from the fast onset dynamics of calcium signals. The normalization by the square root of the frame rate f_r renders the metric comparable across datasets with different frame rates.

Why the square root? Because shot noise decreases with the number of sampling points with a square root dependecy. The only downside of this measure is that the units seem a bit arbitrary (% for dF/F, divided by the square root of seconds), but this does not make it less useful. To compute it on a raw dF/F trace (percent dF/F, no neuropil subtraction applied), simple use this simple one-liner in Matlab:

noise_level = median(abs(diff(dFF_trace)))/sqrt(framerate)

Or in Python:

import numpy as np
noise_level = np.median(np.abs(np.diff(dFF_trace)))/np.sqrt(framerate)

If you want to know more about this metric, check out the Methods part of our paper on more details (bioRxiv / Nature Neuroscience, subsection “Computation of noise levels”).

The metric \nu comes in handy if you want to compare the shot noise levels between calcium imaging datasets and understand whether noise levels are relatively high or low. So, what is a “high” noise level?

Comparison of noise levels and neuron numbers across datasets

I collected a couple of publicly available datasets (links and descriptions in the appendix of the blog post) and extracted both the numbers of simultaneously recorded neurons and the shot noise level \nu. Each data point stands for one animal, except for the MICrONS dataset, where each dataset stands for a separate session in the same animal.

As a reference, I used the Allen Brain Institute Visual Coding dataset. For excitatory neurons, typically 100-200 neurons were recording with a standard noise level of 1 (units omitted for simplicity). If you distribute the photons across an increasing number of neurons, the shot noise levels should increase with the square root of this multiple (indicated by the black line). Datasets with inhibitory neurons (de Vries et al., red) have by experimental design fewer neurons and therefore lie above the line.

A dataset that I recorded in zebrafish with typically 800-1500 neuron per recording lies pretty much on this line, similar to the MICrONS dataset where they used a mesoscope to record from several thousand cortical neurons simultaneously, at the cost of lower frame rate and therefore higher noise levels, similar to the dataset by Sofroniew et al., which recorded ca. 3000 neurons, but all from one plane in a large FOV.

Two datasets acquired by Pachitariu and colleagues stands out a bit by pushing the number of simultaneously recorded neurons. In 2018, this came at the expense of increased noise levels (pink). In 2019 (a single mouse; grey), despite a dataset with ca. 20,000 simultaneously recorded neurons, the noise level was impressively low.

In regular experiments, in order to mitigate possible laser-induced photodamage or problems due to overexpression of indicators, noise levels should not be maximized at the cost of physiological damage. For example, the mouse from the MICrONS dataset was later used for dense EM reconstruction; any sort of damage to the tissue, which might be invisible at first glance, could complicate subsequent diffusive penetration with heavy metals or the cutting of nanometer-thick slices. As a bottom line, there are often good reasons not to go for the highest signal yield.

Spike inference for high noise levels

To give an idea about the noise level, here is an example for the MICrONS dataset. Due to the noisiness of the recordings (noise level of ca. 8-9), only large transients can be reliably detected. I used spike inference through CASCADE to de-noise the recording. It is also clear from this example that CASCADE extracts useful information, but won’t be able to recover anything close to single-spike precision for such a noise level.

Above are shown the smooth inferred spike rates (orange) and also the discrete inferred spikes (black). The discrete spikes (black) are nice to look at, but due to the high noise levels, the discretization into binary spikes is mostly overfitting to noise and should be avoided for real analyses. For analyses, I would use the inferred spike rate (orange).


The noise level \nu can be used to quantitatively compare noise levels across recordings. I hope that other people can use this noise level metric \nu for their work.

As a note of caution, \nu should never be the sole criterion for data quality. Other factors like neuropil contamination, spatial resolution, movement artifacts, potential downsides of over-expression, etc. also play important roles. Low shot noise levels is not a guarantee for anything. However, high shot noise levels on the other hand are always undesirable.


Appendix: Details about the data shown in the scatter plot

de Vries et al. (2020; red and black) describes the Allen Visual Coding Observatory dataset. It includes recordings from more than 100 mice with different transgenic backgrounds in different layers of visual-related cortices. Red dots are datasets from mice that only expressed calcium indicators in interneurons, while black dot are datasets with cortical principal neurons of different layers. The datasets are highly standardized and of low shot noise levels (standardized level of ca. 1.0), with relatively few neurons per dataset (100-200).

Rupprecht et al. (unpublished; green) is a small dataset in transgenic Thy-1 mice in hippocampal CA1 that I recorded as a small pilot earlier this year. The number of manually selected neurons is around 400-500, at a standardized noise level of 2.0-3.0. With virally induced expression and with higher laser power (here, I used only 20 mW), lower noise levels and higher cell counts could be easily achieved in CA1.

Rupprecht et al. (2021; violet) is a dataset using the small dye indicator OGB-1 injected in the homolog of olfactory cortex in adult zebrafish. At low laser powers of ca. 30 mW, 800-1500 neurons were recorded simultaneously at a standardized noise level of 2.0-4.0.

Sofroniew et al. (2016; light green) recorded a bit more than 3000 neurons simultaneously at a relatively low imaging rate (1.96 Hz). Different from all other datasets with >1000 neurons shown in the plot, they recorded only from one single but very large field of view. All neuronal ROIs had been drawn manually, which I really appreciate.

Pachitariu et al. (2018; pink) is a dataset recorded at a relatively low imaging rate (2.5 Hz), covering ca. 10,000 neurons simultaneously. The standardized noise level seems to be rather high according to my calculations.

Pachitariu et al. (2019; black) is a similar dataset that contains ca. 20,000 neurons, but at a much lower standardized noise level (4.0-5.0). The improvement compared to the 2018 dataset was later explained by Marius Pachitariu in this tweet.

MICrONS et al. (2021; red) is a dataset from a single mouse, each dot representing a different session. 8 imaging planes were recorded simultaneously at laser powers that would not damage the tissue, in order to preserve the brain for later slicing, with the ultimate goal to image the ultrastructure using electron microscopes. The number of simultaneously imaged neurons comes close to 10,000, resulting in a relatively high standardized noise level of 7.0-10.0.
[Update, November 2021] As has become clear after a discussion with Jake Reimer on Github, the MICrONS data that I used were not properly normalized; it was not proper dF/F but with a background subtraction. The noise measure for this dataset is therefore not very meaningful, unfortunately. My guess is that the true noise level is in the same order of magnitude as shown in the plot above, but I cannot tell for sure.

The black line indicates how the noise level scales with the number of neurons. For n_1 = 150 neurons (Allen dataset, de Vries et al.), a standardized noise level of \nu_1 = 1.0 can be assumed. For higher numbers of neurons n_2, the noise level \nu_2 scales with \nu_2 = \nu_1*\sqrt{n_2/n_1}. Deviations from the line indicate where recording conditions were better or worse compared to these “typical” conditions.

Posted in Calcium Imaging, Data analysis, Imaging, Microscopy, Neuronal activity | Tagged , , , , | 5 Comments

5 reasons why to use Cascade for spike inference

Our paper on A database and deep learning toolbox for noise-optimized, generalized spike inference from calcium imaging is out now in Nature Neuroscience. It consists of a large and diverse ground truth database with simultaneous calcium imaging and juxtacellular recordings across almost 300 neurons. We used the database to train a supervised algorithm (“Cascade”) to infer spike rates from calcium imaging data.

If you are into calcium imaging, here are 5 reasons why you should use Cascade.

1. You don’t have to install it

You are not familiar with Python? Or you don’t want to install dependencies? No problem. Click this link. The link will bring you to an online interface where you can work with Cascade. You can upload your data (dF/F traces) as *.mat- or *.npy-files and download the results (inferred spike rates). Trying out the package is as quick as it can get. The online interface is a Google Colaboratory and therefore runs on servers provided by Google for free.

However, you can also install Cascade locally on your computer. Just go to the Github page and follow the installation instructions (tested on Ubuntu, Windows and Mac). This might be useful if you want to integrate spike inference into an existing workflow. People including myself have used Cascade together with CaImAn or Suite2p. You can use the dF/F output of these packages and apply the Cascade “predict” function as seen in the demo scripts. It’s a single-line addition of code to your pipeline.

Personally, I used Cascade with a GPU for the analyses in the paper. For daily work with calcium imaging recordings in mice (typically a few hours of recordings and a few hundreds of neurons), I run it on a computer without dedicated GPU because it’s fast enough on any reasonable CPU (seconds to minutes). Sometimes, I also use the online Colaboratory Notebook. The results are identical, whether I use a local installation or the Notebook, since the code is identical.

2. You don’t have to tune parameters

For each neuron from a data set, the algorithm automatically detects the noise level and chooses an appropriately trained model. The models are trained across many different conditions and indicators. This broadly trained model ideally generalizes to unseen data. Check out Figure 3 in the paper if you want to know more about the details.

To get started, you have to choose a model based on the frame rate of your recordings and on the temporal resolution that you want to achieve with spike inference. The FAQ, which can be found both on the Github Readme and at the end of the Colab Notebook, give you guidance if any doubts remain.

3. Estimate absolute spike rates

Have you ever wondered whether a specific calcium transient corresponded to a single action potential or a burst of many action potentials? At least it would be nice to have some estimate of spike rates for a given recording. Cascade gives you this estimate.

The next question is a bit more tricky: how precise is that estimate? – It is as exact as it can get if you apply an algorithm to calcium imaging data where you do not have an associated ground truth. We have quantified this precision in the paper in terms of correlation (=variance explained), absolute errors and typical biases (see Figure 3 and Extended Data Figure 4). In the end, the typical errors depend on the data quality and noise levels. However, you should expect that the true spike rate might be as low as 0.5 times or as high as 2.0 times the spike rate estimated by Cascade. On average across neurons, the error will be lower. This is not single-spike precision, but, as bad as it sounds, this is as good as it gets. The imprecision is, among other things, due to the unpredictable heterogeneity of the spike-calcium-relationship across neurons.

However, absolute estimates with a certain imprecision, which you get from Cascade, are still better than results that do not have an immediate meaning (dF/F scores).

4. Improve temporal precision

Spike inference, also referred to as temporal “deconvolution” of calcium recordings, improves the temporal resolution by getting rid of the slow calcium transient. The slower the calcium transient, the more pronounced the improvement of deconvolution. And, yes, the algorithm generalizes well across short and long time constants.

Applications where I have used Cascade myself to achieve improved temporal resolution:

a) Detection of fast sequences on a sub-second time scale (check out Figure 5e in our toolbox paper)

b) Detection of swimming-locked neuronal activity in head-fixed adult zebrafish. Adult zebrafish move on a sub-second time scale, faster than the calcium indicators we used in their brains at room temperature.

c) Locking of neuronal activity to oscillations in the hippocampus (ongoing work, not yet published).

If your observations are masked by slow indicator transients, give it a shot and try out Cascade.

5. De-noise your recordings

When I used Cascade for the first time, I was surprised how well it de-noised recordings. The Cascade paper is full of examples where this is validated with ground truth.

Or, have a look at the spike rate predictions for the Allen Brain Observatory data set (Figure 6b,f,g,h; Extended Data Figure 10). Shot noise is removed, and slower transients due to movement artifacts are rejected. The algorithm simply has learned very well how an action potential looks like.

One of the most striking examples, however, was when we tested the effect of Cascade on population imaging analyses (Supplementary Figure 11, the only figure not yet included in the preprint). To this end, we used NAOMi to simulate neuronal population patterns and analyzed how well the correlations between neuron pairs were predicted from dF/F traces (red) or from spike rates inferred with Cascade (blue). For dF/F traces, correlations were often overestimated (among other reasons due to slow calcium transients) and underestimated (due to overwhelming noise). Pairwise correlation computed from Cascade’s spike rates are simply closer to the true correlations.

Therefore, if you want to get the best out of your 2P calcium imaging data, I would recommend to use Cascade. The result is simply closer to the true neuronal activity.

Posted in Calcium Imaging, Data analysis, electrophysiology, Imaging, machine learning, Microscopy, Neuronal activity | Tagged , , , , | Leave a comment

Fast scanning, triplet states and photon yield

In point-scanning microscopy like two-photon or confocal microscopy, a focused laser beam is scanned across the field of view and thereby sequentially recovers an image of the object. In this blog post, I will discuss the idea that scanning faster across the field of view would increase the total amount of collected fluorescence. This idea is based on the experimental finding that high-intensity laser light could induce long-lived and non-fluorescent triplet states of the fluorophore molecule while scanning the sample; when the laser is scanned only slowly across the sample, it would therefore try to image fluorophores that are already in their “dark” triplet states. Imaging dark fluorophores would result in an overall decreased fluorescence yield. I tested this hypothesis directly with resonant scanning two-photon microscopy in the living brain tissue together with typical fluorophores (GCaMP6f, OGB-1). The main result from these experiments is that I could not find a substantial effect of the triplet states under these realistic conditions, and therefore no advantage in terms of fluorescence yield gained by an increased scan speed.


Fluorophores are molecules that can re-emit light after absorbing a photon themselves. This effect of fluorescence can be described as a state change of the fluorophore from ground state to an excited state upon absorption of the incoming photon, and as a state change from an excited state back to the ground state, together with emission of the outgoing photon (see this Wikipedia article). The lifetime of the fluorescent state is often a few nanoseconds. However, there are often additional excited states that are more long-lived, such as so-called triplet states. The transition probabilities from ground state to triplet state (and vice versa) are very low due to the exclusion principle that derives from quantum spin mechanics. The lower transition probabilities make these states less likely to occur during fluorescence microscopy but also render the triplet state longer-lived once it is attained. These long-lived triplet states could have rather undesired consequences, since a fluorophore in the triplet state is unable to absorb or emit photons, therefore becoming “dark” or non-functional from the experimenter’s perspective.

When triplet states play a prominent role in laser-scanning microscopy, one can avoid their detrimental effects by simply scanning faster. Scanning slowly across the sample means that the same fluorophores are hit with light over and over within a short time window. In such a scenario, some of the fluorophores are already in a dark triplet states, resulting in lower overall fluorescence. Faster scanning avoids this problem, and it has been shown for confocal microscopy (Borlinghaus, 2006) as well as STED microscopy (Schneider et al., 2015; Wu et al., 2015) that a large signal increase can be achieved simply by faster scanning.

For two-photon microscopy, the situation is less clear. It has been shown that scanning at all compared to non-scanning two-photon fluorescence correlation spectroscopy results in higher photon yield (Petrášek and Schwille, 2008), but this specific fluorescence configuration cannot be translated easily to two-photon imaging used by neuroscientists. However, there is one publication that demonstrated that microsecond-long triplet states do play a large role for two-photon microscopy (Donnert et al., 2007). This paper has been cited as a standard reference to justify the advantages of fast scanning approaches (e.g., Chen et al., 2011), and some two-photon microscopy approaches directly designed their scanning approaches to reduce triplet-induced reduction of fluorescence yield (Gautam et al., 2015; Castanares et al., 2016; Karpf et al., 2020). However, another study showed an effect that seemed to be contradictory to the Donnert et al. results (Ji et al., 2008), and the interpretation of both papers was discussed repeatedly, e.g., by Andrew Hires and on Labrigger. The consensus, if there was any, seemed to be that it probably depends on the sample. The experiments by Donnert et al. had been done with GFP fixed on a coverslip, but one can easily imagine that a fluorescent protein in vivo might behave differently.

So I decided to probe these results with typical samples and procedures used by neuroscientists – using video-rate calcium imaging of neurons in the living brain. I did not do this out of pure curiosity about photophysics, but because of the implications of these photophysics for the design of two-photon imaging modalities. If triplet states were indeed an important factor for two-photon imaging of such samples as suggested by Donnert et al., a slightly modified scanning scheme (or parallelized scanning schemes, as reviewed by Weissenburger and Vaziri, 2018, Figure 3) or adapted laser repetition rates would have huge benefits for the total fluorescence yield.

The main finding of the Donnert et al. paper was the induction of dark triplet states by the scanning laser, and a key prediction was that faster scanning would decrease the probability that an imaged fluorophore is in a dark state; in short, that faster scanner would increase fluorescence yield. Therefore, I wanted to systematically understand whether fluorescence for two-photon microscopy depends on scan speed. To study this dependence experimentally in a typical in vivo sample , I used a resonant scanning microscope. For a resonant scanning microscope, the physical scan speed across the sample can be adjusted by changing the scan amplitude (often called the Zoom setting). High zoom would result in low scan speed across the sample, while low zoom would result in high scan speed, enabling a simple characterization of scan speed versus fluorescence yield. According to Donnert et al., higher scan speed would result in higher fluorescence yield compared to slower scan speed when imaging the same spot.

Experimental approach

For a resonant scanning microscope, the scan speed can be derived from the position x(t) of the laser beam focus:

x(t) = x_0 \cdot sin( t \cdot f_{res} \cdot 2 \pi ),

with the resonant frequency (f = 8 kHz) and the amplitude (x0 = 500 μm), which determines roughly the size of the field of view (FOV, here: 1000 μm). The scan speed v(t) is the derivative of the position x(t) with respect to time and therefore also follows a sinusoidal trajectory that reaches its peak in the center of the FOV:

v(t) = x_0\cdot f_{res} \cdot 2 \pi \cdot cos( t \cdot f_{res} \cdot 2 \pi ),

with the speed at the center of the FOV given by

v_{max} = x_0\cdot f_{res} \cdot 2 \pi.

For higher zoom values (i.e., a smaller FOV), the value is reduced by the same factor (at least in the microscopy software I used for these experiments!). Thereby, it is possible to test a wide range of scan speeds just be changing the zoom setting of the resonant scanner. Since the resonant scanner can span zoom levels between 1x to 20x (for much higher zoom settings, a resonant scanner can become unstable due to the small scan amplitude), a range of speeds between 25 μm/μs and 1.25 μm/μs is spanned.

This speed should be compared to the lateral resolution of the microscope, which is for the microscope used in this experiment around 0.4 μm FWHM (full width at half maximum of the point spread function). To scan over this resolution-limited spot, the resonant scanner needs 0.016 – 0.32 μs. For a pulsed laser with a repetition rate of 80 MHz with 12.5 ns between two pulses, this corresponds to 1.3 – 26 pulses per resolution-limited spot. Let’s call this number the cumulative pulse count per resolution-limited spot (CPC) from now on.

Intuitively, this means that each time a laser pulse tries to excite a fluorophore, this fluorophore has very recently already been hit by a number n of previous pulses, approximately n = CPC/2. If there is a chance that a pulse drives the fluorophore into a dark state that makes the fluorophore non-excitable for a handful of microseconds (e.g., a triplet state), then the CPC can be used to calculate the number of fluorophores remaining to be excited:

N_{remaining} = N_0 \cdot e^{-CPC/\lambda}

λ is a constant that depends on the applied laser intensity and the fluorophore itself. More concretely, the constant λ is the event rate of a fluorophore going into the dark state. The exponential decay with CPC should hold true as long as these events occur within a time window that is shorter than the recovery timescale of the dark state (for triplet states, this is ca. 1 μs). The main consequence is that the fluorescence yield is obviously proportional to the remaining number of excitable fluorophores, Nremaining.

Therefore, if these events that generate dark triplet states were very unlikely (e.g., λ ~ 500), the effect of fast scanning would not really make a difference. However, if λ was ~1, the effects would be dramatic. How can we distinguish these scenarios?

Experimental implementation, part I

With a resonant scanning microscope, the experiment is easy to perform for a dye in a solution. You simply have to generate a homogeneous sample and image first with high zoom and low zoom, then compare the fluorescence in the central region where the scan speed is maximal.

Unfortunately, biological samples are not homogeneous, and also not stationary in the case of calcium indicators in living neurons. To image in a biological but still rather homogeneous sample, I used a transgenic GCaMP6f zebrafish line. I imaged in an explant of a dorsal forebrain region that I knew was labeled very densely and very homogeneously (described by Huang et al., 2020). But the neuronal somata with the nice nuclear exclusion of GCaMP generated a lot of undesired variability.

The solution to circumvent these sources of variability is systematic averaging. In a first approach, I took advantage of the fact that I had written large parts of the microscopy control software myself. I wrote a helper program that performed continuous imaging but randomly moved the stage to different positions every few seconds, thereby averaging across all these inhomogeneously labeled FOVs:

After a minute, the program automatically changed the Zoom setting to a different, random value and recorded again the video with intermittent stage movements:

To see an effect, I performed these experiments with rather high laser power (50-80 mW below the objective) and at rather low wavelengths (typically 800 nm, which had been used previously by Donnert et al.).

In the following plot, each data point corresponds to one movie as shown above. Keep in mind that a zoom setting corresponds to a specific CPC. For example, a zoom setting of 1 corresponds to a CPC of 1.3. However, I could not see any dependence of the total fluorescence on the cumulative pulse count CPC (left), so it was not worth determining a numeric value for λ. Interestingly, the total fluorescence decreased slightly but clearly visible over time (right), but since the zoom setting sequence was randomized, this did not reflect any dependence on the CPC:

This finding also held true for repetitions of the same experiment at different locations of the fish brain and with slightly changed wavelength or average power:

Next, I performed the same experiment with a different fluorophore. I injected OGB-1 into the homolog of piriform cortex of zebrafish. This is a large and relatively uniform region in the zebrafish forebrain that allows the dye to diffuse more or less homogeneously, at least if you’re lucky. When I analyzed the experiments, I found again no visible effect, no matter the laser power or the laser wavelength:

Together, these experiments strongly suggested that there is indeed no effect of triplet states and therefore no benefit of fast scanning to increase the fluorescence yield.

Experimental implementation, part II

However, to convince myself more about this experimental finding that seemed to be at odds with my expectations from the Donnert et al. paper, I used a second, slightly more direct approach. Instead of comparing sequential recordings at different zoom levels, I thought it would be interesting to record the fluorescence while changing the zoom level continuously. This would enable me to measure the dependence of fluorescence on CPC more quickly and therefore also for a larger set of power settings and wavelengths. To continuously change the zoom setting, I used a DAQ board to generate a low-frequency sine signal that modulated the zoom level from 1 (peak of the sine) to ~20 (trough of the sine) with a period of ca. 13 seconds. (To keep track of the sine signal, I also connected the command signal with the second input channel of the microscope.) That’s how these experiments looked like:

Of course it is important to use not the entire FOV but only the central vertical stripe that remains more or less stationary. I used only a small vertical window of 7 pixels for the analysis. A single experiments resulted in a result such as shown below, plotting CPC (top) and fluorescence (bottom). The fluorescence clearly shows some variability stemming from active neurons (after all, we’re still dealing with a living brain here!):

In the plot above, no obvious relationship between CPC and fluorescence can be seen, and when I changed the power at 920 nm between 15 mW and 60 mW (this was the maximum that I could get with this system), I could not see any effect. I therefore show here all experiments performed at 920 nm, pooled across power settings and across two fish (total of ca. 30 recordings, each a few minutes):

For some experiments, the maximum zoom level was around 14, which I extended for a subset of experiments to something closer to 20.

I performed the same experiment also at 800 nm. I could also increase the laser power, simply because the laser provided more power at this wavelength range. However, these levels cannot go to arbitrary values. At a certain threshold that also depends on the duration of the exposure, all neurons across the zebrafish’s brain region become bright, resulting in a wave of high-calcium neurons that propagates through the brain. To avoid this, I used a maximum power of 75 mW at 800 nm. The result, again pooled across laser powers between 40 and 75 mW, showed an effect, albeit a bit subtle:

Fluorescence was indeed slightly higher for very low CPCs. However, the effect was much smaller than expected from experiments in vitro by Donnert et al.. Overall, such a small effect which, in addition, only appeared at the less relevant wavelength of 800 nm, seemed of little relevance for practical purposes.

Therefore, with fluorescence depending on CPC only in minor ways and under non-typical imaging conditions, the suggested triplet states seem to be not relevant for in vivo calcium imaging situations; and as a consequence, one would be ill-guided to assume that faster scanning yields higher fluorescence yield for two-photon microscopy by avoiding these μs-long dark states.

As a side-note, when I tried to analyze the power-dependency of this weak effect observed at 800 nm, I came across a weird effect. There was indeed some sort of increased fluorescence at lower zoom levels. However, this increase came with a delay of several seconds during the above experiments, resulting in a hysteresis:

Due to this longer time-scale, this effect has nothing to do with the short-lived μs-triplet states but is something different entirely. Photophysics is really complicated! This additional observation made me stop experiments, because I realized that these things would be more difficult to figure out, on top of a very small and probably irrelevant triplet effect.


I did not find any evidence for a substantial effect of triplet states during typical conditions (calcium imaging of neurons with a Ti:Sa laser and <100 mW power at the sample). I therefore do not see a benefit in terms of fluorescence yield by faster scanning. The triplet states on the timescale of few microseconds that had been observed by Donnert et al. for fixed samples do not seem to play a major role under the investigated conditions. It is always challenging to convincingly show the absence of an effect, but my experimental results convinced me to not further investigate fast scanning and multiplexing schemes as a means to increase fluorescence yield.

[Update August 2021: Christian Wilms pointed out that the most obvious difference between my experiments and the Donnert et al. study is that the dye molecules were freely diffusible in my experiments but fixed in the Donnert et al. experiments. Consistent with that, he also noticed that the paper from Ji et al., which found results contradictory to the Donnert et al. paper, was mostly based on experiments with freely diffusible dyes, while STED experiments, which clearly showed the effect, are mostly based on fixed fluorophores.]

Multiplexing or other modified distributions of excitation photons in time and space might still be able to increase fluorescence yield for certain fluorophores and conditions, but probably not under conditions similar to the ones I investigated.

The experiments described above are not fully systematic and cover only a specific parameter regime of excitation wavelengths, fluorophores and laser powers. However, anybody with a resonantly scanning two-photon microscope can easily reproduce these findings for any other scenario. Simply switch between high and low zoom settings and check whether the brightness in the center of the FOV changed substantially or not. Quantification of possible effects requires careful averaging, but a quick and qualitative confirmation or refutation of the above findings would be very easy to do for any experimenter.


The experiments described above were carried out in the lab of Rainer Friedrich at the FMI in Basel. I’m thankful to Christian Wilms, who encouraged me to analyze and write up these experiments after a discussion on Twitter.

Posted in Calcium Imaging, Imaging, Microscopy, Neuronal activity | Tagged , , , , | Leave a comment

Research and Intuition

So far I was very fortunate with my scientific long-term mentors and supervisors: both of them are kind, open, creative and stunningly intelligent. I could not wish for more. However, when asked about a role model, I would mention a person that influenced my take on research, during a time when I still was studying physics, probably more than others: Pina Bausch.

Pina Bausch was a dancer and choreographer who mostly worked in the small town of Wuppertal, Germany, where she developed her own way of modern dance. Her works are creative and inventive in very unexpected ways, and the way she explored body movements as a dancer struck me as surprisingly similar to what I think is research.

Research in its purest form is the exploration of the unknown, the discovery of what is not yet discovered, without a clear path ahead. The question that I’m working on in the broadest sense, “How does the brain work?”, enters the unknown very quickly as soon as you take the question seriously. How, in general, can we see what cannot be seen yet, how can we find ideas that do not yet exist?

Pina Bausch was a master in this art. Her craft was not science or biology but dancing. However, I think one can learn some lessons from her. It was typical of her to explore her own movements and to “invent” new movements, like wrist movements or coordinated movements of elbows and the head, or simply a slowed-down or delayed movement of the fingers. In regular life we use a rather limited and predefined combination of motor actions, and it takes some creativity to come up with movements that are unexpected and new but still interesting. One way to find new ways to move would be to consciously become aware of the own patterns and limitations and then try to systematically break those rules. However, Pina Bausch performed this discovery process in a different way. Her research was not guided by intellectual deduction or conclusion, but by her intuition. In 1992, she said:

“Ich weiß nämlich immer, wonach ich suche, aber ich weiß es eher mit meinem Gefühl als mit meinem Kopf.”

“Because I always know what I’m searching for. But I know it with my heart and with my feeling rather than with my brain.”

This might come over as a bit naive at first glance. Sure, an artist uses her heart, a scientist uses his brain, that sounds more or less normal, doesn’t it? However, when I saw Pina Bausch do this kind of searching, that is, when she danced, I was very impressed.

She seemed to rely on her intuition on every single moment of her explorations; and when I heard her talk about it (unfortunately, I’m only aware of interviews in German without translation), it was also clear that she did not have and did not need an explanation of what was going on. Most impressively for me, her way of exploring the unknown really struck me as similar to what is going on in a researcher, no matter the subject. What made her such an excellent researcher?

To me, it seems that the prerequisites of her impressive ability are the following: First of all, of course, a deeply engrained knowledge of and skill with her art, together with a honest care about the details. There’s no intuition without experience and knowledge. Second, an openness to whatever random things might happen and to embrace them, coming from the outside or her inside. Third, an acceptance of the fact that she doesn’t really know what she’s doing. Or, to put this differently, a certain humility in the face of what is going to happen and what is going on in her own subconsciousness. I believe that these are qualities that also make for a good researcher in science.

It also reflects my own experience of doing research (at least partially). Even when I was working with mathematical tools, for example when I was modeling diffusion processes in inhomogeneous media during my diploma thesis, I had the impression that my intuition was always a couple of steps ahead of myself. Often I could see the shape of the mathematical goal ahead of my derivations, and it would take me several days before I could bring it down to the paper.

Of course there are other ways to develop new ideas, and for some problems intuition also fails systematically (maybe complex systems?). And of course there are other kinds of research, for example the gradual optimization of methods, or the development of devices to solve a specific problem, or the broad and systematic screening of candidate genes or materials for a defined purpose.

These systematic and step-wise procedures are more predictable than “pure” research, and the grant-based scientific research reinforces this kind of research. In a grant proposal, there are typically a defined number of “aims”. The more clearly defined these aims are, the better the chances of the grant proposal to be accepted. This makes sense. It would be ridiculous to fund a project with loosely defined aims, especially if other, competing proposals have a clear and realistic goal.

However, this necessary side-effect of grant-based research narrows our perspective on a kind of research that can be more or less clearly described even before doing it. It narrows down also the way how we talk about research and about results. We do not directly encourage young researchers to use and develop their intuition, as if this had nothing to do with the scientific process. In grants and progress reports and talks and papers, we try to use very concise, precise language, sharp and clean as steel (often completed by pieces of superficial math that are supposed to demonstrate precision), not only when describing our methods – but also when describing results and when interpreting the results. This is not bad by itself, but it shapes also the way we think about research, and it can lead to a situation where we internally might reject ideas or results that do not satisfy the desired clarity and cleanliness in a first step.

I think that also researchers in “hard” sciences like neuroscience could benefit from a technique that uses intuitive thinking, and at least I have learnt a lot from the way Pina Bausch approached her subject of study using these techniques. Ultimately, understanding in neuroscience should always aim for descriptions in terms of words or math. But the way towards this goal does not need to be guided by these clear ways of thinking alone. From my experience, the power of intuition is only unleashed if we accept that we cannot really understand the process itself. Therefore, I see the humility that Pina Bausch showed towards her own intuitive thought process not simply as a virtue of a human being, but rather as a tool and a way of thinking that enables creativity.

Posted in Uncategorized | Leave a comment

Online spike rate inference with Cascade

To infer spike rates from calcium imaging data for a time point t, knowledge about the calcium signal both before and after time t is required. Our algorithm Cascade (Github) uses by default a window that is symmetric in time and feeds this window into a small deep network to use the data points in the window for spike inference (schematic below taken from Fig. 2A of the preprint; CC-BY-NC 4.0):

However, if one wants to perform spike inference not as a post-processing step but rather during the experiment (“online spike inference”), it would be ideal to perform spike inference with a delay as short as possible. This would allow for example to use the result of spike inference for a closed-loop interaction with the animal.

Dario Ringach recently came up with this interesting problem. With the Cascade algorithm already set up, I was curious to check very specifically: How many time points (i.e., imaging frames) are required after time point t to perform reliable spike inference?

Using GCaMP/mouse datasets from the large ground truth database (the database is again described in the preprint), I addressed this question directly by training separate models. For each model, the time window was shifted such that a variable number of data points (between minimally 1 and maximally 32) were used for spike inference. Everything was evaluated at a typical frame rate of 30 Hz, and also at different noise levels of the recordings (color-coded below); a noise level of “2” is pretty decent, while a noise level of “8” is quite noisy – explained with examples (Fig. S3) and equations (Methods) again in the preprint.

The results are quite clear: For low noise levels (black curve, SEM across datasets as corridor), spike inference seems to reach a saturating performance (correlation with ground truth spike rates) around a value of almost 8 frames. This would result in a delay of almost 8*33 ms ≈ 260 ms after a spiking event (dashed line).

But let’s have a closer look. The above curve was averaged across 8 datasets, mixing different indicators (GCaMP6f and GCaMP6s) and induction methods (transgenic mouse lines and AAV-based induction). Below, I looked into the curve for each single dataset (for the noise level of 2).

It is immediately clear that for some datasets fewer frames after t are sufficient for almost optimal spike inference, for others not.

For the best datasets, optimal performance is already reached with 4 frames (left panel; delay of ca. 120 ms). These are datasets #10 and #11, which use the fast indicator GCaMP6f, which in addition is here transgenically expressed. The corresponding spike-triggered linear kernels (right side; copied from Fig. S1 of the preprint) are indeed faster than for other datasets.

Two datasets with GCaMP6s (datasets #15 and #16) stand out as non-ideal, requiring almost 16 frames after t before optimal performance is reached. Probably, expression levels in these experiment using AAV-based approaches were very high, resulting in calcium buffering and therefore slower transients. The corresponding spike-triggered linear kernels are indeed much slower than for the other GCaMP6s- or GCaMP6f-based datasets.

The script used to perform the above evaluations can be found on Cascade’s Github repository. Since each data point requires retraining the model from scratch, it cannot be run on a CPU in reasonable time. On a RTX 2080 Ti, the script took 2-3 days to complete.


  1. Only few frames (down to 4 frames) after time t are sufficient to perform almost ideal spike inference. This is probably a consequence of the fact that the sharp step increase is more informative than the slow decay of a spike-triggered event.
  2. To optimize the experiment for online spike-inference, it is helpful to use a fast indicator (e.g., GCaMP6f). It also seems that transgenic expression might be an advantage, since indicator expression and calcium buffering is typically lower for transgenic expression than for viral induction, preventing a slow-down of the indicator by overexpression.

Posted in Calcium Imaging, Data analysis, machine learning, Neuronal activity | Tagged , , , | Leave a comment