Heat #1: The Heat Equation (and a Classical Solver)

Manuel Madeira
Manuel Madeira Author
David Carvalho
David Carvalho Editor and Co-author
Fábio Cruz
Fábio Cruz Technical Reviewer
Heat #1

To mark Valentine’s day, we let ourselves get inspired by feelings of love and human warmth. Love is to be spread around, so they say.

Yeah, we’re sure you guessed it right — in this all-new blog series, we delve into the problem of … 🔥 heat diffusion on a plate! 🔥

Let us set the bar here — in the Heat series, we will take you step by step up to a point where you can gauge the potential of Neural Networks – in both strength and versatility – in solving Partial Differential Equations.

The Heat Series
A 3-Part Journey on ML for PDEs

For us, it’s about No Love Lost and so, in this 3-part series, we’ll use the Heat Equation to plough through. We

  • Heat #1: introduce the physics behind heat diffusion and use a classical numerical routine to approximate its solution across a 2D plate.

  • Heat #2: with that estimate of the solution as a benchmark, see how Machine-Learning algorithms fare for the same situation, by deploying a PINN (Physics-Inspired Neural Network).

  • Heat #3: explore further into how versatile these Neural Networks are in handling more complex geometries than the 2D domain and varying initial/boundary conditions;

This field is fast-growing but, most importantly is at its blissful infancy stage. Our dream, rooted in a belief, is that Machine Learning may be refined to excel in extremely complex and customizable cases.

This sophistication could potentially fare better in performance than every other classical solver, allowing us to go to lands classical frameworks have never granted us so far!

Alright. 🔥 Let’s get hot in here! 🔥

Heat #1: the Heat Equation
(and a Classical Solver)

Hello, I Love You
The Heat Equation: a warm-up

Everybody has an elementary feeling for what heat is. Heat diffusion is the process through which energy is transported in space due to gradients in temperature.

These considerations are reflected mathematically in the so-called Heat Equation, which sets how the temperature \(u(t, \mathbf{r})\) at a point in space \(\mathbf{r}\) and time instant \(t\) evolves. It must satisfy:

\[\begin{equation} \frac{\partial u (t, \mathbf{r})}{\partial t} - D \Delta u(t, \mathbf{r}) = 0. \end{equation}\]

The thermal diffusivity \(D\) controls how fast heat can spread around a neighborhood. For more realistic (and oftentimes less structured) media, their specifics result in a local diffusity \(D(t, \mathbf{r})\). In the Heat series, we consider homogeneous media, for which \(D\) is constant throughout.

Now, what about \(\Delta\) — the Laplacian operator? Let’s see how it acts on a function \(u\):

\[\Delta u = \nabla \cdot (\nabla u) = \sum_{i} \frac{\partial^2 u}{\partial {r_i}^2}\]

So we have \(n\) \(2^\rm{nd}\)-order partial spatial derivatives to account for but also a single \(1^\rm{st}\)-order partial derivative in time. That’s a large batch of differential operators right there. We can lump them into a single differential operator which governs the evolution of the solution we’re after:

\[\mathcal{L} \left[ \frac{\partial}{\partial t}, \frac{\partial^2}{\partial^2_{r_1}}, \dots \frac{\partial^2}{\partial^2_{r_n}} \right] u(t, \mathbf{r}) = 0\]

Alright — no doubt we’re in the presence of a Partial Differential Equation.

These beasts are not easy to tame. However, throughout the Heat series, we exploit the structural simplicity of the Heat Equation — alongside the fact it is both very intuitive to understand and easy to set up meaningfully.

Between Love and Hate
Heating across a 2D square plate

We’ll guide you on how to solve this equation in a simple yet realistic geometric setup: across a 2-dimensional plate of variable size [2].

As you will soon see, this domain is simple enough not only to allow intuitive visualization but also to provide the possibility of adding extra complexity without much effort.

So let’s have a go. Our spatial vector in 2 dimensions is \(\mathbf{r} = (x, y)\) and so the heat equation takes the form:

$$ \left[ \frac{\partial}{\partial t} - D \left( \frac{\partial^2}{\partial x ^2} + \frac{\partial^2}{\partial y ^2} \right) \right]u(t,x,y)= 0 $$

From a computational point of view, it will do us a favor to think in terms of the result upon applying each differential operator to the solution:

\[\begin{equation} u_t - D \left( u_{xx} + u_{yy} \right) = 0 \label{HE_2D} \end{equation}\]

where the partial derivatives are shorthanded to, say, \(\partial_{x} \equiv \partial / \partial x\) and \(\partial_{xx} = \partial_{x} \partial_{x}\).

But hold on! We can’t start solving this beast just yet…

Love is the Drug
Setting the boundaries (& initial condition)

We need more information to formulate completely this PDE. Initial and boundary conditions ensure that a unique solution exists by the function at particular points \((t,x,y)\) of the input space.

These are normally reasoned through intuition and hindsight. Throughout the Heat series, we employ Dirichlet boundary conditions, which ensure:

  • through the initial condition, that a certain temperature is fixed across all points on the plate at the initial time instant (when \(t=0\)). We’ll take \(u(0, x, y) = -1 ^\mathrm{o}C\).

  • through the the boundary conditions, that a temperature is fixed at all times only for points along the 4 edges of the plate (top, bottom, left and right).

    • We choose the energy source to act as to get the top edge to some hot temperature. We’ll take \(u(t, x, y_\mathrm{max}) = 1 ^\mathrm{o}C \ .\)
    • The remaining edges are held at some cold temperature. We’ll take
      $$\underbrace{u(t, x, y_\mathrm{min})}_{\text{Bottom}} = \underbrace{u(t, x_{\mathrm{min}}, y)}_{\text{Left}} = \underbrace{u(t, x_\mathrm{max}, y)}_{\text{Right}} = -1 ^\mathrm{o}C \ .$$

Fig. 1: The boundary and initial conditions used throughout the Heat series. Energy is pumped from the top edge onto an initially completely cold 2D plate. Credits: David Carvalho / Inductiva

The Power of Love
… and Classical Numerical Methods

As usual, finding governing equations from first principles is actually the easy part. Rather, solving them presents us major challenges. Why?

Most PDEs do not admit analytical, pen-and-paper solutions. Only a handful of cherry-picked differential operators can give rise to closed-form solutions. Lesson learntnumerical approximations must be used.

But which ones? In which conditions? For what type of PDE?

These are general and difficult questions to answer. Wildly different methods have been curated by the computationally-inclined in the Mathematical and Physical communities — not as recipes set on stone but rather frameworks prone to constant scrutiny.

Mesh-based methods are traditionally the dominant approaches. The main idea is to discretize the domain of interest into a set of mesh points and, with them, approximate the solution. A whole zoo of methods exists, in particular:

  • Finite Differences Methods (FDMs) replace the PDE expression with a discretized analogue along a grid with the aid of function differences.
  • Finite Elements Methods (FEMs) subdivide the problem domain in multiple elements and apply the equations to each one of them.
  • Finite Volume Methods (FVMs) builds a solution approximation based on the exact computation (taking advantage of the divergence theorem) of the average value of the solution function in each of the smaller sub-volumes in which the domain is partitioned.

Cupid shouldn’t show his preference here. All these methods have their advantages and shortcomings given the idiosyncrasies of each PDE and setup of the problem. Sometimes they may even coincide — for some simple scenarios (such as regular grids) FEMs and FDMs might end up being the same [1].

Typically, mesh-based classical routines tend to be very efficient in low-dimensional problems on regular geometries. We will then use this convenience to our advantage by applying a FDM in a FTCS (Forward in Time, Centered in Space) scheme.

Never Mesh with Love
Discretizing our domain

Deciding how to discretize the domain is also far from being set on stone: for a given instantiation of a PDE problem, it is typically one of the most complex steps in its resolution. Squares are cool and all but what if we want to simulate the Heat Equation in a mug or a pan?

We’ll make our lives easier (for now!) by using a regular grid to represent the domain. With the aid of the cube:

$$ (t,x,y) \in [t_\mathrm{min}, t_\mathrm{max}] \times [x_\mathrm{min}, x_\mathrm{max}] \times [y_\mathrm{min}, y_\mathrm{max}] $$

in a regular grid with \(N_t\), \(N_x\) and \(N_y\) points along the \(t\), \(x\) and \(y\)-axis, respectively, we can set their step intervals, defined by their regular spacing along their respective dimension:

$$ (\Delta t, \Delta x, \Delta y) = \left( \frac{t_\mathrm{max} - t_\mathrm{min}}{N_t-1}, \frac{x_\mathrm{max} - x_\mathrm{min}}{N_x-1}, \frac{y_\mathrm{max} - y_\mathrm{min}}{N_y-1} \right) $$

Consequently, input points \((t,x,y)\) become discretized as \((t_k, x_i, y_j)\) and associated with a node \([k,i,j]\). Here,

$$ (t_k, x_i, y_j) = \left\{ \begin{matrix} t_k = t_0 + k \Delta t \ \ \ \ \ \ \ \ \ \ \ \ \text{for} \ 0 \leq k \leq N_t-1 \\ x_i = x_0 + i \Delta x \ \ \ \ \ \ \ \ \ \ \ \text{for} \ \ 0 \leq i \leq N_x-1 \\ y_j = y_0 + j \Delta y \ \ \ \ \ \ \ \ \ \ \ \text{for} \ 0 \leq j \leq N_y-1 \end{matrix} \right. $$

It is in this pixelated world we will express how heat will diffuse away…

Eu Tenho Dois Amores (I Have Two Lovers)
From continuous to discrete derivatives

So, we now need to express a differential operator in a finite, discretized domain. How exactly do we discretize such abstract objects, like the (univariate) derivative \(f_x(x)\):

\[f_x(x) = \underset{\Delta x \to 0}{\mathrm{lim}} \; \frac{f(x+ \Delta x) - f(x)}{\Delta x}.\]

or a partial derivative with respect to, say, a coordinate \(x_j\):

$$ f_{x_j}(x_1, \dots, x_j,\ldots,x_N)= \underset{\Delta x_j \to 0}{\mathrm{lim}} \; \frac{f(x_1, \dots, x_j + \Delta x_j, \dots ,x_N) - f(x_1, \ldots,x_j, \dots, x_N)}{\Delta x_j} $$

Being on a grid means that we require information on a finite number of points in the domain. The FDM philosophy is to express (infinitesimal) differentials by (finite) differences between any nodes.

The continuous thus becomes discrete — the differential operator \(\mathcal{L}(t,x,y)\) becomes a function-valued discrete grid \(\equiv \mathcal{L}[k,i,j] = \mathcal{L}(t_k,x_i,y_j)\). Resulting from this, so will our solution be output as the grid \(u[k,i,j]\).

In essence, we must find an algorithm which can propagate the constrains set by the PDE from the initial conditions as faithfully as possible along the grid. In a more specific sense, you might wonder:

How to approximate \(\mathcal{L}\) reasonably to obtain \(u[k,i,j]\) for all nodes \([k,i,j]\)?

This is exactly what the FDM solver will accomplish. For that, it has to approximate the differential operators — \(u_{t}[k,i,j]\), \(u_{xx}[k,i,j]\) and \(u_{yy}[k,i,j]\) — at all nodes \([k,i,j]\).

Goodbye operators, hello grids!

You’ve Got (Not) to Hide Your Love Away
Cupid chooses a Finite Difference Method

Approximating differentials on a discrete set is also not a recipe set on stone. Let us look at two Finite Difference approximations to a function of a single variable \(x\):

Fig. 2: Approximation schemes of the derivative of \(f(x)\) at a grid point \(x_i\) using - (left) forward and (right) centered schemes. Note that either method differs from the actual value \(f_x(x_i)\), given by the slope of the solid straight line. Credits: David Carvalho / Inductiva

These will estimate the true instantaneous rate via neighboring differences. Expectedly, an error will be incurred in the process.

The price to pay in “dropping” the limit results in a 1st-order approximation, meaning this estimate scales linearly with the spacing \(\Delta x\). For our approximation to have a chance to succeed, we better sample the \(x\) axis with a high \(N_x\)!

You Can’t Hurry Love
Choosing how to spread heat

Given a particular PDE, different FDMs would iterate differently over the grid node by node to obtain estimates of the differential operators.
So, fixing a scheme is (again) delicate task.

You guessed it right - for this problem, we will use the 2 approximation schemes we showed before to evaluate the differential operators. They are combined in the FTCS scheme:

  • Forward in Time - the time partial derivative \(u_t\) uses the forward 1st order differences

    \[u^{k, i, j}_t = \frac{u^{k+1, i, j} - u^{k, i, j}}{\Delta t}\]
  • Centered in Space - the spatial partial derivatives \(u_{xx}\) and \(u_{yy}\) are are computed through a 2nd-order central difference by applying the centered difference twice:

    • We first approximate the (second-order) derivative (say, \(u_{xx}\)) via the centered difference:
    \[u^{k, i, j}_{xx} = \frac{u^{k, i+\frac{1}{2}, j}_x - u^{k, i-\frac{1}{2}, j}_x}{\Delta x}\]
    • And then express each first-order term \(u_x\) in the same fashion e.g.:
    \[u^{k, i, j}_{x} = \frac{u^{k, i+\frac{1}{2}, j} - u^{k, i-\frac{1}{2}, j}}{\Delta x}\]
    • With a bit of rearranging, both spatial derivatives become:
    $$ \begin{split} u_{xx}^{k,i,j} = \frac{u^{k, i+1, j} - 2 u^{k, i, j} + u^{k, i-1, j}}{(\Delta x)^2} && \ \ \ \ \ \ u_{yy}^{k,i,j}= \frac{u^{k,i,j+1} - 2 u^{k,i,j} + u^{k,i,j-1}}{(\Delta y)^2} \end{split} $$

Look — some terms are evaluated outside the original grid. However, you will notice that these fractional indexes only appear for the computation of intermediary constructions/variables [4].

Victim of Love
Heat diffusion on a grid

Phew! That was intense but we now have all the tools to start simulating heat flow! For that, we must:

  1. discretize the domain by sampling each dimension individually and considering all possible combinations of the coordinates, creating points \((t_k,x_i,y_j)\), indexed by nodes \([k,i,j]\).
  2. discretize the differential operators using a FTCS scheme.
  3. solve for the solution grid \(u[k,i,j]\) by using the iteration rules that propagate the solution across all nodes \([k,i,j]\).

In this fashion, we convert the 2D Heat Equation to this difference equation:

$$ u^{k+1, i, j} = \alpha (u^{k, i+1, j} + u^{k, i-1, j}) + \beta (u^{k, i, j+1} + u^{k, i, j-1}) + (1 - 2\alpha - 2\beta) u^{k, i, j} $$

where we can lump all parameters in the ratios:

\[\begin{split} \alpha \equiv D\frac{\Delta t}{(\Delta x)^2} \ \ \ \ \ \ && \ \ \ \ \ \ \ \beta \equiv D \frac{\Delta t}{(\Delta y)^2}. \end{split}\]

We can understand how knowledge about the function propagates along the grid with a stencil by depicting which input grid points are needed to iterate the algorithm so a solution estimate may be computed at all grid points.

Fig. 3: A stencil used by the FTCS scheme. At a given node \([k,i,j]\), the solution is propagated forward in time as \(u[k+1,i,j]\) by considering the 2-point centered derivatives in \(x\) (the neighboring nodes \([k,i+1,j]\) and \([k,i-1,j]\)) and in \(y\) (the nodes \([k,i,j+1]\) and \([k,i,j-1]\)).
Credits: Augusto Peres, Inductiva.

Time to Heat [Start]
Run our code

It’s time to play with all these concepts and see how our FDM approximation fares.

All our code can be accessed on our code snippets GitHub repository. There, you will find the heat_fdm.py file for simulations and plotting utilities in /utils.
Have a go yourself!

Fast(er) Love
The role of the \(D\)-ffusivity

We will pick a spatial discretization of, say, \(500 \times 500\) points. We can now run for different thermal diffusivities and see their effect. Below you can see the temperature profiles as we increase \(D\) — first with \(D=0.01\), then \(D=0.1\) and finally \(D=1\).

Fig. 4: Role of various \(D\) in the diffusion. Credits: Manuel Madeira / Inductiva

Love is in the air — but heat for sure is on the plate!
We can reason with these results.

  • At the beginning (for \(t=0\)), heat starts developing from the only points that are not at the common temperature (\(-1 \; ^\mathrm{o}C\)), where temperature gradients exist. These are points nearby the hot (top) edge.

  • From that point on, we can observe the progressive heat diffusion from the hot edge towards lower regions, until we reach a stationary state, where no further changes occur (and the system is in thermal equilibrium).

  • Heat does not spread evenly across the plate — its spread is faster in regions that are further from the cold regions. Given this symmetric setup, points along the central line between the edges allow for faster diffusion than regions nearby the edges. This will progressively wind down until we reach the cold edges and a paraboloid-like pattern is observed.

  • But more importantly — the higher the \(D\), the faster this spread occurs. This directly impacts the choice of the time sampling.

So — it seems that each internal parameter \(D\) requires its own discretization setup somehow. But in what way?

How Stable Is Your Love?
Setting stability criteria

We ran our first PDE classical solver. Yay! However, why are we happy with the results? Clearly, we can think of potential issues, such as:

  • for excessively sparse discretizations, our approximations of the differential operators will be miserably undersampled and consequently the trial solution will be shockingly wrong.
  • error buildup will take place as the iterative procedure tends to propagate and amplify errors between any iterations. In those cases, we say that the method is unstable.

However, these are the extreme instances. Were we lucky? If we are using “reasonable” discretizations, a question should make you scratch your head:

How to be sure a certain discretization provides a trustful approximation to the PDE solution?

Fortunately for us, you can notice that the discretization setup can be monitored exclusively through the ratios \(\alpha\) and \(\beta\)! This raises the question:

Can we somehow fine-tune them to ensure stability i.e. find admissible combinations of \((\Delta t, \Delta x, \Delta y)\) for a fixed \(D\)?

Well, theoretical studies can be performed to find a restricted space of acceptable discretization setups. For our case [3], Von-Neumann analysis establishes a surprisingly simple stability criterion:

\[\alpha + \beta \leq \frac{1}{2}\]

These two constants provide a straightforward procedure to be sure that we i) are not oversampling or undersampling a dimension with respect to any other and ii) have spacings that can capture meaningful changes of the solution.

This is not a loose condition.
Just to show you that we’re playing with fire, we cannot think of a more illustrative example than by ramping up that bound by a mere \(2 \%\) above the theoretical maximum i.e. for \(\alpha + \beta \approx 0.51\):

Fig. 5: In a discretization step for which \(\alpha + \beta > 0.5\), our algorithm is bound to fail. Credits: Manuel Madeira / Inductiva

Yup: for this setup, an “explosion” caused by the FDM instability can be seen propagating along the trial solution — quickly unfolding to some psychedelic burst!

At the very beginning, some similarities to the downwards diffusion pattern ( that we just saw previously) quickly fade away as the error buildup spreads away and faster in regions with already large errors.

It’s a race for disaster: the exponential buildup of the errors, even if sustained for only a handful of propagation steps, ensures that some nodes will experience huge growth (and \(u\) becomes unbounded) — \(u \mapsto 1 \mapsto 10^{15} \mapsto \dots 10^{80} \dots\), until the maximum computer precision is reached — at which point its value becomes a NaN (Not a Number). You can see that eventually all points are not even represented on the scale (they’re white)!

By the same token, let’s now see what happens if we’re close to the edge from within the admissible area. Dropping our bound by \(2 \%\), let’s heat the start button for \(\alpha + \beta \approx 0.49\).

Fig. 6: With our parameters recalibrated, stability has been restored! Credits: Manuel Madeira / Inductiva

Incredible! Not even a minimal sign of instability coming through.

We’re happy - we obtained seemingly satisfactory outputs in a reasonable way. The simulations take about \(10s\) and CPUs can handle the computation just fine.

But remember though: this is an extremely simple system… We could be considering this problem in 3D or at way more complex domains and boundaries. In an extreme limit, we could be using these FDMs in highly-nonlinear equations with hundreds of variables.

To Vectorize or Not To Vectorize
That is not the question

Understandingly, instructing how to compute many functions over a grid is not a trivial task — issues like computation power and memory storage can be bottlenecks already for discretization setups far from ideal.

We must understand how our implementation choices affect the algorithm performance and the output accuracy. In essence:

How much faster can we solve the PDE without compromising the output?

Using the running time as a metric, we’ll compare 3 mainstream implementations:

  • Nested For Cycles: we fill in the entries for the plate at the new time frame \(u^{k+1, i, j}\), by nesting 2 For cycles that iterate through the \(x\) and \(y\) axes from the previous time frame in \(u^{k, i, j}\).

  • Vectorized approach: we perform arithmetic operations on \(u^{k+1,i,j}\) by considering it as a linear combination of slightly shifted versions \(u^{k,i,j}\), multiplied by constants (known a priori), resulting in scalar multiplications and additions of matrices. NumPy implementations will be used.

  • Compiled approach: we can compile a vectorized approach, which allows sequences of operations to be optimized together and run at once. The code compilation was performed through JAX, using its just-in-time (JIT) compilation decorator.

The approach used is passed as a flag: --vectorization_strategy=my_strategy.
The default is set to numpy and the vectorized approach is used. When set to jax, a compiled version of the vectorized approach is used and, when set to serial, the nested For cycles approach is used.

After running each implementation for the same parameters, we obtain wildly different running times:

vectorization_strategy=serial - Time spent in FDM: 93.7518858909607s
vectorization_strategy=numpy - Time spent in FDM: 0.6520950794219971s
vectorization_strategy=jax - Time spent in FDM: 0.297029972076416s

Wow! By using the vectorized approach, we obtain more than a \(100\)-fold boost in running speed!

This is not surprising. The vectorized approach takes advantage SIMD (Single Instruction, Multiple Data) computations, where one instruction carries out the same operation on a number of operands in parallel.

This discrepancy is then easy to understand and a huge acceleration is observed, in constrast when ran with the nested For cycles, where each instruction is executed one at a time.

We are not done yet. By compiling the vectorized approach, we obtain yet another \(3\)-fold speed-up, further improving our metric! This improvement would be even more significant if we ran the compiled code in GPUs (currently, JAX still only supports CPU in Apple Silicon machines).

Computation resources are finite (and costly). Despite the huge boost granted by the compiled vectorized approach, this methodology improved our task by 2 orders of magnitude. Is that enough when dealing with far more complex PDEs in higher dimensions?

Limit to Your Love
How far can we push classical methods?

Despite our pleasant experience with a low-dimensional problem on a regular geometry, we should be careful thinking ahead. Classical methods for solving PDEs have long been recognized for suffering from some significant drawbacks and we will comment on a few:

Meshing
Gives Love a Bad Name

Meshing becomes a daunting task in more complex geometries. Different discretization setups can always be chosen but once one is fixed, we have limited ourselves to computations on the nodes generated by that particular setup. We can no longer apply arbitrary smaller steps to approximate the differential operators.

Worse still — depending on the behavior of the differential operator, we might need to focus on certain regions more than others. It’s unclear how these choices impact the quality of the approximations.

Discretizing the problem also creates a bias in favor of the grid points with respect to all others.

But what if there the PDE behaves differently outside the mesh?

The best we can do is to infer a solution through interpolation but this is admittedly unsatisfactory.

Scaling
Will Tear Us Apart

Classical algorithms scale terribly with the PDE dimensions and grid refinements.

For all the Complexity lovers out there — you know that sweeping the regular grid \([k,i,j]\) step-by-step requires algorithms of order \(\mathcal{O}(N_t N_x N_y)\).
By simply refining our grid by \(5\) times, we scale the number of nodes by a whopping factor \(5^3 = 125\) (!!!).

Similar bad news are expected as we consider simulations on PDEs for higher dimensions. For a problem of 3D heat diffusion, the complexity grows as \(\mathcal{O} (N_t N_x N_y N_z)\) and naturally, \(n\) dimensions scale as \(\mathcal{O}(N^d)\) — an exponential scaling of the algorithm with dimension.

Crazy Little Thing Called…
Feasibility

Even though we can rely on stability conditions, this does not mean we can deploy them — it may be computationally very, very, …, heavy or simply impossible.

As we saw, simulating a medium with higher \(D\) requires a finer time discretization. For some instances, satisfactory \(\Delta t\) might be too small to be practical and, in the extreme case, downright untenable!

This comes at a bitter price and is a major drawback of the FDM. Baby, was that a cheap Mon Cherrie knock-off?

All for Love
… and Machine Learning?

PDE simulations must be as fast and light as possible. But can classical methods, in general, get fast enough?

We can look farther ahead. Consider something extreme (but likely). Imagine our task is to find parameters that lead to optimal performance for some problem.

Let us assume we have access to the theoretically fastest PDE classical solver, foolproof to instabilities and extremely efficient in terms of memory and processing.

If hundreds, millions, …, billions of intermediate PDE calculations had to be performed with that algorithm, could we afford the computational budget? Would the time elapsed be reasonable?

Certainly not. It is not difficult to estimate runtimes comparable to our lifetime (run across millions of processors!) to achieve some real-life optimization tasks.

This would forbid us from iterating across the parameter space fast enough — Game over.

Efficiency in classical methods won’t provide the scaling we need to overcome this barrier.
A new paradigm is needed.

Fortunately, a large class of algorithms can handle this curse of dimensionality exceptionally well. Can you guess what we are talking about?
Yup, Deep Learning!

In the following posts of this series, we will address how we can leverage on those approaches to possibly overcome these issues raised by classical methods.

Until then, don’t melt away!

Advice For The Young At Heart
References & Remarks

[1] More on this FEM and FDM possible similarity here.
[2]. The FDM-based approach we present is loosely following this excellent example that inspired us.
[3]. More details of the derivation of this expression can be found here (pages 11/12).
[4] There are more complex cases where it is convenient to define such staggered grids with fractional offsets to the original (e.g for Maxwell’s Equations).

Love For…
Credits

We thank our Inductiva pals — Manuel Madeira (main author), David Carvalho (editor and co-author), Fábio Cruz (technical editor and reviewer) and Augusto Peres (reviewer and animation creator) for this post.

❤️❤️❤️️ Pounds of Love, from us at Inductiva. ❤️❤️❤️

Recent posts from our blog

Manuel Madeira

Manuel Madeira

David Carvalho

David Carvalho

Fábio Cruz

Fábio Cruz

In this 3rd and final part of the Heat series, we delve into the idea of enhancing generalizational power in Neural Networks so they can learn more complex aspects. We exemplify these ideas by running Physics Informed Neural Networks (PINNs) on a custom-designed domain and boundary condition.

Manuel Madeira

Manuel Madeira

David Carvalho

David Carvalho

Fábio Cruz

Fábio Cruz

In this 2nd part of the series, we show that Neural Networks can learn how to solve Partial Differential Equations! In particular, we use a PINN (Physics-Informed Neural Network) architecture to obtain the results we obtained with classical algorithms in Heat #1.