散文網(wǎng) » 科技 »學(xué)習(xí) » 我們終于找到了一條可能通往基本物理理論的道路，而且它很美...[3/7]

我們終于找到了一條可能通往基本物理理論的道路，而且它很美...[3/7]

2020-04-29 10:47 作者:木之本仁 0人讀過 | 我要投稿

The Graph of Causal Relationships

In the end it’s wonderfully elegant. But to get to the point where we can understand the elegant bigger picture we need to go through?some detailed things. (It isn’t terribly surprising that a fundamental theory of physics—inevitably built on very abstract ideas—is somewhat complicated to explain, but so it goes.)

To keep things tolerably simple, I’m not going to talk directly about rules that operate on hypergraphs. Instead I’m going to talk about?rules that operate on strings of characters. (To clarify: these are not the strings of string theory—although in a bizarre twist of “pun-becomes-science” I suspect that the continuum limit of the operations I discuss on character strings is actually related to string theory in the modern physics sense.)

OK, so let’s say we have the rule:

{A?→?BBB, BB?→?A}

This rule says that anywhere we see an?A, we can replace it with?BBB, and anywhere we see?BB?we can replace it with?A. So now we can generate what we call the multiway system for this rule, and draw a “multiway graph” that shows everything that can happen:

At the first step, the only possibility is to use?A→BBB?to replace the?A?with?BBB. But then there are two possibilities: replace either the first?BB?or the second?BB—and these choices give different results. On the next step, though, all that can be done is to replace the?A—in both cases giving?BBBB.

So in other words, even though we in a sense had two paths of history that diverged in the multiway system, it took only one step for them to converge again. And if you trace through the picture above you’ll find out that’s what always happens with this rule: every pair of branches that is produced always merges, in this case after just one more step.

This kind of balance between branching and merging is a phenomenon I call “causal invariance”. And while it might seem like a detail here, it actually turns out that it’s at the core of why relativity works, why there’s a meaningful objective reality in quantum mechanics, and a host of other core features of fundamental physics.

But let’s explain why I call the property causal invariance. The picture above just shows what “state” (i.e. what string) leads to what other one. But at the risk of making the picture more complicated (and note that this is incredibly simple compared to the?full hypergraph case), we can annotate the multiway graph by including the updating events that lead to each transition between states:

But now we can ask the question: what are the causal relationships between these events? In other words, what event needs to happen before some other event can happen? Or, said another way, what events must have happened in order to create the input that’s needed for some other event?

Let us go even further, and annotate the graph above by showing all the causal dependencies between events:

The orange lines in effect show which event has to happen before which other event—or what all the causal relationships in the multiway system are. And, yes, it’s complicated. But note that this picture shows the whole multiway system—with all possible paths of history—as well as the whole network of causal relationships within and between these paths.

But here’s the crucial thing about causal invariance: it implies that actually the graph of causal relationships is the same regardless of which path of history is followed. And that’s why I originally called this property “causal invariance”—because it says that with a rule like this, the causal properties are invariant with respect to different choices of the sequence in which updating is done.

And if one traced through the picture above (and went quite a few more steps), one would find that for every path of history, the?causal graph?representing causal relationships between events would always be:

or, drawn differently,

The Importance of Causal Invariance

To understand more about causal invariance, it’s useful to look at an?even simpler example: the case of the rule?BA→AB. This rule says that any time there’s a?B?followed by an?A?in a string, swap these characters around. In other words, this is a rule that tries to sort a string into alphabetical order, two characters at a time.

Let’s say we start with?BBBAAA. Then here’s the multiway graph that shows all the things that can happen according to the rule:

There are lots of different paths that can be followed, depending on which?BA?in the string the rule is applied to at each step. But the important thing we see is that at the end all the paths merge, and we get a single final result: the sorted string?AAABBB. And the fact that we get this single final result is a consequence of the causal invariance of the rule. In a case like this where there’s a final result (as opposed to just evolving forever), causal invariance basically says: it doesn’t matter what order you do all the updates in; the result you’ll get will always be the same.

I’ve introduced causal invariance in the context of trying to find a model of fundamental physics—and I’ve said that it’s going to be critical to both relativity and quantum mechanics. But actually what amounts to causal invariance has been seen before in various different guises in mathematics, mathematical logic and computer science. (Its most common name is “confluence”, though there are some technical differences between this and what I call causal invariance.)

Think about expanding out an algebraic expression, like?(x?+?(1?+?x)^2)(x?+?2)^2. You could expand one of the powers first, then multiply things out. Or you could multiply the terms first. It doesn’t matter what order you do the steps in; you’ll always get the same canonical form (which in this case?Mathematica?tells me is?4?+?16x?+?17x^2?+?7x^3?+?x^4. And this independence of orders is essentially causal invariance.

Here’s one more example. Imagine you’ve got some recursive definition, say?

f[n_]:=f[n-1]+f[n-2]

(with?f[0]=f[1]=1). Now evaluate?f[10]. First you get?f[9]+f[8]. But what do you do next? Do you evaluate?f[9], or?f[8]? And then what? In the end, it doesn’t matter; you’ll always get 55. And this is another example of causal invariance.

When one thinks about parallel or asynchronous algorithms, it’s important if one has causal invariance. Because it means one can do things in any order—say, depth-first, breadth-first, or whatever—and one will always get the same answer. And that’s what’s happening in our little sorting algorithm above.

OK, but now let’s come back to causal relationships. Here’s the multiway system for the sorting process annotated with all causal relationships for all paths:

And, yes, it’s a mess. But?because there’s causal invariance, we know something very important: this is basically just a lot of copies of the same causal graph—a simple grid:

(By the way—as the picture suggests—the cross-connections between these copies aren’t trivial, and later on we’ll see they’re associated with?deep relations between relativity and quantum mechanics, that probably manifest themselves in the physics of black holes. But we’ll get to that later…)

OK, so every different way of applying the sorting rule is supposed to give the same causal graph. So?here’s one example?of how we might apply the rule starting with a particular initial string:

But now let’s show the graph of causal connections. And we see it’s just a grid:

Here are three other possible sequences of updates:

But now we see causal invariance in action: even though different updates occur at different times, the graph of causal relationships between updating events is always the same. And having seen this—in the context of a very simple example—we’re ready to talk about special relativity.

Deriving Special Relativity

It’s typical first instinct in thinking about doing science: you imagine doing an experiment on a system, but you—as the “observer”—are outside the system. Of course if you’re thinking about modeling the whole universe and everything in it, this isn’t ultimately a reasonable way to think about things. Because the “observer” is inevitably part of the universe, and so has to be modeled just like everything else.

In our models what this means is that the “mind of the observer”, just like everything else in the universe, has to get updated through a series of updating events. There’s no absolute way for the observer to “know what’s going on in the universe”; all they ever experience is a series of updating events, that may happen to be affected by updating events occurring elsewhere in the universe. Or, said differently, all the observer can ever observe is the?network of causal relationships between events—or the causal graph that we’ve been talking about.

So as toy model let’s look at our?BA→AB?rule for strings. We might imagine that the string is laid out in space. But to our observer the only thing they know is the causal graph that represents causal relationships between events. And for the?BA→AB?system here’s one way we can draw that:

But now let’s think about how observers might “experience” this causal graph. Underneath, an observer is getting updated by some sequence of updating events. But even though that’s “really what’s going on”, to make sense of it, we can imagine our observers setting up internal “mental” models for what they see. And a pretty natural thing for observers like us to do is just to say “one set of things happens all across the universe, then another, and so on”. And we can translate this into saying that we imagine a series of “moments” in time, where things happen “simultaneously” across the universe—at least with some convention for defining what we mean by simultaneously. (And, yes, this part of what we’re doing is basically following what Einstein did when he originally proposed special relativity.)

Here’s a?possible way of doing it:

One can describe this as a “foliation” of the causal graph. We’re dividing the causal graph into leaves or slices. And each slice our observers can consider to be a “successive moment in time”.

It’s important to note that there are some constraints on the foliation we can pick. The causal graph defines what event has to happen before what. And if our observers are going to have a chance of making sense of the world, it had better be the case that their notion of the progress of time aligns with what the causal graph says. So for example this foliation wouldn’t work—because basically it says that the time we assign to events is going to disagree with the order in which the causal graph says they have to happen:

But, so given the foliation above, what actual order of updating events does it imply? It basically just says: as many events as possible happen at the same time (i.e. in the same slice of the foliation), as in this picture:

OK, now let’s connect this to physics. The foliation we had above is relevant to observers who are somehow “stationary with respect to the universe” (the “cosmological rest frame”). One can imagine that as time progresses, the events a particular observer experiences are ones in a column going vertically down the page:

But now let’s think about an observer who is uniformly moving in space. They’ll experience a different sequence of events, say:

And that means that the foliation they’ll naturally construct will be different. From the “outside” we can draw it on the causal graph like this:

But to the observer each slice just represents a successive moment of time. And they don’t have any way to know how the causal graph was drawn. So they’ll construct their own version, where the slices are horizontal:

But now there’s a purely geometrical fact: to make this rearrangement, while preserving the basic structure (and here, angles) of the causal graph, each moment of time has to sample fewer events in the causal graph, by a factor of?

?where?β?is the angle that represents the velocity of the observer.

If you know about special relativity, you’ll recognize a lot of this. What we’ve been calling foliations correspond directly to relativity’s “reference frames”. And our foliations that represent motion are the standard inertial reference frames of special relativity.

But here’s the special thing that’s going on here: we can interpret all this discussion of foliations and reference frames in terms of the actual rules and evolution of our underlying system. So here now is the?evolution of our string-sorting system?in the “boosted reference frame” corresponding to an observer going at a certain speed:

And here’s the crucial point: because of causal invariance it doesn’t matter that we’re in a different reference frame—the causal graph for the system (and the way it eventually sorts the string) is exactly the same.

In special relativity, the key idea is that the “l(fā)aws of physics” work the same in all inertial reference frames. But why should that be true? Well, in our systems, there’s an answer: it’s a consequence of causal invariance in the underlying rules. In other words, from the property of causal invariance, we’re able to?derive relativity.

Normally in physics one puts in relativity by the way one sets up the mathematical structure of spacetime. But in our models we don’t start from anything like this, and in fact space and time are not even at all the same kind of thing. But what we can now see is that—because of causal invariance—relativity emerges in our models, with all the relationships between space and time that that implies.

So, for example, if we look at the picture of our string-sorting system above, we can see relativistic time dilation. In effect, because of the foliation we picked, time operates slower. Or, said another way, in the effort to sample space faster, our observer experiences slower updating of the system in time.

The speed of light?c?in our toy system is defined by the maximum rate at which information can propagate, which is determined by the rule, and in the case of this rule is one character per step. And in terms of this, we can then say that our foliation corresponds to a speed 0.3?c. But now we can look at the amount of time dilation, and it’s exactly the amount?

?that relativity says it should be.

By the way, if we imagine trying to make our observer go “faster than light”, we can see that can’t work. Because there’s no way to tip the foliation at more than 45° in our picture, and still maintain the causal relationships implied by the causal graph.

OK, so in our toy model we can derive special relativity. But here’s the thing: this derivation isn’t specific to the toy model; it applies to any rule that has causal invariance. So even though we may be dealing with hypergraphs, not strings, and we may have a rule that shows all kinds of complicated behavior, if it ultimately has causal invariance, then (with various technical caveats, mostly about possible wildness in the causal graph) it will?exhibit relativistic invariance, and a physics based on it will follow special relativity.

What Is Energy? What Is Mass?

In our model, everything in the universe—space, matter, whatever—is supposed to be represented by features of our evolving hypergraph. So within that hypergraph, is there a way to identify things that are familiar from current physics, like mass, or energy?

I have to say that although it’s a widespread concept in current physics, I’d never thought of energy as something fundamental. I’d just thought of it as an attribute that things (atoms, photons, whatever) can have. I never really thought of it as something that one could identify abstractly in the very structure of the universe.

So it came as a big surprise when we recently realized that actually in our model, there is something we can point to, and say “that’s energy!”, independent of what it’s the energy of. The technical statement is:?energy corresponds to the flux of causal edges through spacelike hypersurfaces. And, by the way, momentum corresponds to the flux of causal edges through timelike hypersurfaces.

OK, so what does this mean? First, what’s a spacelike hypersurface? It’s actually a standard concept in general relativity, for which there’s a direct analogy in our models. Basically it’s what forms a?slice in our foliation. Why is it called what it’s called? We can identify two kinds of directions: spacelike and timelike.

A spacelike direction is one that involves just moving in space—and it’s a direction where one can always reverse and go back. A timelike direction is one that involves also progressing through time—where one can’t go back. We can mark spacelike (—) and timelike (- -) hypersurfaces in the causal graph for our toy model:

(They might be called “surfaces”, except that “surfaces” are usually thought of as 2-dimensional, and our 3-space + 1-time dimensional universe, these foliation slices are 3-dimensional: hence the term “hypersurfaces”.)

OK, now let’s look at the picture. The “causal edges” are the causal connections between events, shown in the picture as lines joining the events. So when we talk about a “flux of causal edges through spacelike hypersurfaces”, what we’re talking about is the net number of causal edges that go down through the horizontal slices in the pictures.

In the toy model that’s trivial to see. But here’s a causal graph from a?simple hypergraph model, where it’s already considerably more complicated:

(Our toy-model causal graph starts from a line of events because we set up a long string as the initial condition; this starts from a single event because it’s starting from a minimal initial condition.)

But when we put a foliation on this causal graph (thereby effectively defining our reference frame) we can start counting how many causal edges go down through successive (“spacelike”) slices:

We can also ask how many causal edges go “sideways”, through timelike hypersurfaces:

OK, so why do we think these fluxes of edges correspond to energy and momentum? Imagine what happens if we change our foliation, say tipping it to correspond to motion at some velocity, as we did in the previous section. It takes a?little bit of math, but what we find out is that our fluxes of causal edges transform with velocity basically just like we saw distance and time transform in the previous section.

In the standard derivation of relativistic mechanics, there’s a consistency argument that energy has to transform with velocity like time does, and momentum like distance. But now we actually have a structural reason for this to be the case. It’s a fundamental consequence of our whole setup, and of causal invariance. In traditional physics, one often says that position is the conjugate variable to momentum, and energy to time. And that’s something that’s burnt into the mathematical structure of the theory. But here it’s not something we’re burning in; it’s something we’re deriving from the underlying structure of our model.

And that means there’s ultimately a lot more we can say about it. For example, we might wonder what the “zero of energy” is. After all, if we look at one of our causal graphs, a lot of the causal edges are really just going into “maintaining the structure of space”. So if in a sense space is uniform, there’s inevitably a uniform “background flux” of causal edges associated with that. And whatever we consider to be “energy” corresponds to the fluctuations of that flux around its background value.

By the way, it’s worth mentioning what a “flux of causal edges” corresponds to. Each causal edge represents a causal connection between events, that is in a sense “carried” by some element in the underlying hypergraph (the “spatial hypergraph”). So a “flux of causal edges” is in effect the communication of activity (i.e. events), either in time (i.e. through spacelike hypersurfaces) or in space (i.e. through timelike hypersurfaces). And at least in some approximation we can then say that energy is associated with activity in the hypergraph that propagates information through time, while momentum is associated with activity that propagates information in space.

There’s a fundamental feature of our causal graphs that we haven’t mentioned yet—that’s related to information propagation. Start at any point (i.e. any event) in a causal graph. Then trace the causal connections from that event. You’ll get some kind of cone (here just in 2D):

The cone is more complicated in a more complicated causal graph. But you’ll always have something like it. And what it corresponds to physically is what’s normally called a light cone (or “forward light cone”). Assuming we’ve drawn our causal network so that events are somehow laid out in space across the page, then the light cone will show how information (as transmitted by light) can spread in space with time.

When the causal graph gets complicated, the whole setup with light cones gets complicated, as we’ll discuss for example in connection with black holes later. But for now, we can just say there are cones in our causal graph, and in effect the angle of these cones represents the maximum rate of information propagation in the system, which we can?identify with the physical speed of light.

And in fact, not only can we identify light cones in our causal graph: in some sense we can think of our whole causal graph as just being a large number of “elementary light cones” all knitted together. And, as we mentioned, much of the structure that’s built necessarily goes into, in effect, “maintaining the structure of space”.

But let’s look more closely at our light cones. There are causal edges on their boundaries that in effect correspond to propagation at the speed of light—and that, in terms of the underlying hypergraph, correspond to events that “reach out” in the hypergraph, and “entrain” new elements as quickly as possible. But what about causal edges that are “more vertical”? These causal edges are associated with events that in a sense reuse elements in the hypergraph, without involving new ones.

And it looks like these causal edges have an important interpretation: they are associated with mass (or, more specifically, rest mass). OK, so the total flux of causal edges through spacelike hypersurfaces corresponds to energy. And now we’re saying that the flux of causal edges specifically in the timelike direction corresponds to rest mass. We can see what happens if we “tip our reference” frames just a bit, say corresponding to a velocity?v???c. Again, there’s a?small amount of math, but it’s pretty easy to derive formulas for momentum (p) and energy (E). The speed of light?ccomes into the formulas because it defines the ratio of “horizontal” (i.e. spacelike) to “vertical” (i.e timelike) distances on the causal graph. And for?v?small compared to?cwe get:

So from these formulas we can see that just by thinking about causal graphs (and, yes, with a backdrop of causal invariance, and a whole host of detailed mathematical limit questions that we’re not discussing here), we’ve managed to derive a basic (and famous) fact about the relation between energy and mass:

Sometimes in the standard formalism of physics, this relation by now seems more like a definition than something to derive. But in our model, it’s not just a definition, and in fact we can successfully derive it.

標(biāo)簽：