A Caucus-Race and a Long Tale (Alice, Ch. 3)

They were indeed a queer-looking party that assembled on the bank—the birds with draggled feathers, the animals with their fur clinging close to them, and all dripping wet, cross, and uncomfortable.

Alice, Professor Thomas, and all his lovely students.

As soon as they reached the bank, some students left ran towards a forest to the right (some of them couldn’t fly with wet wings, and the others, even without) — and Alice was left with Thomas and his other set of students whom she knew very little about. They were all feeling very wet so for a while, Alice forgot that she had to get out of there. She asked how they could all dry themselves up, and it was Thomas who spoke up.

“We will run a Caucus Race.”, said the Mouse.

Alice had no idea what a Caucus race was, and neither did any of the other students. The Mouse continued.

“A Caucus Race is a race where there is no winner. Everyone runs as much as they want, and then stops when they want to. There’s a twist, though. For this Caucus Race, everyone has to, while running, talk about what the word “Invariance” means to them. Start!”

And what ensued was the biggest commotion Alice could imagine. Everyone was running all around the place and blabbering on about different things. Alice decided to talk about the Invariance Principle, one of her favorite problem-solving strategies from Mathematics. She went:

The Invariance Principle states that when a lot of things change, we should search for what does not change. For example, here’s a question (try it before you ask me the solution):

How do you fill an 8×8 chessboard with 1×2 dominoes when two diagonally opposite ones have been chipped off? The board would look like this:

A chessboard with diagonal cells removed.

After a while, Alice continued to tell the solution:

Let L be the number of uncovered light squares, and D be the number of uncovered dark ones. The number Z = D – L is 2 at the beginning (since two light squares have been chipped off). When we add a 1×2 domino, it covers equal light and dark squares. Thus, this number remains invariant. In order to fully tile the chessboard, we want Z to be 0 in the end. Due to the Invariance Principle, this can never happen. Alice assumed an expression of wonder from the listeners and continued. Here’s another one:

In an International Doctoral Symposium on Deep Learning hosted by ELLIS, a lot of handshakes happened. Some people exchanged an odd number of handshakes (and were called odd people), while some exchanged an even number (called even people). Prove that throughout the 1-week symposium, there were always an even number of odd people. (Hint: the proof is invariant to the change in venue; it could have been the NeuraLIPS conference instead.)

When Alice was finished, she was quite satisfied with herself. She looked around and tried to understand what the others were talking about. It was a complete jungle, but somehow she was able to understand multiple people speaking all at once. Alice thought it could be because of her consciousness spreading along multiple paths in this maze of a network.

The clearest voice was that of Claire the Crab. She was saying something like this:

An invariant is a property of an object which remains unchanged when a transformation is applied to the object. Closely related is the idea of an “equivariant” — a property that undergoes the same transformation instead of remaining unchanged. These two concepts can help us understand a lot of ideas in AI.

For example, CNNs exhibit both of these properties at different places. The convolution operation is equivariant under translation — which means shifting the input correspondingly shifts the output. This allows the network to understand concrete structures such as edges, shapes, and patterns. As we go on to later layers, the receptive field of the output increases, and (if we have balanced the equivariance well enough with things like pooling), we observe invariance under translation — which aids generalization for a number of tasks.

Similarly, another class of networks, Graph Neural Networks, is designed to be invariant (or equivariant, based on the task at hand) to permutations.

These invariances, though, are not per se properties of these models. Actually, they are properties of the problem we’re trying to solve. The problem (which includes the dataset) usually gives us a set of constraints. These can be used by the model as assumptions in order to learn to solve the problem — and are called the inductive bias of the model.

The Caucus Race. Credit: Emily Carew Woodard.

An object detection model, for instance, has to detect a cat at any position in the input image. This fact can be used by the model to share weights across sets of nearby input pixels (which is, in essence, a convolution!).

However, things can go really bad if the assumed inductive bias is itself incorrect. For example, the inductive bias that a series of binary questions is all you need — which works very well for a lot of problems that use tabular data (guess what models make use of this bias?) — might be very inefficient for image processing.

Tricks like data augmentation, regularization, etc. are all ways of increasing the inductive bias of the model — thereby making learning easier with lesser data.

Some problems (like language understanding) require models that have very less inductive bias. (guess what models do so well there?)

Aye aye, caption. No, I’m not sorry for this lameness.

Simultaneously, Alice was also listening to Frenchie, the fish. He did not have a tongue, though, and Alice could not figure out a single word.

Then there was Yannic the Hedgehog. He was saying (or more like questioning, since all his statements were basically questions):

Since invariances due to inductive biases are a part of the problem, is it not quite possible that we AniML Scientists (AniML was the name of lab where all these students worked) miss some of them? Could it be possible for the model to learn these inductive biases? Could there be learnable invariances in AI?

What are some inductive biases in reinforcement learning?

Just like the loss function, would these invariance or equivariances need to be differentiable?

If one were to create models that operate on sets, would they have some inductive bias my models can make use of? Since sets are unordered, wouldn’t it be permutation invariance/equivariance? Would it be possible for models to make use of this equivariance?

Alice later found out that the answer to this last question was a yes, through models such as Deep Sets.

She was also listening to an otter. Interestingly, she was not talking about AI, or even Computer Science, for that matter. But for some reason, it was the most beautiful thing Alice heard of them all:

Whenever there is an invariance of energy under a transformation, we can derive a corresponding conservation law in Physics. For example:

invariance under spatial translation gives the law of conservation of linear momentum,
invariance under rotation gives the law of conservation of angular momentum, and
invariance under time translation gives the law of conservation of energy.

When they had been running half an hour or so and were quite dry again, the Dodo suddenly called out “The race is over!” and they all crowded around it, panting, and asking, “But who has won?”

This question the Dodo could not answer without a great deal of thought, and it sat for a long time with one finger pressed upon its forehead, while the rest waited in silence. At last, the Dodo said, “Everybody has won, and all must have prizes.”

“But who is to give the prizes?” quite a chorus of voices asked.

“Why, she, of course,” said the Dodo, pointing to Alice with one finger; and the whole party at once crowded around her, calling out in a confused way, “Prizes! Prizes!”

Alice had no idea what to do, and in despair, she put her hand in her pocket, pulled out a box of comfits, (luckily the salt water had not got into it), and handed them around as prizes. There was exactly one a-piece, all around.

“But she must have a prize herself, you know,” said the Mouse.

“Of course,” the Dodo replied very gravely. “What else have you got in your pocket?” he went on, turning to Alice.

“Only a thimble,” said Alice sadly.

“Hand it over here,” said the Dodo.

Then they all crowded around her once more, while the Dodo solemnly presented the thimble, saying “We beg your acceptance of this elegant thimble;” and, when it had finished this short speech, they all cheered.

Alice thought the whole thing was very absurd, but they all looked so grave that she did not dare to laugh; and, as she could not think of anything to say, she simply bowed, and took the thimble, looking as solemn as she could.

Very soon, though, Alice mentioned her cat, Dinah, and everyone ran away. Alice was all alone and was once again on the verge of tears when she heard a little pattering of footsteps in the distance, and she looked up eagerly, half hoping that the Mouse had changed his mind, and was coming back to finish his story.

But it was the White Rabbit, trotting slowly back again, and looking anxiously about as it went as if it had lost something; and she heard it muttering to itself “The Duchess! The Duchess! Oh, my dear paws! Oh, my fur and whiskers! She’ll get me executed, as sure as ferrets are ferrets! Where can I have dropped them, I wonder?”

A Caucus-Race and a Long Tale (Alice, Ch. 3)

Like this:

Discover more from 7vik