DSL Sketch #5

anshumanmohan · 2024-04-18T19:15:02Z

anshumanmohan
Apr 18, 2024
Maintainer

Background

As I mentioned towards the end of #3, we're eventually going to want a DSL that allows users to specify their policies without having to actually program (trees of) PIFOs. This DSL will expose exactly as much expressivity as we can support, after compilation, on the single PIFO tree that we will provision in hardware.

Here I have begun to sketch this DSL. Some of the policies it can express could already be supported on our simple existing hardware.

Syntax and Semantics

We assume for now that we only want to classify packets based on source and destination IP addresses. An address is constructed as:

addr ::= ip | addr or addr

Where ip is opaque and represented by capital letters, and the disjunction will become useful shortly.

We have a packet classification pass that we are going to elide. The packet parser of a P4 program can do this easily enough with a match-action table. We denote the class of packets with source addr_1 and destination addr_2 as addr_1 -> addr_2.

For now we only support three policies.

policy ::= fifo (addr -> addr)
         | fair (policy list)
         | strict (policy list)

That is:

fifo: Take a given class of packets and just transmit it in first come, first served order.
fair: Alternate between items from each of the sub-policies in the list provided. This is work conserving.
strict: Strictly prefer packets from the first sub-policy in the list to those from the second, and so on.

A program in our DSL will receive some number of classes of packets, each of the form addr -> addr. It must combine them into a single policy and return that policy. If classes of packets are never mentioned by the program, they will be dropped.

As an aside, we can support a small subset of this language already: policies that use the fifo construct and/or the fair construct with lists of length two. Such policies can be implemented using FIFOs and PIFOs that achieve fairness among two ranks, and we have these in our development.

Examples

Say our classifier has annotated the incoming packets in six possible ways:

A -> X
A -> Y
B -> X
B -> Y
C -> X
C -> Y

Program 1: Forward packets having source `A`; drop the rest.

return fifo (A -> X or Y)

When the disjunction covers all the possibilities, it may make sense to allow some syntactic sugar (e.g., A -> * or A -> _), but I'm just going to write it out in full for now.

Program 2: Fairly split traffic by source.

return fair (fifo (A -> X or Y),
             fifo (B -> X or Y),
             fifo (C -> X or Y))

Program 3: Fairly split traffic by source, and then prefer packets headed to `X` over those headed to `Y`.

left = strict(fifo(A -> X), 
              fifo(A -> Y))
mid = strict(fifo(B -> X), 
              fifo(B -> Y))
right = strict(fifo(C -> X), 
              fifo(C -> Y)) 
return fair (left, mid, right)

Program 3 is beginning to reveal an interesting problem. The policy we have written is well-formed, and is in fact a refinement of the policy we wrote in Program 2. However, this is because the sub-policies we wrote were carefully specifying details of larger "glob" policies. For example, fifo (A -> X or Y) in Program 2 is a glob, and left in Program 3 further refines it.

Say a tired grad student wrote:

Program 4: Incorrect version of Program 3

left = strict(fifo(A -> X), 
              fifo(B -> Y))     // the only change
mid = strict(fifo(B -> X), 
              fifo(B -> Y))
right = strict(fifo(C -> X), 
              fifo(C -> Y)) 
return fair (left, mid, right)

Now I'm not sure what should happen to packets going from A to Y. I presume they get dropped. Should this result in a warning? Should we just catch it based on the fact that fifo(B -> Y) appears twice? Should we change the language to make such mistakes harder to make? Let's talk!

sampsyo · 2024-04-19T17:55:54Z

sampsyo
Apr 19, 2024
Maintainer

Thanks for outlining this!! This is super helpful and makes things concrete to talk about.

I want to propose a mid-level tweak here, which might help resolve the dropping/duplicating issue you highlight in Program 4. The idea is to further decouple the packet classification from the scheduling. Your proposal "bakes in" a specific notion of how to classify packets, based on source and destination address. My proposal is to make these classifications completely opaque. The result is a simpler language that is a little more annoying to use because you can't write A -> X and have that immediately conjure a specific classification; you instead have to name a classification A_to_X or whatever. Perhaps we can recover some of the niceness with syntactic sugar, but this is about the core formalism.

In this model, every DSL program would start with a list of classes, something like this:

classes A_to_X, A_to_Y, B_to_X, B_to_Y, C_to_X, C_to_Y;

The idea is that the P4 program is required to attach a single integer variable with every packet, in this case ranging from 0 to 5. That number gets mapped to the 6 names declared at the top of the scheduler program. Part of the upside here is that we leave the classification entirely up to the P4 program, so it can classify packets based on anything a P4 program can do. So the associated P4 program could do P4y things to set the class:

action set_schedule_class(bit<8> c) {
  schedule_data.class = c;
}
table schedule_class {
  key = { headers.source_addr; }
  actions = { set_schedule_class; }
}

I dunno, I'm really not a P4 programmer. But something like this.

Now that we have opaque names for our classes, then back in our special new scheduling DSL, we amend the grammar to this:

class ::= id
policy ::= fifo (class list)
         | fair (policy list)
         | strict (policy list)

So at the leaves, we write fifo(A_to_X, A_to_Y) or whatever. Like I said above, this sacrifices the nice A -> X or Y syntax you have, but maybe we can recover that with syntactic sugar.

Here's where I was going with all this, however: we make it a correctness constraint that every class appears exactly once in the policy. (Classes are linear, if you like.) This sidesteps all possible complications w/r/t one class including another one and therefore conflicting—all classes are mutually exclusive, so you can never use one twice. And you are likewise not allowed to leave one off; if you want to drop packets, you should do that in your P4 program instead, I guess.

Does this make sense?

0 replies

anshumanmohan · 2024-04-19T18:38:49Z

anshumanmohan
Apr 19, 2024
Maintainer Author

I like it! I just wonder if we can bring back the "to drop a class of packets, just don't name it in the policy" thing. That is, we make it a correctness constraint that every class appears at most once in the policy.

My angle here is that I don't want the P4 gadget to have the power to drop packets. I think a P4 gadget that just classifies packets would be more generic and more reusable.

1 reply

sampsyo Apr 19, 2024
Maintainer

Sure, sounds equally reasonable.

anshumanmohan · 2024-05-28T00:17:23Z

anshumanmohan
May 28, 2024
Maintainer Author

Just writing out a quick idea that @sampsyo had during a meeting. The idea is for the DSL to be able to represent not only a source program (like a user might write) but also a program that has been compiled (in the Formal Abstractions sense) into a program that runs against a different topology but behaves identically.

So for example, the user might write the policy

fair(fifo (foo),
     fifo (bar),
     fifo (baz))

which is written against a ternary tree of height two.

Then we might compile this policy so it runs against a binary tree of minimum height. The catch is that we still want to express that new policy in the DSL. That is, we want to be able to look at and study the compiled policy in a DSL before we compile it (in the Calyx sense) down to an accelerator.

The compiled policy won't be as neat and tidy as our user-written one, but at least it'll be somewhat readable. We'll also be able to state and prove equivalence of two programs written in the same DSL. This is nicer than trying to directly prove that a program written in the DSL is equivalent to the program running on the accelerator.

The challenge, of course, is that when we compile policies running on one topology to run on another, the new control program often needs to make scheduling decisions that "look through" intermediate nodes. The DSL as presented above is not immediately going to be able to do what this comment describes. We will need a careful tweak.

5 replies

sampsyo May 28, 2024
Maintainer

Excellent. Indeed, the primary challenge here is in thinking through what a construct for the "special" nodes that "look through" other nodes would look like… and how to describe their semantics, in an informal but compositional way.

anshumanmohan May 28, 2024
Maintainer Author

I have a quick proposal for this "see through" system. The high-level idea is to actually leave alone the mother node that will be seeing through other nodes, and create a new policy construct t for transient nodes that will be seen through. The policy t is read, "I have descendants but don't have a policy; someone above me will see through me and give my descendants a policy".

Also, policies now need to be tagged with an integer that specifies how many descendants the policy orchestrates. So fair2 is different from fair3, and so on.

To lighten notation I am going to start writing the name of a single class, e.g., class, as a policy constructor. This is to be read the same as fifo1 ([class]), i.e., the policy that transmits the single class class in FIFO order.

Concretely:

class ::= id
policy ::= class
         | fifo2 (class list)
         | fair2 (policy list)
         | strict2 (policy list)
         ...
         | t2 (policy list)
         | t3 (policy list)
         ...

Say the user wrote

fair3(foo,
      bar,
      baz)

Compiled to run on a binary tree, this would turn into:

fair3(foo,
      t2(bar, baz))

So the policy is still fair3, but the list of policies provided is just two long:

foo
t2(bar, baz)

As a result, the t2 takes the rest of the shares on the assumption that it has enough descendants to consume those shares.

Consider a kinda wild case, where some opaque policy pol6 is written against a 6-wide tree of height 2, but is then compiled to run on a tall skinny binary tree of height 6.

Original:

pol6(A, B, C, D, E, F)

Compiled:

pol6(A,
     t5(B,
       t4(C,
         t3(D,
           t2(E, F)))))

Gross! And the Formal Abstractions compiler would not actually produce such a tall skinny tree. But you can sorta convince yourself that the DSL policy below can easily be "folded" back into the original policy.

Let's do a more realistic case:

Original:

fair4(A, B, C, D)

Compiled:

fair4(t2(A, B),
      t2(C, D))

sampsyo May 29, 2024
Maintainer

Aha, that's an interesting idea. You're right that this has the upside of making it clear how to "collapse" the tree back into one where the transient nodes are removed.

One possible concern I have about this style is that it might mean that the semantics (and possibly compilation) of the non-transient nodes is now not very compositional. What I mean is that, to describe the semantics of the fair4 policy constructor, this would be a compositional statement:

Take 4 streams of packets as input. Fairly balance the streams so that, when all the streams are active, 1/4 of any window in the output on average belongs to each of the 4 input streams.

With transient nodes, the semantics become more case-by-case-y (i.e., not as compositional)… the semantics would have to roughly say something like:

Take all my children and recursively flatten them by removing transient nodes and taking all the children of those. Then fairly balance the traffic among those gathered-up children as above.

As in, the semantics would essentially boil down to "perform the folding @anshumanmohan outlined, and then execute that policy."

This worries me a little bit w/r/t compilation because it makes it less clear how to implement these transient nodes… and how to implement fair4 in a way that doesn't require case-by-case reasoning about what its children are (transient or non-transient). However, maybe this is the most direct reflection of how the hardware needs to work? Maybe that's the thing to think through: roughly how do we think we will build the hardware for special/transient PIFO tree nodes, and how do we reflect that in the most straightforward way possible, semantically?

anshumanmohan May 29, 2024
Maintainer Author

Yes, totally fair! I think we'll have to wait and see what we actually do with the hardware, and that will inform this decision here. Glad we aired out these thoughts though!

anshumanmohan Jun 5, 2024
Maintainer Author

Just jotting down an idea from a discussion with Nate and Tobias: decouple the topology of the tree from the logical policy that we wish to run on it.

The user's dream is still written as before:

fair3(A, B, C)

We can infer trivially that this can be run against the following tree topology, which we'll call T1.

Node(Leaf, Leaf, Leaf)

But say we know that we only have a binary branching tree, T2, at our disposal:

Node(Node (Leaf, Leaf),
     Node (Leaf, Leaf))

We proceed by invoking the embedding algorithm from §6.1 of Formal Abstractions. This returns the binary tree of minimum height that T1 can embed into. We'll call it T3.

Node(Node (Leaf, Leaf),
     Leaf)

Informally, T3 "fits into" T2, so we're going to be okay.

A quick note on this is that Leaf above was perhaps always shorthand for Node [], the node with no children. And the tree had only constructor, Node of tree list. This may allow us to more precisely state my "fitting into" idea. T3 fits into T2 because Leaf = Node [] < Node (Leaf, Leaf). Sorta scrambling here, but hopefully the intuition is clear.

Let's pop back up to the DSL. All policies are stated in two pieces: the logical dream and the topology the dream should run on. When the topology is the most obvious projection of the dream, it can be elided.

The user writes:

fair3(A, B, C)

which we desugar into:

fair3(A, B, C)
---
Node(Leaf, Leaf, Leaf)

trivially.

Then we run the Formal Abstractions compiler on the above. It actually operates on the shape alone and gives us:

fair3(A, B, C)
---
Node(Node (Leaf, Leaf),
     Leaf)

and comes with the guarantee that if

fair3(A, B, C)

could run on

Node(Leaf, Leaf, Leaf)

then it can run on

Node(Node (Leaf, Leaf),
     Leaf)

It does not actually compute the new scheduling brain that would be needed to do this; we are after all in DSL land. It just assures us that it will be possible to do so.

anshumanmohan · 2024-07-22T16:01:31Z

anshumanmohan
Jul 22, 2024
Maintainer Author

Just to keep this discussion current: we now have an AST and a parser; see the AST here.

Reproducing it for easy reference:

type clss = string
type var = string

type declare =
| DeclareClasses of clss list

type policy =
| Class of clss
| Fifo of policy list
| Fair of policy list
| Strict of policy list
| Var of var

type return =
| Return of policy

type assignment =
| Assn of var * policy

type program =
| Prog of declare * (assignment list) * return

That is, you declare your classes once, make a series of assignments if you want, and then write a return statement with a policy as its payload. The idea is that the assignments will be for sub-policies, and that these will accrete into a policy that you will eventually return.

I will note that Class of clss is a permitted policy; it just means "please emit packets from this one class in FIFO order". One could wrap this into a FIFO if one wanted, but that's unnecessary. The sample programs used to do this unnecessary wrapping but I have simplified that syntax.

If you actually want to combine two policies agnostically, i.e., you don't care about one crowding out another in the case of silences, you are welcome to write the policy Fifo[C1, C2]. This will just create a first-come-first-served free-for-all between packets of classes C1 and C2.

1 reply

anshumanmohan Jul 22, 2024
Maintainer Author

The AST and parser do not yet implement non-work-conserving algorithms or transient nodes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DSL Sketch #5

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

DSL Sketch #5

anshumanmohan Apr 18, 2024 Maintainer

Background

Syntax and Semantics

Examples

Program 1: Forward packets having source A; drop the rest.

Program 2: Fairly split traffic by source.

Program 3: Fairly split traffic by source, and then prefer packets headed to X over those headed to Y.

Program 4: Incorrect version of Program 3

Replies: 4 comments · 7 replies

sampsyo Apr 19, 2024 Maintainer

anshumanmohan Apr 19, 2024 Maintainer Author

sampsyo Apr 19, 2024 Maintainer

anshumanmohan May 28, 2024 Maintainer Author

sampsyo May 28, 2024 Maintainer

anshumanmohan May 28, 2024 Maintainer Author

sampsyo May 29, 2024 Maintainer

anshumanmohan May 29, 2024 Maintainer Author

anshumanmohan Jun 5, 2024 Maintainer Author

anshumanmohan Jul 22, 2024 Maintainer Author

anshumanmohan Jul 22, 2024 Maintainer Author

anshumanmohan
Apr 18, 2024
Maintainer

Program 1: Forward packets having source `A`; drop the rest.

Program 3: Fairly split traffic by source, and then prefer packets headed to `X` over those headed to `Y`.

Replies: 4 comments 7 replies

sampsyo
Apr 19, 2024
Maintainer

anshumanmohan
Apr 19, 2024
Maintainer Author

sampsyo Apr 19, 2024
Maintainer

anshumanmohan
May 28, 2024
Maintainer Author

sampsyo May 28, 2024
Maintainer

anshumanmohan May 28, 2024
Maintainer Author

sampsyo May 29, 2024
Maintainer

anshumanmohan May 29, 2024
Maintainer Author

anshumanmohan Jun 5, 2024
Maintainer Author

anshumanmohan
Jul 22, 2024
Maintainer Author

anshumanmohan Jul 22, 2024
Maintainer Author