Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow concatenation of string literals at compile time #1299

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 59 additions & 16 deletions p4-16/spec/P4-16-spec.mdk
Original file line number Diff line number Diff line change
Expand Up @@ -1304,11 +1304,18 @@ number of backslash characters (ASCII code 92). P4 does not make any
validity checks on strings (i.e., it does not check that strings
represent legal UTF-8 encodings).

Since P4 does not provide any operations on strings,
string literals are generally passed unchanged through the P4 compiler to
other third-party tools or compiler-backends, including the
terminating quotes. These tools can define their own handling of
escape sequences (e.g., how to specify Unicode characters, or handle
Since P4 does not allow strings to exist at runtime, string literals
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't really true anymore? Now we also concatenate them e.g.? And I'm guessing we plan to turn that into a single string literal before passing it along to backends (though maybe not?).

I think the point of this paragraph is to say that the P4 compiler won't interpret string literals. What does that mean in the new world with concatenation?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably terminating quotes are no longer left as-is?

are generally passed unchanged through the P4 compiler to
other third-party tools or compiler-backends. The compiler can, however,
perform compile-time concatenation (constant-folding) of concatenation
expressions into single literal. When such concatenation is performed,
the binary representation of the string literals (excluding the quotes)
is concatenated in the order they appears in the source code. There are
no escape sequences that would be treated specially when strings are
concatenated.

The backends and other tools can define their own handling of escape
sequences (e.g., how to specify Unicode characters, or handle
unprintable ASCII characters).

Here are 3 examples of string literals:
Expand All @@ -1319,6 +1326,16 @@ Here are 3 examples of string literals:
line terminator"
~ End P4Example

Here is an example of concatenation expression and an equivalent string
literal:

~ Begin P4Example
"one string \" with a quote inside;" ++ (" " ++ "another string")
// can be constant folded to
"one string \" with a quote inside; another string")
~ End P4Example


### Optional trailing commas { #sec-trailing-commas }

The P4 grammar allows several kinds of comma-separated lists to end in
Expand Down Expand Up @@ -1906,13 +1923,18 @@ Operations on values of type `match_kind` are described in Section
### The Boolean type { #sec-bool-type }

The Boolean type `bool` contains just two values, `false` and `true`.
Boolean values are not integers or bit-strings.
Boolean values are not integers or bit-strings. Operations that can
be performed on booleans are described in Section [#sec-bool-exprs].

### Strings { #sec-string-type }

The type `string` represents strings. There are no operations on
string values; one cannot declare variables with a `string` type.
Parameters with type `string` can be only directionless (see Section
The type `string` represents strings. The values of type `string` are
either string literals, or concatenations of multiple `string`-typed
expression. Operations that can be performed on strings are described in
Section [#sec-string-ops].

One cannot declare variables with a `string` type. Parameters with
type `string` can be only directionless (see Section
[#sec-calling-convention]). P4 does not support string manipulation
in the dataplane; the `string` type is only allowed for describing
compile-time known values (i.e., string literals, as discussed in
Expand Down Expand Up @@ -3739,6 +3761,23 @@ finding this information in a section dedicated to type `varbit`.
Additionally, the maximum size of a variable-length bit-string can be determined at
compile-time (Section [#sec-minsizeinbits]).

## Operations on Strings { #sec-string-ops }

The only operation allowed on strings is concatenation, denoted by
`++`. For string concatenation, both operands must be strings and
the result is also a string. String concatenation can only be
performed at compile time.

~ Begin P4Example
extern void log(string message);

void foo(int<8> v) {
// ...
log("my log message " ++
"continuation of the log message");
}
~ End P4Example

## Casts { #sec-casts }

P4 provides a limited set of casts between types. A cast is written
Expand Down Expand Up @@ -8486,9 +8525,10 @@ table t {

The `@name` annotation directs the compiler to use a different
local name when generating the external APIs used to manipulate a
language element from the control plane. This annotation takes a string literal
body. In the
following example, the fully-qualified name of the table is `c_inst.t1`.
language element from the control plane. This annotation takes a local
compile-time known value of type `string` (typically a string literal).
In the following example, the fully-qualified name of the table is
`c_inst.t1`.

~ Begin P4Example
control c( /* parameters omitted */ )() {
Expand Down Expand Up @@ -8587,11 +8627,13 @@ absence), allowing architecture-independent analysis of P4 programs.

The `deprecated` annotation has a required string argument that is a
message that will be printed by a compiler when a program is using the
deprecated construct. This is mostly useful for annotating library
constructs, such as externs.
deprecated construct. This is mostly useful for annotating library
constructs, such as externs. The parameter must be a local
compile-time known value of type `string`.

~ Begin P4Example
@deprecated("Please use the 'check' function instead")
#define DEPR_V1_2_2 "Deprecated in v1.2.2"
@deprecated("Please use the 'check' function instead." ++ DEPR_V1_2_2)
extern Checker {
/* body omitted */
}
Expand All @@ -8602,7 +8644,8 @@ extern Checker {
The `noWarn` annotation has a required string argument that indicates
a compiler warning that will be inhibited. For example
`@noWarn("unused")` on a declaration will prevent a compiler warning
if that declaration is not used.
if that declaration is not used. The parameter must be a local
compile-time known value of type `string`.

## Target-specific annotations

Expand Down