Disclaimer:
- These docs are unofficial and may be inaccurate or incomplete.
- Please file bugs at https://github.com/ntrel/cpp2/issues.
- At the time of writing, Cpp2 is an unstable experimental language, see:
Note: Some examples are snipped/adapted from: https://github.com/hsutter/cppfront/tree/main/regression-tests
Note: Examples here use C++23 std::println
instead of std::cout
. If you don't have it, you can use this definition:
std: namespace = { println: (args...) = (std::cout << ... << args) << "\n"; }
- Declarations
- Variables
- Modules
- Types
- Memory Safety
- Expressions
- Statements
- Functions
- User-Defined Types
- Templates
- Aliases
These are of the form:
- declaration:
- identifier
:
type?=
initializer
- identifier
type can be omitted for type inference (though not at global scope).
x: int = 42; y := x;
A global declaration can be used before the line declaring it.
Cpp1 declarations can be mixed in the same file.
// Cpp2 x: int = 42; // Cpp1 int main() { return x; // use a Cpp2 definition }
A Cpp2 declaration cannot use Cpp1 declaration format internally:
// declare a function f: () = { int x; // error }
Note: cppfront
has a -p
switch to only allow pure Cpp2.
Use of an uninitialized variable is statically detected.
When the variable declaration specifies the type, initialization can be deferred to a later statement. Both branches of an if
statement must initialize a variable, or neither.
x: int; y := x; // error, x is uninitialized if f() { x = 1; // initialization, not assignment } else { x = 0; // initialization required here too, otherwise an error } x = 2; // assignment
x: const int; x = 5; // initialization x = 6; // error
y: int = 7; z: const _ = y; // z is a `const int`
Note that x
does not need to be initialized immediately, it can deferred. This is particularly useful when using if
branches to initialize the constant.
https://github.com/ntrel/cppfront/wiki/Design-note:-const-objects-by-default
A variable is implicitly moved on its last use when the use site syntax may accept an rvalue. This includes passing an argument to a function, but not an assignment to the last use of a variable.
inc: (inout v: int) = v++; test2: () = { v := 42; inc(v); // OK, lvalue inc(v); // error, cannot pass rvalue }
This can be suppressed by adding a statement _ = v;
after the final inc
call.
Cpp2 files have the file extensions .cpp2
and .h2
.
C++23 will support:
import std;
This will be implicitly done in Cpp2. For now common std
headers are imported.
See also: User-Defined Types.
Use:
std::array
for fixed-size arrays.std::vector
for dynamic arrays.std::span
to reference consecutive elements from either.
A pointer to T
has type *T
. Pointer arithmetic is illegal.
Address of and dereference operators are postfix:
x: int = 42; p: *int = x&; y := p*;
This makes p->
obsolete - use p*.
instead.
To distinguish these from binary &
and *
, use preceeding whitespace.
new<T>
gives unique_ptr
by default:
p: std::unique_ptr<int> = new<int>; q: std::shared_ptr<int> = shared.new<int>;
Note: gc.new<T>
will allocate from a garbage collected arena.
There is no delete
operator. Raw pointers cannot own memory.
Initialization or assignment from null is an error:
q: *int = nullptr; // error
Instead of using null for *T
, use std::optional<*T>
.
By default, cppfront
also detects a runtime null dereference. For example when dereferencing a pointer created in Cpp1 code.
int *ptr; f: () -> int = ptr*;
Calling f
above produces:
Null safety violation: dynamic null dereference attempt detected
Cpp2 will not enforce a memory-safety subset 100%. It will diagnose or prevent type, bounds, initialization, and common lifetime memory-safety violations. This is done by:
- Runtime bounds checks
- Requiring each variable is initialized before use in every possible branch
- Not implemented yet: Compile-time tracking of a set of 'points-to' information for each pointer. When a pointed-to variable goes out of scope, the set is updated to replace the variable with an invalid item. Dereferencing a pointer with a set containing an invalid item is a compile-time error. See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1179r1.pdf.
See:
- https://github.com/hsutter/cppfront#2015-lifetime-safety
- https://www.reddit.com/r/cpp/comments/16ummo8/cppfront_autumn_update/k2r3fto/
By default, cppfront
does runtime bound checks when indexing:
v: std::vector = (1, 2); i := v[-1]; // aborts program s: std::string = ("hi"); i = s[2]; // aborts program
Besides the pointer operators, Cpp2 also only uses postfix instead of prefix form for:
++
--
~
Unlike Cpp1, the immediate result of postfix increment/decrement is the new value.
i := 0; assert(i++ == 1);
https://github.com/hsutter/cppfront/wiki/Design-note:-Postfix-operators
A bracketed expression with a trailing $
inside a string will evaluate the expression, convert it to string and insert it into the string.
a := 2; b: std::optional<int> = 2; s: std::string = "a^2 + b = (a * a + b.value())$\n"; assert(s == "a^2 + b = 6\n");
Note: $
means 'capture' and is also used in closures and postconditions: https://github.com/hsutter/cppfront/wiki/Design-note%3A-Capture
- anonymousVariable:
:
type?=
expression
f: (i: int) = { std::println("int"); } f: (i: short) = { std::println("short"); } main: () = { f(5); // int f(:short = 5); // short }
The last statement is equivalent to tmp: short = 5; f(tmp);
.
- identifierExpression:
- identifier
- identifier
<
expressions>
- expression
::
identifierExpression
Whenever any kind of identifier expression is used where it could parse as a type, it must be enclosed in parentheses:
id1
- type(id1)
- expression
An identifier expression does not need parentheses where a type would not be valid. Other expressions never need parentheses as they could not be parsed as a valid type, e.g. literals, unary expressions etc.
- asExpression:
- expression
as
type
- expression
x as T
attempts:
- type conversion (if the type of
x
implicitly converts toT
) - customized conversion (using
operator as<T>
), useful forstd::optional
,std::variant
etc. - construction of
T(x)
- dynamic casting (equivalent to Cpp1
dynamic_cast<T>(x)
whenx
is a base class ofT
)
An exception is thrown if the expression is well-formed but the conversion is invalid.
c := 'A'; i: int = c as int; assert(i == 65); v := std::any(5); i = v as int; s := "hi" as std::string; assert(s.length() == 2);
- isExpression:
- type
is
(type | template) - expression
is
(type | expression | template)
- type
Not implemented yet.
Test a type T
matches another type - T is Target
attempts:
true
whenT
is the same type asTarget
.true
ifT
is a type that inherits fromTarget
.
Test a type against a template - T is Template
attempts:
true
ifT
is an instance ofTemplate
.Template<T>
if the result is convertible tobool
.
Note: Testing an identifier expression needs to use parentheses.
Test type of an expression - (x) is T
attempts:
true
when the type ofx
isT
x.operator is<T>()
(x) is void
meansx
is empty
assert(5 is int); i := 5; assert((i) is int); assert(!((i) is long)); v := std::any(); assert((v) is void); // `v.operator is<void>()` v = 5; assert((v) is int); // `v.operator is<int>()`
Test expression has a particular value - (x) is v
attempts:
x.operator is(v)
x == v
x as V == v
whereV
is the type ofv
v(x)
if the result isbool
i := 5; assert((i) is 5); v := std::any(i); assert((v) is 5);
The last lowering allows to test a value by calling a predicate function:
pred: (x: int) -> bool = x < 20; test_int: (i: int) = { if (i) is (pred) { std::println("(i)$ is less than 20"); } } main: () = { test_int(5); test_int(15); test_int(25); }
Note that pred
is not a type identifier so it must be parenthesized.
Test an expression against a template - (x) is Template
attempts:
true
if the type ofx
is an instance ofTemplate
.Template<(x)>
if the result is convertible tobool
.
- inspectExpression:
inspect
constexpr
? expression->
type{
alternative+}
- alternative:
- alt-name? pattern
=
statement - alt-name? pattern
{
alternative+}
- alt-name? pattern
- alt-name:
- identifier
:
- identifier
- pattern:
is
(type | expression | template)as
typeif
expression- pattern
||
pattern - pattern
&&
pattern
Only is
alternatives without alt-name are implemented ATM.
v : std::any = 12; main: () = { s: std::string; s = inspect v -> std::string { is 5 = "five"; is int = "some other integer"; is _ = "not an integer"; }; std::println(s); }
An inspect
expression must have an is _
case.
Unimplemented: an inspect
statement has the same grammar except there must be no ->
type after the expression.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2392r2.pdf
A variable can be explictly moved. The move constructor of z
will destroy x
:
x: std::string = "hi"; z := (move x); assert(z == "hi"); assert(x == "");
See also Implicit Move on Last Use.
A condition expression does not require parentheses in Cpp2, though when a statement immediately follows a condition, a blockStatement is required.
- ifStatement:
if
constexpr
? expression blockStatement elseClause?
- elseClause:
else
blockStatementelse
ifStatement
if c1 { ... } else if c2 { ... } else { ... }
x := 1 assert(x == 1);
- parameterizedStatement:
- parameterList statement
A parameterized statement declares one or more variables that are defined only for the scope of statement.
(tmp := some_complex_expression) func(tmp, tmp); // tmp no longer in scope
Valid parameterStorage keywords are in
, copy
, inout
.
- whileStatement:
while
expression nextClause? blockStatement
- nextClause:
next
expression
If next
is present, its expression will be evaluated at the end of each loop iteration.
// prints: 0 1 2 (copy i := 0) while i < 3 next i++ { std::println(i); }
Note: The above is a parameterizedStatement.
- doWhileStatement:
do
blockStatement nextClause?while
expression;
// prints: 0 1 2 i := 0; do { std::println(i); } next i++ while i < 3;
- forStatement:
for
expression nextClause?do
(
parameter)
statement
The first expression must be a range. parameter is initialized from each element of the range. The parameter type is inferred. parameter can have inout
parameterStorage.
vec: std::vector<int> = (1, 2, 3); for vec do (inout e) e++; assert(vec[0] == 2); for vec do (e) std::println(e);
The target of these statements can be a labelled loop.
outer: while true { j := 0; while j < 3 next j++ { if done() { break outer; } } }
- functionType:
- parameterList returnSpec
- parameterList:
(
parameter?)
(
parameter (,
parameter)+)
- parameter:
- parameterStorage? type.
- parameterStorage? identifier
...
?:
type.
- returnSpec:
->
(forward
|move
)? type->
parameterList
E.g. (int, float) -> bool
.
- functionDeclaration:
- identifier?
:
parameterList returnSpec?;
- identifier?
:
parameterList returnSpec? contracts?=
functionInitializer - identifier?
:
parameterList expression;
- identifier?
Function declarations extend the declaration form. Each parameter must have an identifier.
If returnSpec is missing with the first two forms, the function returns void
. The return type can be inferred from the initializer by using -> _
.
See also Template Functions.
- functionInitializer:
- (expression
;
| statement)
- (expression
A function is initialized from a statement or an expression.
d: (i: int) = std::println(i); e: (i: int) = { std::println(i); } // same
If the function has a returnSpec, the expression form implies a return
statement.
f: (i: int) -> int = return i; g: (i: int) -> int = i; // same
Lastly, -> _ =
together can be omitted:
h: (i: int) i; // same as f and g
This form is useful for lambda functions.
When a function returns a parameterList, each parameter must be named. A function with multiple named return parameters returns a struct with a member for each parameter.
f: () -> (i: int, s: std::string) = { i = 10; s = "hi"; } main: () = { t := f(); assert(t.i == 5); assert(t.s == "hi"); }
- Unless a return parameter has a default value, it must be initialized in the function body.
- When only one return parameter is declared, the caller does not use member syntax to access the result.
f: () -> (ret: int = 42) = {} main: () = { assert(f() == 42); }
- mainFunction:
main
:
(
args
?)
(->
int
)?=
functionInitializer
If args
is declared, it is a std::vector<std::string_view>
containing each command-line argument to the program.
If a method doesn't exist when using method call syntax, and there is a function whose first parameter can take the type of the 'object' expression, then that function is called instead.
main: () -> int = { // call C functions myfile := fopen("xyzzy", "w"); myfile.fprintf("Hello %d!", 2); // fprintf(myfile, "Hello %d!", 2) myfile.fclose(); // fclose(myfile) }
in
- default, read-only. Will pass by reference when more efficient, otherwise pass by value.inout
- pass by mutable reference.out
- must be written to. Can accept an uninitialized argument, otherwise destroys the argument. The first assignment constructs the parameter. Used for constructors.move
- argument can be moved from. Used for destructors.copy
- argument can be copied from.forward
- accepts lvalue or rvalue, pass by reference.
e: (i: int) = i++; // error, `i` is read-only f: (inout i: int) = i++; // mutate argument g: (out i: int) = { v := i; // error, `i` used before initialization // error, `i` was not initialized }
Functions can return by reference:
first: (forward v: std::vector<int>) -> forward int = v[0]; main: () -> int = { v : std::vector = (1,2,3); first(v) = 4; }
https://github.com/hsutter/cppfront/blob/main/regression-tests/mixed-parameter-passing.cpp2
vec: std::vector<int> = (); insert_at: (where: int, val: int) pre(0 <= where && where <= vec.ssize()) post(vec.ssize() == vec.ssize()$ + 1) = { vec.insert(vec.begin() + where, val); }
The postcondition compares the vector size at the end of the function call with an expression that captures the vector size at the start of the function call.
A single named return is useful to refer to a result in a postcondition:
f: () -> (ret: int) post(ret > 0) = { ret = 42; }
A function literal is declared like a named function, but omitting the leading identifier. Variables can be captured:
s: std::string = "Got: "; f := :(x) = { std::println(s$, x); }; f(5); f("str");
s$
means captures
by value.s&$*
can be used to dereference the captured address ofs
.
A template function declaration can have template parameters:
- functionTemplate:
- identifier?
:
templateParameterList? parameterList returnSpec? requiresClause?
- identifier?
E.g. size: <T> (v: T) -> _ = v.length();
When a function parameter type is _
, this implies a template with a corresponding type parameter.
A template function parameter can also be just identifier
.
f: (x: _) = {} g: (x) = {} // same
print: (a0) = std::print(a0); print: (a0, args...) = { print(a0); print(", "); print(args...); } main: () = print(1, 2, 3);
type
declares a user-defined type with data members and member functions. When the first parameter is this
, it is an instance method.
myclass : type = { data: int = 42; more: std::string = std::to_string(42); // method print: (this) = { std::println("data: (data)$, more: (more)$"); } // non-const method inc: (inout this) = data++; } main: () = { x: myclass = (); x.print(); x.inc(); x.print(); }
Data members are private
by default, whereas methods are public
. Member declarations can be prefixed with private
or public
.
Official docs: https://github.com/hsutter/cppfront/wiki/Cpp2:-operator=,-this-&-that.
operator=
with an out this
first parameter is called for construction. When only one subsequent parameter is declared, assignment will also call this function.
operator=: (out this, i: int) = { this.data = i; } ... x: myclass = 99; x = 1;
With only one parameter move this
, it is called to destroy the object:
operator=: (move this) = { std::println("destroying (data)$ and (more)$"); }
Objects are destroyed on last use, not end of scope.
base: type = { operator=: (out this, i: int) = {} } derived: type = { this: base = (5); // declare parent class & construct with `base(5)` }
- typeTemplate:
- identifier?
:
templateParameterList?type
requiresClause?
- identifier?
- templateParameterList:
<
templateParameters>
- templateParameter
- identifier
...
? (:
type
)? - identifier
:
type
- identifier
The first parameter form accepts a type.
The second parameter form accepts a value. To use a constant identifier as a template parameter, enclose it in parentheses:
f: <i: int> () -> _ = i; n: int == 5; ... std::println(f<(n)>());
n
is a constant alias.
- requiresClause:
requires
constExpression
defaultValue: <T> () -> T requires std::regular<T> = { v: T = (); return v; } ... assert(defaultValue<int>() == 0);
Note: Using an inline concept for a type parameter is not supported yet.
- concept:
- identifier
:
templateParameterListconcept
requiresClause?=
constExpression;
- identifier
arithmetic: <T> concept = std::integral<T> || std::floating_point<T>; ... assert(arithmetic<i32>); assert(arithmetic<float>);
Aliases are defined using ==
rather than =
.
- alias:
- identifier
:
templateParameterList? type?==
constExpression - identifier
:
templateParameterList? functionType==
functionInitializer - identifier
:
templateParameterList?type
==
type - identifier
:
namespace
==
identifierExpression
- identifier
The forms above are equivalent to the following Cpp1 declarations:
constexpr
variableconstexpr
functionusing
type aliasnamespace
alias
// constant template size: <T> size_t == sizeof(T); // compile-time function init: <T> () -> T == (); main: () = { static_assert(size<char> == 1); // constant aliases v := 5; //n :== v; // error, cannot read `v` at compile-time n :== 6; // OK myfunc :== main; static_assert(init<int>() == 0); view: type == std::string_view; N4: namespace == std::literals; }