


.NETが変化し始めてもう数年でしょうか、.NET Coreが2.1になり、だいぶ成熟していますが、これを使う上で問題になることが一つあります。そう、標準では式木などで生成したコードをアセンブリファイルに保存できないのです。.NET Frameworkでは、アセンブリを生成する際にオプションを指定することで保存も可能になるのですが、.NET Coreにはそんなオプションは用意されていません。そこで使用することになるのが表題のライブラリです。System名前空間にあることからわかる通り、将来的にはCoreに添付されるようになるのかもしれません。現状は、NuGet経由でインストールすることになります。

  1. まずは何はともあれ、PEHeaderBuilderを作成します。基本的に元の値をそのまま入れれば大丈夫なはずです。
  2. ILをコピーします。ここがちょっと曲者で、なしでも動くかもしれませんが私は、Roslynのコードを参考に多少変形をかけています。この時にMethodBodyStreamEncoder.MethodBodyからOffsetを拾ってくるのを忘れずに。これを記録しておかないと、RVA(RelationalVirtualAddressの略です、多分)が算出できなくなります。
  3. MetadataBuilderを作ります。これはMetadataBuilderのインスタンスを生成後、各テーブルの情報を押し込むところまで含めています。この部分だけで200行近く行くはずです。
  4. MetadataBuilderからMetadataRootBuilderを作ります。
  5. あとはこれらをManagedPEBuilderに放り込み、BlobBuilderにSerializeして、最後にWriteContentToでStreamに書き込めば完成のはずです。





プログラミング言語を作るには、大きく分けて4つの工程を理解し、制作する必要があります。レキサー(lexer)、パーサ(parser)、アナライザ(analyzer)、ジェネレータ(code generator)です。アナライザは広く言われている呼称かは知りませんが。

そして、構成された文、式の意味を解析し、何を意図しているかを解釈するのが、次のアナライザの役目です。このブログの文章で行くと、こんにちは=挨拶、1ヶ月=期間、ぐらい=期間の範囲を指定する……などと解釈されて意味(semantics: 自然言語の分野ではこう呼ばれないと思いますが)が構成されます。


いかがでしたでしょうか?あなたもプログラミング言語を作ってみたくなりましたか?言語を作ることは、主にコマンドラインで動くプログラムを作ることになるので、WebなどGUIありきのプログラミングに慣れている方にはハードルが高いかもしれません。ですが、言語自体ができてきてコードコンプリーションなどのユーザ支援の機能を実装する段階に入ると、GUIのプログラミングも出てきます。もちろん、実用的な言語を作るならば、何かしらの目的がないと続かないでしょうが、お遊び程度にCのサブセットを実装してみるのも一興かもしれませんね^ - ^


My original programming language, Expresso -- The import statement and interoperability with other .NET languages

Hello, again. This is HAZAMA. And yet another blog post about Expresso.

In a previous blog post, I said that you can now use .NET as the standard library, and I expanded the specification further and it now supports interoperability with other .NET languages via assemblies. In addition, I also revised the specification of the import statement, so I'll cover it.
Let's say we have the following C# source code:

//In TestInterface.cs
using System;
using System.Collections.Generic;

namespace InteroperabilityTest
    public interface TestInterface
        void DoSomething();
        int GetSomeInt();
        List<int> GetIntList();

// In InteroperabilityTest.cs
using System;
using System.Collections.Generic;

namespace InteroperabilityTest
    public class InteroperabilityTest : TestInterface
        public void DoSomething()
            Console.WriteLine("Hello from 'DoSomething'");

        public List<int> GetIntList()
            Console.WriteLine("GetIntList called");
            return new List<int>{1, 2, 3, 4, 5};

        public int GetSomeInt()
            Console.WriteLine("GetSomeInt called");
            return 100;

// In StaticTest.cs
using System;
using Expresso.Runtime.Builtins;

namespace InteroperabilityTest
    public class StaticTest
        public static void DoSomething()
            Console.WriteLine("Hello from StaticTest.DoSomething");

        public static bool GetSomeBool()
            Console.WriteLine("GetSomeBool called");
            return true;

        public static ExpressoIntegerSequence GetSomeIntSeq()
            Console.WriteLine("GetSomeIntSeq called");
            return new ExpressoIntegerSequence(1, 10, 1, true);

Assume that the DLL containing the above code is named InteroperabilityTest.dll and we'll write this Expresso code:

module main;

import InteroperabilityTest.{InteroperabilityTest, StaticTest} from "./InteroperabilityTest.dll" as {InteroperabilityTest, StaticTest};

def main()
    let t = InteroperabilityTest{};
    let i = t.GetSomeInt();
    let list = t.GetIntList();

    let flag = StaticTest.GetSomeBool();
    let seq = StaticTest.GetSomeIntSeq();

    println(i, list, flag, seq);

Then we'll see the Console.WriteLine outputs, 100, [1, 2, 3, 4, 5, ...], true and [1..11:1] on the console. As you can see, although you generally call only functions using FFI(Function Foreign Interface), here you can create instances and even call instance methods. Of course, it's because we run on the same runtime environment, the CLR. With the compiled DLL, you can create instances and call methods from other .NET languages such as C#. We take full advantage of the .NET environment, huh?
Apr. 7 2018 added: Although you could have "gotten" properties, it now supports "setting" properties.
It's almost there where we achieve complete interoperability with foreign .NET languages. Apr. 8 2018 added: Now you can refer to enums defined on IL codes. That means that we have achieved complete interoperability with other .NET languages, I suppose.
C# can interoperate with C++, so we could interoperate with C++ from Expresso if we wrap it in C#.

So notice the import statement has changed? In the previous post, the import clause takes a string but now it looks more like the ones in Python or maybe Rust(Rust calls it use statements, though). When the import statement is written in EBNF, it looks something like the following:

"import" ident [ "::" ( ident | '{' ident { ',' ident } '}' ) ] { '.' ( ident | '{' ident { ',' ident } '}' ) [ "from" string_literal ] "as" ( ident | '{' ident { ',' ident } '}') ';'

This means that you can't omit the as clause unlike Python. In Expresso, you have to alias imported names. Otherwise you would have to refer to them with names containing "::" or "." but Expresso doesn't allow it. You use "::" when you refer to a type that belongs to a module. This is also true for other expressions.
On the other hand, you also specify the namespace when you're referring to a type in external assemblies that are written in C#. In addition, when doing so you can't import variables and functions directly. This is because IL code doesn't allow you to define variables or functions on assemblies.
The file names in the from clause is relative to the source file that the import statement resides in. That means that InteroperabilityTest.dll locates on the same directory as the source file. This rule applies as well when you refer to .exs source files.

Now that we can interoperate with C#, I'm expecting to write the Expresso compiler itself in Expresso.




すると、Console.WriteLineの出力をしつつ、100、[1, 2, 3, 4, 5, ...]、true、[1..11:1]という出力がされます。ご覧の通り、普通FFI(Function Foreign Interface)を使用して呼べるのは関数だけですが、インスタンスの生成、インスタンスメソッドの呼び出しも行えます。ランタイム環境が共通になっている恩恵ですね。また、コンパイル後のILの状態ならば、リフレクションの機能を使用してC#などの他の言語から呼び出すこともできます。まさに「無敵」状態です。.NETを採用した強みが出ていますね。
2018/4/7 追記: 以前からプロパティのgetはできたものの、本日、setも実装し、完全にプロパティを扱えるようになりました。完全な互換性を保持するまであとenumを使えるようにするだけと、あと1歩になりました。2018/4/8 追記: ILのenumも参照できるようになりました。これで完全な互換性を持ったことになるはずです。


My original programming language, Expresso -- The intseq type

Hi, this is HAZAMA. And this is the 5th blog post about Expresso in this month.

Even though I wasn't planning to write blog posts about specific types, I have realized a funny fact about it, and I'm writing this post.
Namely, the intseq type is a builtin type to Expresso, and it is alike to the xrange type in Python or the Range type in Rust. Frankly, it is a generator that produces integers in sequence so the basic functionality is the same as that of the similar types in other programming languages. Although the range in Kotlin can be tested for inclusion with the in operator, Expresso can't. In is a keyword in Expresso as well, but it only is used as the right-hand-side in for loops.

module main;

def main()
    for let i in 0..10 {    // start..end:step is a ternary operator。When step is omitted, it will be 1.

    for let j in 9...0:-1 { // If you set step to negative integers, the sequence go in the negative direction. Note that we don't check whether it is valid(On Apr. 5 2018 added: The compiler now checks whether it is correct when the expression consists only of literal expressions. In other words, the compiler issues a warning if you write it as 0...9:-1 instead).

    for let k in 0..10:3 {  // Of course, you can set it to more than 1.
    let a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...]; // I wish that we can write it as (0..10).select(|x| x);. From Apr. 5 2018, it can also be written as [0..10, ...];
    let a2 = a[2...4];

The above code prints [2, 3, 4] as expected(The actual output will be something like [List<int>, 2...4:1] because it prints the result of calling ToString on it). The type of a2 will be a slice, which is also a builtin type. Even though Rust has the Slice type as well and it also allows us to write like that, other programming languages don't. So it is a strong point of Expresso.
In Kotlin you can create ranges for floats(because they don't make sense I may get it wrong), but in Expresso you can do so for ints only. Note that it might raise an exception when you try to create a range outside of the range of the int(Added on Apr. 5 2018: Actually the compiler complains because a number outside of the int range will be interpreted as a double and the intseq expression doesn't accept a double).
In addition, the range operator is one of the two ternary operators in Expresso. The other ternary operator is the conditional operator( ? : ).
Added on Apr. 5 2018: As shown in the comment of initialization of a, you can now initialize a sequence with an intseq. Other languages such as Rust, Kotlin and Swift don't support this syntax, so as far as I know it's unique to Expresso only. You can create an array with [0..10]; and of course, with a step: [0..10:2];.




module main;

def main()
    for let i in 0..10 {    // start..end:stepは3項演算子。stepを省略すると1になる

    for let j in 9...0:-1 { //stepを負にすれば、マイナス方向にも行ける。なお、整合性はチェックしてません(2018/4/5 追記: リテラルで指定している場合のみ、整合性をチェックするようになりました。0...9:-1と書くと警告が出ます。変数などを使ってるとチェックされません)

    for let k in 0..10:3 {  //もちろん、stepは1以上でもいい
    let a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...]; //将来的には(0..10).select(|x| x);などと書けるようにしたいところ(2018/4/5追記: [0..10, ...];と書けるようになりました)
    let a2 = a[2...4];

上記のコードで、想定通り[2, 3, 4]が出力されます(まあ、Slice型のToString()の結果なので、[List<int>, 2...4:1]みたいな感じですが)。a2の型はこちらも組み込みのslice型になります。Slice型自体は、Rustにもあるものの、このような書き方ができる言語がないので(修正: Rustでもできましたね、この書き方)、特筆すべきところはそこになります。
Kotlinだと、floatに対してもrangeが定義されてるみたいですが(floatの数列って意味をなさない気がするから気のせいなのかな)、Expressoの場合は、intだけです。その範囲外は多分死にます(型しか見てないので、範囲外が来ても明示的なエラーにできず、intへの型変換で落ちるはず)(2018/4/5追記: これは誤りでした。実際にはintの範囲外はdoubleと解釈され、intseqはdoubleを受け付けないと怒られます)。
ちなみにrange operatorは、Expressoに2種類ある3項演算子の1つです。もう一つは、有名な条件演算子( ? : )です。
2018/4/5追記: aの宣言のコメントにあるように、intseqでシーケンスを初期化できるようになりました。意外にも、RustやKotlin、Swiftなどではこの表記は採用されていないので、私の観測範囲ではExpresso固有の機能になります。なお、[0..10];と書けば、arrayにもできます。もちろん、stepを指定して、[0..10:2];などと書くこともできます。機能限定のリスト内包表記みたいな感じですね。


My original programming language, Expresso -- About functions

Hi, this is HAZAMA. And this is an-another blog entry for Expresso this month. We'll be talking a bit about the grammar this time around.

One of the commonest language constructs is the functions and Expresso, of course, has ones. Specifically, Expresso has module-level functions and methods in class definitions. They are implemented as static methods in module classes, and ordinary methods in class definitions, respectively.
module main;

def test()
    let a = 10;
    return a + 10;

def test2(n (- int)
    return n + 10;

def test3(n (- int) -> int
    return n + 20;

def test4(n (- int) -> int
    if n >= 100 {
        return n;
        return test4(n + 10);

def main()
    let a = test();
    let b = test2(20);
    let c = test3(20);
    let d = test4(80);

    println(a, b, c, d);

This code snippet originated from the test codes again. Return types of Expresso functions will be inferred from the body, but I'm worried about whether to change it or not(In the code above, all the functions will be inferred to return an int). In Kotlin, they are inferred only if the return type is Unit(Void), so I've not decided to make Expresso do that.
Parameters of functions are inferred only when they have the optional values(In the code above there isn't). In the code above, each of the functions:

  1. takes no parameters, and the return type is implicit.
  2. takes a parameter, and the return type is implicit.
  3. takes a parameter, and the return type is explicit.
  4. takes a parameter, and the return type is explicit and it contains a recursive call.
I guess there are no special differences to those in other programming languages.
Note that these are unnecessary comments. Although the parser can parse variadic parameters, the program isn't emitting any code for that so you can't use variadic parameters yet. I added those for the print* functions, but I don't come up with many use cases so I've not decided whether to implement them. Maybe I should implement them because doing so isn't that hard. Likewise, I've not decided to implement so-called string interpolation. This will replace the printFormat function if it exists. This is a difficult problem.(Apr. 8 2018 added: The string interpolation has been implemented and the printFormat function has been removed)