Post new topic    
Page «  1, 2, 3, 4, 5  »
Red Slime
Send private message
 
 PostWed Mar 18, 2020 4:34 pm
Send private message Reply with quote
As for the parser details, you certainly know better than me. The longest script I've ever written was 5 lines long and whatever I did was tailored to the reference material I've worked with.

I've added a function that removes all "none" nodes before attempting to compile because I realized I couldn't easily skip them at the last moment and also to support this syntax:
Code:
HSpeak> a(10
 .... > ,20
 .... > ,30
 .... > )
flow: do
  function: a
    number: 10
    number: 20
    number: 30

It makes things easier for joining lines. The consequence is that:
Code:
HSpeak> ,,,a,,,
flow: do
  value: a

But I think it's acceptable.

For speed you may want to consider partial compilaton. An hash can be created and stored for every script and they would be re-compiled as needed.

As for Python performance in general, the only tool that works for me is Shed Skin. I don't know if it works with PLY, but it's similar to how the Euphoria compiler works, and it comes with some interesting examples.

In any case, I advice you against trying to optimize Python code for speed by hand. It actually does nothing and you lose the benefit of readability.
Metal King Slime
Send private message
 
 PostThu Mar 19, 2020 1:27 am
Send private message Reply with quote
I thought you were going to say that you removed none nodes in order to remove "variable" and "subscript", which can't be put in the hsz. They need to be removed so that the correct number of children is known for every node in compile_recurse.

I think a variant of your previous rule for concatenating lines, checking both whether the last char on the previous line and the first char of the new line are commas, could handle that better. Which I think is good enough to be a permanent solution.

Yes, I was actually considering (in HSpeak) hashing the source code of each script and only recompile the ones that changed. The tricky part is that it also needs to know which globals and constants and other scripts the script depends upon. Far too much trouble to be worth it. Oh... but in hspeak_parse, it's parsing/lexing that's slow, compiling to hsz is fast. And it's only compiling to hsz that depends on constants/globals/scripts, so a cache of lexer/parser output could be implemented really easily (this is Python afterall) and have a big impact.

I have plenty of experience trying to optimise Python, and I agree that it typically doesn't work. My tool of choice is Cython, but I'm not going to attempt that. I noticed that this compiler actually runs 3x slower under PyPy than CPython! PyPy has pathlogical performance when doing syscalls, including file I/O, so I tried loading all the files into memory... that barely helped. I guess this proves that PLY is highly optimised, for CPython, as it's pessimised for PyPy.
Red Slime
Send private message
 
 PostThu Mar 19, 2020 6:48 am
Send private message Reply with quote
Quote:
removed none nodes in order to remove "variable" and "subscript

I forgot about that. I knew the .hsz sizes don't match between compilers. Maybe that's the reason?

If you want to try some Python optimizations, you can get this. I wrote it a while ago. It loads Allegro 4 via ctypes and draws random 3D trees. Click and drag to change orientation and arrow keys to move. You can go crazy with cython on the matrix/vector classes.

----

In hspeak_parse I've split the "none" tag into "void" to indicate statements that are forbidden in arithmetic and "empty" to indicate and empty element.
In hspeak_ast I've added an AST_post function. For now it just interprets and removes "variable" calls but it could be used for more later.

----

I've made some changes to make it Python 2 compatible and I've written enough of the .hs generation code to be able to import scripts into the engine and make it execute something.

It shows the main menu in baconthulhu, then several errors about slices I don't know about. I was also able to compile and run my own script that uses strings and the keyboard.

I'm still missing "commands.bin" generation. For now I'm copying it from the original compiler. I have no idea what a "nonlocal" is and it also needs checks and re-factoring. But it works.
Metal King Slime
Send private message
 
 PostSun Mar 22, 2020 3:48 pm
Send private message Reply with quote
Oh, wow.
(You really should start making new posts instead of editing your previous post, I missed that update)
commands.bin is only used for certain error messages and is optional.

A bit surprised you went for Python 2 compatibility, but not unwelcome. I did just backport DWRL's Python 3 fixes to Python 2, but that was mostly because I think scons is normally installed as a Python 2 package (anyone who has compiled the OHRRPGCE successfully has it installed that way).

Nonlocal variable nodes are used when a subscript accesses variables from one of the scripts it's nested inside. (It's not possible to access variables in inner scripts, however it is possible to call a sibling subscript, or a sibling of the parent/grandparent/etc.
See https://rpg.hamsterrepublic.com/ohrrpgce/HSZ#Kind_8:_Nonlocal_Variable and https://rpg.hamsterrepublic.com/ohrrpgce/HSZ#Variable_IDs
Admittedly the first section is very hard to understand, but the second has an example.

I fixed a bunch of bugs:
-return, continue, and break must have exactly one child (the script interpreter doesn't check this, so you get various strange errors due to reading garbage)
-the first child to 'for' must be a variable ID ('reference' symbols), not an expression (there's still no error checking for invalidly using a script or global @ reference)
-variable IDs for local variables were numbered wrong (see the link above if you care)
-math functions like 'random', 'not', 'sqrt' didn't compile

After that, I had to replace "end" with ")" in "place stairs up"/"place stairs down" to get past those scripts
That makes Baconthulhu run as far as calling "around x", which fails because that contains a 'switch' block. So I stopped there.
Red Slime
Send private message
 
 PostSun Mar 22, 2020 4:34 pm
Send private message Reply with quote
Python2 compatibility is just because I could. Did you try my amazing demo game? I'm very proud of it and what's in the script source is pretty much what works currently.

I also need to add support for string literals. Since we're still in this weird quarantine I'll read all your suggestions and I'll update this post later.

----

Ok. I think I got most of your bugfixes in. I'll double check later but it's definitely working much better now. After this I'll be doing some re-factoring. Link is the same as usual.
Metal King Slime
Send private message
 
 PostMon Mar 23, 2020 4:38 pm
Send private message Reply with quote
Oh, yes, very nice.

I think the biggest missing feature is default arguments for function and script calls (and argument count checking).

I'm amused that string_ref can be only a constant, not an integer literal, which is by far the most common case. Actually, it should be an arbitrary expression. Replacing 'string_ref' with 'expression' works beautifully except that it generates a conflict with "void : reference '=' expression" which breaks parsing of 'script' and 'plotscript'. But that rule for 'void' should not really exist anyway, rather, there would be a rule for 'script' and 'plotscript' header lines.

Really, when AST_state.build() is called from different contexts (outside any script, inside a script, defineconstant, etc) it should accept different grammars. But PLY only seems to let you set the start token when you build the parser. Should there be multiple parsers? Afterall, almost no grammar is shared between the different contexts except for commas separating tokens.

But keeping everything as a single parser/grammar could maybe be achieved by prefixing the input with a character telling the context, e.g.
Code:

def p_root_toplevel(p):
    """
    root : '*' script_header
         | '*' expression_list
    """
    AST_state.root = p[2]

def p_root_script(p):
    "root : '(' expression_list"
    AST_state.root = AST_node("flow", p[2], "do")

def p_root_define_block(p):
    "root : '<' expression_list"
    AST_state.root = p[2]

def p_script_header(p):
    "script_header : name_concat ',' arg_list"
    p[0] = [p[1]] + p[3]

def p_arg_list1(p):
    """
    arg_list : reference
             | default_value
    """
    p[0] = [p[1]]

def p_arg_list2(p):
    """
    arg_list : arg_list ',' reference
             | arg_list ',' default_value
    """
    p[0] = p[1] + [p[3]]

def p_default_value(p):
    "default_value : reference '=' expression"
    p[0] = AST_node("value", [p[3]], p[1].leaf)

I almost have this working...


"return" without a condition isn't allowed; it would be pointless because "return" does not exit the script, unlike any other language with a "return" statement. In HS "exit returning" returns a value and exits (I've been meaning to add "exit" which is truly equivalent to "return" in other languages, with an optional return value. Trivial but I haven't.)

Any reason you removed/omitted support for the 4-argument form of 'for'?

"true" in HS is 1, not -1 (it's FB that uses -1). But I see plotscr.hsd will override those builtin definitions of true/false.

It had never really occurred to me how many things there are that HSpeak supports that aren't needed for the majority of scripts but add a lot of complexity. Stuff like definescript, defineoperator and plotscrversion; @obsolete; assert and tracevalue...
Red Slime
Send private message
 
 PostMon Mar 23, 2020 6:10 pm
Send private message Reply with quote
Quote:
default arguments for function and script calls

That's already implemented. I think.
Quote:
argument count checking

That's not implemented. Most error reporting aside from syntax errors is not implemented. And even then any error is just ignored. It's not a bug, it's an entirely new concept in programming. The idea is that you assert your superiority to the machine by ignoring its complaints.
Quote:
string_ref can be only a constant, not an integer literal

I've just fixed it. Poorly.
Quote:
Should there be multiple parsers?

You can add more parsers. My intent was to have a rough line oriented, preprocessor like parser and a proper one.
Quote:
Any reason you removed/omitted support for the 4-argument form of 'for'?

No reason except for the fact that I'm not sure where I want to go. I think this may be a good time to fork the project if you want to retain or improve compatibility with existing scripts.

There is a lot that can be done with this framework to make the language more familiar for users. C-like operators, local variables introduced by assignment, string literals as arguments, arrays, all sorts of experimental features.

I don't know. For now I'm still in a phase where I'm surprised it does anything at all.

----

In a general sense there's an hard limit to how much of the behavior of an hand-written parser you can emulate with a machine-written one.
On the other hand, machine-written parsers have been so popular for so long that people assume a computer language is bound by their rules.
Metal King Slime
Send private message
 
 PostTue Mar 24, 2020 1:28 am
Send private message Reply with quote
It's nice that it continues on an error, because otherwise you wouldn't get anything useful out of it.
Reading about doing error reporting with PLY.

I guess you're referring to the fact that default arguments are written to scripts.txt, but the script compiler needs to add defaults for missing arguments to script, function, and math operator calls itself; the interpreter doesn't do so. (Math ops: "increment"/"decrement" can be called with a single argument. Very uncommon, but certain people like to use that form)

In fact, the script interpreter doesn't even use the default args in scripts.txt even when it should (when a plotscript is triggered), and that's a bug.


Quote:
You can add more parsers. My intent was to have a rough line oriented, preprocessor like parser and a proper one.

Right, that'll be a far cleaner solution.

Quote:
No reason except for the fact that I'm not sure where I want to go. I think this may be a good time to fork the project if you want to retain or improve compatibility with existing scripts.

There is a lot that can be done with this framework to make the language more familiar for users. C-like operators, local variables introduced by assignment, string literals as arguments, arrays, all sorts of experimental features.


So you're planning to do more than just re-factoring, then?

I want to add most of that stuff to HS/HSpeak anyway.
Introducing variables by assignment (with special syntax like "x <- 2" or "var x := 2") is something I'm considering, I find "variable" annoyingly verbose. And some alternative/new operator spellings, like binary - instead of -- if possible (argh), unary -, != instead of <> and %% as an alternative to ,mod, which is absolutely awful. %% rather than % because I want to reserve % for units, eg 4% == 0.04. Incidentally I want to add other units too, like "walk hero(me, left, 40px)", or "2 tiles" or "wait(0.3s)". They'll compile to e.g. "px(40)". That would be very nice.

I do have a HSpeak branch where I added object.member and array[index] syntax and string literals, which compiled down to kludges like readglobal instead of real objects/types, arrays and strings. But I want to add the real things, after I get past these graphics features.

So if you want to work on these sorts of things, but still diverge from HS, it might be better to have two fork points, for HS-current and HS-future.
Red Slime
Send private message
 
 PostTue Mar 24, 2020 10:38 am
Send private message Reply with quote
Quote:
It's nice that it continues on an error

I've told you errors are overrated.
Quote:
the script compiler needs to add defaults for missing arguments to script, function, and math operator calls itself

I suspected that. But I'm keeping track of the values, so it shouldn't be hard to do together with argument count checking in some AST_post function.
Quote:
So you're planning to do more than just re-factoring, then?

We're still not allowed to go outside for another week at least. I have no idea what I'm going to do in the future and I don't want to think about it.

----

There's one thing worth noting that I did yesterday. I was't happy with how the "gen" and "hs" modules were attached to the "ast" module, so now I've attached them to "tld" instead. That solved most of the circular dependencies issues.

There's still a reason to let "parse" be attached to "ast" the way it is, but that has to do with how PLY works. All other imports are now just regular imports.

----

I've implemented default values and check for number of arguments for function and script calls. There's still no error reporting, but I do padding. Also, it was harder than expected, but I've got the engine to segfault reliably.

An important change I've made is that I finally I had it with using simple lists in the AST_state dictionaries and so I've introduced the AST_call_signature class which is meant to store what's needed to assemble a function or a script call.

After this and unless I forgot some other major feature, the general structure should remain stable for a while.

----

While implementing these latest changes I've noticed that sometimes functions are defined with constants as default values but some of those constants are defined after they're referenced.

After some panic, I've considered doing a third compiler pass, but then I've found another solution. Instead of trying to resolve the list of default arguments into a list of integers at the time of the function definition, I attach the relevant piece of the AST to the call signature and I leave it there unresolved.

Later in AST_post_3 I recall it and I merge it with the piece of AST that comes with the function call. From there the rest of the compiler does the resolution as it normally would. Since the function and constant definitions are read during pass 1, but the compiler operates during pass 2, they are resolved independent of the ordering.

The code that does this is not horribly complicated, but I thought it was worth describing. For scripts I do almost the same thing except for an extra step where I swap the parameters with their default values or with a zero.
Metal King Slime
Send private message
 
 PostWed Mar 25, 2020 2:41 pm
Send private message Reply with quote
Yes, I mentioned that HSpeak does 3 passes, first for all the defineconstants and also definetrigger, second to process other top-level blocks once the constants and the names "script" and "plotscript" are known, and third to compile scripts after all script call signatures are known. But as you said, doing it that way isn't necessary provided no constants are used for "number of arguments" in definefunction. Your solution looks pretty clean to me.

A few functions like srunscriptbyid allow a variable number of arguments, which is indicted with n_args == -1 and no default values (it's too bad definefunction doesn't allow specifying a min or max number of args; I want to replace it at some point).
To accomodate that, you just need to add "if p_func.n_args == -1: continue" to AST_post_3().

Quote:
Also, it was harder than expected, but I've got the engine to segfault reliably.

I don't know what you saw, but when I tried compiling and running baconthulhu.hss I got "encountered clean noop" (meaning it encountered the noop() function, which has id=0), which I traced back to random being miscompiled again. See fix in patch below.

I also noticed that ", and," etc binops weren't space- and case-insensitive, but the following generalisations of the regexps fix that.

Code:
--- a/hspeak_gen.py
+++ b/hspeak_gen.py
@@ -83,7 +83,7 @@ def kind_and_id(node):
         # compatibility
         if node.leaf in binop_table:
-            return KIND_FUNCTION, binop_table[node.leaf]
+            return KIND_MATH, binop_table[node.leaf]
         if node.leaf in unop_table:
-            return KIND_FUNCTION, unop_table[node.leaf]
+            return KIND_MATH, unop_table[node.leaf]
         if node.leaf in flow_table:
             return KIND_FLOW, flow_table[node.leaf]
@@ -91,6 +91,7 @@ def kind_and_id(node):
     if node.type == 'binop':
 
-        if node.leaf in binop_table:
-            return KIND_MATH, binop_table[node.leaf]
+        canonical = node.leaf.upper().replace(' ', '')
+        if canonical in binop_table:
+            return KIND_MATH, binop_table[canonical]
 
     if node.type == 'unop':
--- a/hspeak_parse.py
+++ b/hspeak_parse.py
@@ -46,4 +46,4 @@ t_LT_EQUAL = r'<='
 t_GT_EQUAL = r'>='
-t_BITWISE_AND = r',AND,'
-t_BITWISE_OR = r',OR,'
+t_BITWISE_AND = r'(?i),\s*AND\s*,'
+t_BITWISE_OR = r'(?i),\s*OR\s*,'
 t_BOOL_AND = r'&&'
@@ -55,5 +55,5 @@ t_LESS_THAN = r'<<'
 t_GREATER_THAN = r'>>'
-t_BITWISE_XOR = r',XOR,'
+t_BITWISE_XOR = r'(?i),\s*XOR\s*,'
 t_BOOL_XOR = r'\^\^'
-t_REMINDER = r',MOD,'
+t_REMINDER = r'(?i),\s*MOD\s*,'
 
 
Red Slime
Send private message
 
 PostWed Mar 25, 2020 4:07 pm
Send private message Reply with quote
I hope I got your patch in correctly this time.

Going forward I want to make this version (same link) final. It is close to what I had in my mind when I started and I think we've made something really interesting for people who like to play this computer language game.

Feel free to fork and re-license it however you want. It's our stuff. Sorry if you forked it before yesterday's marathon, but the call signature/default argument issue took me by surprise.

I think I want to do something else for a while now, but I'm pleased with how far we were able to go.
Metal King Slime
Send private message
 
 PostThu Mar 26, 2020 2:53 pm
Send private message Reply with quote
OK. Thank you hugely for your work. How do you want to be credited, just "Lenny"?

I'll commit it to the OHRRPGCE SVN repo (downloading and uploading the source is really not fun, why did we do that?) and continue working on it there, now that I know I won't be conflicting with your changes. You can get an svn account by asking James. I'll make the license dual-licensed as BSD or the OHRRPGCE license (currently GPL). In the absense of a name, I might pick 'physpeak'.

I'm currently working on implementing 'switch' and am sure I'll shortly manage to get the Baconthulhu scripts working 100%. (451 out of 498 scripts compile currently, and they all look good to me.) After that I'll try to get it to pass all the test cases, and have other improvements in mind too, and of course as a remote REPL for Game. Check back in a couple days.

This may yet become HSpeak v4. I'm considering the practicality of it, and also looked into possible alternatives to PLY (e.g. SLY, PlyPlus, PyBison, Lark) to see what our options are/whether there's any advantage. It does look pretty feasible to switch if there's a good reason, since they're so similar.
Red Slime
Send private message
 
 PostThu Mar 26, 2020 5:35 pm
Send private message Reply with quote
Code:
just "Lenny"?

That would be fine.
Code:
get the Baconthulhu scripts working

I did it by converting all switch blocks to if/else blocks by hand. The only feature you can't run around is $msg + 10 = "". It generates the dungeon, the player moves, the menu works. There are glitches but it's almost playable.
Code:
SLY, PlyPlus, PyBison, Lark

The reason I made it Python 2 compatible at some point was to try to compile the parser with ShedSkin. It didn't work. Lark looks interesting. I don't know. In my search for a parser generator I stopped at PLY because it was advertised as being close to flex/bison but without all the setup required for a C project.
Code:
Check back in a couple days.

I'll be around unless the russian army decides to invade my home, which these days it's a possibility.

----

I've found a serious bug in AST_post_3. It's supposed to start like this:
Code:
def AST_post_3(node):

    if node.children:

        for child in node.children:

            if \
            child.type != "function" and \
            child.type != "value":
                continue

            if child.leaf in AST_state.scripts:
                p_func = AST_state.scripts[child.leaf]
            elif child.leaf in AST_state.functions:
                p_func = AST_state.functions[child.leaf]
            else:
                continue

The rest of the function is ok. I really don't know what happened there. I'm sure it was working at some point, then I must have hit undo one time too much.

----

It didn't occur to me initally but if you do this:
Code:
def p_string_op_1(p):
    "void : '$' string_ref '=' string_val"
    p[0] = AST_node("function", [p[2], p[4]], "setstringfromtable")
   
def p_string_op_2(p):
    "void : '$' string_ref '+' string_val"
    p[0] = AST_node("function", [p[2], p[4]], "appendstringfromtable")
   
def p_string_op_3(p):
    "void : '$' string_ref '+' expression"
    p[0] = AST_node("function", [p[2], p[4]], "concatenatestrings")

You can get rid of the "$a + b" vs. "a $+ b" syntax. It works and looks decent on code. Something like:
Code:
$msg = "Drank a "
get item name(tmp str, item)
$msg + tmp str
$msg + ". ("
append number(msg, inventory(item))
$msg + " left)"


----

I have a playable version of Baconthulhu here if you want to try it. I guess it's a demo for the new compiler and also a proposal for some language changes.
Metal King Slime
Send private message
 
 PostSun Mar 29, 2020 8:29 am
Send private message Reply with quote
Very nice.
Yes, I suppose that's a pretty reasonable alternative to $+ and $=... however, is that going to prevent $msg+i="" from parsing?

A git mirror is here.
We've diverged, but I want to merge some of your changes back in, particularly the file reorganisation.

I added better error reporting and made the repl usable by importing readline. I can't stand repls that don't use at least readline. But I'd like to switch to something more powerful, maybe prompt_toolkit, and add suggestions, argument hinting, and more.

I've implemented 'switch', both old and new syntaxes, and I think it's compiled correctly but haven't tested it.

I've discovered the joy of writting LALR(1) grammars. Adding switch caused a lot of seemingly unrelated shift/reduce and even a reduce/reduce conflict to crop up. Spent a lot of time today wrapping my head around it and finally properly reading through the PLY docs.
In the quest to fix those I've removed the 'empty' symbol (not yet committed to svn) and am trying to find a better solution for handling newlines and commas (which removing 'empty' broke). Instead of 'empty' nodes, the difference between "x" and "x()" is indicated by .children being None or [].
Red Slime
Send private message
 
 PostSun Mar 29, 2020 10:01 am
Send private message Reply with quote
I'm so sorry about breaking compatibilty, but I just had to re-organize the operators and get rid of "--". I don't think it's going to be too hard to add some compatibility back later.
Quote:
particularly the file reorganisation

I thought you would have liked to include it in your PyGTK tool at some point.
Quote:
is that going to prevent $msg+i="" from parsing?

Yes. I've made it $(msg + i) = "" for expressions. It may be possible to do better by enstablishing the right priorities but I haven't put much effort into it.
Quote:
I've implemented 'switch'

I've had a long debate with myself about it. I won't tell you the whole story, but I was considering doing constant propagation and having it would have gotten in the way.
Quote:
children being None or []

In respect to what the PLY manual suggests, I've made some abuse of the fact that in Python both None and [] evaluate to False. So I haven't been too careful to differentiate them.

----

I was aware of a long-standing bug, which was due to the fact that I couln't find a negate function in the VM and knowing it was never used anyway by the original compiler, I had it temporarily aliased to the unary "not". So I've taken a page from your manual and did this:
Code:
def p_unop(p):
    "expression : '-' expression %prec UMINUS"
    if p[2].type == "number":
        p[0] = AST_node("number", None, -p[2].leaf)
    else:
        p[0] = AST_node("binop", [p[2], AST_node("number", None, -1)], "*")

It's for stuff like:
Code:
HSpeak> -a(b)
function: do
  binop: *
    function: a
      value: b
    number: -1
Display posts from previous:
Page «  1, 2, 3, 4, 5  »