Indent tool for HSpeak

Talk about things that are not making games here. But you should also make games!

Moderators: Bob the Hamster, marionline, SDHawk

lennyhome
Slime Knight
Posts: 115
Joined: Fri Feb 14, 2020 6:07 am

Post by lennyhome »

I'm trying to privilege simplicty and over coverage for now. If something silly like collating multiple lines with a ',' and pretending it's a list works 99% of the times, then why not? The way I'm currently stripping comments is also wrong. The language is never going to be exactly the same but it could be made better in a way that porting existing scripts would be trivial.

I've just done some support for "else if" right now, like so:

Code: Select all

HSpeak> if (a) then (b) else if (c) then (d)
root
  if
    group
      val: a
    val: b
    if
      group
        val: c
      val: d
And I've introduced the same issue that exists for Python tuples:

Code: Select all

>>> (1)
1
>>> (1,)
(1,)
Where 1-lenght tuples collapse to values. And becaues Python does that, that's a feature, not a bug.

I've put some BSD license in it, just to be sure, but feel free to choose or do whatever you like.
1k lines per second?
It's ok fast. Consider that I'm working on a Raspberry Pi 3.

Finally, you may have heard of stuff going on in Italy due to that zombie virus. Internet is ok for now and we're probably going to make it. Just... If I start to turn... Do whatever you need to do.

----

Code: Select all

expression_group : '(' ',' ')'
I don't think that does anything. I've put it there to deal with blank lines somewhat, but then I chose to skip them in the pre-parser. I wanted to allow this:

Code: Select all

HSpeak> (a,,b)
root
  group
    val: a
    none
    val: b
And I was thinking of a way to specify some sort of NULL. Maybe? There's still a lot to do to resolve the fine details
"do" block not following a for/while/switch
I've just added it.

Don't worry too much about:

Code: Select all

WARNING: 110 shift/reduce conflicts
I'll solve some of those but they just mean some rules that have the same lenght depend on the order they're defined instead of explicitly via the priority table.

----

Since by law I'm only allowed to program and ****post on 4chan, I wanted to type a little bit more and correct myself. The order in which the rules are defined doesn't matter. In case of a conflict, the rules that match are those that form the longest chain possible and then are reduced right to left, so:

Code: Select all

HSpeak> if (a) then b + 1
root
  if
    group
      val: a
    binop: +
      val: b
      number: 1
That's the default behavior. Unfortunately (b + 1) is a useless statement. If you set a priority higher than + for THEN, this happens:

Code: Select all

HSpeak> if (a) then b + 1
root
  binop: +
    if
      group
        val: a
      val: b
    number: 1
Which is interesting because makes it possible to use IF statements in the middle of expressions like in (a ? b : c) in C.
Anyway. I've explicitly disabled that behavior in the interest of my sanity. I just wanted to let you know it's possible.
Last edited by lennyhome on Thu Mar 12, 2020 6:47 pm, edited 3 times in total.
TMC
Metal King Slime
Posts: 4308
Joined: Sun Apr 10, 2011 9:19 am

Post by TMC »

Wow. So now you have little to do but play around with this. Until you run out of food. Please don't turn.

I'm in favour of adding the ternary operator to HS, but I don't know what syntax to use; maybe iif(condition, a, b) like FB uses.

I see you added a 'none' symbol, something that returns nothing. But this seems to cause some problems. For example I don't know why you have the second rule here:
none : IF expression_group THEN expression_group ELSE expression_group
| IF expression_group THEN expression_group ELSE none
Which allows stuff like "if(1) then(2) else x:=4"

.....however, I've been thinking about making parentheses after many statements like 'if', 'then', etc optional. I think it should be possible, while breaking fex existing scripts, to allow

Code: Select all

    # I think the comma will be needed, but none after )
if wallbits, and, northwall, then do something, else something else(1)
# Maybe comma after ) could be optional:
if keypress(anykey) then presses += 1, else wait(1)
Maybe this is a can of worms (switch statements?, I haven't thought it through

I'm playing around with the grammar now to see whether I can get that.

More importantly, I'm going to work on the Game<->REPL IPC.

(BTW, "x:=y", "x+=y", "x-=y", "$x="y"" and "$x+"y"" are all expressions that return a value. Sadly, in HSpeak :=/+=/-= are still left-associative. Which can be considered a bug. (And fixing it won't break any scripts))

Regarding the following...
statement : expression_list

expression_list : expression_group
| expression
| none
My guess is that expression_list : none is just for the purpose of typing statements like 'if' at the REPL, not serious?

As for tuples... I like destructuring assignments, and possibly also optional arguments at positions over than the end (like x(1, , 2), which is the intention you had for that?), so those could be useful.
lennyhome
Slime Knight
Posts: 115
Joined: Fri Feb 14, 2020 6:07 am

Post by lennyhome »

The reason for "none" is confusing at the moment. It allows empty tokens in lists:

Code: Select all

HSpeak> , &#40;,,&#41;, &#40;&#41;
And I've used it as return value for IFs and assignments because "none" is allowed in lists but disallowed in arithmetic. That's its meaning.

The:

Code: Select all

none&#58; IF expression_group THEN expression_group ELSE none
is there for this:

Code: Select all

HSpeak> if &#40;a&#41; then &#40;b&#41; else if &#40;c&#41; then &#40;d&#41; ...
It's an experiment.

You can make parenthesis optional in the condition part but I didn't do it because of the ",AND," operator. Should it be aliased to "&&" using the lexer?

You can make more parenthesis optional as long as it's not ambiguos with the priority of "THEN" and "ELSE" as binary operators. That's tricky as I've showed you earlier. You can make more things like single statements able to be chained without parenthesis.

The PLY manual says that in case of shift/reduce conflicts you should open "parser.out" and assume caffeine. If you're going to attempt stuff like the above, you're going to need it. Search for "conflict" in that file and it'll give you some indications.

I did what appeared to be strictly necessary in the interest of readability but at least now I understand why so many people make so many computer languages: this parser business is a game in itself.
User avatar
Bob the Hamster
Lord of the Slimes
Posts: 7658
Joined: Tue Oct 16, 2007 2:34 pm
Location: Hamster Republic (Ontario Enclave)
Contact:

Post by Bob the Hamster »

I like the iif(condition, trueval, falseval) syntax for ternerys

I am wary of making parentheses optional. Is it a big benefit? I regret many of the optional vaguenesses already in hamsterspeak

Be careful with ,and, it is not the same as &&

,and, is bitwise
&& Is logical

The same with ,or, and ||
lennyhome
Slime Knight
Posts: 115
Joined: Fri Feb 14, 2020 6:07 am

Post by lennyhome »

Wow. So now you have little to do but play around with this. Until you run out of food.
Yes. I love my country and I love the massive damage it's doing to us to save a handful of extremely old people.
I am wary of making parentheses optional.
The good thing about this framework we're using is that it can predict without ambiguity if something is going to be ambiguos. It takes care of the known unknowns, the unknown unknowns and allows people and fish to coexist.
Be careful with ,and, it is not the same as &&
I've fixed it. I think.
User avatar
Bob the Hamster
Lord of the Slimes
Posts: 7658
Joined: Tue Oct 16, 2007 2:34 pm
Location: Hamster Republic (Ontario Enclave)
Contact:

Post by Bob the Hamster »

lennyhome wrote: The good thing about this framework we're using is that it can predict without ambiguity if something is going to be ambiguos. It takes care of the known unknowns, the unknown unknowns and allows people and fish to coexist.
Mermaids?
Okay, I'll trust y'all on this one if it means mermaids ;)
lennyhome wrote:
Be careful with ,and, it is not the same as &&
I've fixed it. I think.
Excellent!
TMC
Metal King Slime
Posts: 4308
Joined: Sun Apr 10, 2011 9:19 am

Post by TMC »

Ah yes, the extra commas due to and/or/mod may make it trickier to get the grammar right, but I would think it should still be possible. The change I'm suggesting is basically to disallow identifier to start with "if ", "then ", "else ", including the spaces. An alternative would be "if: ", "then: " etc.

Whether making the brackets optional is a good idea depends on whether it makes it easier or harder to write scripts. It's not just less typing, it may be easier or harder to read, may cause more confusing compiler error messages, but also (a primary motivation) having fewer brackets in your scripts means there's less chance of mismatching them. Even stronger, it could be required that whole statement/expression following if/then/else be on the same line, so if you do mismatch brackets, eg:

Code: Select all

if flag, then set hero direction&#40;me, npc direction&#40;npc&#41;
then you could get an error there.

But that syntax is so different that maybe it's too crazy an idea.

It's very common that people misplace brackets because the typical HamsterSpeak script has terrible formatting with crazy placement of brackets and newlines. Users always complain that HSpeak doesn't help them find the misplaced brackets. The fact that HSpeak splits compiling over so many passes means that it finds mismatched brackets very early - it only counts brackets to begin with, it doesn't actually pair them up! It would be possible to print out a detailed multi-line message showing how the brackets are paired (which is the only possible help I can think of) but HSpeak would need a little restructuring.

I would try out and see how it behaves in practice... but this isn't important; if I'm going to change syntax I should be working on stuff like arrays, member access, floats, etc.
Last edited by TMC on Sat Mar 14, 2020 12:52 pm, edited 4 times in total.
lennyhome
Slime Knight
Posts: 115
Joined: Fri Feb 14, 2020 6:07 am

Post by lennyhome »

Whether making the brackets optional
I'm not going to do it, but it's also really not a concern.

In a general sense, the syntax I'm implementing is close to the original, but not the same. For example I can't deal with:

Code: Select all

if &#40;a&#41; then &#40;b&#41;
else &#40;c&#41;
Because of my Go-inspired automatic comma insertion, it results in a logical error due to the default clause in switch statements having the same syntax. I'll let somebody with higher IQ than me fix that. This however:

Code: Select all

if &#40;a&#41; then &#40;
... b ...
&#41; else &#40;
... c ...
&#41;
Is perfectly fine. And there are also many more subtle differences. Consider that I'm taking no code from the original compiler, rather I'm implementing features as I see them actually used.

An important change is that I'm now doing 2-pass compilaton. I have no idea if it's necessary but I wanted to do it and it came at very little cost. Scripts can now forward-reference constants, globals and each other.

----

Is this anywhere close to being right?

Code: Select all

HSpeak> if &#40;a&#41; then &#40;b&#41; else if &#40;c&#41; then &#40;d&#41; else if &#40;e&#41; then &#40;f&#41; else &#40;g&#41;
flow&#58; return
  flow&#58; if
    value&#58; a
    flow&#58; then
      value&#58; b
    flow&#58; if
      value&#58; c
      flow&#58; then
        value&#58; d
      flow&#58; if
        value&#58; e
        flow&#58; then
          value&#58; f
        flow&#58; else
          value&#58; g
toHSZ&#58; 270 bytes
If it is, then I'm only missing string support. Not quite, but... I've had to make a lot of changes and refactoring. Sorry about that.
Last edited by lennyhome on Sun Mar 15, 2020 2:15 pm, edited 6 times in total.
TMC
Metal King Slime
Posts: 4308
Joined: Sun Apr 10, 2011 9:19 am

Post by TMC »

Two passes, great; that is necessary.

Don't worry, your changes don't really conflict with me.

Well, I see you edited your post with a much better AST for the "else if".
"if" nodes (in hsz) always need 3 children. As generated by HSpeak, the 2nd and 3rd are always "then" and "else", so "if (a) then (b) else if (c) then (d) else (e)" becomes "if (a, then(b), else(if(c, then(d), else(e))". However, looking at the interpreter source, I think the way you're compiling it, omitting the "else", will also work fine and is more efficient. I never thought of doing that.
Last edited by TMC on Sun Mar 15, 2020 3:01 pm, edited 1 time in total.
lennyhome
Slime Knight
Posts: 115
Joined: Fri Feb 14, 2020 6:07 am

Post by lennyhome »

I've determined that the parenthesis in IF/WHILE conditions aren't optional. Has to do with name concatenation. It's either you disallow reserved words at all in variable names or you need the parenthesis.

I've solved the last two remaining shift/reduce conflicts. I introduced them when for whatever reason I attempted to add support nested lists a while ago. They weren't dangerous anyway, but nested lists aren't needed at the moment.

When you get a chance to look at "baconthulhu.txt" I think you'll be pleased. Most of the remaining errors are due to string support, more than one "#" on a line or use of begin/end inside scripts, which are all features I don't support at the moment.
omitting the "else"
I wasn't sure about that. I think I can do it either way.
TMC
Metal King Slime
Posts: 4308
Joined: Sun Apr 10, 2011 9:19 am

Post by TMC »

Yes, the parsed AST for baconthulhu.hss (and plotscr.hsd) looks great, as far as it works! And the grammar is a lot more straightforward and makes more sense to me.
I see you've been working on hspeak_gen.py too.

BTW, I just noticed you replaced the original lines from compile_recurse:

Code: Select all

         if kind in &#40;KIND_FLOW, KIND_MATH, KIND_FUNCTION, KIND_SCRIPT&#41;&#58;
             cmddata.append&#40;len&#40;self.children&#41;&#41;
with

Code: Select all

    if not node._children&#58;
        cmddata.append&#40;0&#41;
        return
But that's not right; only those four kinds should be followed by a child count. (Right now it only adds bloat but will still run, but it will be a problem in a couple weeks when I finally merge a branch which adds debug info to the .hsz format)

Also BTW, $x="y" is not the same as the $= operator (which takes string ID as the second arg instead of a string literal). They both compile to a function call (KIND_FUNCTION) insted of KIND_MATH like other operators. The $= and $+ operators are rarely used.
lennyhome wrote:I've determined that the parenthesis in IF/WHILE conditions aren't optional. Has to do with name concatenation. It's either you disallow reserved words at all in variable names or you need the parenthesis.
Not so! The reason removing the parentheses didn't work is because... you're not actually doing any name concatenation. t_NAME accepts spaces and then moves them so apparently p_name_concat_1/2/3 are never used.

After removing spaces from t_NAME (although I think it's quite preferable that it does handle them), I only had to define condition like so:

Code: Select all

def p_condition&#40;p&#41;&#58;
    """condition &#58; expression
                 | expression ','"""
    p&#91;0&#93; = p&#91;1&#93;
(This does produce some shift-reduce clonflicts though, haven't investigated)
Result:

Code: Select all

HSpeak> while true, do &#40;&#41;
function&#58; do
  flow&#58; while
    value&#58; true
    flow&#58; do
      none
HSpeak> if key is pressed &#40;key&#58;x&#41; then &#40;wait&#40;1&#41;&#41;
function&#58; do
  flow&#58; if
    function&#58; keyispressed
      value&#58; key&#58;x
    flow&#58; then
      function&#58; wait
        number&#58; 1
HSpeak> if &#40;1&#41;, then &#40;2&#41;
function&#58; do
  flow&#58; if
    number&#58; 1
    flow&#58; then
      number&#58; 2
HSpeak> if 1+2 then&#40;&#41;
function&#58; do
  flow&#58; if
    binop&#58; +
      number&#58; 1
      number&#58; 2
    flow&#58; then
      none
Would need another symbol to makes brackets for 'then' also optional (don't want to make them optional for 'block')

This is fun.
Last edited by TMC on Mon Mar 16, 2020 5:09 pm, edited 8 times in total.
lennyhome
Slime Knight
Posts: 115
Joined: Fri Feb 14, 2020 6:07 am

Post by lennyhome »

To make it actually work I'm going to need help but so far I'm happy the code is legible and a good platform for experiments.

The reason I let the lexer handle some spaces in names is because of the "do se do" script. As for the additional parsing rules, they're all used to handle combinations of numbers, separators, keywords and non-keywords.
you replaced the original lines from compile_recurse
My bad. I've fixed it. I think.
This is fun.
Years ago I did something with flex/bison in C but I never got so in depth into it. It's totally a game in itself.
TMC
Metal King Slime
Posts: 4308
Joined: Sun Apr 10, 2011 9:19 am

Post by TMC »

Ah, an example of a script name starting with "do ". Actually, I think that's going to be far more common than identifiers starting with any other reserved word, so I guess "do" could be brackets-mandatory. Or use "do:" syntax instead... it's not python syntax, but it's pythonish!

I see what the other name concat rules are for, but you see that they weren't used?
I'd like to compare the speed difference between the name concat parser rules vs doing it all in t_NAME.

Shall I write the rest of the code for writing out .hs files (that is, writing the other lumps and the .hsz string table)?
Last edited by TMC on Tue Mar 17, 2020 3:39 pm, edited 1 time in total.
lennyhome
Slime Knight
Posts: 115
Joined: Fri Feb 14, 2020 6:07 am

Post by lennyhome »

Shall I write the rest of the code for writing out .hs files
At your leisure. Whatever and whenever you decide to do I'll integrate it.

----

Code: Select all

HSpeak> a &#58; do &#58; a
--> Used rule p_name_concat_4
--> Used rule p_name_concat_1
--> Used rule p_name_concat_1
--> Used rule p_name_concat_1
--> Used rule p_name_concat_3

Code: Select all

HSpeak> a &#58; 2
--> Used rule p_name_concat_4
--> Used rule p_name_concat_1
--> Used rule p_name_concat_2
Those rules are all used even when you allow spaces in names in the lexer.
Last edited by lennyhome on Tue Mar 17, 2020 7:06 pm, edited 1 time in total.
TMC
Metal King Slime
Posts: 4308
Joined: Sun Apr 10, 2011 9:19 am

Post by TMC »

Oh, right, I forgot that t_NAME doesn't handle :, but it can:

Code: Select all

def t_NAME&#40;t&#41;&#58;
    r'&#91;a-zA-Z_&#93;&#40;&#91;a-zA-Z0-9_ &#93;|&#58;&#40;?!=&#41;&#41;*'
    t.value = t.value.replace&#40;' ', ''&#41;.lower&#40;&#41;
    t.type = reserved.get&#40;t.value, 'NAME'&#41;
    return t
I see name_concat also handles . for file extensions in include lines, but this will have to be removed when we start using . for member access (. is already disallowed in identifiers in HSpeak). On include lines it's also necessary to preserve whitespace and case and allow all other characters in general too, like - ( and ). "include" in HS is quite a mess; there are two allowed syntaxes: the file name can optionally be enclosed in quotes. HSpeak treats include lines as a special case and doesn't run them through the lexer unless the filename is quoted

I see baconthulhu also has scripts called "for each effect", "do level up", "do idle save". Incidentally, I was planning to add a "for each" flow construct once we have iterables.

Anyway timing name_concat variants:

t_NAME handles spaces (As it was): 1.18s
t_NAME doesn't handle spaces: 1.30s (I added a rule to allow name_concat to start with IF, etc, so that it would compile)
t_NAME handles spaces, '.' and ':': 1.14s
As above, and replace name_concat symbol with NAME: 1.03s (The parser performs one less step)

So it doesn't make as big a difference as I'd hoped. But getting quite close to the speed of HSpeak.
(The only reason I care so much about speed is because of those huge Entrepreneur scripts. I spent a lot of time optimising HSpeak to handle them better. I think I sped it up around 3x, and HSpeak is compiled to C while Python is, uh, slow, so this is actually competing really well! The speed is good enough for all other games.)
Last edited by TMC on Wed Mar 18, 2020 8:00 am, edited 1 time in total.
Post Reply