Skip to content

Conversation

akinomyoga
Copy link
Collaborator

See commit messages.

Note: Now ([k1]=v1 [k2]=v2) is not a literal for BashAssoc

The construct (...) in a=(...) and a+=(...) always produces the new type InitializerList regardless of the forms of its items. The LHS object receives the initializer list and reads items one by one to modify its content. When the variable a is Undef, a is initialized to be an indexed array and processes the initializer list, so a=([k1]=v1 [k2]=v2) produces a sparse array (but not an associative array).

@akinomyoga akinomyoga force-pushed the InitializerList branch 4 times, most recently from 57c50bf to a444d81 Compare February 27, 2025 12:38
@akinomyoga
Copy link
Collaborator Author

Cleanup of ShArrayLiteral

Another thing to discuss is how much we should clean up the ShArrayLiteral-related codes. We no longer use ShArrayLiteral for the construct a=(...), but YSH seems to use ShArrayLiteral for another construct :| ... |, so we cannot simply remove ShArrayLiteral. Then, it is nontrivial which uses of ShArrayLiteral in the codebase could be removed.

I guess the definition in frontend/syntax.asdl and related functions in frontend/location.py should be kept:

./frontend/location.py:32:    ShArrayLiteral,
./frontend/location.py:220:        if case(word_part_e.ShArrayLiteral):
./frontend/location.py:221:            part = cast(ShArrayLiteral, UP_part)
./frontend/location.py:293:        if case(word_part_e.ShArrayLiteral):
./frontend/location.py:294:            part = cast(ShArrayLiteral, UP_part)
./frontend/location.py:491:        elif case(expr_e.ShArrayLiteral):
./frontend/location.py:492:            node = cast(ShArrayLiteral, UP_node)
./frontend/syntax.asdl:187:  ShArrayLiteral = (Token left, List[word] words, Token right)
./frontend/syntax.asdl:205:    ShArrayLiteral %ShArrayLiteral
./frontend/syntax.asdl:254:    # For word sequences command.Simple, ShArrayLiteral, for_iter.Words
./frontend/syntax.asdl:362:                                  # like ShArrayLiteral, but no location for %(
./frontend/syntax.asdl:554:  | ShArrayLiteral %ShArrayLiteral

I think the uses of ShArrayLiteral in ysh/expr_{parse,eval,to_ast}.py should also be kept:

./ysh/expr_eval.py:18:    ShArrayLiteral,
./ysh/expr_eval.py:1219:            elif case(expr_e.ShArrayLiteral):  # var x = :| foo *.py |
./ysh/expr_eval.py:1220:                node = cast(ShArrayLiteral, UP_node)
./ysh/expr_parse.py:5:                                       CommandSub, ShArrayLiteral,
./ysh/expr_parse.py:223:            lit_part = ShArrayLiteral(left_tok, words3, close_tok)
./ysh/expr_to_ast.py:14:    ShArrayLiteral,
./ysh/expr_to_ast.py:692:            return cast(ShArrayLiteral, pnode.GetChild(1).tok)
./ysh/expr_to_ast.py:695:            return cast(ShArrayLiteral, pnode.GetChild(1).tok)

I guessed the uses in osh/* can be removed in most cases, but it turned out that YSH calls word_ev.EvalWordSequence() here:

strs = self.word_ev.EvalWordSequence(words)

which means that the :| ... | literal can be processed by osh/word_eval.py. Then, we need to be careful about removing ShArrayLiteral-related codes in osh/word_eval.py, etc.

What I removed

I removed the processing of ShArrayLiteral in EvalRhsWord (by my guess it wouldn't be triggered by YSH). The error didn't arise so far, but I'm not sure if the test coverage is perfect.

What I left

There are still places where ShArrayLiteral is used in the OSH codebase. I was not sure if it can be safely removed, so I left them in the codebase in the present PR.

  1. The first place is _EvalWordPart (osh/word_eval.py):

oils/osh/word_eval.py

Lines 1845 to 1853 in a444d81

with tagswitch(part) as case:
if case(word_part_e.ShArrayLiteral):
part = cast(ShArrayLiteral, UP_part)
e_die("Unexpected array literal", loc.WordPart(part))
elif case(word_part_e.InitializerLiteral):
part = cast(word_part.InitializerLiteral, UP_part)
e_die("Unexpected associative array literal",
loc.WordPart(part))

This anyway produces an error message, so I think the removal of this part wouldn't affect correct scripts. However, I'm not sure if there is indeed a case where :| ... | comes here and we want to print the error message for wrong scripts.

  1. Another place is inside _EvalWordPart (osh/word_.py):

oils/osh/word_.py

Lines 99 to 107 in a444d81

elif case(word_part_e.ShArrayLiteral, word_part_e.InitializerLiteral,
word_part_e.ZshVarSub, word_part_e.CommandSub,
word_part_e.SimpleVarSub, word_part_e.BracedVarSub,
word_part_e.TildeSub, word_part_e.ArithSub,
word_part_e.ExtGlob, word_part_e.Splice,
word_part_e.ExprSub):
return False, '', False

  1. ShArrayLiteral is skipped in DoWordPart (tools/ysh_ify.py):

oils/tools/ysh_ify.py

Lines 1020 to 1026 in a444d81

with tagswitch(node) as case:
if case(word_part_e.ShArrayLiteral, word_part_e.InitializerLiteral,
word_part_e.TildeSub, word_part_e.ExtGlob):
pass
elif case(word_part_e.EscapedLiteral):

Renaming {Sh => Ysh}ArrayLiteral?

If ShArrayLiteral is only used by YSH's :| ... |, maybe we should rename it to YshArrayLiteral.

Copy link
Contributor

@andychu andychu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow thank you! I read over the docs ONLY

That helps and clarifies what it does ... I don't think I fully understood that

I will have more comments later once I look at the code

Copy link
Contributor

@andychu andychu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks good -- I mostly looked at the tests and the ASDL schema

I think we should rename ShArrayLiteral to YshArrayLiteral -- that is a good idea, since it's only used for YSH now

)

AssocPair = (CompoundWord key, CompoundWord value)
AssocPair = (CompoundWord key, CompoundWord value, bool has_plus)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed in frontend/syntax.asdl we have assign_op = Equal | PlusEqual, and AssignPair uses it

In core/runtime.asdl, AssignArg has bool has_plus

(One of them is static and one is dynamic)

It might be better to be consistent


IntBox = (int i)

InitializerValue = (str? key, str rval, bool plus_eq)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be cleaner to use frontend/syntax.asdl assign_op everywhere, although we can also refactor that separately probably

* [osh/word_parse] Allow mixed forms of initializer in BashAssocLiteral

  We allow mixed forms of initializer such as (1 2 [3]=4).  We extend
  the element type of "BashAssocLiteral.pairs" to "InitializerWord",
  which is either "ArrayWord" or "AssocPair". "ArrayWord" is the non
  array-assignment form of a word.  In addition, "AssocPair" now
  records also whether the word in initializer has the form "[]=" or
  "[]+=".

* [refactor frontend/syntax] Rename "{BashAssoc => Initializer}Literal"
The current OSH behavior of overwriting "assoc" with a new indexed
array for "assoc=(1 2)" conflicts with the Bash 5.1 feature
"assoc=(key value)".  The current master behavior will soon be changed
when the Bash 5.1 feature is implemented, yet we keep the current
master behavior for now to avoid changing the behavior too many times.
This commit adjusts the behavior to be backward compatible with the
current master version.
Now BashArray/BashAssoc cannot appear on the RHS. The construct =(...)
always produce InitializerList.
* [spec/assoc] Update spec tests for "assoc=([key]=value)"

  We now support the sparse-array initialization of the form
  "arr=([index]=value)", so an attempt of assigning an initializer
  list to an indexed array no longer overwrites the original array
  with BashAssoc.  With this change, the expected behaviors of three
  spec tests in spec/assoc.test.sh are changed.  We adjust the
  expected results of the three tests and copy them into
  spec/ble-sparse.test.sh.  We also leave BashAssoc versions of spec
  tests in spec/assoc.test.sh.

* [spec/ysh-printing] Update spec tests for "assoc=([key]=value)"
The initialier-list implementation resolves the inconsistency in the
sparse array representation of the form "declare -a sp=([2]=v)".  This
commit removes the "oils_failures_allowed" introduced in Ref. [1] (as
a part of PR 2257, which turned on the sparse array representation of
indexed arrays).

[1] bfa7497#diff-097d35f191fa3ada9a05d61d3020a2b3acb43e800141b67021094fb05d8e769e
OSH now implements the feature required by spec/assoc#37 "Implicit
increment of keys" so passes the test.
* [doc/ref/chap-osh-assign] Add descriptions about the initializer list
* [doc/{ref/{chap-type-method,toc-osh},known-differences,quirks}] Update
* [doc/ref/chap-osh-assign] Move the paragraph about =() vs +=()
* [doc/ref/chap-osh-assign] Fix a typo "asso{ => ci}ative"
* [doc/ref/chap-type-method] Mention sh-init-list for initialization/mutation
* [doc/ref/toc-osh] Update an example for "sh-init-list"
* [doc/ref/chap-osh-assign] Explain LHS type for the list initialization using a table
* [doc/ref/chap-osh-assign] Explain list initialization of BashArray/BashAssoc using examples
@akinomyoga
Copy link
Collaborator Author

I think we should rename ShArrayLiteral to YshArrayLiteral -- that is a good idea, since it's only used for YSH now

4756763

Copy link
Contributor

@andychu andychu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you! I fixed a couple typos in the docs

I think we should consider the renaming mentioned

And I may make some more tweaks to the docs, but this is a big step forward!

@andychu andychu merged commit 4991783 into soil-staging Mar 1, 2025
18 checks passed
@akinomyoga akinomyoga deleted the InitializerList branch March 1, 2025 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants