undefined | Better HN

0 pointsdep_b2y ago0 comments

Are they stored as 16 bit words before or after parsing the BASIC code?

0 comments

The BASIC code is only in it's full textual form on screen. The moment you press return on it, it's tokenized, and it's stored tokenized both in memory and when saved. Unlike modern systems, the full textual representation of the code is never stored anywhere.

dep_bOP2y ago

When I SAVE a program in C64 BASIC and LOAD it again the syntax doesn't change no matter what I do, add spaces or not, use shorthand or not, colons, etcetera. So I get the feeling that my whole program gets saved as a string and then parsed, not tokenized and saved.

Also there is a line limit in C64 BASIC that would overflow if certain shorthand would be expanded and for beginners to see their fully written keywords being transformed to shorthand after loading would be even more confusing.

pgeorgi2y ago

The keywords are tokenized, the line number is converted to a 16bit integer, leading spaces are stripped (which is why some "formatted" BASIC uses ":" as the first character in a line, like the following), everything else is kept intact.

10 for i = 1 to 10

20 : (arbitrary number of spaces) print "hello"

30 next

The short hand issue is real, too:

1?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?

expands into six lines of "1 print:print:....:print" that you can't simply edit because the limit is 80 characters (two lines)

actionfromafar2y ago

It's an accident of history this didn't continue. So many code style wars could have been avoided over the eons.

vidarh2y ago

It wasn't that it didn't continue, but that this was unique to a branch of languages that largely were sidelined.

And the tokenization didn't prevent you from style differences anyway - as the article points out it e.g. keeps spaces etc. It only tokenized a few things, like keywords and line numbers.

(EDIT: in the late 90's I worked on a project written in Word BASIC.... It was also tokenized and that was used as an opportunity to translate the keywords in the localised versions of Word. But someone had managed to write a bunch of code in the Danish version and somehow exported it as text and imported it into the Norwegian version - the languages are similar enough that it was really hard to tell (no syntax highlighting, and they'd edited a bunch before realising and I had the fun job of untangling it... Yay...)

2 more replies

j / k navigate · click thread line to collapse

0 comments

vidarh2y ago

dep_bOP2y ago

pgeorgi2y ago

10 for i = 1 to 10

20 : (arbitrary number of spaces) print "hello"

30 next

The short hand issue is real, too:

1?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?:?

expands into six lines of "1 print:print:....:print" that you can't simply edit because the limit is 80 characters (two lines)

actionfromafar2y ago

It's an accident of history this didn't continue. So many code style wars could have been avoided over the eons.

vidarh2y ago

It wasn't that it didn't continue, but that this was unique to a branch of languages that largely were sidelined.

And the tokenization didn't prevent you from style differences anyway - as the article points out it e.g. keeps spaces etc. It only tokenized a few things, like keywords and line numbers.

2 more replies

j / k navigate · click thread line to collapse