2022-08-29 :: We need more sub-languages
In my opinion, what we need is not new languages, but more sub-languages. By that I mean, either:
- A safe implementation of a sub-set of an unsafe language;
- An easier-to-learn-and-to-work-with language of a sizable language and standard library.
Examples of possible applications:
- A SubC language for which it would be easier to write a compiler that makes the language safe or at least safer compared to the full C specification. Because it would be easier to refactor a C codebase to a sub-set of C, rather than rewriting in another language. Note that at some rare occasions (especially in low-level software), some inherently unsafe code is necessary, and C allows it.
- Avoiding a proliferation of DSLs (Domain Specific Languages), for example taking a sub-set of Java for a build system.
For related (and already implemented or prototyped) work on making deep-seated languages safer, I recommend to look at the papers written by Stephen Kell.
Specifically for the GNOME community, Kell writes about GObject and its
introspection system (GIR) in
Towards a dynamic object model within Unix processes, with
liballocs
. I didn't try testing it in practice against codebases
I work on, but it would be valuable.
Why? A personal perspective
The workplace is more and more specialized, we cannot be generalist and specialists at the same time, and things like DevOps is just causing too much stress on already pretty overwhelmed computer scientists. (That's where we must not hesitate to subcontract).
Juggling between different languages and technologies can be difficult. I would rather be very proficient with a single main programming language and its standard library for which I have access to its main characteristics at my fingertips.
For build systems, the usual argument put forward for a DSL is that it's not
dependent on a specific language implementation (and a specific version of
it). But CMake has a somewhat special way to handle variables and scoping, and
Meson totally lacks any variable scope (all variables are global, which makes
me uncomfortable, but that's not an overall feeling among Meson users and
authors).
So, why not a sub-set of the Java syntax, alongside a very small sub-set of
its standard library, plus some classes specific to the build system. Running
that kind of build instructions would work with a compliant Java runtime
environment (JRE), but alternative implementations would be feasible too, if a
sufficiently small sub-set of the language is carefully chosen. Most
developers would at least feel at home.
Note: this is an opinion blog post, I don't intend to work in that area right in the near future :-)
Edit: more thoughts
- For a potential SubC language: having a tool to locate unsafe parts in a codebase, and add annotations (like with GObject Introspection) to the source code for what the C syntax lacks for safeness.
- Additionally, I think for high-level code like a GTK-based library or application, more runtime checks (e.g., for memeroy handling: avoiding dangling references and leaks [1]) would not hurt too much the performances and would be highly beneficial. The C specification also doesn't preclude such runtime checks.
- Above, I didn't want to criticize Meson or CMake, lots of people like those. It was more to give an example of avoiding Domain Specific Languages. And I know, Java is not the most appreciated language within the GNOME community :-) But it's a popular language and often taught during computer science studies.
- For a related matter, see the short article that I wrote about Mallard and Ducktype. Again, if you intend to write lots of documentation, Ducktype is a fine language. It really depends if you need to work with lots of different lightweight markup formats.
- A "sub-language" can also just be an older standard of an existing language, plus some additional compile-time or run-time checks. With time, languages become larger and larger, to the point that an initially easy-to-learn language like Python becomes hard to master all its features. Also because adapting the code to new standards can be tedious work, especially to give a facelift to have more idiomatic code for the new ways of doing things.
- A "sub-language" could also just be applied to the documentation (including the API reference) and text editor tooling (auto-completion etc). A related idea would be a step-by-step learning, progressively increasing the language coverage.
P.S.: Note that I don't have practical experience for implementing some of the
above, for example I never really designed a new or sub programming language.
So take my thoughts with a grain of salt ;-) However I have experience for
high-level plumbing and GUI application development (developer tools, mainly).
P.P.S.: Also I've had a (bad) experience being a "generalist + specialist" by myself (for small projects in the areas for which I was not a specialist). But CS has no mercy, computers are quite square. All the details need to be written (and it's unfeasible to anticipate all of them when being asked for a deadline estimation, by the way), and every flaw is just around the corner, it can haunt you at any and unpredictable times. Hence my preference for a single main programming language plus possibly several subsets of it instead of DSLs (and just be a developer or architect, not meddling with sysadmin stuff that I master far less well). And! work within a diverse team, being able to delegate work, subcontract like I already said, and so forth.
[1] To look up to a good vocabulary in CS, I refer to
Concepts, Techniques, and Models of Computer Programming
(Peter Van Roy and Seif Haridi). For patient readers and happy
bookworms (alongside a
good cup of tea, coffee or whateverelse, but here it would more be a whole
pan!), I can recommend that book.
Have fun, whoever gets to work on this :)
-
Andy Wingo
That's all for now, sometimes I feel like writing or reading, other times I feel more like programming! (in execution mode).
Last update: 2022-10-07
October 2022: rephrase and make some parts easier to understand.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Thanks for reading. Even though it's not possible to write comments on this blog, don't hesitate to drop me an email. I do read them, and like to have feedbacks.